cshalizi + markov_models 194
[1204.3915] Theory and Inference for a Class of Observation-driven Models with Application to Time Series of Counts
5 weeks ago by cshalizi
"This paper studies theory and inference related to a class of time series models that incorporates nonlinear dynamics. It is assumed that the observations follow a one-parameter exponential family of distributions given an accompanying process that evolves as a function of lagged observations. We employ an iterated random function approach and a special coupling technique to show that, under suitable conditions on the parameter space, the conditional mean process is a geometric moment contracting Markov chain and that the observation process is absolutely regular with geometrically decaying coefficients. Moreover the asymptotic theory of the maximum likelihood estimates of the parameters is established under some mild assumptions. These models are applied to two examples; the first is the number of transactions per minute of Ericsson stock and the second is related to return times of extreme events of Goldman Sachs Group stock."
--- Without reading beyond the abstract, I'm guessing chains with complete connections.
to:NB
time_series
markov_models
statistics
--- Without reading beyond the abstract, I'm guessing chains with complete connections.
5 weeks ago by cshalizi
[1204.2612] Computing bounds for entropy of stationary Z^d Markov random fields
6 weeks ago by cshalizi
"For any stationary $mZ^d$-Gibbs measure that satisfies strong spatial mixing, we obtain sequences of upper and lower approximations that converge to its entropy. In the case, $d=2$, these approximations are efficient in the sense that the approximations are accurate to within $epsilon$ and can be computed in time polynomial in $1/epsilon$."
to:NB
information_theory
markov_models
stochastic_processes
entropy
6 weeks ago by cshalizi
[1204.0608] Mixing times in evolutionary game dynamics
6 weeks ago by cshalizi
"Without mutation and migration, evolutionary dynamics ultimately leads to the extinction of all but one species. Such fixation processes are well understood and can be characterized analytically with methods from statistical physics. However, many biological arguments focus on stationary distributions in a mutation-selection equilibrium. Here, we address the equilibration time required to reach stationarity in the presence of mutation, this is known as the mixing time in the theory of Markov processes. We show that mixing times in evolutionary games have the opposite behaviour from fixation times when the intensity of selection increases: In coordination games with bistabilities, the fixation time decreases, but the mixing time increases. In coexistence games with metastable states, the fixation time increases, but the mixing time decreases. Our results are based on simulations and the WKB approximation of the master equation."
to:NB
evolutionary_game_theory
markov_models
mixing
re:do-institutions-evolve
stochastic_processes
6 weeks ago by cshalizi
[1204.2477] A Simple Explanation of A Spectral Algorithm for Learning Hidden Markov Models
6 weeks ago by cshalizi
"A simple linear algebraic explanation of the algorithm in "A Spectral Algorithm for Learning Hidden Markov Models" (COLT 2009). Most of the content is in Figure 2; the text just makes everything precise in four nearly-trivial claims."
to:NB
to_read
statistics
markov_models
re:AoS_project
spectral_methods
6 weeks ago by cshalizi
Relative Entropy and Exponential Deviation Bounds for General Markov Chains
6 weeks ago by cshalizi
"We develop explicit, general bounds for the prob- ability that the normalized partial sums of a function of a Markov chain on a general alphabet will exceed the steady-state mean of that function by a given amount. Our bounds combine simple information-theoretic ideas together with techniques from optimization and some fairly elementary tools from analysis. In one direction, we obtain a general bound for the important class of Doeblin chains; this bound is optimal, in the sense that in the special case of independent and identically distributed random variables it essentially reduces to the classical Hoeffding bound. In another direction, motivated by important problems in simulation, we develop a series of bounds in a form which is particularly suited to these problems, and which apply to the more general class of “geometrically ergodic” Markov chains."
to:NB
to_read
deviation_bounds
markov_models
stochastic_processes
via:ded-maxim
meyn.sean
kontoyiannis.ioannis
mixing
information_theory
6 weeks ago by cshalizi
Ferré , Hervé , Ledoux : Limit theorems for stationary Markov processes with L2-spectral gap
6 weeks ago by cshalizi
"Let be a discrete or continuous-time Markov process with state space where is an arbitrary measurable set. Its transition semigroup is assumed to be additive with respect to the second component, i.e. is assumed to be a Markov additive process. In particular, this implies that the first component is also a Markov process. Markov random walks or additive functionals of a Markov process are special instances of Markov additive processes. In this paper, the process is shown to satisfy the following classical limit theorems:
(a) the central limit theorem,
(b) the local limit theorem,
(c) the one-dimensional Berry–Esseen theorem,
(d) the one-dimensional first-order Edgeworth expansion,
provided that we have with the expected order α with respect to the independent case (up to some ε > 0 for (c) and (d)). For the statements (b) and (d), a Markov nonlattice condition is also assumed as in the independent case. All the results are derived under the assumption that the Markov process has an invariant probability distribution π, is stationary and has the -spectral gap property (that is, (Xt)t∈ℕ is ρ-mixing in the discrete-time case). The case where is non-stationary is briefly discussed. As an application, we derive a Berry–Esseen bound for the M-estimators associated with ρ-mixing Markov chains."
to:NB
stochastic_processes
markov_models
ergodic_theory
mixing
statistical_inference_for_stochastic_processes
(a) the central limit theorem,
(b) the local limit theorem,
(c) the one-dimensional Berry–Esseen theorem,
(d) the one-dimensional first-order Edgeworth expansion,
provided that we have with the expected order α with respect to the independent case (up to some ε > 0 for (c) and (d)). For the statements (b) and (d), a Markov nonlattice condition is also assumed as in the independent case. All the results are derived under the assumption that the Markov process has an invariant probability distribution π, is stationary and has the -spectral gap property (that is, (Xt)t∈ℕ is ρ-mixing in the discrete-time case). The case where is non-stationary is briefly discussed. As an application, we derive a Berry–Esseen bound for the M-estimators associated with ρ-mixing Markov chains."
6 weeks ago by cshalizi
[1203.0683] A Method of Moments for Mixture Models and Hidden Markov Models
7 weeks ago by cshalizi
"Mixture models are a fundamental tool in applied statistics and machine learning for treating data taken from multiple subpopulations. The current practice for estimating the parameters of such models relies on local search heuristics (e.g., the EM algorithm) which are prone to failure, and existing consistent methods are unfavorable due to their high computational and sample complexity which typically scale exponentially with the number of mixture components. This work develops an efficient method of moments approach to parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians (such as mixtures of axis-aligned Gaussians) and hidden Markov models. The new method leads to rigorous unsupervised learning results for mixture models that were not achieved by previous works; and, because of its simplicity, it also constitutes a viable alternative to EM for practical deployment."
Clever: some mixture models can be characterized by expectations, covariances, and third-order mixed moments, so you just need to estimate tensors up to third order, and not very high moments of vectors (which are very noisy) and do some linear algebra. I should probably re-read because I couldn't reproduce this at the board.
in_NB
statistics
estimation
mixture_models
markov_models
state-space_models
have_read
Clever: some mixture models can be characterized by expectations, covariances, and third-order mixed moments, so you just need to estimate tensors up to third order, and not very high moments of vectors (which are very noisy) and do some linear algebra. I should probably re-read because I couldn't reproduce this at the board.
7 weeks ago by cshalizi
[1203.6898] Long-term stability of sequential Monte Carlo methods under verifiable conditions
8 weeks ago by cshalizi
"This paper discusses particle filtering in general hidden Markov models (HMMs) and presents novel theoretical results on the long-term stability of bootstrap-type particle filters. More specifically, we establish that the asymptotic variance of the Monte Carlo estimates produced by the bootstrap filter is uniformly bounded in time. On the contrary to most previous results of this type, which in general presuppose that the state space of the hidden state process is compact (an assumption that is rarely satisfied in practice), our very mild assumptions are satisfied for a large class of HMMs with possibly non-compact state space. In addition, we derive a similar time uniform bound on the asymptotic Lp error. Importantly, our results hold for misspecified models, i.e. we do not at all assume that the data entering into the particle filter originate from the model governing the dynamics of the particles or not even from an HMM."
to:NB
particle_filters
stochastic_processes
time_series
state_estimation
state-space_models
markov_models
statistics
8 weeks ago by cshalizi
[0803.0835] Goodness-of-fit tests for Markovian time series models: Central limit theory and bootstrap approximations
8 weeks ago by cshalizi
"New goodness-of-fit tests for Markovian models in time series analysis are developed which are based on the difference between a fully nonparametric estimate of the one-step transition distribution function of the observed process and that of the model class postulated under the null hypothesis. The model specification under the null allows for Markovian models, the transition mechanisms of which depend on an unknown vector of parameters and an unspecified distribution of i.i.d. innovations. Asymptotic properties of the test statistic are derived and the critical values of the test are found using appropriate bootstrap schemes. General properties of the bootstrap for Markovian processes are derived. A new central limit theorem for triangular arrays of weakly dependent random variables is obtained. For the proof of stochastic equicontinuity of multidimensional empirical processes, we use a simple approach based on an anisotropic tiling of the space. The finite-sample behavior of the proposed test is illustrated by some numerical examples and a real-data application is given."
in_NB
statistics
statistical_inference_for_stochastic_processes
bootstrap
markov_models
goodness-of-fit
8 weeks ago by cshalizi
[1203.5351] Activity driven modeling of dynamic networks
8 weeks ago by cshalizi
"Network modeling plays a critical role in identifying statistical regularities and structural principles common to many systems. The large majority of recent modeling approaches are connectivity driven, in the sense that the structural pattern of the network is at the basis of the mechanisms ruling the network formation. Connectivity driven models necessarily provide a time-aggregated representation that may fail to describe the instantaneous and fluctuating dynamics of many networks. We address this challenge by defining the activity potential, a time invariant function characterizing the agents' interactions in real-world networks and constructing an activity driven model capable of encoding the instantaneous time description of the network dynamics. The model provides an explanation of structural features such as the presence of hubs, which simply originate from the heterogeneous activity of agents. Additionally, we find that diffusive processes in highly dynamical networks can be described analytically in terms of the activity potential, allowing a quantitative discussion of the biases induced by the time-aggregated network representation in the analysis of dynamical processes in evolving networks."
to:NB
network_data_analysis
networks
stochastic_processes
markov_models
transaction_networks
to_read
re:stacs
8 weeks ago by cshalizi
[1203.5930] Large deviations for the empirical measure of Markov renewal processes
8 weeks ago by cshalizi
"A large deviations principle is established for the joint law of the empirical measure and the flow measure of a renewal Markov process on a finite graph. We do not assume any bound on the arrival times, allowing heavy tailed distributions. In particular, the rate functional is in general degenerate (it has a nontrivial set of zeros) and not strictly convex. These features show a behavior highly different from what one may guess with a heuristic Donsker-Varadhan analysis of the problem."
to:NB
large_deviations
markov_models
stochastic_processes
re:almost_none
8 weeks ago by cshalizi
Taylor & Francis Online :: Graphical Diagnostics for Markov Models for Categorical Data - Journal of Computational and Graphical Statistics - Volume 20, Issue 2
8 weeks ago by cshalizi
"Markov models are widely used as a method for describing categorical data that exhibit stationary and nonstationary autocorrelation. However, diagnostic methods are a largely overlooked topic for Markov models. We introduce two types of residuals for this purpose: one for assessing the length of runs between state changes, and the other for assessing the frequency with which the process moves from any given state to the other states. Methods for calculating the sampling distribution of both types of residuals are presented, enabling objective interpretation through graphical summaries. The graphical summaries are formed using a modification of the probability integral transformation that is applicable for discrete data. Residuals from simulated datasets are presented to demonstrate when the model is, and is not, adequate for the data. The two types of residuals are used to highlight inadequacies of a model posed for real data on seabed fauna from the marine environment."
to:NB
visual_display_of_quantitative_information
statistics
markov_models
to_teach:undergrad-ADA
8 weeks ago by cshalizi
[1203.6130] Spectral dimensionality reduction for HMMs
8 weeks ago by cshalizi
"Hidden Markov Models (HMMs) can be accurately approximated using co-occurrence frequencies of pairs and triples of observations by using a fast spectral method in contrast to the usual slow methods like EM or Gibbs sampling. We provide a new spectral method which significantly reduces the number of model parameters that need to be estimated, and generates a sample complexity that does not depend on the size of the observation vocabulary. We present an elementary proof giving bounds on the relative accuracy of probability estimates from our model. (Correlaries show our bounds can be weakened to provide either L1 bounds or KL bounds which provide easier direct comparisons to previous work.) Our theorem uses conditions that are checkable from the data, instead of putting conditions on the unobservable Markov transition matrix."
to:NB
to_read
markov_models
statistics
machine_learning
dimension_reduction
re:AoS_project
spectral_clustering
8 weeks ago by cshalizi
[1203.6432] Equilibrium states and invariant measures for random dynamical systems
8 weeks ago by cshalizi
"The existence of invariant Borel probability measures for random dynamical systems on complete metric spaces is proved under assumptions that the systems have countably many maps and admit finite Markov partitions such that the resulting Markov systems are uniformly continuous and contractive, and satisfy some integrability condition in the infinite case. A one-to-one map between these measures and equilibrium states associated with such systems is established. Some properties of the map and the measures are given."
to:NB
stochastic_processes
dynamical_systems
markov_models
ergodic_theory
8 weeks ago by cshalizi
Kaiser , Lahiri , Nordman : Goodness of fit tests for a class of Markov random field models
10 weeks ago by cshalizi
"This paper develops goodness of fit statistics that can be used to formally assess Markov random field models for spatial data, when the model distributions are discrete or continuous and potentially parametric. Test statistics are formed from generalized spatial residuals which are collected over groups of nonneighboring spatial observations, called concliques. Under a hypothesized Markov model structure, spatial residuals within each conclique are shown to be independent and identically distributed as uniform variables. The information from a series of concliques can be then pooled into goodness of fit statistics. Under some conditions, large sample distributions of these statistics are explicitly derived for testing both simple and composite hypotheses, where the latter involves additional parametric estimation steps. The distributional results are verified through simulation, and a data example illustrates the method for model assessment."
to:NB
to_read
statistics
spatial_statistics
random_fields
goodness-of-fit
hypothesis_testing
re:stacs
markov_models
10 weeks ago by cshalizi
[1203.2035] A Noether Theorem for Markov Processes
11 weeks ago by cshalizi
"Noether's theorem links the symmetries of a quantum system with its conserved quantities, and is a cornerstone of quantum mechanics. Here we prove a version of Noether's theorem for Markov processes. In quantum mechanics, an observable commutes with the Hamiltonian if and only if its expected value remains constant in time for every state. For Markov processes that no longer holds, but an observable commutes with the Hamiltonian if and only if both its expected value and standard deviation are constant in time for every state."
--- For "Hamiltonian" of a Markov process, read "generator".
to:NB
stochastic_processes
markov_models
noethers_theorem
baez.john
re:almost_none
have_read
--- For "Hamiltonian" of a Markov process, read "generator".
11 weeks ago by cshalizi
[1203.1199] Lagrangian and Hamiltonian Feynman formulae for some Feller semigroups and their perturbations
11 weeks ago by cshalizi
"A Feynman formula is a representation of a solution of an initial (or initial-boundary) value problem for an evolution equation (or, equivalently, a representation of the semigroup resolving the problem) by a limit of $n$-fold iterated integrals of some elementary functions as $ntoinfty$. In this note we obtain some Feynman formulae for a class of semigroups associated with Feller processes. Finite dimensional integrals in the Feynman formulae give approximations for functional integrals in some Feynman--Kac formulae corresponding to the underlying processes. Hence, these Feynman formulae give an effective tool to calculate functional integrals with respect to probability measures generated by these Feller processes and, in particular, to obtain simulations of Feller processes."
to:NB
stochastic_processes
markov_models
path_integrals
11 weeks ago by cshalizi
[1202.2341] Measure concentration through non-Lipschitz observables and functional inequalities
12 weeks ago by cshalizi
"Non-Gaussian concentration estimates are obtained for invariant probability measures of reversible Markov processes. We show that the functional inequalities approach combined with a suitable Lyapunov condition allows us to circumvent the classical Lipschitz assumption of the observables. Our method is general and covers diffusions as well as pure-jump Markov processes on unbounded spaces."
in_NB
concentration_of_measure
markov_models
stochastic_processes
12 weeks ago by cshalizi
[0810.2123] Forgetting of the initial distribution for non-ergodic Hidden Markov Chains
12 weeks ago by cshalizi
"In this paper, the forgetting of the initial distribution for a non-ergodic Hidden Markov Models (HMM) is studied. A new set of conditions is proposed to establish the forgetting property of the filter, which significantly extends all the existing results. Both a pathwise-type convergence of the total variation distance of the filter started from two different initial distributions, and a convergence in expectation are considered. The results are illustrated using generic models of non-ergodic HMM and extend all the results known so far."
to:NB
filtering
markov_models
state_estimation
stochastic_processes
12 weeks ago by cshalizi
Online Learning with Hidden Markov Models
february 2012 by cshalizi
"We present an online version of the expectation-maximization (EM) algorithm for hidden Markov models (HMMs). The sufficient statistics required for parameters estimation is computed recursively with time, that is, in an online way instead of using the batch forward-backward procedure. This computational scheme is generalized to the case where the model parameters can change with time by introducing a discount factor into the recurrence relations. The resulting algorithm is equivalent to the batch EM algorithm, for appropriate discount factor and scheduling of parameters update. On the other hand, the online algorithm is able to deal with dynamic environments, i.e., when the statistics of the observed data is changing with time. The implications of the online algorithm for probabilistic modeling in neuroscience are briefly discussed."
to:NB
markov_models
filtering
state_estimation
statistics
em_algorithm
february 2012 by cshalizi
[1201.6307] Statistical convergence of Markov experiments to diffusion limits
february 2012 by cshalizi
"Assume that one observes the $k$-th, $2k$-th, ...., $nk$-th value of a Markov chain $X_{1,h},...,X_{nk,h}$. That means we assume that a high frequency Markov chain runs in the background on a very fine time grid but that it is only observed on a coarser grid. This asymptotics reflects a set up occurring in the high frequency statistical analysis for financial data where diffusion approximations are used only for coarser time scales. In this paper we show that under appropriate conditions the L$_1$-distance between the joint distribution of the Markov chain and the distribution of the discretized diffusion limit converges to zero. The result implies that the LeCam deficiency distance between the statistical Markov experiment and its diffusion limit converges to zero. This result can be applied to Euler approximations for the joint distribution of diffusions observed at points $Delta, 2 Delta, ,,,, nDelta$. The joint distribution can be approximated by generating Euler approximations at the points $Delta k^{-1}, 2 Delta k^{-1}, ,,,, nDelta$. Our result implies that under our regularity conditions the Euler approximation is consistent for $n to infty$ if $nk^{-2}to 0$."
in_NB
convergence_of_stochastic_processes
markov_models
stochastic_processes
stochastic_differential_equations
re:almost_none
february 2012 by cshalizi
[1201.6381] Fluctuation relations: a pedagogical overview
february 2012 by cshalizi
"The fluctuation relations have received considerable attention since their emergence and development in the 1990s. We present a summary of the main results and suggest ways to interpret this material. Starting with a consideration of the under-determined time evolution of a simple open system, formulated using continuous Markovian stochastic dy- namics, an expression for the entropy generated over a time interval is developed in terms of the probability of observing a trajectory associated with a prescribed driving protocol, and the probability of its time-reverse. This forms the basis for a general theoretical description of non-equilibrium thermodynamic pro- cesses. Having established a connection between entropy production and an inequivalence in probability for forward and time-reversed events, we proceed in the manner of Sekimoto and Seifert, in particular, to derive results in stochastic thermodynamics: a description of the evolution of a system between equilibrium states that ties in with well-established thermodynamic expectations. We derive fluctuation relations, state conditions for their validity, and illustrate their op- eration in some simple cases, thereby providing some introductory insight into the various celebrated symmetry relations that have emerged in this field."
to:NB
non-equilibrium
statistical_mechanics
stochastic_processes
markov_models
re:almost_none
thermodynamics
february 2012 by cshalizi
[1201.4579] Limit theorems for stationary Markov processes with L2-spectral gap
january 2012 by cshalizi
"Let $(X_t, Y_t)_{tin T}$ be a discrete or continuous-time Markov process with state space $X times R^d$ where $X$ is an arbitrary measurable set. Its transition semigroup is assumed to be additive with respect to the second component, i.e. $(X_t, Y_t)_{tin T}$ is assumed to be a Markov additive process. In particular, this implies that the first component $(X_t)_{tin T}$ is also a Markov process. Markov random walks or additive functionals of a Markov process are special instances of Markov additive processes. In this paper, the process $(Y_t)_{tin T}$ is shown to satisfy the following classical limit theorems: (a) the central limit theorem, (b) the local limit theorem, (c) the one-dimensional Berry-Esseen theorem, (d) the one-dimensional first-order Edgeworth expansion, provided that we have sup{tin(0,1]cap T : E{pi,0}[|Y_t| ^{alpha}] < 1 with the expected order with respect to the independent case (up to some $varepsilon > 0$ for (c) and (d)). For the statements (b) and (d), a Markov nonlattice condition is also assumed as in the independent case. All the results are derived under the assumption that the Markov process $(X_t)_{tin T}$ has an invariant probability distribution $pi$, is stationary and has the $L^2(pi)$-spectral gap property (that is, $(X_t)tin N}$ is $rho$-mixing in the discrete-time case). The case where $(X_t)_{tin T}$ is non-stationary is briefly discussed. As an application, we derive a Berry-Esseen bound for the M-estimators associated with $rho$-mixing Markov chains."
in_NB
markov_models
stochastic_processes
central_limit_theorem
mixing
ergodic_theory
january 2012 by cshalizi
[1201.3569] Exponential Concentration Inequalities for Additive Functionals of Markov Chains
january 2012 by cshalizi
"Using the renewal approach we prove exponential inequalities for additive functionals and empirical processes of ergodic Markov chains, thus obtaining counterparts of inequalities for sums of independent random variables. The inequalities do not require functions of the chain to be bounded and moreover have all the constants accessible whenever the usual drift condition holds, which is crucial for practical applications e.g. in MCMC algorithms."
in_NB
stochastic_processes
empirical_processes
markov_models
concentration_of_measure
january 2012 by cshalizi
[1201.2265] Hoeffding's inequalities for geometrically ergodic Markov chains on general state space
january 2012 by cshalizi
We consider Markov chain with spectral gap in $L^2$ space. Assume that $f$ is a bounded function. Then the probabilities of large deviations of average along trajectory satisfy Hoeffding's-type inequalities. These bounds depend only on the stationary mean, spectral gap and the end-points of support of $f$.
to:NB
deviation_bounds
markov_models
stochastic_processes
january 2012 by cshalizi
[1201.2256] Empirical Processes of Markov Chains and Dynamical Systems Indexed by Classes of Functions
january 2012 by cshalizi
"We study weak convergence of empirical processes of dependent data, indexed by classes of functions. We obtain results that are especially suitable for data arising from dynamical systems and Markov chains, where the Central Limit Theorem for partial sums is commonly derived via the spectral gap technique. Our results apply, e.g. to the empirical process of ergodic torus automorphisms."
in_NB
empirical_processes
stochastic_processes
markov_models
central_limit_theorem
dynamical_systems
january 2012 by cshalizi
[1201.2056] Adaptive Context Tree Weighting
january 2012 by cshalizi
"We describe an adaptive context tree weighting (ACTW) algorithm, as an extension to the standard context tree weighting (CTW) algorithm. Unlike the standard CTW algorithm, which weights all observations equally regardless of the depth, ACTW gives increasing weight to more recent observations, aiming to improve performance in cases where the input sequence is from a non-stationary distribution. Data compression results show ACTW variants improving over CTW on merged files from standard compression benchmark tests while never being significantly worse on any individual file."
to:NB
information_theory
non-stationarity
markov_models
stochastic_processes
re:AoS_project
january 2012 by cshalizi
[1112.3257] Exact Computation of Kullback-Leibler Distance for Hidden Markov Trees and Models
december 2011 by cshalizi
"We suggest new recursive formulas to compute the exact value of the Kullback-Leibler distance (KLD) between two general Hidden Markov Trees (HMTs). For homogeneous HMTs with regular topology, such as homogeneous Hidden Markov Models (HMMs), we obtain a closed-form expression for the KLD when no evidence is given. We generalize our recursive formulas to the case of HMMs conditioned on the observable variables. Our proposed formulas are validated through several numerical examples in which we compare the exact KLD value with Monte Carlo estimations."
to:NB
to_read
re:AoS_project
markov_models
stochastic_processes
information_theory
december 2011 by cshalizi
An Asymptotic Behavior of the Marginal Likelihood for General Markov Models
december 2011 by cshalizi
"The standard Bayesian Information Criterion (BIC) is derived under regularity conditions which are not always satisfied in the case of graphical models with hidden variables. In this paper we derive the BIC for the binary graphical tree models where all the inner nodes of a tree represent binary hidden variables. This provides an extension of a similar formula given by Rusakov and Geiger for naive Bayes models. The main tool used in this paper is the connection between the growth behavior of marginal likelihood integrals and the real log-canonical threshold."
in_NB
markov_models
graphical_models
statistics
information_criteria
december 2011 by cshalizi
[1111.5369] Joint probability distributions and fluctuation theorems
december 2011 by cshalizi
"We derive various exact results for Markovian systems that spontaneously relax to a non-equilibrium steady-state by using joint probability distributions symmetries of different entropy production decompositions. The analytical approach is applied to diverse problems such as the description of the fluctuations induced by experimental errors, for unveiling symmetries of correlation functions appearing in fluctuation-dissipation relations recently generalised to non-equilibrium steady-states, and also for mapping averages between different trajectory-based dynamical ensembles. Many known fluctuation theorems arise as special instances of our approach, for particular two-fold decompositions of the total entropy production. As a complement, we also briefly review and synthesise the variety of fluctuation theorems applying to stochastic dynamics of both, continuous systems described by a Langevin dynamics and discrete systems obeying a Markov dynamics, emphasising how these results emerge from distinct symmetries of the dynamical entropy of the trajectory followed by the system For Langevin dynamics, we embed the "dual dynamics" with a physical meaning, and for Markov systems we show how the fluctuation theorems translate into symmetries of modified evolution operators."
to:NB
statistical_mechanics
non-equilibrium
markov_models
fluctuation-response
arrow_of_time
december 2011 by cshalizi
Quantile regression for longitudinal data based on latent Markov subject-specific parameters Alessio Farcomeni - Statistics and Computing, Volume 22, Number 1
december 2011 by cshalizi
"We propose a latent Markov quantile regression model for longitudinal data with non-informative drop-out. The observations, conditionally on covariates, are modeled through an asymmetric Laplace distribution. Random effects are assumed to be time-varying and to follow a first order latent Markov chain. This latter assumption is easily interpretable and allows exact inference through an ad hoc EM-type algorithm based on appropriate recursions. Finally, we illustrate the model on a benchmark data set."
to:NB
regression
time_series
prediction
markov_models
december 2011 by cshalizi
Truquet : On a nonparametric resampling scheme for Markov random fields
november 2011 by cshalizi
"We study an extension to general Markov random fields of the resampling scheme given in Bickel and Levina (2006) [4] for texture synthesis with stationary Markov mesh models. The procedure generates bootstrap replicates of a sample using kernel regression and the principle of Gibbs sampling. Consistency of the bootstrap distribution is investigated under the Dobrushin contraction condition. Some simulation examples are given, in particular for the texture synthesis, for which the multiscale algorithm of Paget and Longstaff (1998) [27] is revisited."
in_NB
to_read
random_fields
bootstrap
statistics
spatial_statistics
markov_models
november 2011 by cshalizi
[1111.3182] Context Tree Switching
november 2011 by cshalizi
"This paper describes the Context Tree Switching technique, a modification of Context Tree Weighting for the prediction of binary, stationary, n-Markov sources. By modifying Context Tree Weighting's recursive weighting scheme, it is possible to mix over a strictly larger class of models without increasing the asymptotic time or space complexity of the original algorithm. We prove that this generalization preserves the desirable theoretical properties of Context Tree Weighting on stationary n-Markov sources, and show empirically that this new technique leads to consistent improvements over Context Tree Weighting as measured on the Calgary Corpus."
to:NB
re:AoS_project
markov_models
november 2011 by cshalizi
[1111.1177] Partially observed Markov random fields are variable neighborhood random fields
november 2011 by cshalizi
"The present paper has two goals. First to present a natural example of a new class of random fields which are the variable neighborhood random fields. The example we consider is a partially observed nearest neighbor binary Markov random field. The second goal is to establish sufficient conditions ensuring that the variable neighborhoods are almost surely finite. We discuss the relationship between the almost sure finiteness of the interaction neighborhoods and the presence/absence of phase transition of the underlying Markov random field. In the case where the underlying random field has no phase transition we show that the finiteness of neighborhoods depends on a specific relation between the noise level and the Dobrushin coefficient. The case in which there is phase transition is addressed in the frame of the ferromagnetic Ising model. We prove that the existence of infinite interaction neighborhoods depends on the phase. The first result has a probabilistic proof using a Kalikow type decomposition of a Glauber dynamics associated to the field. The second result is proved using cluster expansion."
to:NB
to_read
markov_models
random_fields
phase_transitions
re:AoS_project
stochastic_processes
november 2011 by cshalizi
Banff blog « An Ergodic Walk
october 2011 by cshalizi
Sounds delightful (and Banff is beautiful).
conferences
statistics
learning_theory
statistical_inference_for_stochastic_processes
track_down_references
markov_models
network_data_analysis
estimation
van_handel.ramon
nonparametrics
re:smoothing_adjacency_matrices
information_theory
october 2011 by cshalizi
[math/0305026] Chains with complete connections: General theory, uniqueness, loss of memory and mixing properties
october 2011 by cshalizi
Published version: J. Stat. Phys., 118 (2005): 555--588. Changes seem minor.
in_NB
chains_with_complete_connections
re:AoS_project
markov_models
stochastic_processes
re:almost_none
mixing
ergodic_theory
october 2011 by cshalizi
Ergodicity of Hidden Markov Model - Mathematics of Control, Signals, and Systems (MCSS), Volume 17, Number 4
october 2011 by cshalizi
"In this paper we study ergodic properties of hidden Markov models with a generalized observation structure. In particular sufficient conditions for the existence of a unique invariant measure for the pair filter-observation are given. Furthermore, necessary and sufficient conditions for the existence of a unique invariant measure of the triple state-observation-filter are provided in terms of asymptotic stability in probability of incorrectly initialized filters. We also study the asymptotic properties of the filter and of the state estimator based on the observations as well as on the knowledge of the initial state. Their connection with minimal and maximal invariant measures is also studied."
in_NB
stochastic_processes
ergodic_theory
markov_models
filtering
re:almost_none
october 2011 by cshalizi
Relational Learning with One Network: An Asymptotic Analysis
october 2011 by cshalizi
An attempt on the "n=1" problem. Alternative home: http://www.cs.purdue.edu/homes/neville/papers/xiang-neville-aistat2011.pdf
in_NB
re:XV_for_networks
re:your_favorite_ergm_sucks
network_data_analysis
relational_learning
neville.jennifer
have_read
random_fields
markov_models
statistics
machine_learning
estimation
october 2011 by cshalizi
[1110.1338] Robustness and Conditional Independence Ideals
october 2011 by cshalizi
"We study notions of robustness of Markov kernels and probability distribution of a system that is described by $n$ input random variables and one output random variable. Markov kernels can be expanded in a series of potentials that allow to describe the system's behaviour after knockouts. Robustness imposes structural constraints on these potentials. Robustness of probability distributions is defined via conditional independence statements. These statements can be studied algebraically. The corresponding conditional independence ideals are related to binary edge ideals. The set of robust probability distributions lies on an algebraic variety. We compute a Gr"obner basis of this ideal and study the irreducible decomposition of the variety. These algebraic results allow to parametrize the set of all robust probability distributions."
probability
markov_models
graphical_models
information_geometry
algebraic_geometry
ay.nihat
in_NB
october 2011 by cshalizi
[1110.0356] Asymptotic properties of the maximum likelihood estimation in misspecified Hidden Markov models
october 2011 by cshalizi
"Let $(Y_k)$ be a stationary sequence on a probability space taking values in a standard Borel space. Consider the associated maximum likelihood estimator with respect to a parametrized family of Hidden Markov models such that the law of the observations $(Y_k)$ is not assumed to be described by any of the Hidden Markov models of this family. In this paper we investigate the consistency of this estimator in such mispecified models under mild assumptions."
statistical_inference_for_stochastic_processes
markov_models
state-space_models
re:your_favorite_dsge_sucks
in_NB
to_read
misspecification
randal.douc
moulines.eric
october 2011 by cshalizi
[1108.3968] Online Expectation Maximization based algorithms for inference in hidden Markov models
august 2011 by cshalizi
"The Expectation Maximization (EM) algorithm is a versatile tool for model parameter estimation in latent data models. When processing large data sets or data stream however, EM becomes intractable since it requires the whole data set to be available at each iteration of the algorithm. In this contribution, a new generic online EM algorithm for model parameter inference in general Hidden Markov Model is proposed. This new algorithm updates the parameter estimate after a block of observations is processed (online). The convergence of this new algorithm is established, and the rate of convergence is studied showing the impact of the block size. An averaging procedure is also proposed to improve the rate of convergence. Finally, practical illustrations are presented as well as extensions to some online stochastic EM when Sequential Monte Carlo methods have to be used in combination, in order to make the E-step tractable."
filtering
expectation-maximization
markov_models
statistics
statistical_inference_for_stochastic_processes
in_NB
august 2011 by cshalizi
[0908.3666] On the minimal penalty for Markov order estimation
august 2011 by cshalizi
I think this is the only time I have seen the LIL actually _used_ for anything.
van_handel.ramon
empirical_processes
model_selection
markov_models
statistics
stochastic_processes
law_of_the_iterated_logarithm
in_NB
august 2011 by cshalizi
[1107.1283] Spectral Methods for Learning Multivariate Latent Tree Structure
july 2011 by cshalizi
Huh, sounds like they're using tetrad equations?
markov_models
to_read
re:AoS_project
in_NB
graphical_models
inference_to_latent_objects
zhang.tong
kakade.sham
hsu.daniel
song.le
july 2011 by cshalizi
[1107.2612] Commuting time geometry of ergodic Markov chains
july 2011 by cshalizi
"We show how to map the states of an ergodic Markov chain to Euclidean space so that the squared distance between states is the expected commuting time. We find a minimax characterization of commuting times, and from this we get monotonicity of commuting times with respect to equilibrium transition rates. All of these results are familiar in the case of time-reversible chains, where techniques of classical electrical theory apply. In presenting these results, we take the opportunity to develop Markov chain theory in a `conformally correct' way"
markov_models
stochastic_processes
in_NB
waiting_times
july 2011 by cshalizi
[0812.0567] The ensemble of random Markov matrices
july 2011 by cshalizi
"The ensemble of random Markov matrices is introduced as a set of Markov or stochastic matrices with the maximal Shannon entropy. The statistical properties of the stationary distribution pi, the average entropy growth rate $h$ and the second largest eigenvalue nu across the ensemble are studied. It is shown and heuristically proven that the entropy growth-rate and second largest eigenvalue of Markov matrices scale in average with dimension of matrices d as h ~ log(O(d)) and nu ~ d^(-1/2), respectively, yielding the asymptotic relation h tau_c ~ 1/2 between entropy h and correlation decay time tau_c = -1/log|nu| . Additionally, the correlation between h and and tau_c is analysed and is decreasing with increasing dimension d."
markov_models
stochastic_processes
spectral_gap
information_theory
in_NB
july 2011 by cshalizi
[1107.4353] Upper Bounds for Markov Approximations of Ergodic Processes
july 2011 by cshalizi
"Chains of infinite order are generalizations of Markov chains that constitute a flexible class of models in the general theory of stochastic processes. These processes can be naturally studied using approximating Markov chains. Here we derive new upper bounds for the $bar{d}$-distance and the K"ullback-Leibler divergence between chains of infinite order and their canonical $k$-step Markov approximations. In contrast to the bounds available in the literature our results apply to chains of infinite order compatible with general classes of probability kernels. In particular, we allow kernels with discontinuities and null transition probabilities."" (Pedantry: Pretty sure Kullback did not spell his name with an umlaut!)
markov_models
stochastic_processes
re:AoS_project
to_read
in_NB
approximation
re:your_favorite_dsge_sucks
july 2011 by cshalizi
[1107.1794] Some Aspects of Modeling Dependence in Copula-based Markov chains
july 2011 by cshalizi
"Dependence coefficients have been widely studied for Markov processes defined by a set of transition probabilities and an initial distribution. This work clarifies some aspects of the theory of dependence structure of Markov chains generated by copulas... relationship between the notions of geometric ergodicity and geometric {rho}-mixing ... for a large number of well known copulas, such as Clayton, Gumbel or Student, these notions are equivalent. Some of the results published in the last years appear to be redundant if one takes into account this fact. We apply this equivalence to show that any mixture of Clayton, Gumbel or Student copulas generate both geometrically ergodic and geometric {rho}-mixing stationary Markov chains, answering in this way an open question in the literature. We shall also point out that a sufficient condition for {rho}-mixing, used in the literature, actually implies Doeblin recurrence."
markov_models
copulas
ergodic_theory
mixing
statistics
in_NB
july 2011 by cshalizi
[1107.1948] On the concentration properties of Interacting particle processes
july 2011 by cshalizi
"These lecture notes present some new concentration inequalities for Feynman-Kac particle processes. We analyze different types of stochastic particle models, including particle profile occupation measures, genealogical tree based evolution models, particle free energies, as well as backward Markov chain particle models. We illustrate these results with a series of topics related to computational physics and biology, stochastic optimization, signal processing and bayesian statistics, and many other probabilistic machine learning algorithms. Special emphasis is given to the stochastic modeling and the quantitative performance analysis of a series of advanced Monte Carlo methods, including particle filters, genetic type island models, Markov bridge models, interacting particle Markov chain Monte Carlo methodologies."
interacting_particle_systems
concentration_of_measure
stochastic_processes
markov_models
branching_processes
particle_filters
in_NB
del_moral.pierre
july 2011 by cshalizi
IEEE Xplore - Computation and Estimation of Generalized Entropy Rates for Denumerable Markov Chains
june 2011 by cshalizi
The estimation part is just for parametric models, and amounts to plugging in the MLE. But it might be interesting to see how well that works...
entropy_estimation
entropy
markov_models
in_NB
june 2011 by cshalizi
Markov Processes and Related Problems of Analysis: Selected Papers of E. B. Dynkin
may 2011 by cshalizi
Should check how many of them aren't in Project Euclid before paying $$$, I guess. OTOH, perhaps think of it as a contribution to the Fund for Aged Probabilists?
books:noted
stochastic_processes
markov_models
re:almost_none
in_NB
may 2011 by cshalizi
Language Log » Word-order “universals” are lineage-specific?
april 2011 by cshalizi
I don't see how they could possibly have enough data to reliably estimate that many differences in transition rates, but I need to read the paper.
historical_linguistics
linguistic_evolution
markov_models
track_down_references
april 2011 by cshalizi
[1102.3107] Regenerative block empirical likelihood for Markov chains
february 2011 by cshalizi
Can't tell from abstract whether this is related to my query here (http://bactra.org/notebooks/mixtures-of-processes.html) or not.
statistics
stochastic_processes
markov_models
empirical_likelihood
february 2011 by cshalizi
[1102.0365] Limit Theorems for the Sample Entropy of Hidden Markov Chains
february 2011 by cshalizi
"The Shannon-McMillan-Breiman theorem asserts that the sample entropy of a stationary and ergodic stochastic process converges to the entropy rate of the same process almost surely. In this paper, we focus our attention on the convergence behavior of the sample entropy of a hidden Markov chain. Under certain positivity assumption, we prove that a central limit theorem (CLT) with some Berry-Esseen bound for the sample entropy of a hidden Markov chain, and we use this CLT to establish a law of iterated logarithm (LIL) for the sample entropy."
information_theory
markov_models
central_limit_theorem
february 2011 by cshalizi
Suzuki : A Markov chain analysis of genetic algorithms: large deviation principle approach
january 2011 by cshalizi
"In this paper we prove that the stationary distribution of populations in genetic algorithms focuses on the uniform population with the highest fitness value as the selective pressure goes to ∞ and the mutation probability goes to 0. The obtained sufficient condition is based on the work of Albuquerque and Mazza (2000), who, following Cerf (1998), applied the large deviation principle approach (Freidlin-Wentzell theory) to the Markov chain of genetic algorithms. The sufficient condition is more general than that of Albuquerque and Mazza, and covers a set of parameters which were not found by Cerf."
large_deviations
markov_models
genetic_algorithms
evolution
re:bayes_as_evol
re:do-institutions-evolve
january 2011 by cshalizi
Stochastic Automata: Stability, Nondeterminism and Prediction
january 2011 by cshalizi
From 1981 but perhaps still worth looking at?
books:noted
automata_theory
markov_models
re:AoS_project
coveted
january 2011 by cshalizi
[1012.5687] Coupling and Applications
january 2011 by cshalizi
"This paper presents a self-contained account for coupling arguments and applications in the context of Markov processes. We first use coupling to describe the transport problem, which leads to the concepts of optimal coupling and probability distance (or transportation-cost), then introduce applications of coupling to the study of ergodicity, Liouville theorem, convergence rate, gradient estimate, and Harnack inequality for Markov processes."
markov_models
stochastic_processes
coupling
ergodic_theory
january 2011 by cshalizi
related tags
agent-based_models ⊕ ahmed.amr ⊕ algebraic_geometry ⊕ amaral.luis ⊕ approximation ⊕ arbitrage ⊕ arrow_of_time ⊕ automata_theory ⊕ ay.nihat ⊕ baez.john ⊕ baiesi.marco ⊕ basu.sumit ⊕ beirl.wolfgang ⊕ birds ⊕ books:noted ⊕ books:recommended ⊕ boots.byron ⊕ bootstrap ⊕ branching_processes ⊕ brockwell.anthony ⊕ cai.t._tony ⊕ calls_for_papers ⊕ cartoons ⊕ causality ⊕ causal_inference ⊕ central_limit_theorem ⊕ chains_with_complete_connections ⊕ change-point_problem ⊕ choudhury.tanzeem ⊕ clarkson.brian ⊕ classifiers ⊕ clustering ⊕ coarse-graining ⊕ collective_cognition ⊕ community_discovery ⊕ complexity_measures ⊕ computational_complexity ⊕ computational_statistics ⊕ concentration_of_measure ⊕ conferences ⊕ control_theory ⊕ convergence_of_stochastic_processes ⊕ copulas ⊕ coupling ⊕ coveted ⊕ crespi.valentino ⊕ cross-validation ⊕ cybenko.george ⊕ data_mining ⊕ del_moral.pierre ⊕ deviation_bounds ⊕ deviation_inequalities ⊕ dietterich.thomas ⊕ dimension_reduction ⊕ distributed_systems ⊕ doeblinwolfgang ⊕ domingos.pedro ⊕ dupuis.paul ⊕ dynamical_systems ⊕ ellis.richard ⊕ empirical_likelihood ⊕ empirical_processes ⊕ em_algorithm ⊕ ensemble_methods ⊕ entropy ⊕ entropy_estimation ⊕ epidemic_models ⊕ ergodic_theory ⊕ estimation ⊕ estimation_of_dynamical_systems ⊕ evolution ⊕ evolutionary_game_theory ⊕ expectation-maximization ⊕ exponential_families ⊕ filtering ⊕ finance ⊕ financial_speculation ⊕ fluctuation-dissipation_relations ⊕ fluctuation-response ⊕ fluid_mechanics ⊕ fmri ⊕ fox.emily ⊕ fractals ⊕ fraser.andrew ⊕ functional_connectivity ⊕ funny:geeky ⊕ galstyan.aram ⊕ genetic_algorithms ⊕ geometry ⊕ gibbs_distributions ⊕ goernerup.olof ⊕ goodness-of-fit ⊕ gordon.geoffrey_j. ⊕ grammar_induction ⊕ graphical_models ⊕ graph_theory ⊕ guttorp.peter ⊕ have_read ⊕ heard_the_talk ⊕ heavy_tails ⊕ hilbert_space ⊕ historical_linguistics ⊕ history_of_mathematics ⊕ hofling.holger ⊕ hofman.jake ⊕ hsu.daniel ⊕ hypothesis_testing ⊕ identifiability ⊕ inference_to_latent_objects ⊕ information_criteria ⊕ information_geometry ⊕ information_retrieval ⊕ information_theory ⊕ interacting_particle_systems ⊕ in_NB ⊕ ising_model ⊕ jaeger.herbert ⊕ jordan.michael_i. ⊕ kakade.sham ⊕ kernel_methods ⊕ kifer.yuri ⊕ kitchens.bruce ⊕ kith_and_kin ⊕ klinkner.kristina ⊕ knight.frank_b. ⊕ kolar.mladen ⊕ kontorovich.aryeh ⊕ kontoyiannis.ioannis ⊕ kotelentz.peter_m. ⊕ kurtz.thomas ⊕ langford.john ⊕ laplace_approximation ⊕ large_deviations ⊕ lasso ⊕ latent_variables ⊕ law_of_the_iterated_logarithm ⊕ learning_theory ⊕ lee.ann ⊕ likelihood ⊕ limit_theorems ⊕ linguistics ⊕ linguistic_evolution ⊕ lives_of_the_scientists ⊕ logistic_regression ⊕ machine_learning ⊕ macro_from_micro ⊕ maes.christian ⊕ manifold_learning ⊕ markov_models ⊖ martingales ⊕ meyn.sean ⊕ meyn.sean_p. ⊕ misspecification ⊕ mixing ⊕ mixture_models ⊕ modeling ⊕ model_selection ⊕ monte_carlo ⊕ moulines.eric ⊕ movies ⊕ multiple_comparisons ⊕ multiple_testing ⊕ networks ⊕ network_data_analysis ⊕ neural_data_analysis ⊕ neville.jennifer ⊕ nilsson_jacobi.martin ⊕ noethers_theorem ⊕ non-equilibrium ⊕ non-stationarity ⊕ nonparametrics ⊕ paninski.liam ⊕ particle_filters ⊕ path_integrals ⊕ pentland.alex ⊕ phase_transitions ⊕ point_processes ⊕ prediction ⊕ predictive_state_representations ⊕ probability ⊕ R ⊕ randal.douc ⊕ random_fields ⊕ random_walks ⊕ rare_event_simulation ⊕ re:almost_none ⊕ re:AoS_project ⊕ re:bayes_as_evol ⊕ re:do-institutions-evolve ⊕ re:growing_ensemble_project ⊕ re:homophily_and_confounding ⊕ re:simulating_coupled_markov_chains ⊕ re:smoothing_adjacency_matrices ⊕ re:social-networks-as-sensor-networks ⊕ re:stacs ⊕ re:what_is_a_macrostate ⊕ re:XV_for_mixing ⊕ re:XV_for_networks ⊕ re:your_favorite_dsge_sucks ⊕ re:your_favorite_ergm_sucks ⊕ recurrence_times ⊕ regression ⊕ reinforcement_learning ⊕ relational_learning ⊕ salakhutdinov.ruslan ⊕ search_engines ⊕ self-centered ⊕ send_a_note ⊕ separation_of_time-scales ⊕ siddiqi.sajid_m. ⊕ signal_processing ⊕ signal_transduction ⊕ simulation ⊕ sleep ⊕ snijders.tom ⊕ social_influence ⊕ social_networks ⊕ sofic_processes ⊕ song.le ⊕ sparsity ⊕ spatial_statistics ⊕ spectral_clustering ⊕ spectral_gap ⊕ spectral_methods ⊕ state-space_models ⊕ state_estimation ⊕ statistical_inference_for_stochastic_processes ⊕ statistical_learning ⊕ statistical_mechanics ⊕ statistics ⊕ stochastic_differential_equations ⊕ stochastic_models ⊕ stochastic_processes ⊕ stochastic_volatility ⊕ symbolic_dynamics ⊕ synchronizing_words ⊕ thermodynamics ⊕ tibshirani.robert ⊕ time_series ⊕ to:blog ⊕ to:NB ⊕ to_be_shot_after_a_fair_trial ⊕ to_read ⊕ to_teach:advanced-stochastic-processes ⊕ to_teach:complexity-and-inference ⊕ to_teach:data-mining ⊕ to_teach:financial-time-series ⊕ to_teach:undergrad-ADA ⊕ track_down_references ⊕ transaction_networks ⊕ tuncel.selim ⊕ universal_prediction ⊕ van_handel.ramon ⊕ varadhan.s.r.s ⊕ variable-length_markov_models ⊕ via:arthegall ⊕ via:ded-maxim ⊕ via:dpfeldman ⊕ via:guslacerda ⊕ via:paper_I_refereed_and_can't_tell_you_about ⊕ visual_display_of_quantitative_information ⊕ waiting_times ⊕ wasserman.larry ⊕ watts.duncan ⊕ xing.eric ⊕ zhang.tong ⊕Copy this bookmark: