cshalizi + em_algorithm 11
[1203.5181] $k$-MLE: A fast algorithm for learning statistical mixture models
9 weeks ago by cshalizi
"We describe $k$-MLE, a fast and efficient local search algorithm for learning finite statistical mixtures of exponential families such as Gaussian mixture models. Mixture models are traditionally learned using the expectation-maximization (EM) soft clustering technique that monotonically increases the incomplete (expected complete) likelihood. Given prescribed mixture weights, the hard clustering $k$-MLE algorithm iteratively assigns data to the most likely weighted component and update the component models using Maximum Likelihood Estimators (MLEs). Using the duality between exponential families and Bregman divergences, we prove that the local convergence of the complete likelihood of $k$-MLE follows directly from the convergence of a dual additively weighted Bregman hard clustering. The inner loop of $k$-MLE can be implemented using any $k$-means heuristic like the celebrated Lloyd's batched or Hartigan's greedy swap updates. We then show how to update the mixture weights by minimizing a cross-entropy criterion that implies to update weights by taking the relative proportion of cluster points, and reiterate the mixture parameter update and mixture weight update processes until convergence. Hard EM is interpreted as a special case of $k$-MLE when both the component update and the weight update are performed successively in the inner loop. To initialize $k$-MLE, we propose $k$-MLE++, a careful initialization of $k$-MLE guaranteeing probabilistically a global bound on the best possible complete likelihood."
in_NB
em_algorithm
mixture_models
statistics
machine_learning
clustering
estimation
9 weeks ago by cshalizi
[0712.4273] Online EM Algorithm for Latent Data Models
12 weeks ago by cshalizi
"In this contribution, we propose a generic online (also sometimes called adaptive or recursive) version of the Expectation-Maximisation (EM) algorithm applicable to latent variable models of independent observations. Compared to the algorithm of Titterington (1984), this approach is more directly connected to the usual EM algorithm and does not rely on integration with respect to the complete data distribution. The resulting algorithm is usually simpler and is shown to achieve convergence to the stationary points of the Kullback-Leibler divergence between the marginal distribution of the observation and the model distribution at the optimal rate, i.e., that of the maximum likelihood estimator. In addition, the proposed approach is also suitable for conditional (or regression) models, as illustrated in the case of the mixture of linear regressions model."
to:NB
statistics
em_algorithm
12 weeks ago by cshalizi
Online Learning with Hidden Markov Models
february 2012 by cshalizi
"We present an online version of the expectation-maximization (EM) algorithm for hidden Markov models (HMMs). The sufficient statistics required for parameters estimation is computed recursively with time, that is, in an online way instead of using the batch forward-backward procedure. This computational scheme is generalized to the case where the model parameters can change with time by introducing a discount factor into the recurrence relations. The resulting algorithm is equivalent to the batch EM algorithm, for appropriate discount factor and scheduling of parameters update. On the other hand, the online algorithm is able to deal with dynamic environments, i.e., when the statistics of the observed data is changing with time. The implications of the online algorithm for probabilistic modeling in neuroscience are briefly discussed."
to:NB
markov_models
filtering
state_estimation
statistics
em_algorithm
february 2012 by cshalizi
[1201.5913] A Component-wise EM Algorithm for Mixtures
february 2012 by cshalizi
"In some situations, EM algorithm shows slow convergence problems. One possible reason is that standard procedures update the parameters simultaneously. In this paper we focus on finite mixture estimation. In this framework, we propose a component-wise EM, which updates the parameters sequentially. We give an interpretation of this procedure as a proximal point algorithm and use it to prove the convergence. Illustrative numerical experiments show how our algorithm compares to EM and a version of the SAGE algorithm."
Huh, is this related to the way that updating partial response functions simultaneously in a GAM can be trouble, and it's better, as in standard backfitting, to update sequentially?
in_NB
statistics
em_algorithm
mixture_models
Huh, is this related to the way that updating partial response functions simultaneously in a GAM can be trouble, and it's better, as in standard backfitting, to update sequentially?
february 2012 by cshalizi
[1111.4954] Estimation for general birth-death processes
november 2011 by cshalizi
"Birth-death processes (BDPs) are continuous-time Markov chains that track the number of "particles" in a system over time. While widely used in population biology, genetics and ecology, statistical inference of the instantaneous particle birth and death rates remains largely limited to restrictive linear BDPs in which per-particle birth and death rates are constant. Researchers often observe the number of particles at discrete times, necessitating data augmentation procedures such as expectation-maximization (EM) to find maximum likelihood estimates. The E-step in the EM algorithm is available in closed-form for some linear BDPs, but otherwise previous work has resorted to approximation or simulation. Remarkably, the E-step conditional expectations can also be expressed as convolutions of computable transition probabilities for any general BDP with arbitrary rates. This important observation, along with a convenient continued fraction representation of the Laplace transforms of the transition probabilities, allows novel and efficient computation of the conditional expectations for all BDPs, eliminating the need for approximation or costly simulation. We use this insight to derive EM algorithms that yield maximum likelihood estimation for general BDPs characterized by various rate models, including generalized linear models. We show that our Laplace convolution technique outperforms competing methods when available and demonstrate a technique to accelerate EM algorithm convergence. Finally, we validate our approach using synthetic data and then apply our methods to estimation of mutation parameters in microsatellite evolution."
to:NB
statistics
statistical_inference_for_stochastic_processes
em_algorithm
november 2011 by cshalizi
A stable estimator of the information matrix under EM for dependent data
december 2010 by cshalizi
"This article develops a new and stable estimator for information matrix when the EM algorithm is used in maximum likelihood estimation. This estimator is constructed using the smoothed individual complete-data scores that are readily available from running the EM algorithm. The method works for dependent data sets and when the expectation step is an irregular function of the conditioning parameters." (When I teach EM, I should say something about how to get uncertainty estimates...)
fisher_information
em_algorithm
estimation
statistics
to_teach:data-mining
to_teach:undergrad-ADA
december 2010 by cshalizi
[0910.2034] Strategies for Online Inference of Model-Based Clustering in large Networks
october 2009 by cshalizi
Can't tell what they're actually doing (other than tweaking estimation procedures). Read carefully.
community_discovery
network_data_analysis
em_algorithm
to_read
re:stacs
october 2009 by cshalizi
Spike Train Decoding without Spike Sorting (Ventura)
february 2008 by cshalizi
Valerie's paper on neural decoding and tuning-curve estimation without spike-sorting. "[A] novel paradigm for spike train decoding, which avoids entirely spike sorting based on waveform measurements. This paradigm directly uses the spike train collected a
em_algorithm
neuroscience
ventura.valerie
have_read
kith_and_kin
neural_coding_and_decoding
february 2008 by cshalizi
related tags
clustering ⊕ community_discovery ⊕ em_algorithm ⊖ estimation ⊕ filtering ⊕ fisher_information ⊕ have_read ⊕ in_NB ⊕ kith_and_kin ⊕ machine_learning ⊕ markov_models ⊕ mixture_models ⊕ network_data_analysis ⊕ neural_coding_and_decoding ⊕ neuroscience ⊕ newman.mark ⊕ programming ⊕ R ⊕ re:stacs ⊕ regression ⊕ state_estimation ⊕ statistical_inference_for_stochastic_processes ⊕ statistics ⊕ to:NB ⊕ to_read ⊕ to_teach:data-mining ⊕ to_teach:undergrad-ADA ⊕ ventura.valerie ⊕Copy this bookmark: