mraginsky + machine-learning 137
[1204.6265] Statistical inference for dynamical systems: a review
4 weeks ago by mraginsky
The topic of statistical inference for dynamical systems has been studied extensively across several fields. In this survey we focus on the problem of parameter estimation for non-linear dynamical systems. Our objective is to place results across distinct disciplines in a common setting and highlight opportunities for further research.
papers
to-read
dynamical-systems
machine-learning
system-identification
ergodic-theory
4 weeks ago by mraginsky
[0911.0280] Causal Inference on Discrete Data using Additive Noise Models
march 2012 by mraginsky
Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. Recently, methods using additive noise models have been suggested to approach the case of continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work we extend the notion of additive noise models to these cases. We prove that whenever the joint distribution $prob^{(X,Y)}$ admits such a model in one direction, e.g. $Y=f(X)+N, N independent X$, it does not admit the reversed model $X=g(Y)+tilde N, tilde N independent Y$ as long as the model is chosen in a generic way. Based on these deliberations we propose an efficient new algorithm that is able to distinguish between cause and effect for a finite sample of discrete variables. In an extensive experimental study we show that this algorithm works both on synthetic and real data sets.
papers
to-read
machine-learning
causality
march 2012 by mraginsky
[1105.2274] Data-Distributed Weighted Majority and Online Mirror Descent
february 2012 by mraginsky
In this paper, we focus on the question of the extent to which online learning can benefit from distributed computing. We focus on the setting in which $N$ agents online-learn cooperatively, where each agent only has access to its own data. We propose a generic data-distributed online learning meta-algorithm. We then introduce the Distributed Weighted Majority and Distributed Online Mirror Descent algorithms, as special cases. We show, using both theoretical analysis and experiments, that compared to a single agent: given the same computation time, these distributed algorithms achieve smaller generalization errors; and given the same generalization errors, they can be $N$ times faster.
papers
to-read
machine-learning
online-learning
optimization
distributed-systems
february 2012 by mraginsky
[1106.4739] Nonasymptotic bounds on the estimation error of MCMC algorithms
june 2011 by mraginsky
"We address the problem of upper bounding the mean square error of MCMC estimators. Our analysis is non-asymptotic. We first establish a general result valid for essentially all ergodic Markov chains encountered in Bayesian computation and a possibly unbounded target function $f.$ The bound is sharp in the sense that the leading term is exactly $\asvar/n$, where $\asvar$ is the CLT asymptotic variance. Next, we proceed to specific assumptions and give explicit computable bounds for geometrically and polynomially ergodic Markov chains. As a corollary we provide results on confidence estimation."
papers
to-read
optimization
algorithms
MCMC
machine-learning
june 2011 by mraginsky
[1010.4207] Convex Analysis and Optimization with Submodular Functions: a Tutorial (Francis Bach)
november 2010 by mraginsky
"Set-functions appear in many areas of computer science and applied mathematics, such as machine learning, computer vision, operations research or electrical networks. Among these set-functions, submodular functions play an important role, similar to convex functions on vector spaces. In this tutorial, the theory of submodular functions is presented, in a self-contained way, with all results shown from first principles. A good knowledge of convex analysis is assumed."
papers
to-read
lecture-notes
convex-programming
optimization
machine-learning
algorithms
submodular-functions
november 2010 by mraginsky
[1011.2952] Balanced Reduction of Nonlinear Control Systems in Reproducing Kernel Hilbert Space
november 2010 by mraginsky
"We introduce a novel data-driven order reduction method for nonlinear control systems, drawing on recent progress in machine learning and statistical dimensionality reduction. The method rests on the assumption that the nonlinear system behaves linearly when lifted into a high (or infinite) dimensional feature space where balanced truncation may be carried out implicitly. This leads to a nonlinear reduction map which can be combined with a representation of the system belonging to a reproducing kernel Hilbert space to give a closed, reduced order dynamical system which captures the essential input-output characteristics of the original model. Empirical simulations illustrating the approach are also provided."
papers
to-read
heard-the-talk
control-theory
machine-learning
dynamical-systems
november 2010 by mraginsky
[1011.0415] Learning Networks of Stochastic Differential Equations
november 2010 by mraginsky
"We consider linear models for stochastic dynamics. To any such model can be associated a network (namely a directed graph) describing which degrees of freedom interact under the dynamics. We tackle the problem of learning such a network from observation of the system trajectory over a time interval $T$.
We analyze the $\ell_1$-regularized least squares algorithm and, in the setting in which the underlying network is sparse, we prove performance guarantees that are \emph{uniform in the sampling rate} as long as this is sufficiently high. This result substantiates the notion of a well defined `time complexity' for the network inference problem."
papers
to-read
sparsity
complex-systems
system-identification
machine-learning
We analyze the $\ell_1$-regularized least squares algorithm and, in the setting in which the underlying network is sparse, we prove performance guarantees that are \emph{uniform in the sampling rate} as long as this is sufficiently high. This result substantiates the notion of a well defined `time complexity' for the network inference problem."
november 2010 by mraginsky
[1008.4654] Freezing and Sleeping: Tracking Experts that Learn by Evolving Past Posteriors
august 2010 by mraginsky
Hot damn! "... in this paper we re-examine Freund's problem in case the experts have internal structure that enables them to learn. In this case the problem has two possible interpretations: should the experts learn from all data or only from the subsequence on which they are being tracked? The MPP algorithm solves the first case. Our contribution is to generalise MPP to address the second option. The results we obtain apply to any expert structure that can be formalised using (expert) hidden Markov models. Curiously enough, for our interpretation there are \emph{two} natural reference schemes: freezing and sleeping. For each scheme, we provide an efficient prediction strategy and prove the relevant loss bound."
papers
to-read
online-learning
decision-making
machine-learning
algorithms
august 2010 by mraginsky
[1008.1758] Stochastic Data Clustering
august 2010 by mraginsky
Looks very interesting: "In 1961 Herbert Simon and Albert Ando published the theory behind the long-term behavior of a dynamical system that can be described by a nearly completely decomposable matrix. Over the past fifty years this theory has been used in a variety of contexts, including queueing theory, computer performance, and ecology. In all these applications, the structure of the system is known and the point of interest is the various states the system passes through on its way to some long-term equilibrium.
This paper looks at this problem from the other direction. That is, we develop a technique for using the evolution of the system to tell us about its initial structure, and we use this technique to develop a new algorithm for data clustering."
papers
to-read
dynamical-systems
machine-learning
data-mining
clustering
via:arthegall
This paper looks at this problem from the other direction. That is, we develop a technique for using the evolution of the system to tell us about its initial structure, and we use this technique to develop a new algorithm for data clustering."
august 2010 by mraginsky
"A Revisionist History of Connectionism"
july 2010 by mraginsky
A great quote from M. Minsky: "It would seem that Perceptrons has much the same role as The Necronomicon -- that is, often cited but never read."
history_of_cybernetics
AI
cognitive-science
machine-learning
perception
connectionism
neuroscience
essays
july 2010 by mraginsky
Generalizing Apprenticeship Learning across Hypothesis Classes .oO(ML Discuss)
july 2010 by mraginsky
"This paper develops a generalized apprenticeship learning protocol for reinforcement-learning agents with access to a teacher who provides policy traces (transition and reward observations). We characterize sufficient conditions of the underlying models for efficient apprenticeship learning and link this criteria to two established learnability classes (KWIK and Mistake Bound). We then construct efficient apprenticeship-learning algorithms in a number of domains, including two types of relational MDPs. We instantiate our approach in a software agent and a robot agent that learn effectively from a human teacher."
papers
to-read
machine-learning
reinforcement-learning
control-theory
decision-making
july 2010 by mraginsky
[0901.2735] State Space Realization Theorems For Data Mining
may 2010 by mraginsky
"In this paper, we consider formal series associated with events, profiles derived from events, and statistical models that make predictions about events. We prove theorems about realizations for these formal series using the language and tools of Hopf algebras."
papers
to-read
state-space-models
machine-learning
computation
may 2010 by mraginsky
[1004.0557] Applications of Lindeberg Principle in Communications and Statistical Learning
april 2010 by mraginsky
"We use a generalization of the Lindeberg principle developed by Sourav Chatterjee to prove universality properties for various problems in communications, statistical learning and random matrix theory. We also show that these systems can be viewed as the limiting case of a properly defined sparse system. The latter result is useful when the sparse systems are easier to analyze than their dense counterparts. The list of problems we consider is by no means exhaustive. We believe that the ideas can be used in many other problems relevant for information theory."
paper
to-read
random-matrices
probability
machine-learning
signal-processing
information-theory
geometric-functional-analysis
april 2010 by mraginsky
Reasoning in Reduced Information Spaces
april 2010 by mraginsky
"...a website for research and coordination on Decentralized Techniques for Reasoning in Reduced Information Spaces (ONR 2009 Multi-disciplinary University Research Project)"
decision-making
decentralized-control
machine-learning
distributed-systems
AI
algorithms
optimization
april 2010 by mraginsky
Sequences Project
january 2010 by mraginsky
Michael Steele's page on individual sequences, learning in games, etc.
reference
universal-prediction
machine-learning
statistics
game-theory
statistical-learning
optimization
via:mreid
january 2010 by mraginsky
[0911.0054] Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity (Sham M. Kakade, Ohad Shamir, Karthik Sridharan, Ambuj Tewari)
november 2009 by mraginsky
Abstract: "The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model. A central issue is learning these models in high-dimensions, such as when there is some sparsity pattern of the optimal parameter. This work characterizes a certain strong convexity property of general exponential families, which allow their generalization ability to be quantified. In particular, we show how this property can be used to analyze generic exponential families under L_1 regularization."
papers
have-read
statistics
learning-theory
machine-learning
exponential-families
graphical-models
convex-programming
sparsity
november 2009 by mraginsky
[0910.5761] Which graphical models are difficult to learn? (Authors: Jose Bento, Andrea Montanari)
november 2009 by mraginsky
"We consider the problem of learning the structure of Ising models (pairwise binary Markov random fields) from i.i.d. samples. While several methods have been proposed to accomplish this task, their relative merits and limitations remain somewhat obscure. By analyzing a number of concrete examples, we show that low-complexity algorithms systematically fail when the Markov random field develops long-range correlations. More precisely, this phenomenon appears to be related to the Ising model phase transition (although it does not coincide with it)."
papers
to-read
graphical-models
statistics
machine-learning
statistical-physics
november 2009 by mraginsky
Optimal sequential experimental design (active learning)
september 2009 by mraginsky
The work by Liam Paninski et al. "It is often expensive to run experiments. Thus we would like to choose our experimental stimuli so that we can learn as much as possible in just a few trials. In many cases, we can try to optimize our stimuli online, adapting our experiments to take into account each new observation. We have examined this problem both theoretically (how much does this adaptive experimental approach help, in the long run?) and computationally (is it possible to perform this optimization in real time?), and many interesting questions remain open."
papers
to-read
statistics
machine-learning
information-theory
experimental-design
neuroscience
september 2009 by mraginsky
Data spectroscopy: eigenspaces of convolution operators and clustering [pdf]
july 2009 by mraginsky
Tao Shi, Mikhail Belkin and Bin Yu; to appear in The Annals of Statistics
papers
statistics
machine-learning
learning-theory
filetype:pdf
media:document
july 2009 by mraginsky
Active Learning Tutorial, ICML 2009
june 2009 by mraginsky
Sanjoy Dasgupta and John Langford
machine-learning
learning-theory
reference
papers
conferences
june 2009 by mraginsky
Hilbert Space Embeddings of Conditional Distributions with Applications to Dynamical Systems (2009) [ICML Discussion]
june 2009 by mraginsky
Song, Le and Huang, Jonathan and Smola, Alex and Fukumizu, Kenji
papers
to-read
dynamical-systems
learning-theory
control-theory
machine-learning
june 2009 by mraginsky
[0901.0356] Information, Divergence and Risk for Binary Experiments
june 2009 by mraginsky
Mark D. Reid, Robert C. Williamson
papers
to-read
statistics
learning-theory
information-theory
machine-learning
june 2009 by mraginsky
[0905.3369] Learning Nonlinear Dynamic Models
june 2009 by mraginsky
John Langford, Ruslan Salakhutdinov, Tong Zhang; to appear in ICML 2009
papers
to-read
dynamical-systems
learning-theory
control-theory
machine-learning
june 2009 by mraginsky
Estimation of Sparse Binary Pairwise Markov Networks using Pseudo-likelihoods
may 2009 by mraginsky
Holger Höfling, Robert Tibshirani; JMLR 10(Apr):883--906, 2009 (via cshalizi)
papers
have-read
statistics
machine-learning
graphical-models
optimization
algorithms
may 2009 by mraginsky
[0903.1468] Taking Advantage of Sparsity in Multi-Task Learning
march 2009 by mraginsky
Karim Lounici, Massimiliano Pontil, Alexandre B. Tsybakov, Sara van de Geer
papers
to-read
statistics
machine-learning
learning-theory
sparsity
optimization
march 2009 by mraginsky
[0903.0649] The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs
march 2009 by mraginsky
Han Liu, John Lafferty, Larry Wasserman
have-read
statistics
network-data-analysis
machine-learning
graph-theory
march 2009 by mraginsky
[0809.2085] Clustered Multi-Task Learning: A Convex Formulation
march 2009 by mraginsky
Laurent Jacob, Francis Bach (INRIA Rocquencourt), Jean-Philippe Vert
papers
have-read
machine-learning
algorithms
optimization
march 2009 by mraginsky
[0902.4380] Kernel Conjugate Gradient is Universally Consistent
february 2009 by mraginsky
by Gilles Blanchard and Nicole Kraemer
papers
to-read
statistics
machine-learning
learning-theory
optimization
february 2009 by mraginsky
[0902.1284] Multi-Label Prediction via Compressed Sensing
february 2009 by mraginsky
by Daniel Hsu, Sham Kakade, John Langford and Tong Zhang
papers
have-read
machine-learning
compressed-sensing
february 2009 by mraginsky
Conditional Independence Models via Filtrations
february 2009 by mraginsky
by Simon Lunagomez, Sayan Mukherjee, Robert L. Wolpert (Duke University)
papers
to-read
statistics
network-data-analysis
machine-learning
graph-theory
bayesian
february 2009 by mraginsky
Sparse online learning with truncated gradient (pdf)
january 2009 by mraginsky
NIPS 2008 paper by John Langford, Lihong Li, and Tong Zhang
papers
to-read
statistics
machine-learning
optimization
sparsity
filetype:pdf
media:document
january 2009 by mraginsky
[0811.0121] Spectral Connectivity Analysis
november 2008 by mraginsky
Ann B. Lee and Larry Wasserman
papers
to-read
statistics
machine-learning
learning-theory
november 2008 by mraginsky
[0712.4273] Online EM Algorithm for Latent Data Models
august 2008 by mraginsky
Olivier Cappe and Eric Moulines
papers
have-read
machine-learning
optimization
statistics
algorithms
august 2008 by mraginsky
[0808.2902] A History of Markov Chain Monte Carlo--Subjective Recollections from Incomplete Data
august 2008 by mraginsky
by Christian Robert and George Casella
papers
to-read
history_of_statistics
algorithms
interesting
machine-learning
august 2008 by mraginsky
[0807.3050] Dimensionally Distributed Learning: Models and Algorithm
july 2008 by mraginsky
Haipeng Zheng, Sanjeev R. Kulkarni, H. Vincent Poor
papers
to-read
machine-learning
learning-theory
distributed-computing
optimization
algorithms
sensor-networks
july 2008 by mraginsky
[0804.0551] Statistical performance of support vector machines
april 2008 by mraginsky
Gilles Blanchard, Olivier Bousquet, Pascal Massart
papers
have-read
machine-learning
statistics
learning-theory
april 2008 by mraginsky
related tags
active-learning ⊕ adaptive-systems ⊕ ai ⊕ algorithms ⊕ bayesian ⊕ biology ⊕ biostatistics ⊕ blogs ⊕ books ⊕ causality ⊕ channel-coding ⊕ clustering ⊕ coding-theory ⊕ cognitive-science ⊕ communications ⊕ complex-systems ⊕ complexity ⊕ compressed-sensing ⊕ computation ⊕ computer-science ⊕ computer-vision ⊕ conferences ⊕ connectionism ⊕ consensus-algorithms ⊕ control-theory ⊕ convex-programming ⊕ crypto ⊕ cybernetics ⊕ data-mining ⊕ decentralized-control ⊕ decision-making ⊕ distributed-computing ⊕ distributed-systems ⊕ Duke ⊕ dynamical-systems ⊕ economics ⊕ energy ⊕ environment ⊕ epistemology ⊕ ergodic-theory ⊕ essays ⊕ experimental-design ⊕ exponential-families ⊕ feature-selection ⊕ filetype:pdf ⊕ finance ⊕ forensics ⊕ game-theory ⊕ geekery ⊕ genomics ⊕ geometric-functional-analysis ⊕ graph-theory ⊕ graphical-models ⊕ have-read ⊕ heard-the-talk ⊕ history_of_cybernetics ⊕ history_of_statistics ⊕ homepages ⊕ information-theory ⊕ interesting ⊕ journals ⊕ learning-theory ⊕ lecture-notes ⊕ machine-learning ⊖ mathematics ⊕ MCMC ⊕ media:document ⊕ methodology ⊕ model-selection ⊕ multiagent-systems ⊕ network-data-analysis ⊕ neuroscience ⊕ omics ⊕ online-learning ⊕ optimization ⊕ paper ⊕ papers ⊕ papes ⊕ people ⊕ perception ⊕ philosophy ⊕ polemics ⊕ prediction ⊕ prediction-markets ⊕ probability ⊕ random-matrices ⊕ re:active_feature_selection_project ⊕ reference ⊕ reinforcement-learning ⊕ research ⊕ reviews ⊕ russian ⊕ satire ⊕ security ⊕ sensor-networks ⊕ signal-processing ⊕ social-networks ⊕ sparsity ⊕ state-space-models ⊕ statistical-learning ⊕ statistical-physics ⊕ statistics ⊕ submodular-functions ⊕ sustainability ⊕ system-identification ⊕ to-read ⊕ universal-prediction ⊕ via:arthegall ⊕ via:mreid ⊕Copy this bookmark: