mraginsky + machine-learning   137

[1204.6265] Statistical inference for dynamical systems: a review
The topic of statistical inference for dynamical systems has been studied extensively across several fields. In this survey we focus on the problem of parameter estimation for non-linear dynamical systems. Our objective is to place results across distinct disciplines in a common setting and highlight opportunities for further research.
papers  to-read  dynamical-systems  machine-learning  system-identification  ergodic-theory 
4 weeks ago by mraginsky
[0911.0280] Causal Inference on Discrete Data using Additive Noise Models
Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. Recently, methods using additive noise models have been suggested to approach the case of continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work we extend the notion of additive noise models to these cases. We prove that whenever the joint distribution $prob^{(X,Y)}$ admits such a model in one direction, e.g. $Y=f(X)+N, N independent X$, it does not admit the reversed model $X=g(Y)+tilde N, tilde N independent Y$ as long as the model is chosen in a generic way. Based on these deliberations we propose an efficient new algorithm that is able to distinguish between cause and effect for a finite sample of discrete variables. In an extensive experimental study we show that this algorithm works both on synthetic and real data sets.
papers  to-read  machine-learning  causality 
march 2012 by mraginsky
[1105.2274] Data-Distributed Weighted Majority and Online Mirror Descent
In this paper, we focus on the question of the extent to which online learning can benefit from distributed computing. We focus on the setting in which $N$ agents online-learn cooperatively, where each agent only has access to its own data. We propose a generic data-distributed online learning meta-algorithm. We then introduce the Distributed Weighted Majority and Distributed Online Mirror Descent algorithms, as special cases. We show, using both theoretical analysis and experiments, that compared to a single agent: given the same computation time, these distributed algorithms achieve smaller generalization errors; and given the same generalization errors, they can be $N$ times faster.
papers  to-read  machine-learning  online-learning  optimization  distributed-systems 
february 2012 by mraginsky
[1106.4739] Nonasymptotic bounds on the estimation error of MCMC algorithms
"We address the problem of upper bounding the mean square error of MCMC estimators. Our analysis is non-asymptotic. We first establish a general result valid for essentially all ergodic Markov chains encountered in Bayesian computation and a possibly unbounded target function $f.$ The bound is sharp in the sense that the leading term is exactly $\asvar/n$, where $\asvar$ is the CLT asymptotic variance. Next, we proceed to specific assumptions and give explicit computable bounds for geometrically and polynomially ergodic Markov chains. As a corollary we provide results on confidence estimation."
papers  to-read  optimization  algorithms  MCMC  machine-learning 
june 2011 by mraginsky
[1010.4207] Convex Analysis and Optimization with Submodular Functions: a Tutorial (Francis Bach)
"Set-functions appear in many areas of computer science and applied mathematics, such as machine learning, computer vision, operations research or electrical networks. Among these set-functions, submodular functions play an important role, similar to convex functions on vector spaces. In this tutorial, the theory of submodular functions is presented, in a self-contained way, with all results shown from first principles. A good knowledge of convex analysis is assumed."
papers  to-read  lecture-notes  convex-programming  optimization  machine-learning  algorithms  submodular-functions 
november 2010 by mraginsky
[1011.2952] Balanced Reduction of Nonlinear Control Systems in Reproducing Kernel Hilbert Space
"We introduce a novel data-driven order reduction method for nonlinear control systems, drawing on recent progress in machine learning and statistical dimensionality reduction. The method rests on the assumption that the nonlinear system behaves linearly when lifted into a high (or infinite) dimensional feature space where balanced truncation may be carried out implicitly. This leads to a nonlinear reduction map which can be combined with a representation of the system belonging to a reproducing kernel Hilbert space to give a closed, reduced order dynamical system which captures the essential input-output characteristics of the original model. Empirical simulations illustrating the approach are also provided."
papers  to-read  heard-the-talk  control-theory  machine-learning  dynamical-systems 
november 2010 by mraginsky
[1011.0415] Learning Networks of Stochastic Differential Equations
"We consider linear models for stochastic dynamics. To any such model can be associated a network (namely a directed graph) describing which degrees of freedom interact under the dynamics. We tackle the problem of learning such a network from observation of the system trajectory over a time interval $T$.
We analyze the $\ell_1$-regularized least squares algorithm and, in the setting in which the underlying network is sparse, we prove performance guarantees that are \emph{uniform in the sampling rate} as long as this is sufficiently high. This result substantiates the notion of a well defined `time complexity' for the network inference problem."
papers  to-read  sparsity  complex-systems  system-identification  machine-learning 
november 2010 by mraginsky
[1008.4654] Freezing and Sleeping: Tracking Experts that Learn by Evolving Past Posteriors
Hot damn! "... in this paper we re-examine Freund's problem in case the experts have internal structure that enables them to learn. In this case the problem has two possible interpretations: should the experts learn from all data or only from the subsequence on which they are being tracked? The MPP algorithm solves the first case. Our contribution is to generalise MPP to address the second option. The results we obtain apply to any expert structure that can be formalised using (expert) hidden Markov models. Curiously enough, for our interpretation there are \emph{two} natural reference schemes: freezing and sleeping. For each scheme, we provide an efficient prediction strategy and prove the relevant loss bound."
papers  to-read  online-learning  decision-making  machine-learning  algorithms 
august 2010 by mraginsky
[1008.1758] Stochastic Data Clustering
Looks very interesting: "In 1961 Herbert Simon and Albert Ando published the theory behind the long-term behavior of a dynamical system that can be described by a nearly completely decomposable matrix. Over the past fifty years this theory has been used in a variety of contexts, including queueing theory, computer performance, and ecology. In all these applications, the structure of the system is known and the point of interest is the various states the system passes through on its way to some long-term equilibrium.
This paper looks at this problem from the other direction. That is, we develop a technique for using the evolution of the system to tell us about its initial structure, and we use this technique to develop a new algorithm for data clustering."
papers  to-read  dynamical-systems  machine-learning  data-mining  clustering  via:arthegall 
august 2010 by mraginsky
"A Revisionist History of Connectionism"
A great quote from M. Minsky: "It would seem that Perceptrons has much the same role as The Necronomicon -- that is, often cited but never read."
history_of_cybernetics  AI  cognitive-science  machine-learning  perception  connectionism  neuroscience  essays 
july 2010 by mraginsky
Generalizing Apprenticeship Learning across Hypothesis Classes .oO(ML Discuss)
"This paper develops a generalized apprenticeship learning protocol for reinforcement-learning agents with access to a teacher who provides policy traces (transition and reward observations). We characterize sufficient conditions of the underlying models for efficient apprenticeship learning and link this criteria to two established learnability classes (KWIK and Mistake Bound). We then construct efficient apprenticeship-learning algorithms in a number of domains, including two types of relational MDPs. We instantiate our approach in a software agent and a robot agent that learn effectively from a human teacher."
papers  to-read  machine-learning  reinforcement-learning  control-theory  decision-making 
july 2010 by mraginsky
[0901.2735] State Space Realization Theorems For Data Mining
"In this paper, we consider formal series associated with events, profiles derived from events, and statistical models that make predictions about events. We prove theorems about realizations for these formal series using the language and tools of Hopf algebras."
papers  to-read  state-space-models  machine-learning  computation 
may 2010 by mraginsky
[1004.0557] Applications of Lindeberg Principle in Communications and Statistical Learning
"We use a generalization of the Lindeberg principle developed by Sourav Chatterjee to prove universality properties for various problems in communications, statistical learning and random matrix theory. We also show that these systems can be viewed as the limiting case of a properly defined sparse system. The latter result is useful when the sparse systems are easier to analyze than their dense counterparts. The list of problems we consider is by no means exhaustive. We believe that the ideas can be used in many other problems relevant for information theory."
paper  to-read  random-matrices  probability  machine-learning  signal-processing  information-theory  geometric-functional-analysis 
april 2010 by mraginsky
Reasoning in Reduced Information Spaces
"...a website for research and coordination on Decentralized Techniques for Reasoning in Reduced Information Spaces (ONR 2009 Multi-disciplinary University Research Project)"
decision-making  decentralized-control  machine-learning  distributed-systems  AI  algorithms  optimization 
april 2010 by mraginsky
[0911.0054] Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity (Sham M. Kakade, Ohad Shamir, Karthik Sridharan, Ambuj Tewari)
Abstract: "The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model. A central issue is learning these models in high-dimensions, such as when there is some sparsity pattern of the optimal parameter. This work characterizes a certain strong convexity property of general exponential families, which allow their generalization ability to be quantified. In particular, we show how this property can be used to analyze generic exponential families under L_1 regularization."
papers  have-read  statistics  learning-theory  machine-learning  exponential-families  graphical-models  convex-programming  sparsity 
november 2009 by mraginsky
[0910.5761] Which graphical models are difficult to learn? (Authors: Jose Bento, Andrea Montanari)
"We consider the problem of learning the structure of Ising models (pairwise binary Markov random fields) from i.i.d. samples. While several methods have been proposed to accomplish this task, their relative merits and limitations remain somewhat obscure. By analyzing a number of concrete examples, we show that low-complexity algorithms systematically fail when the Markov random field develops long-range correlations. More precisely, this phenomenon appears to be related to the Ising model phase transition (although it does not coincide with it)."
papers  to-read  graphical-models  statistics  machine-learning  statistical-physics 
november 2009 by mraginsky
Optimal sequential experimental design (active learning)
The work by Liam Paninski et al. "It is often expensive to run experiments. Thus we would like to choose our experimental stimuli so that we can learn as much as possible in just a few trials. In many cases, we can try to optimize our stimuli online, adapting our experiments to take into account each new observation. We have examined this problem both theoretically (how much does this adaptive experimental approach help, in the long run?) and computationally (is it possible to perform this optimization in real time?), and many interesting questions remain open."
papers  to-read  statistics  machine-learning  information-theory  experimental-design  neuroscience 
september 2009 by mraginsky
« earlier      

related tags

active-learning  adaptive-systems  ai  algorithms  bayesian  biology  biostatistics  blogs  books  causality  channel-coding  clustering  coding-theory  cognitive-science  communications  complex-systems  complexity  compressed-sensing  computation  computer-science  computer-vision  conferences  connectionism  consensus-algorithms  control-theory  convex-programming  crypto  cybernetics  data-mining  decentralized-control  decision-making  distributed-computing  distributed-systems  Duke  dynamical-systems  economics  energy  environment  epistemology  ergodic-theory  essays  experimental-design  exponential-families  feature-selection  filetype:pdf  finance  forensics  game-theory  geekery  genomics  geometric-functional-analysis  graph-theory  graphical-models  have-read  heard-the-talk  history_of_cybernetics  history_of_statistics  homepages  information-theory  interesting  journals  learning-theory  lecture-notes  machine-learning  mathematics  MCMC  media:document  methodology  model-selection  multiagent-systems  network-data-analysis  neuroscience  omics  online-learning  optimization  paper  papers  papes  people  perception  philosophy  polemics  prediction  prediction-markets  probability  random-matrices  re:active_feature_selection_project  reference  reinforcement-learning  research  reviews  russian  satire  security  sensor-networks  signal-processing  social-networks  sparsity  state-space-models  statistical-learning  statistical-physics  statistics  submodular-functions  sustainability  system-identification  to-read  universal-prediction  via:arthegall  via:mreid 

Copy this bookmark:



description:


tags: