mraginsky + optimization   94

"Approximate dynamic programming via Bellman inequalities" (Wang and Boyd)
"In this paper we introduce new methods for finding functions that lower bound
the value function of a stochastic control problem, using an iterated form of the Bellman inequality. Our method is based on solving linear or semidefinite programs, and
produces both a bound on the optimal objective, as well as a suboptimal policy that
appears to works very well. These results extend and improve bounds obtained by
authors in a previous paper using a single Bellman inequality condition. We describe
the methods in a general setting, and show how they can be applied in specific cases
including the finite state case, constrained linear quadratic control, switched affine
control, and multi-period portfolio investment."
papers  to-read  control-theory  dynamic-programming  optimization 
6 weeks ago by mraginsky
[1202.6258] A Stochastic Gradient Method with an Exponential Convergence Rate for Strongly-Convex Optimization with Finite Training Sets
We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex. While standard stochastic gradient methods converge at sublinear rates for this problem, the proposed method incorporates a memory of previous gradient values in order to achieve a linear convergence rate. Numerical experiments indicate that the new algorithm can dramatically outperform standard algorithms.
papers  to-read  optimization  convex-programming 
march 2012 by mraginsky
"Online sequential optimization with biased gradients"
We study a class of stochastic optimization problems where though the objective functions
may not be convex, they satisfy a generalization of convexity, called the sequentially convex
property. We focus on a setting where the distribution of the underlying uncertainty is unknown and the manager must make a decision in real-time based on historical data. Since sequentially convex functions are not necessarily convex, they pose difficulties in applying standard adaptive methods for convex optimization. We propose a nonparametric algorithm based on a gradient descent method, and show that the T-season average expected cost di ffers from the minimum cost by at most O(1/sqrt(T)). Our analysis is based on a careful quanti cation of the bias that is inherent in gradient estimation due to the adaptive nature of the problem. We demonstrate the usefulness of the concept of sequential convexity by applying it to three canonical problems in inventory control, capacity allocation, and the lifetime buy decision, under the assumption that the manager does not know the demand distributions and has access only to historical sales (censored demand) data.
papers  to-read  online-learning  optimization  convex-programming 
february 2012 by mraginsky
[1105.2274] Data-Distributed Weighted Majority and Online Mirror Descent
In this paper, we focus on the question of the extent to which online learning can benefit from distributed computing. We focus on the setting in which $N$ agents online-learn cooperatively, where each agent only has access to its own data. We propose a generic data-distributed online learning meta-algorithm. We then introduce the Distributed Weighted Majority and Distributed Online Mirror Descent algorithms, as special cases. We show, using both theoretical analysis and experiments, that compared to a single agent: given the same computation time, these distributed algorithms achieve smaller generalization errors; and given the same generalization errors, they can be $N$ times faster.
papers  to-read  machine-learning  online-learning  optimization  distributed-systems 
february 2012 by mraginsky
Sebastien Bubeck, "Introduction to Online Optimization"
lecture notes for ORF570: Special Topics in Statistics and Operations Research, Prediction Games (Princeton)
lecture-notes  online-learning  optimization  prediction 
december 2011 by mraginsky
Huber: On the Non-Optimality of Optimal Procedures
"This paper discusses some subtle, and largely overlooked, differences between conceptual and mathematical optimization goals in statistics, and illustrates them by examples."
papers  to-read  statistics  robust_statistics  optimization  polemics 
july 2011 by mraginsky
[1106.4739] Nonasymptotic bounds on the estimation error of MCMC algorithms
"We address the problem of upper bounding the mean square error of MCMC estimators. Our analysis is non-asymptotic. We first establish a general result valid for essentially all ergodic Markov chains encountered in Bayesian computation and a possibly unbounded target function $f.$ The bound is sharp in the sense that the leading term is exactly $\asvar/n$, where $\asvar$ is the CLT asymptotic variance. Next, we proceed to specific assumptions and give explicit computable bounds for geometrically and polynomially ergodic Markov chains. As a corollary we provide results on confidence estimation."
papers  to-read  optimization  algorithms  MCMC  machine-learning 
june 2011 by mraginsky
[1103.4296] Randomized Smoothing for Stochastic Optimization
"We analyze convergence rates of stochastic optimization procedures for non-smooth convex optimization problems. By combining randomized smoothing techniques with accelerated gradient methods, we obtain optimal variance-based convergence rates of stochastic optimization procedures, both in expectation and with high probability, To the best of our knowledge, these are the first variance-based rates for non-smooth optimization. We give several applications of our results to statistical estimation problems, and provide experimental results that demonstrate the effectiveness of the proposed algorithms. We also describe how a combination of our algorithm with recent work on decentralized optimization yields a distributed stochastic optimization algorithm that is order-optimal."
papers  to-read  optimization  convex-programming 
march 2011 by mraginsky
[1102.4442] Internal Regret with Partial Monitoring. Calibration-Based Optimal Algorithms
"We provide consistent random algorithms for sequential decision under partial monitoring, i.e. when the decision maker does not observe the outcomes but receives instead random feedback signals. Those algorithms have no internal regret in the sense that, on the set of stages where the decision maker chose his action according to a given law, the average payoff could not have been improved in average by using any other fixed law. They are based on a generalization of calibration, no longer defined in terms of a Voronoi diagram but instead of a Laguerre diagram (a more general concept). This allows us to bound, for the first time in this general framework, the expected average internal -- as well as the usual external -- regret at stage $n$ by $O(n^{-1/3})$, which is known to be optimal."
papers  to-read  learning-theory  online-learning  game-theory  optimization 
february 2011 by mraginsky
Decision Making in Large Scale Systems (Spring 2004)
This course is an introduction to the theory and application of large-scale dynamic programming. Topics include Markov decision processes, dynamic programming algorithms, simulation-based algorithms, theory and algorithms for value function approximation, and policy search methods. The course examines games and applications in areas such as dynamic resource allocation, finance and queueing networks.
lecture-notes  control-theory  dynamic-programming  optimization  via:stochastix 
january 2011 by mraginsky
[1010.4207] Convex Analysis and Optimization with Submodular Functions: a Tutorial (Francis Bach)
"Set-functions appear in many areas of computer science and applied mathematics, such as machine learning, computer vision, operations research or electrical networks. Among these set-functions, submodular functions play an important role, similar to convex functions on vector spaces. In this tutorial, the theory of submodular functions is presented, in a self-contained way, with all results shown from first principles. A good knowledge of convex analysis is assumed."
papers  to-read  lecture-notes  convex-programming  optimization  machine-learning  algorithms  submodular-functions 
november 2010 by mraginsky
[1011.1936] Blackwell Approachability and Low-Regret Learning are Equivalent
"We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payoffs. We show that Blackwell's result is equivalent, via efficient reductions, to the existence of "no-regret" algorithms for Online Linear Optimization. Indeed, we show that any algorithm for one such problem can be efficiently converted into an algorithm for the other. We provide a useful application of this reduction: the first efficient algorithm for calibrated forecasting."
to-read  papers  online-learning  decision-making  game-theory  optimization 
november 2010 by mraginsky
Subexponential Lower Bound for Randomized Pivot Rules! | Combinatorics and more
"Oliver Friedman, Thomas Dueholm Hansen, and Uri Zwick have managed to prove subexponential lower bounds of the form for ... two basic randomized pivot rules for the simplex algorithm! This is the first result of its kind and deciding if this is possible was an open problem for several decades." And they do it using MDPs!
papers  to-read  computer-science  computation  optimization  linear-programming  dynamic-programming  Markov-decision-processes  lower-bounds 
november 2010 by mraginsky
[1009.3958] Approximate Inference and Stochastic Optimal Control
"We propose a novel reformulation of the stochastic optimal control problem as an approximate inference problem, demonstrating, that such a interpretation leads to new practical methods for the original problem. In particular we characterise a novel class of iterative solutions to the stochastic optimal control problem based on a natural relaxation of the exact dual formulation. These theoretical insights are applied to the Reinforcement Learning problem where they lead to new model free, off policy methods for discrete and continuous problems."
papers  to-read  control-theory  reinforcement-learning  optimization  information-theory  graphical-models 
september 2010 by mraginsky
[1009.0679] Optimal Uncertainty Quantification
Hmm: "We propose a rigorous framework for Uncertainty Quantification (UQ) in which the UQ objectives and the assumptions/information set are brought to the forefront. This framework, which we call \emph{Optimal Uncertainty Quantification} (OUQ), is based on the observation that, given a set of assumptions and information about the problem, there exist optimal bounds on uncertainties: these are obtained as extreme values of well-defined optimization problems corresponding to extremizing probabilities of failure, or of deviations, subject to the constraints imposed by the scenarios compatible with the assumptions and information. In particular, this framework does not implicitly impose inappropriate assumptions, nor does it repudiate relevant information. ..."
papers  to-read  information-theory  decision-making  optimization 
september 2010 by mraginsky
Niemiro: Asymptotics for $M$-Estimators Defined by Convex Minimization
Found this reference while reading Pollard and Hjort's "Asymptotics for minimisers of convex processes"
papers  to-read  statistics  convex-programming  optimization 
september 2010 by mraginsky
Probability Collectives (David H. Wolpert)
"We recently proved that game theory and statistical physics are identical when cast in terms of information theory.

We call the associated formalism Probability Collectives (PC). PC opens many new lines of research, and provides new approaches to problems in distributed control and distributed optimization."
research  papers  reference  control-theory  distributed-systems  decision-making  game-theory  statistical-physics  optimization 
july 2010 by mraginsky
Belief Propagation for Min-cost Network Flow: Convergence and Correctness
"...we present a simple modification of the BP algorithm which gives a fully polynomial-time randomized approximation scheme (FPRAS)...This is the first instance where BP is proved to have fully-polynomial running time."
papers  to-read  optimization  graphical-models  algorithms  complexity  via:shivak 
april 2010 by mraginsky
Reasoning in Reduced Information Spaces
"...a website for research and coordination on Decentralized Techniques for Reasoning in Reduced Information Spaces (ONR 2009 Multi-disciplinary University Research Project)"
decision-making  decentralized-control  machine-learning  distributed-systems  AI  algorithms  optimization 
april 2010 by mraginsky
[0910.2065v1] Decentralized Multi-Armed Bandit with Multiple Distributed Players (Keqin Liu, Qing Zhao)
From the abstract: "We consider multi-armed bandit with distributed players, where each player independently samples one of N stochastic processes with unknown parameters and accrues reward in each slot without information exchange. Users choosing the same arm collide, and none or only one receives reward depending on the collision model. This problem can be formulated as a decentralized multi-armed bandit problem."
papers  to-read  distributed-systems  algorithms  optimization  decision-making  decentralized-control  control-theory 
november 2009 by mraginsky
« earlier      

related tags

active-learning  AI  algorithms  banach-spaces  bayesian  bayesian-learning  blogs  books  combinatorics  complexity  compressed-sensing  computation  computer-science  computer-vision  conferences  consensus-algorithms  control-theory  convex-programming  data-mining  decentralized-control  decision-making  distributed-computing  distributed-systems  dynamic-programming  dynamical-systems  economics  energy  environment  evolution  experimental-design  filetype:pdf  filtering  game-theory  geometric-functional-analysis  graph-theory  graphical-models  have-read  homepages  information-theory  learning-theory  lecture-notes  linear-programming  lower-bounds  machine-learning  markets  Markov-decision-processes  mathematics  matlab  MCMC  media:document  message-passing  model-selection  online-learning  optimal-search  optimization  papers  papes  people  planning  polemics  policy  prediction  preprints  probability  re:active_feature_selection_project  re:distributed_decisions_project  reference  reinforcement-learning  research  robust_statistics  sensor-networks  signal-processing  socialism  software  sparsity  statistical-learning  statistical-physics  statistics  submodular-functions  sustainability  to-read  universal-prediction  via:mreid  via:shivak  via:stochastix  workshops 

Copy this bookmark:



description:


tags: