cshalizi + learning_in_games   15

Online Learning and Online Convex Optimization
"Online learning is a well established learning paradigm which has both theoretical and practical appeals. The goal of online learning is to make a sequence of accurate predictions given knowledge of the correct answer to previous prediction tasks and possibly additional available information. Online learning has been studied in several research fields including game theory, information theory, and machine learning. It also became of great interest to practitioners due the recent emergence of large scale applications such as online advertisement placement and online web ranking. In this survey we provide a modern overview of online learning. Our goal is to give the reader a sense of some of the interesting ideas and in particular to underscore the centrality of convexity in deriving efficient online learning algorithms. We do not mean to be comprehensive but rather to give a high-level, rigorous yet easy to follow, survey."

Ungated version (via shivak): http://www.cs.huji.ac.il/~shais/papers/OLsurvey.pdf
to:NB  online_learning  individual_sequence_prediction  optimization  learning_theory  machine_learning  learning_in_games  low-regret_learning 
8 weeks ago by cshalizi
[math/0701419] Strategies for prediction under imperfect monitoring
"We propose simple randomized strategies for sequential prediction under imperfect monitoring, that is, when the forecaster does not have access to the past outcomes but rather to a feedback signal. The proposed strategies are consistent in the sense that they achieve, asymptotically, the best possible average reward. It was Rustichini (1999) who first proved the existence of such consistent predictors. The forecasters presented here offer the first constructive proof of consistency. Moreover, the proposed algorithms are computationally efficient. We also establish upper bounds for the rates of convergence. In the case of deterministic feedback, these rates are optimal up to logarithmic terms."
to:NB  prediction  individual_sequence_prediction  learning_in_games  re:growing_ensemble_project 
february 2012 by cshalizi
Learning to Compete, Coordinate and Cooperate in Repeated Games Using Reinforcement Learning
"problem of learning in repeated general-sum matrix games when a learning algorithm can observe the actions but not the payoffs of its associates. ... non-stationarity of the environment caused by learning associates in these games, most state-of-the-art algorithms perform poorly ... due to an inability to make profitable compromises.=,,, agent must effectively balance competing objectives, including bounding losses, playing optimally with respect to current beliefs, and taking calculated, but profitable, risks. ... we present ... M-Qubed, a reinforcement learning algorithm ... balancing best-response, cautious, and optimistic learning biases... learns to make profitable compromises across a wide-range of repeated matrix games played with many kinds of learners... average payoffs meet or exceed its maximin value in the limit.., in two-player games... average payoffs approach the value of the Nash bargaining solution... robust behavior in round-robin and evolutionary tournaments..."
machine_learning  learning_in_games  reinforcement_learning  re:do-institutions-evolve  re:knightian_uncertainty  game_theory 
march 2011 by cshalizi
Mertikopoulos, Moustakas: The emergence of rational behavior in the presence of stochastic perturbations
"We study repeated games where players use an exponential learning scheme in order to adapt to an ever-changing environment. If the game’s payoffs are subject to random perturbations, this scheme leads to a new stochastic version of the replicator dynamics that is quite different from the “aggregate shocks” approach of evolutionary game theory. Irrespective of the perturbations’ magnitude, we find that strategies which are dominated (even iteratively) eventually become extinct and that the game’s strict Nash equilibria are stochastically asymptotically stable. We complement our analysis by illustrating these results in the case of congestion games."
evolutionary_game_theory  learning_in_games  replicator_dynamics 
july 2010 by cshalizi
[0903.5328] A Stochastic View of Optimal Regret through Minimax Duality
"We study the regret of optimal strategies for online convex optimization games. Using von Neumann's minimax theorem, we show that the optimal regret in this adversarial setting is closely related to the behavior of the empirical minimization algorithm in a stochastic process setting: it is equal to the maximum, over joint distributions of the adversary's action sequence, of the difference between a sum of minimal expected losses and the minimal empirical loss. ... the optimal regret has a natural geometric interpretation, since it can be viewed as the gap in Jensen's inequality for a concave functional--the minimizer over the player's actions of expected loss--defined on a set of probability distributions. ... obtain upper and lower bounds on the regret of an optimal strategy for a variety of online learning problems. Our method provides upper bounds without the need to construct a learning algorithm; the lower bounds provide explicit optimal strategies for the adversary."
statistics  game_theory  learning_in_games  minimax 
april 2009 by cshalizi
Prediction, Learning, and Games - Cesa-Bianch and Lugosi (@Labyrinth)
How to predict an individual sequence nearly as well as the best possible predictor would, without any probabilistic assumptions.
books:recommended  statistics  machine_learning  universal_prediction  information_theory  learning_in_games  cesa-bianchi.nicolo  lugosi.gabor 
january 2008 by cshalizi

Copy this bookmark:



description:


tags: