mraginsky + learning-theory   77

Concentration Inequalities for the Missing Mass and for Histogram Rule Error (McAllester and Ortiz)
This paper gives distribution-free concentration inequalities for the missing mass and the error rate of histogram rules. Negative association methods can be used to reduce these concentration problems to concentration questions about independent sums. Although the sums are independent, they are highly heterogeneous. Such highly heterogeneous independent sums cannot be analyzed using standard concentration inequalities such as Hoeffding's inequality, the Angluin-Valiant bound, Bernstein's inequality, Bennett's inequality, or McDiarmid's theorem. The concentration inequality for histogram rule error is motivated by the desire to construct a new class of bounds on the generalization error of decision trees.
papers  to-read  learning-theory  measure-concentration  information-theory  probability  re:erasures_and_concentration_project 
december 2011 by mraginsky
[1102.4442] Internal Regret with Partial Monitoring. Calibration-Based Optimal Algorithms
"We provide consistent random algorithms for sequential decision under partial monitoring, i.e. when the decision maker does not observe the outcomes but receives instead random feedback signals. Those algorithms have no internal regret in the sense that, on the set of stages where the decision maker chose his action according to a given law, the average payoff could not have been improved in average by using any other fixed law. They are based on a generalization of calibration, no longer defined in terms of a Voronoi diagram but instead of a Laguerre diagram (a more general concept). This allows us to bound, for the first time in this general framework, the expected average internal -- as well as the usual external -- regret at stage $n$ by $O(n^{-1/3})$, which is known to be optimal."
papers  to-read  learning-theory  online-learning  game-theory  optimization 
february 2011 by mraginsky
Learnability, Stability, and Uniform Convergence
"... there are non-trivial learning problems where uniform convergence does not hold, empirical risk minimization fails, and yet they are learnable using alternative mechanisms. Instead of uniform convergence, we identify stability as the key necessary and sufficient condition for learnability. ... "  I wonder whether there are any connections to smooth estimators of Buescher and Kumar ...
papers  to-read  learning-theory 
november 2010 by mraginsky
[1006.1138] Online Learning: Random Averages, Combinatorial Parameters, and Learnability
"We study learnability in the online learning model. We define several complexity measures which capture the difficulty of learning in a sequential manner. Among these measures are analogues of Rademacher complexity, covering numbers and fat shattering dimension from statistical learning theory. Relationship among these complexity measures, their connection to online learning, and tools for bounding them are provided. In the setting of supervised learning, finiteness of the introduced scale-sensitive parameters is shown to be equivalent to learnability. The complexities we define also ensure uniform convergence non-i.i.d. data, extending the uniform Glivenko-Cantelli type results. We conclude by showing online learnability for an array of examples."
papers  to-read  learning-theory  online-learning  measure-concentration 
june 2010 by mraginsky
[0911.0054] Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity (Sham M. Kakade, Ohad Shamir, Karthik Sridharan, Ambuj Tewari)
Abstract: "The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model. A central issue is learning these models in high-dimensions, such as when there is some sparsity pattern of the optimal parameter. This work characterizes a certain strong convexity property of general exponential families, which allow their generalization ability to be quantified. In particular, we show how this property can be used to analyze generic exponential families under L_1 regularization."
papers  have-read  statistics  learning-theory  machine-learning  exponential-families  graphical-models  convex-programming  sparsity 
november 2009 by mraginsky
Machine Learning (Theory) » Computability in Artificial Intelligence
"Let me show by analogy why limiting research to computational questions is bad for any field. Except in computer science, computational aspects play little role in the development of fundamental theories: Consider e.g. set theory with axiom of choice, foundations of logic, exact/full minimax for zero-sum games, quantum (field) theory, string theory, … Indeed, at least in physics, every new fundamental theory seems to be less computable than previous ones."
blogs  have-read  science  complexity  computer-science  computation  philosophy  epistemology  AI  learning-theory 
may 2009 by mraginsky

Copy this bookmark:



description:


tags: