mraginsky + have-read   66

Le Cam Made Simple: No-N Asymptotics
"If the log likelihood is approximately quadratic with constant Hessian, then the maximum likelihood estimator (MLE) is approximately normally distributed. No other assumptions are required. We do not need independent and identically distributed data. We do not need the law of large numbers (LLN) or the central limit theorem (CLT). We do not need sample size going to infinity or anything going to infinity.

The theory presented here is a combination of Le Cam style involving local asymptotic normality (LAN) and local asymptotic mixed normality (LAMN) and Cramér style involving derivatives and Fisher information. The main tool is convergence in law of the log likelihood function and its derivatives considered as random elements of a Polish space of continuous functions with the metric of uniform convergence on compact sets. We obtain results for both one-step-Newton estimators and Newton-iterated-to-convergence estimators."
have-read  statistics  estimation  Le_Cam-theory  asymptotic-normality  information-geometry  via:cshalizi 
november 2011 by mraginsky
[1104.5246] How well can we estimate a sparse vector?
"The estimation of a sparse vector in the linear model is a fundamental problem in signal processing, statistics, and compressive sensing. This paper establishes a lower bound on the mean-squared error, which holds regardless of the sensing/design matrix being used and regardless of the estimation procedure. This lower bound very nearly matches the known upper bound one gets by taking a random projection of the sparse vector followed by an $\ell_1$ estimation procedure such as the Dantzig selector. In this sense, compressive sensing techniques cannot essentially be improved."
papers  compressed-sensing  estimation  information-theory  statistics  have-read 
april 2011 by mraginsky
[1103.1861] Distinguishing and integrating aleatoric and epistemic variation in uncertainty quantification
"Much of uncertainty quantification to date has focused on determining the effect of variables modeled probabilistically, and with a known distribution, on some physical or engineering system. We develop methods to obtain information on the system when the distributions of some variables are known exactly, others are known only approximately, and perhaps others are not modeled as random variables at all. The main tool used is the duality between risk-sensitive integrals and relative entropy, and we obtain explicit bounds on standard performance measures (variances, exceedance probabilities) over families of distributions whose distance from a nominal distribution is measured by relative entropy. The evaluation of the risk-sensitive expectations is based on polynomial chaos expansions, which help keep the computational aspects tractable."
papers  probability  information-theory  re:knightian-uncertainty  have-read  has:re 
march 2011 by mraginsky
[1102.3944] Fixed-length lossy compression in the finite blocklength regime
"This paper studies the minimum achievable source coding rate as a function of blocklength $n$ and tolerable distortion level $d$. Tight general achievability and converse bounds are derived that hold at arbitrary fixed blocklength. For stationary memoryless sources with separable distortion, the minimum rate achievable is shown to be closely approximated by $R(d) + \sqrt{\frac{V(d)}{n}} \Qinv{\epsilon}$, where $R(d)$ is the rate-distortion function, $V(d)$ is the {\it rate dispersion}, a characteristic of the source which measures its stochastic variability, $\Qinv{\cdot}$ is the inverse of the standard Gaussian complementary cdf, and $\epsilon$ is the probability that the distortion exceeds $d$."
papers  information-theory  rate-distortion  source-coding  have-read 
february 2011 by mraginsky
[1011.3854] A probabilistic and RIPless theory of compressed sensing
"This paper introduces a simple and very general theory of compressive sensing. In this theory, the sensing mechanism simply selects sensing vectors independently at random from a probability distribution F; it includes all models - e.g. Gaussian, frequency measurements - discussed in the literature, but also provides a framework for new measurement strategies as well. We prove that if the probability distribution F obeys a simple incoherence property and an isotropy property, one can faithfully recover approximately sparse signals from a minimal number of noisy measurements. The novelty is that our recovery results do not require the restricted isometry property (RIP) - they make use of a much weaker notion - or a random model for the signal. As an example, the paper shows that a signal with s nonzero entries can be faithfully recovered from about s log n Fourier coefficients that are contaminated with noise."
papers  have-read  compressed-sensing  sparsity  geometric-functional-analysis  random-matrices  signal-processing 
november 2010 by mraginsky
Algorithmic Thermodynamics « Azimuth
John Baez blogs about his paper with Mike Stay on algorithmic thermodynamics
have-read  blogs  information-theory  complexity  computation  thermodynamics 
october 2010 by mraginsky
A note on exponential families of distributions
"We show that an arbitrary probability distribution can be represented in an exponential form. In physical contexts, this implies that the equilibrium distribution of any classical or quantum dynamical system is expressible in a grand canonical form." Is there anything really new here, though? ETA: No, cf. Barron and Sheu.
papers  have-read  meh  probability  exponential-families  statistical-physics 
august 2010 by mraginsky
[1006.4039] Cooperative Autonomous Online Learning
Abstract: "Online learning is becoming increasingly popular for training on large datasets. However, the sequential nature of online learning requires a centralized learner to store data and update parameters. In this paper, we consider a fully decentralized setting, cooperative autonomous online learning, with a distributed data source. The learners perform learning with local parameters while periodically communicating with a small subset of neighbors to exchange information. We define the regret in terms of an implicit aggregated parameter of the learners for such a setting and prove regret bounds similar to the classical sequential online learning." Damn, looks like I've been scooped!
papers  have-read  decision-making  online-learning  distributed-systems 
june 2010 by mraginsky
[1003.1377] Entropy: The Markov Ordering Approach
"The focus of this article is on entropy and Markov processes. We study the properties of functionals which are invariant with respect to monotonic transformations and analyze two invariant "additivity" properties: (i) existence of a monotonic transformation which makes the functional additive with respect to the joining of independent systems and (ii) existence of a monotonic transformation which makes the functional additive with respect to the partitioning of the space of states. All Lyapunov functionals for Markov chains which have properties (i) and (ii) are derived. We describe the most general ordering of the distribution space, with respect to which all continuous-time Markov processes are monotonic (the {\em Markov order}). The solution differs significantly from the ordering given by the inequality of entropy growth. For inference, this approach results in a convex compact set of conditionally "most random" distributions."
papers  have-read  statistics  probability  majorization  information-theory 
june 2010 by mraginsky
[1002.0042] Lower bounds for the minimax risk using $f$-divergences and applications
Very nice: "A new lower bound involving $f$-divergences between the underlying probability measures is proved for the minimax risk in estimation problems. The proof just uses the convexity of the function $f$ and is extremely simple. Special cases and straightforward corollaries of our bound include well known inequalities for establishing minimax lower bounds such as Fano's inequality, Pinsker's inequality and inequalities based on global entropy conditions. Two applications are provided: a new minimax lower bound for the reconstruction of convex bodies from noisy support function measurements and a different proof of a recent minimax lower bound for the estimation of a covariance matrix."
papers  have-read  information-theory  statistics  heard-the-talk 
june 2010 by mraginsky
Bayesian Learning in Social Networks
Daron Acemoglu, Munther A. Dahleh, Ilan Lobel, Asuman Ozdaglar; NBER Working Paper No. 14040
papers  have-read  social-networks  bayesian-learning  game-theory  decision-making  distributed-systems 
april 2010 by mraginsky
[0911.0054] Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity (Sham M. Kakade, Ohad Shamir, Karthik Sridharan, Ambuj Tewari)
Abstract: "The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model. A central issue is learning these models in high-dimensions, such as when there is some sparsity pattern of the optimal parameter. This work characterizes a certain strong convexity property of general exponential families, which allow their generalization ability to be quantified. In particular, we show how this property can be used to analyze generic exponential families under L_1 regularization."
papers  have-read  statistics  learning-theory  machine-learning  exponential-families  graphical-models  convex-programming  sparsity 
november 2009 by mraginsky
[0909.2408] Coordination Capacity
Authors: Paul Cuff (Stanford University), Haim Permuter (Stanford University), Thomas Cover (Stanford University)
papers  have-read  information-theory 
september 2009 by mraginsky
Machine Learning (Theory) » Computability in Artificial Intelligence
"Let me show by analogy why limiting research to computational questions is bad for any field. Except in computer science, computational aspects play little role in the development of fundamental theories: Consider e.g. set theory with axiom of choice, foundations of logic, exact/full minimax for zero-sum games, quantum (field) theory, string theory, … Indeed, at least in physics, every new fundamental theory seems to be less computable than previous ones."
blogs  have-read  science  complexity  computer-science  computation  philosophy  epistemology  AI  learning-theory 
may 2009 by mraginsky
[0711.1460] On the Thermodynamic Temperature of a General Distribution
(by Krishna R. Narayanan, Arun R. Srinivasa) Even though one of the references is B. Roy Frieden's execrable attempt to "derive" physics from Fisher's information, the information theory in this paper is definitely interesting.
papers  have-read  statistics  statistical-physics  information-theory 
march 2009 by mraginsky
Murder in Chicago - The New York Review of Books
Umberto Eco's review of a book about the murder of I. Couliano
books  reviews  have-read 
june 2008 by mraginsky

related tags

active-learning  adaptive-systems  AI  algorithms  asymptotic-normality  bayesian  bayesian-learning  biology  blogs  books  coding-theory  communication  communications  complexity  compressed-sensing  computation  computer-science  control-theory  convex-programming  criticism  culture  decentralized-control  decision-making  distributed-computing  distributed-systems  dynamic-programming  dynamical-systems  epistemology  estimation  evolution  experimental-design  exponential-families  fantasy  feedback  feedback-information-theory  filetype:pdf  film  filtering  frequentist  game-theory  geometric-functional-analysis  graph-theory  graphical-models  has:re  has:via  have-read  heard-the-talk  history_of_statistics  information-geometry  information-theory  interesting  learning-theory  Le_Cam-theory  machine-learning  majorization  markets  markov-chains  martingales  mathematics  measure-concentration  media:document  meh  network-data-analysis  online-learning  optimization  papers  people  philosophy  planning  policy  politics  probability  random-matrices  rate-distortion  re:distributed_decisions_project  re:knightian-uncertainty  re:typical_sequences_project  reviews  robust_statistics  sci-fi  science  sensor-networks  signal-processing  social-networks  socialism  source-coding  sparsity  statistical-learning  statistical-physics  statistics  stochastic-search  submodular-functions  thermodynamics  via:arthegall  via:cshalizi 

Copy this bookmark:



description:


tags: