mraginsky + information-theory   181

A Behavioral Defense of Rational Expectations
This paper studies decision making by agents who value optimism, but are unsure of their environment. As in Brunnermeier and Parker (2005), an agent’s optimism is assumed to be tempered by the decision costs it imposes. As in Hansen and Sargent (2008), an agent’s uncertainty about his environment leads him to formulate ‘robust’ decision rules. It is shown that when combined, these two considerations can lead agents to adhere to the Rational Expectations Hypothesis. Rather than being the outcome of the sophisticated statistical calculations of an impassive expected utility maximizer, Rational Expectations can instead be viewed as a useful approximation in environments where agents struggle to strike a balance between doubt and hope.
papers  to-read  rationality  economics  robust_control  decision-making  information-theory  decision-theory 
11 days ago by mraginsky
[1002.1300] Architecture for communication with a fidelity criterion in unknown networks
We prove that in order to communicate independent sources (this is the unicast problem) between various users over an unknown medium to within various distortion levels, it is sufficient to consider source-channel separation based architectures: architectures which first compress the sources to within the corresponding distortion levels followed by reliable communication over the unknown medium. We are reducing the problem of universal rate-distortion communication of independent sources over a network to the universal reliable communication problem over networks. This is a reductionist view. We are not solving the reliable communication problem in networks.
papers  to-read  networks  information-theory  information-structures 
23 days ago by mraginsky
[1205.0858] Controlled Sensing for Multihypothesis Testing
"The problem of multiple hypothesis testing with observation control is considered in both fixed sample size and sequential settings. In the fixed sample size setting, for binary hypothesis testing, it is shown that the optimal exponent for the maximal error probability corresponds to the maximum Chernoff information over the choice of controls. It is also shown that a pure stationary open-loop control policy is asymptotically optimal within the larger class of all causal control policies. For multihypothesis testing in the fixed sample size setting, lower and upper bounds on the optimal error exponent are derived. It is also shown through an example with three hypotheses that the optimal causal control policy can be strictly better than the optimal open-loop control policy. In the sequential setting, a test based on earlier work by Chernoff for binary hypothesis testing, is shown to be first-order asymptotically optimal for multihypothesis testing for vanishing error probabilities. The role of past information and randomization in designing optimal control policies is discussed."
papers  to-read  information-theory  decision-making  feedback 
24 days ago by mraginsky
Fluctuations and response out of equilibrium (C. Maes)
"We discuss some recently visited positions towards dealing with nonequilibria from the
mathematical point of view of Markov networks."
papers  to-read  statistical-physics  thermodynamics  probability  information-theory 
4 weeks ago by mraginsky
[1107.1222] On the information-theoretic structure of distributed measurements
The internal structure of a measuring device, which depends on what its components are and how they are organized, determines how it categorizes its inputs. This paper presents a geometric approach to studying the internal structure of measurements performed by distributed systems such as probabilistic cellular automata. It constructs the quale, a family of sections of a suitably defined presheaf, whose elements correspond to the measurements performed by all subsystems of a distributed system. Using the quale we quantify (i) the information generated by a measurement; (ii) the extent to which a measurement is context-dependent; and (iii) whether a measurement is decomposable into independent submeasurements, which turns out to be equivalent to context-dependence. Finally, we show that only indecomposable measurements are more informative than the sum of their submeasurements.
papers  to-read  dynamical-systems  information-theory  complex-systems  distributed-systems 
5 weeks ago by mraginsky
[0704.0704] Entropic Measure and Wasserstein Diffusion
We construct a new random probability measure on the sphere and on the unit interval which in both cases has a Gibbs structure with the relative entropy functional as Hamiltonian. It satisfies a quasi-invariance formula with respect to the action of smooth diffeomorphism of the sphere and the interval respectively. The associated integration by parts formula is used to construct two classes of diffusion processes on probability measures (on the sphere or the unit interval) by Dirichlet form methods. The first one is closely related to Malliavin's Brownian motion on the homeomorphism group. The second one is a probability valued stochastic perturbation of the heat flow, whose intrinsic metric is the quadratic Wasserstein distance. It may be regarded as the canonical diffusion process on the Wasserstein space.
papers  to-read  optimal-transportation  information-theory  probability  re:FPF_project 
5 weeks ago by mraginsky
[1204.2980] Realizable Rate Distortion Function and Bayesian FIltering Theory
The relation between rate distortion function (RDF) and Bayesian filtering theory is discussed. The relation is established by imposing a causal or realizability constraint on the reconstruction conditional distribution of the RDF, leading to the definition of a causal RDF. Existence of the optimal reconstruction distribution of the causal RDF is shown using the topology of weak convergence of probability measures. The optimal non-stationary causal reproduction conditional distribution of the causal RDF is derived in closed form; it is given by a set of recursive equations which are computed backward in time. The realization of causal RDF is described via the source-channel matching approach, while an example is briefly discussed to illustrate the concepts.
papers  to-read  information-theory  rate-distortion  filtering  bayesian  estimation 
6 weeks ago by mraginsky
[1109.5417] Finite block length converse results for channel coding and assistance by non-signalling correlations
"A non-signalling assisted code is a block code in which the sender and receiver can be correlated in any way that would not by itself allow signalling between them. The maximum rate that can be achieved by such a code for a given block length and error probability is formulated as a linear program. This provides a finite block length converse for classical codes which we show is identical to one derived by Polyanskiy, Poor and Verd'{u}. This shows that the many classical asymptotic and finite converse results which can be derived from this converse apply equally to entanglement assisted codes which have been recently studied. It also gives an explicit linear programming formulation of the converse which has some useful features."
papers  to-read  information-theory  channel-coding 
6 weeks ago by mraginsky
[1203.6502] Quantifying causal influences
"Common methods of causal inference generate directed acyclic graphs (DAGs) that formalize causal relations between n variables. Given the joint distribution of all these variables, the DAG contains all information about how intervening on one variable would change the distribution of the other n-1 variables. It remains, however, a non-trivial question how to quantify the causal influence of one variable on another one.
Here we propose a measure for causal strength that refers to direct effects and measure the "strength of an arrow" or a set of arrows. It is based on a hypothetical intervention that modifies the joint distribution by cutting the corresponding edge. The causal strength is then the relative entropy distance between the old and the new distribution.
We discuss other measures of causal strength like the average causal effect, transfer entropy and information flow and describe their limitations. We argue that our measure is also more appropriate for time series than the known ones.
Finally, we discuss conceptual problems in defining the strength of indirect effects."

There is no mention of directed information, and yet their measure of causal strength seems to be closely related to it.
papers  to-read  causality  information-theory  feedback-information-theory 
9 weeks ago by mraginsky
Majorizing codes and measures (Andreas Maurer)
An information theoretical interpretation of majorizing and minorizing
measures is given. The expression logarithmic in the reciprocal of the
measure of a ball is replaced by the number of bits needed to achieve
desired precision in some convergent code. We also give a local version of
the majorizing bound.
papers  to-read  measure-concentration  probability  information-theory 
9 weeks ago by mraginsky
[0907.4491] Bounding relative entropy by the relative entropy of local specifications in product spaces
For a class of density functions $q^n(x^n)$ on $Bbb R^n$ we prove an inequality between relative entropy and the sum of average conditional relative entropies of the following form: For any density function $p^n(x^n)$ on $Bbb R^n$, $D(p^n||q^n)leq Const. sum_{i=1}^n Bbb E D(p_i(cdot|Y_1,..., Y_{i-1},Y_{i+1},..., Y_n) || Q_i(cdot|Y_1,..., Y_{i-1},Y_{i+1},..., Y_n)),$ where $p_i(cdot|y_1,..., y_{i-1},y_{i+1},..., y_n)$ and $Q_i(cdot|x_1,..., x_{i-1},x_{i+1},..., x_n)$ denote the local specifications for $p^n$ resp. $q^n$, i.e., the conditional density functions of the $i$'th coordinate, given the other coordinates. The constant depends on the properties of the local specifications of $q^n$. The above inequality implies a logarithmic Sobolev inequality for $q^n$. We get an explicit lower bound for the logarithmic Sobolev constant of $q^n$ under the assumptions that: (i) the local specifications of $q^n$ satisfy logarithmic Sobolev inequalities with constants $rho_i$, and (ii) they also satisfy some condition expressing that the mixed partial derivatives of the Hamiltonian of $q^n$ are not too large relative to the logarithmic Sobolev constants $rho_i$. Condition (ii) may be weaker than that used in Otto and Reznikoff's recent paper on the estimation of logarithmic Sobolev constants of spin systems.
papers  to-read  measure-concentration  information-theory  probability  re:erasures_and_concentration_project 
9 weeks ago by mraginsky
[1203.4626] Active Sequential Hypothesis Testing
Consider a decision maker who is responsible to dynamically collect observations so as to enhance his information in a speedy manner about an underlying phenomena of interest while accounting for the penalty of wrong declaration. The special cases of the problem are shown to be that of variable-length coding with feedback and noisy dynamic search. Due to the sequential nature of the problem, the decision maker relies on his current information state to adaptively select the most "informative" sensing action among the available ones.
In this paper, using results in dynamic programming, a lower bound for the optimal total cost is established. Moreover, upper bounds are obtained via an analysis of heuristic policies for dynamic selection of actions. It is shown that the proposed heuristics achieve asymptotic optimality in many practically relevant problems including the problems of variable-length coding with feedback and noisy dynamic search; where asymptotic optimality implies that the relative difference between the total cost achieved by the proposed policies and the optimal total cost approaches zero as the penalty of wrong declaration or the number of hypotheses (hence the number of collected samples) increases. Furthermore, using the obtained bounds, the gain of adaptive selection of sensing actions is shown to be at least logarithmic in the penalty associated with wrong declarations.
papers  to-read  control-theory  information-theory  adaptive-systems  heard-the-talk 
10 weeks ago by mraginsky
[1201.2999] Spatially Coupled Ensembles Universally Achieve Capacity under Belief Propagation
We investigate spatially coupled code ensembles. For transmission over the binary erasure channel, it was recently shown that spatial coupling increases the belief propagation threshold of the ensemble to essentially the maximum a-priori threshold of the underlying component ensemble. This explains why convolutional LDPC ensembles, originally introduced by Felstrom and Zigangirov, perform so well over this channel. We show that the equivalent result holds true for transmission over general binary-input memoryless output-symmetric channels. More precisely, given a desired error probability and a gap to capacity, we can construct a spatially coupled ensemble which fulfills these constraints universally on this class of channels under belief propagation decoding. In fact, most codes in that ensemble have that property. The quantifier universal refers to the single ensemble/code which is good for all channels but we assume that the channel is known at the receiver. The key technical result is a proof that under belief propagation decoding spatially coupled ensembles achieve essentially the area threshold of the underlying uncoupled ensemble. We conclude by discussing some interesting open problems.
papers  to-read  information-theory  channel-coding 
february 2012 by mraginsky
[1007.1033] A Theory of Network Equivalence, Parts I and II
A family of equivalence tools for bounding network capacities is introduced. Part I treats networks of point-to-point channels. The main result is roughly as follows. Given a network of noisy, independent, memoryless point-to-point channels, a collection of communication demands can be met on the given network if and only if it can be met on another network where each noisy channel is replaced by a noiseless bit pipe with throughput equal to the noisy channel capacity. This result was known previously for the case of a single-source multicast demand. The result given here treats general demands -- including, for example, multiple unicast demands -- and applies even when the achievable rate region for the corresponding demands is unknown in the noiseless network. In part II, definitions of upper and lower bounding channel models for general channels are introduced. By these definitions, a collection of communication demands can be met on a network of independent channels if it can be met on a network where each channel is replaced by its lower bounding model andonly if it can be met on a network where each channel is replaced by its upper bounding model. This work derives general conditions under which a network of noiseless bit pipes is an upper or lower bounding model for a multiterminal channel. Example upper and lower bounding models for broadcast, multiple access, and interference channels are given. It is then shown that bounding the difference between the upper and lower bounding models for a given channel yields bounds on the accuracy of network capacity bounds derived using those models. By bounding the capacity of a network of independent noisy channels by the network coding capacity of a network of noiseless bit pipes, this approach represents one step towards the goal of building computational tools for bounding network capacities.
papers  to-read  information-theory  networks  multiagent-systems  heard-the-talk 
february 2012 by mraginsky
[1111.1977] On Refined Versions of the Azuma-Hoeffding Inequality with Applications in Information Theory
This paper derives some refined versions of the Azuma-Hoeffding inequality for discrete-parameter martingales with uniformly bounded jumps, and it considers some of their potential applications in information theory and related topics. The first part of this paper derives these refined inequalities, followed by a discussion on their relations to some classical results in probability theory. It also considers a geometric interpretation of some of these inequalities, providing an insight on the inter-connections between them. The second part exemplifies the use of these refined inequalities in the context of hypothesis testing, information theory, and communication. The paper is concluded with a discussion on some directions for further research. This work is meant to stimulate the use of some refined versions of the Azuma-Hoeffding inequality in information-theoretic aspects.
papers  to-read  information-theory  measure-concentration  martingales  markov-chains  probability 
january 2012 by mraginsky
Concentration Inequalities for the Missing Mass and for Histogram Rule Error (McAllester and Ortiz)
This paper gives distribution-free concentration inequalities for the missing mass and the error rate of histogram rules. Negative association methods can be used to reduce these concentration problems to concentration questions about independent sums. Although the sums are independent, they are highly heterogeneous. Such highly heterogeneous independent sums cannot be analyzed using standard concentration inequalities such as Hoeffding's inequality, the Angluin-Valiant bound, Bernstein's inequality, Bennett's inequality, or McDiarmid's theorem. The concentration inequality for histogram rule error is motivated by the desire to construct a new class of bounds on the generalization error of decision trees.
papers  to-read  learning-theory  measure-concentration  information-theory  probability  re:erasures_and_concentration_project 
december 2011 by mraginsky
[1112.0708] Information-Theoretically Optimal Compressed Sensing via Spatial Coupling and Approximate Message Passing
We study the compressed sensing reconstruction problem for a broad class of random, band-diagonal sensing matrices. This construction is inspired by the idea of spatial coupling in coding theory. As demonstrated heuristically and numerically by Krzakala et al. cite{KrzakalaEtAl}, message passing algorithms can effectively solve the reconstruction problem for spatially coupled measurements with undersampling rates close to the fraction of non-zero coordinates.
We use an approximate message passing (AMP) algorithm and analyze it through the state evolution method. We give a rigorous proof that this approach is successful as soon as the undersampling rate $delta$ exceeds the (upper) R'enyi information dimension of the signal, $uRenyi(p_X)$. More precisely, for a sequence of signals of diverging dimension $n$ whose empirical distribution converges to $p_X$, reconstruction is with high probability successful from $uRenyi(p_X), n+o(n)$ measurements taken according to a band diagonal matrix.
For sparse signals, i.e. sequences of dimension $n$ and $k(n)$ non-zero entries, this implies reconstruction from $k(n)+o(n)$ measurements. For `discrete' signals, i.e. signals whose coordinates take a fixed finite set of values, this implies reconstruction from $o(n)$ measurements. The result is robust with respect to noise, does not apply uniquely to random signals, but requires the knowledge of the empirical distribution of the signal $p_X$.
papers  to-read  compressed-sensing  information-theory  message-passing  sparsity  analog-information-theory  Renyi-dimension 
december 2011 by mraginsky
[1111.6822] Optimal Phase Transitions in Compressed Sensing
Compressed sensing deals with efficient recovery of analog signals from linear measurements. This paper presents a statistical study of compressed sensing by modeling the input signal as random processes. Three classes of encoders are considered, namely optimal nonlinear, optimal linear and random linear encoders. Focusing on optimal decoders, we investigate the fundamental tradeoff between measurement rate and reconstruction fidelity gauged by error probability and noise sensitivity in the absence and presence of measurement noise, respectively. Optimal phase transition thresholds are determined as a functional of the input distribution and compared to suboptimal thresholds achieved by various popular reconstruction algorithms. In particular, we show that Gaussian sensing matrices incur no penalty on the phase transition threshold with respect to optimal nonlinear encoding. Our results also provide a rigorous justification of previous results based on replica heuristics in the weak-noise regime.
papers  to-read  information-theory  compressed-sensing  signal-processing  sparsity 
november 2011 by mraginsky
[1110.6654] Pointwise Relations between Information and Estimation
Many of the classical and recent relations between information and estimation in the presence of Gaussian noise can be viewed as identities between expectations of random quantities. These include the I-MMSE relationship of Guo et al.; the relative entropy and mismatched estimation relationship of Verd\'{u}; the relationship between causal estimation and mutual information of Duncan, and its extension to the presence of feedback by Kadota et al.; the relationship between causal and non-casual estimation of Guo et al., and its mismatched version of Weissman. We dispense with the expectations and explore the nature of the pointwise relations between the respective random quantities.
papers  to-read  information-theory  estimation  filtering 
november 2011 by mraginsky
WRITE OR RADIATE? Inscribed Matter vs. Electromagnetic Communication
"We consider information flow via physical transport of inscribed media through space and compare it to information flow via electromagnetic radiation. Somewhat counterintuitively for point to point links, physical transport of inscribed mass is often energetically more efficient by many orders of magnitude than electromagnetic broadcast. And perhaps more surprising, even in a broadcast setting (depending on the receiver density) inscribed mass transport is still energetically more efficient. We discuss the implications of these results for terrestrial telecommunications networks as well as point to point and broadcast communication over great distances with loose delay constraints."
papers  to-read  communication  physics  speculation  information-theory  filetype:pdf  media:document 
august 2011 by mraginsky
[1104.5246] How well can we estimate a sparse vector?
"The estimation of a sparse vector in the linear model is a fundamental problem in signal processing, statistics, and compressive sensing. This paper establishes a lower bound on the mean-squared error, which holds regardless of the sensing/design matrix being used and regardless of the estimation procedure. This lower bound very nearly matches the known upper bound one gets by taking a random projection of the sparse vector followed by an $\ell_1$ estimation procedure such as the Dantzig selector. In this sense, compressive sensing techniques cannot essentially be improved."
papers  compressed-sensing  estimation  information-theory  statistics  have-read 
april 2011 by mraginsky
Thermodynamics and Concentration
"We show that the thermal subadditivity of entropy provides a common basis to derive a strong form of the bounded dierence inequality and related results as well as more recent inequalities applicable to convex Lipschitz functions, random symmetric matrices, shortest travelling sales-men paths and weakly self-bounding functions. We also give two new concentration inequalities."
papers  to-read  measure-concentration  information-theory  probability  statistical-physics  statistical-learning  via:shivak  filetype:pdf  media:document 
march 2011 by mraginsky
[1103.1861] Distinguishing and integrating aleatoric and epistemic variation in uncertainty quantification
"Much of uncertainty quantification to date has focused on determining the effect of variables modeled probabilistically, and with a known distribution, on some physical or engineering system. We develop methods to obtain information on the system when the distributions of some variables are known exactly, others are known only approximately, and perhaps others are not modeled as random variables at all. The main tool used is the duality between risk-sensitive integrals and relative entropy, and we obtain explicit bounds on standard performance measures (variances, exceedance probabilities) over families of distributions whose distance from a nominal distribution is measured by relative entropy. The evaluation of the risk-sensitive expectations is based on polynomial chaos expansions, which help keep the computational aspects tractable."
papers  probability  information-theory  re:knightian-uncertainty  have-read  has:re 
march 2011 by mraginsky
[1102.3944] Fixed-length lossy compression in the finite blocklength regime
"This paper studies the minimum achievable source coding rate as a function of blocklength $n$ and tolerable distortion level $d$. Tight general achievability and converse bounds are derived that hold at arbitrary fixed blocklength. For stationary memoryless sources with separable distortion, the minimum rate achievable is shown to be closely approximated by $R(d) + \sqrt{\frac{V(d)}{n}} \Qinv{\epsilon}$, where $R(d)$ is the rate-distortion function, $V(d)$ is the {\it rate dispersion}, a characteristic of the source which measures its stochastic variability, $\Qinv{\cdot}$ is the inverse of the standard Gaussian complementary cdf, and $\epsilon$ is the probability that the distortion exceeds $d$."
papers  information-theory  rate-distortion  source-coding  have-read 
february 2011 by mraginsky
[1102.0710] Universal Prior Prediction for Communication
"We consider the problem of communicating over an unknown and arbitrarily varying channel, using feedback. This paper focuses on the problem of determining the input behavior, or more specifically, a prior which is used to randomly generate a codebook. We pose the problem of setting the prior as a universal sequential prediction problem using information theoretic abstractions of the communication channel. For the case where the channel is block-wise constant, we show it is possible to asymptotically approach the best rate that can be attained by any system using a fixed prior. For the case where the channel may change on each symbol, we combine a rateless coding scheme with a prior predictor and asymptotically approach the capacity of the average channel universally for every sequence of channels."
papers  to-read  information-theory  feedback-information-theory  prediction  online-learning 
february 2011 by mraginsky
[1012.5457] Concentration of the information in data with log-concave distributions
A concentration property of the functional $-\log f(X)$ is demonstrated, when a random vector $X$ has a log-concave density $f$ on $\R^n$. This concentration property implies in particular an extension of the Shannon-McMillan-Breiman strong ergodic theorem to the class of discrete-time stochastic processes with log-concave marginals.
papers  to-read  information-theory  probability  measure-concentration 
january 2011 by mraginsky
[1011.6451] Informational derivation of Quantum Theory
"Quantum theory can be derived from purely informational principles. Five elementary axioms-causality, perfect distinguishability, ideal compression, local distinguishability, and pure conditioning-define a broad class of theories of information-processing that can be regarded as a standard. One postulate-purification-singles out quantum theory within this class. The main structures of quantum theory, such as the representation of mixed states as convex combinations of perfectly distinguishable pure states, are derived directly from the principles without using the Hilbert space framework."
papers  to-read  quantum-mechanics  information-theory  probability  philosophy-of-science 
december 2010 by mraginsky
[1012.0602] LDPC Codes for Compressed Sensing
"We present a mathematical connection between channel coding and compressed sensing. In particular, we link, on the one hand, \emph{channel coding linear programming decoding (CC-LPD)}, which is a well-known relaxation of maximum-likelihood channel decoding for binary linear codes, and, on the other hand, \emph{compressed sensing linear programming decoding (CS-LPD)}, also known as basis pursuit, which is a widely used linear programming relaxation for the problem of finding the sparsest solution of an under-determined system of linear equations."
papers  to-read  compressed-sensing  channel-coding  information-theory 
december 2010 by mraginsky
Algorithmic Thermodynamics « Azimuth
John Baez blogs about his paper with Mike Stay on algorithmic thermodynamics
have-read  blogs  information-theory  complexity  computation  thermodynamics 
october 2010 by mraginsky
[1009.3958] Approximate Inference and Stochastic Optimal Control
"We propose a novel reformulation of the stochastic optimal control problem as an approximate inference problem, demonstrating, that such a interpretation leads to new practical methods for the original problem. In particular we characterise a novel class of iterative solutions to the stochastic optimal control problem based on a natural relaxation of the exact dual formulation. These theoretical insights are applied to the Reinforcement Learning problem where they lead to new model free, off policy methods for discrete and continuous problems."
papers  to-read  control-theory  reinforcement-learning  optimization  information-theory  graphical-models 
september 2010 by mraginsky
[1009.0679] Optimal Uncertainty Quantification
Hmm: "We propose a rigorous framework for Uncertainty Quantification (UQ) in which the UQ objectives and the assumptions/information set are brought to the forefront. This framework, which we call \emph{Optimal Uncertainty Quantification} (OUQ), is based on the observation that, given a set of assumptions and information about the problem, there exist optimal bounds on uncertainties: these are obtained as extreme values of well-defined optimization problems corresponding to extremizing probabilities of failure, or of deviations, subject to the constraints imposed by the scenarios compatible with the assumptions and information. In particular, this framework does not implicitly impose inappropriate assumptions, nor does it repudiate relevant information. ..."
papers  to-read  information-theory  decision-making  optimization 
september 2010 by mraginsky
The Compression Lemma ← Inductio Ex Machina
ETA: The lemma discussed in the blog entry is quite well-known in information theory (cf. Theorem 5.2.1 in Gray's Entropy and Information Theory).
blogs  information-theory  bayesian  divergence 
august 2010 by mraginsky
[1008.2471] Projection Pursuit through Relative Entropy Minimization
"Projection Pursuit methodology permits to solve the difficult problem of finding an estimate of a density defined on a set of very large dimension. In his seminal article, Huber (see "Projection pursuit", Annals of Statistics, 1985) evidences the interest of the Projection Pursuit method thanks to the factorisation of a density into a Gaussian component and some residual density in a context of Kullback-Leibler divergence maximisation. In the present article, we introduce a new algorithm, and in particular a test for the factorisation of a density estimated from an iid sample."
papers  to-read  statistics  information-theory  divergence 
august 2010 by mraginsky
Carl T. Bergstrom - University of Washington
"Information in biological systems. How do living organisms acquire, store, and make use of information? How and why does communication evolve? How does information flow through biological or social networks?"
people  homepages  research  information-theory  biology  theoretical-biology  communication  social-networks 
august 2010 by mraginsky
[1008.2008] Rate-Constrained Simulation and Source Coding IID Sources
"Necessary conditions for asymptotically optimal sliding-block or stationary codes for source coding and rate-constrained simulation of memoryless sources are presented and used to motivate a design technique for trellis-encoded source coding and rate-constrained simulation. The code structure has intuitive similarities to classic random coding arguments as well as to ``fake process'' methods and alphabet-constrained methods. Experimental evidence shows that the approach provides comparable or superior performance in comparison with previously published methods on common examples, sometimes by significant margins."
papers  to-read  information-theory  rate-distortion  source-coding  simulation 
august 2010 by mraginsky
[1007.5354] Synchronization and Control in Intrinsic and Designed Computation: An Information-Theoretic Analysis of Competing Models of Stochastic Computation
"We adapt tools from information theory to analyze how an observer comes to synchronize with the hidden states of a finitary, stationary stochastic process. We show that synchronization is determined by both the process's internal organization and by an observer's model of it. We analyze these components using the convergence of state-block and block-state entropies, comparing them to the previously known convergence properties of the Shannon block entropy. Along the way, we introduce a hierarchy of information quantifiers as derivatives and integrals of these entropies, which parallels a similar hierarchy introduced for block entropy. We also draw out the duality between synchronization properties and a process's controllability. The tools lead to a new classification of a process's alternative representations in terms of minimality, synchronizability, and unifilarity."
papers  to-read  information-theory  control-theory  complexity  computation 
august 2010 by mraginsky
[1006.3780] Least Squares Superposition Codes of Moderate Dictionary Size, Reliable at Rates up to Capacity
Nice: "For the additive white Gaussian noise channel with average codeword power constraint, new coding methods are devised in which the codewords are sparse superpositions, that is, linear combinations of subsets of vectors from a given design, with the possible messages indexed by the choice of subset. Decoding is by least squares, tailored to the assumed form of linear combination. Communication is shown to be reliable with error probability exponentially small for all rates up to the Shannon capacity."
papers  to-read  information-theory  communication  sparsity  statistics 
june 2010 by mraginsky
[1003.1377] Entropy: The Markov Ordering Approach
"The focus of this article is on entropy and Markov processes. We study the properties of functionals which are invariant with respect to monotonic transformations and analyze two invariant "additivity" properties: (i) existence of a monotonic transformation which makes the functional additive with respect to the joining of independent systems and (ii) existence of a monotonic transformation which makes the functional additive with respect to the partitioning of the space of states. All Lyapunov functionals for Markov chains which have properties (i) and (ii) are derived. We describe the most general ordering of the distribution space, with respect to which all continuous-time Markov processes are monotonic (the {\em Markov order}). The solution differs significantly from the ordering given by the inequality of entropy growth. For inference, this approach results in a convex compact set of conditionally "most random" distributions."
papers  have-read  statistics  probability  majorization  information-theory 
june 2010 by mraginsky
[1001.4448] R'enyi Divergence and Majorization
"R'enyi divergence is related to R'enyi entropy much like information divergence (also called Kullback-Leibler divergence or relative entropy) is related to Shannon's entropy, and comes up in many settings. It was introduced by R'enyi as a measure of information that satisfies almost the same axioms as information divergence. We review the most important properties of R'enyi divergence, including its relation to some other distances. We show how R'enyi divergence appears when the theory of majorization is generalized from the finite to the continuous setting. Finally, R'enyi divergence plays a role in analyzing the number of binary questions required to guess the values of a sequence of random variables."
papers  to-read  information-theory  statistics  majorization  heard-the-talk 
june 2010 by mraginsky
[1002.0042] Lower bounds for the minimax risk using $f$-divergences and applications
Very nice: "A new lower bound involving $f$-divergences between the underlying probability measures is proved for the minimax risk in estimation problems. The proof just uses the convexity of the function $f$ and is extremely simple. Special cases and straightforward corollaries of our bound include well known inequalities for establishing minimax lower bounds such as Fano's inequality, Pinsker's inequality and inequalities based on global entropy conditions. Two applications are provided: a new minimax lower bound for the reconstruction of convex bodies from noisy support function measurements and a different proof of a recent minimax lower bound for the estimation of a covariance matrix."
papers  have-read  information-theory  statistics  heard-the-talk 
june 2010 by mraginsky
[1004.0557] Applications of Lindeberg Principle in Communications and Statistical Learning
"We use a generalization of the Lindeberg principle developed by Sourav Chatterjee to prove universality properties for various problems in communications, statistical learning and random matrix theory. We also show that these systems can be viewed as the limiting case of a properly defined sparse system. The latter result is useful when the sparse systems are easier to analyze than their dense counterparts. The list of problems we consider is by no means exhaustive. We believe that the ideas can be used in many other problems relevant for information theory."
paper  to-read  random-matrices  probability  machine-learning  signal-processing  information-theory  geometric-functional-analysis 
april 2010 by mraginsky
"Pictish writing?" (Language Log)
Yet another EPIC FAIL: "The "new research" is described in Rob Lee, Philip Jonathan, and Pauline Ziman, "Pictish symbols revealed as a written language through application of Shannon entropy", Proceedings of the Royal Society A, in press." Make it stop, make it stop! Shannon fhtagn!
linguistics  information-theory  via:cshalizi 
april 2010 by mraginsky
« earlier      

related tags

active-learning  adaptive-systems  ai  algorithms  analog-information-theory  analysis  bayesian  biology  blogs  books  causality  channel-coding  coding-theory  cognition  combinatorics  communication  communications  complex-systems  complexity  compressed-sensing  computation  computer-science  conferences  control-theory  convex-programming  crypto  cybernetics  data-mining  decentralized-control  decision-making  decision-theory  distributed-computing  distributed-systems  divergence  dynamic-programming  dynamical-systems  economics  entropy-estimation  epsilon-entropy  ergodic-theory  estimation  evolution  experimental-design  feature-selection  feedback  feedback-information-theory  filetype:pdf  film  filtering  frequentist  game-theory  geekery  geometric-functional-analysis  graph-theory  graphical-models  group-testing  has:re  have-read  heard-the-talk  homepages  information-structures  information-theory  interesting  journals  learning-theory  lecture-notes  linguistics  machine-learning  majorization  markov-chains  martingales  mathematics  measure-concentration  media:document  message-passing  model-selection  multiagent-systems  networks  neuroscience  online-learning  optimal-search  optimal-transportation  optimization  paper  papers  people  perception  philosophy-of-science  physics  prediction  probability  psychology  quantization  quantum-mechanics  random-matrices  rate-distortion  rationality  re:active_feature_selection_project  re:causality_project  re:erasures_and_concentration_project  re:FPF_project  re:knightian-uncertainty  re:posterior_matching_project  re:typical_sequences_project  reference  reinforcement-learning  Renyi-dimension  research  reviews  robust_control  russian  security  sensor-networks  Shannon  signal-processing  simulation  slides  social-networks  software  source-coding  sparsity  speculation  statistical-learning  statistical-physics  statistics  stochastic-search  technology  theoretical-biology  thermodynamics  to-read  via:cshalizi  via:shivak 

Copy this bookmark:



description:


tags: