Loose Connections - Oldest and Fatherless: The Terrible Secret of Tom Bombadil
13 hours ago
I will never look at Tom Bombadil's character the same way again ...
geekery
speculation
LotR
13 hours ago
JSTOR: The American Economic Review, Vol. 70, No. 3 (Jun., 1980), pp. 393-408
yesterday
"On the impossibility of informationally efficient markets" (Grossman and Stiglitz)
papers
to-read
economics
via:cshalizi
yesterday
The Complexity of Exchange - Axtell - 2005 - The Economic Journal - Wiley Online Library
yesterday
"The computational complexity of two classes of market mechanisms is compared. First the Walrasian interpretation in which prices are centrally computed by an auctioneer. Recent results on the computational complexity are reviewed. The non-polynomial complexity of these algorithms makes Walrasian general equilibrium an implausible conception. Second, a decentralised picture of market processes is described, involving concurrent exchange within transient coalitions of agents. These processes feature price dispersion, yield allocations that are not in the core, modify the distribution of wealth, are always stable, but path-dependent. Replacing the Walrasian framing of markets requires substantial revision of conventional wisdom concerning markets."
papers
to-read
economics
decision-making
via:cshalizi
yesterday
Foundations of statistical machine learning and neural networks. The Vapnik-Chervonenkis theory - Winter 2012
3 days ago
Lecture notes from Vladimir Pestov's course
lecture-notes
statistical-learning
probability
teaching
3 days ago
[1106.1401] Volatility of Power Grids under Real-Time Pricing
3 days ago
The paper proposes a framework for modeling and analysis of the dynamics of supply, demand, and clearing prices in power system with real-time retail pricing and information asymmetry. Real-time retail pricing is characterized by passing on the real-time wholesale electricity prices to the end consumers, and is shown to create a closed-loop feedback system between the physical layer and the market layer of the power system. In the absence of a carefully designed control law, such direct feedback between the two layers could increase volatility and lower the system's robustness to uncertainty in demand and generation. A new notion of generalized price-elasticity is introduced, and it is shown that price volatility can be characterized in terms of the system's maximal relative price elasticity, defined as the maximal ratio of the generalized price-elasticity of consumers to that of the producers. As this ratio increases, the system becomes more volatile, and eventually, unstable. As new demand response technologies and distributed storage increase the price-elasticity of demand, the architecture under examination is likely to lead to increased volatility and possibly instability. This highlights the need for assessing architecture systematically and in advance, in order to optimally strike the trade-offs between volatility, economic efficiency, and system reliability.
papers
to-read
economics
markets
electricity-markets
3 days ago
Cognitive Democracy — Crooked Timber
4 days ago
"Over the last couple of years, Cosma Shalizi and I have been working together on various things, including, inter alia, the relationship between complex systems, democracy and the Internet. These are big unwieldy topics, and trying to think about them systematically is hard. Even so, we’ve gotten to the point where we at least feel ready to start throwing stuff at a wider audience, to get feedback on what works and what doesn’t. Here’s a paper we’re working on, which argues that we should (for some purposes at least), think of markets, hierarchy and democracy in terms of their capacity to solve complex collective problems, makes the case that democracy will on average do the job a lot better than the other two ways, and then looks at different forms of collective information processing on the Internet as experiments that democracies can learn from."
to-read
blogs
decision-making
institutions
markets
democracy
policy
4 days ago
A Behavioral Defense of Rational Expectations
11 days ago
This paper studies decision making by agents who value optimism, but are unsure of their environment. As in Brunnermeier and Parker (2005), an agent’s optimism is assumed to be tempered by the decision costs it imposes. As in Hansen and Sargent (2008), an agent’s uncertainty about his environment leads him to formulate ‘robust’ decision rules. It is shown that when combined, these two considerations can lead agents to adhere to the Rational Expectations Hypothesis. Rather than being the outcome of the sophisticated statistical calculations of an impassive expected utility maximizer, Rational Expectations can instead be viewed as a useful approximation in environments where agents struggle to strike a balance between doubt and hope.
papers
to-read
rationality
economics
robust_control
decision-making
information-theory
decision-theory
11 days ago
[1205.1099] From Knothe's rearrangement to Brenier's optimal transport map
23 days ago
"The Brenier optimal map and Knothe-Rosenblatt rearrangement are two instances of a transport map, that is to say a map sending one measure onto another. The main interest of the former is that it solves the Monge-Kantorovich optimal transport problem, while the latter is very easy to compute, being given by an explicit formula. A few years ago, Carlier, Galichon, and Santambrogio showed that the Knothe rearrangement could be seen as the limit of the Brenier map when the quadratic cost degenerates. In this paper, we prove that on the torus (to avoid boundary issues), when all the data are smooth, the evolution is also smooth, and is entirely determined by a PDE for the Kantorovich potential (which determines the map), with a subtle initial condition. The proof requires the use of the Nash-Moser inverse function theorem. This result generalizes the ode discovered by Carlier, Galichon, and Santambrogio when one measure is uniform and the other is discrete, and could pave to way to new numerical methods for optimal transportation."
papers
to-read
optimal-transportation
probability
PDEs
23 days ago
[1002.1300] Architecture for communication with a fidelity criterion in unknown networks
23 days ago
We prove that in order to communicate independent sources (this is the unicast problem) between various users over an unknown medium to within various distortion levels, it is sufficient to consider source-channel separation based architectures: architectures which first compress the sources to within the corresponding distortion levels followed by reliable communication over the unknown medium. We are reducing the problem of universal rate-distortion communication of independent sources over a network to the universal reliable communication problem over networks. This is a reductionist view. We are not solving the reliable communication problem in networks.
papers
to-read
networks
information-theory
information-structures
23 days ago
[1205.1005] Some Refinements of Large Deviation Tail Probabilities
24 days ago
"We study tail probabilities via some Gaussian approximations. Our results make refinements to large deviation theory. The proof builds on classical results by Bahadur and Rao. Binomial distributions and their tail probabilities are discussed in more detail."
papers
to-read
statistics
probability
large-deviations
measure-concentration
statistical-physics
24 days ago
[1205.0858] Controlled Sensing for Multihypothesis Testing
24 days ago
"The problem of multiple hypothesis testing with observation control is considered in both fixed sample size and sequential settings. In the fixed sample size setting, for binary hypothesis testing, it is shown that the optimal exponent for the maximal error probability corresponds to the maximum Chernoff information over the choice of controls. It is also shown that a pure stationary open-loop control policy is asymptotically optimal within the larger class of all causal control policies. For multihypothesis testing in the fixed sample size setting, lower and upper bounds on the optimal error exponent are derived. It is also shown through an example with three hypotheses that the optimal causal control policy can be strictly better than the optimal open-loop control policy. In the sequential setting, a test based on earlier work by Chernoff for binary hypothesis testing, is shown to be first-order asymptotically optimal for multihypothesis testing for vanishing error probabilities. The role of past information and randomization in designing optimal control policies is discussed."
papers
to-read
information-theory
decision-making
feedback
24 days ago
[1204.6265] Statistical inference for dynamical systems: a review
4 weeks ago
The topic of statistical inference for dynamical systems has been studied extensively across several fields. In this survey we focus on the problem of parameter estimation for non-linear dynamical systems. Our objective is to place results across distinct disciplines in a common setting and highlight opportunities for further research.
papers
to-read
dynamical-systems
machine-learning
system-identification
ergodic-theory
4 weeks ago
[1204.5721] Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
4 weeks ago
Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off. This is the balance between staying with the option that gave highest payoffs in the past and exploring new options that might give higher payoffs in the future. Although the study of bandit problems dates back to the Thirties, exploration-exploitation trade-offs arise in several modern applications, such as ad placement, website optimization, and packet routing. Mathematically, a multi-armed bandit is defined by the payoff process associated with each option. In this survey, we focus on two extreme cases in which the analysis of regret is particularly simple and elegant: i.i.d. payoffs and adversarial payoffs. Besides the basic setting of finitely many actions, we also analyze some of the most important variants and extensions, such as the contextual bandit model.
papers
to-read
bandit-problems
online-learning
control-theory
dynamic-programming
decision-making
4 weeks ago
Fluctuations and response out of equilibrium (C. Maes)
4 weeks ago
"We discuss some recently visited positions towards dealing with nonequilibria from the
mathematical point of view of Markov networks."
papers
to-read
statistical-physics
thermodynamics
probability
information-theory
mathematical point of view of Markov networks."
4 weeks ago
[1107.1222] On the information-theoretic structure of distributed measurements
5 weeks ago
The internal structure of a measuring device, which depends on what its components are and how they are organized, determines how it categorizes its inputs. This paper presents a geometric approach to studying the internal structure of measurements performed by distributed systems such as probabilistic cellular automata. It constructs the quale, a family of sections of a suitably defined presheaf, whose elements correspond to the measurements performed by all subsystems of a distributed system. Using the quale we quantify (i) the information generated by a measurement; (ii) the extent to which a measurement is context-dependent; and (iii) whether a measurement is decomposable into independent submeasurements, which turns out to be equivalent to context-dependence. Finally, we show that only indecomposable measurements are more informative than the sum of their submeasurements.
papers
to-read
dynamical-systems
information-theory
complex-systems
distributed-systems
5 weeks ago
Game of Thrones, US Politics Edition
5 weeks ago
I disagree on the Daenerys analogy, though -- should be Cersei instead.
politics
satire
via:cshalizi
5 weeks ago
[1111.2687] Ricci curvature of finite Markov chains via convexity of the entropy
5 weeks ago
We define and study a new notion of Ricci curvature that applies to Markov chains on discrete spaces. This notion relies on geodesic convexity of the entropy and is analogous to the one introduced by Lott, Sturm, and Villani for geodesic measure spaces. In order to apply to the discrete setting, the role of the Wasserstein metric is taken over by a different metric, having the property that continuous time Markov chains are gradient flows of the entropy.
Using this notion of Ricci curvature we prove discrete analogues of fundamental results by Bakry--Emery and Otto--Villani. Furthermore we show that Ricci curvature bounds are preserved under tensorisation. As a special case we obtain the sharp Ricci curvature lower bound for the discrete hypercube.
papers
to-read
markov-chains
probability
measure-concentration
Using this notion of Ricci curvature we prove discrete analogues of fundamental results by Bakry--Emery and Otto--Villani. Furthermore we show that Ricci curvature bounds are preserved under tensorisation. As a special case we obtain the sharp Ricci curvature lower bound for the discrete hypercube.
5 weeks ago
[0704.0704] Entropic Measure and Wasserstein Diffusion
5 weeks ago
We construct a new random probability measure on the sphere and on the unit interval which in both cases has a Gibbs structure with the relative entropy functional as Hamiltonian. It satisfies a quasi-invariance formula with respect to the action of smooth diffeomorphism of the sphere and the interval respectively. The associated integration by parts formula is used to construct two classes of diffusion processes on probability measures (on the sphere or the unit interval) by Dirichlet form methods. The first one is closely related to Malliavin's Brownian motion on the homeomorphism group. The second one is a probability valued stochastic perturbation of the heat flow, whose intrinsic metric is the quadratic Wasserstein distance. It may be regarded as the canonical diffusion process on the Wasserstein space.
papers
to-read
optimal-transportation
information-theory
probability
re:FPF_project
5 weeks ago
[1204.2980] Realizable Rate Distortion Function and Bayesian FIltering Theory
6 weeks ago
The relation between rate distortion function (RDF) and Bayesian filtering theory is discussed. The relation is established by imposing a causal or realizability constraint on the reconstruction conditional distribution of the RDF, leading to the definition of a causal RDF. Existence of the optimal reconstruction distribution of the causal RDF is shown using the topology of weak convergence of probability measures. The optimal non-stationary causal reproduction conditional distribution of the causal RDF is derived in closed form; it is given by a set of recursive equations which are computed backward in time. The realization of causal RDF is described via the source-channel matching approach, while an example is briefly discussed to illustrate the concepts.
papers
to-read
information-theory
rate-distortion
filtering
bayesian
estimation
6 weeks ago
"Approximate dynamic programming via Bellman inequalities" (Wang and Boyd)
6 weeks ago
"In this paper we introduce new methods for finding functions that lower bound
the value function of a stochastic control problem, using an iterated form of the Bellman inequality. Our method is based on solving linear or semidefinite programs, and
produces both a bound on the optimal objective, as well as a suboptimal policy that
appears to works very well. These results extend and improve bounds obtained by
authors in a previous paper using a single Bellman inequality condition. We describe
the methods in a general setting, and show how they can be applied in specific cases
including the finite state case, constrained linear quadratic control, switched affine
control, and multi-period portfolio investment."
papers
to-read
control-theory
dynamic-programming
optimization
the value function of a stochastic control problem, using an iterated form of the Bellman inequality. Our method is based on solving linear or semidefinite programs, and
produces both a bound on the optimal objective, as well as a suboptimal policy that
appears to works very well. These results extend and improve bounds obtained by
authors in a previous paper using a single Bellman inequality condition. We describe
the methods in a general setting, and show how they can be applied in specific cases
including the finite state case, constrained linear quadratic control, switched affine
control, and multi-period portfolio investment."
6 weeks ago
[1204.2975] Multiple Objects: Error Exponents in Hypotheses Testing and Identification
6 weeks ago
"We servey [sic!] a series of investigations of optimal testing of multiple hypotheses conserning various multiobject models. These studies are a bright instance of application of methods and technics developed in Shannon information theory to solution of typical statistical problems."
papers
to-read
feedback-information-theory
statistics
active-learning
decision-making
6 weeks ago
[1109.5417] Finite block length converse results for channel coding and assistance by non-signalling correlations
6 weeks ago
"A non-signalling assisted code is a block code in which the sender and receiver can be correlated in any way that would not by itself allow signalling between them. The maximum rate that can be achieved by such a code for a given block length and error probability is formulated as a linear program. This provides a finite block length converse for classical codes which we show is identical to one derived by Polyanskiy, Poor and Verd'{u}. This shows that the many classical asymptotic and finite converse results which can be derived from this converse apply equally to entanglement assisted codes which have been recently studied. It also gives an explicit linear programming formulation of the converse which has some useful features."
papers
to-read
information-theory
channel-coding
6 weeks ago
Entropy and the value of information for investors (Cabrales et al.)
8 weeks ago
Consider any investor who fears ruin facing any set of invest-
ments that satisfy no-arbitrage. Before investing, he can purchase informa-
tion about the state of nature in the form of an information structure. Given
his prior, information structure is more informative than information struc-
ture if whenever he rejects at some price, he also rejects at that price.
We show that this complete informativeness ordering is represented by the
decrease in entropy of his beliefs, regardless of his preferences, initial wealth
or investment problem. It is also shown that no prior-independent informa-
tiveness ordering based on similar premises exists.
papers
to-read
economics
decision-making
comparison-of-experiments
re:knightian-uncertainty
ments that satisfy no-arbitrage. Before investing, he can purchase informa-
tion about the state of nature in the form of an information structure. Given
his prior, information structure is more informative than information struc-
ture if whenever he rejects at some price, he also rejects at that price.
We show that this complete informativeness ordering is represented by the
decrease in entropy of his beliefs, regardless of his preferences, initial wealth
or investment problem. It is also shown that no prior-independent informa-
tiveness ordering based on similar premises exists.
8 weeks ago
Information Order in Decision and Agency Problems (Ian Jewitt)
8 weeks ago
This paper establishes information order in a number of special settings. We
discuss Lehmann’s condition, relate it to Blackwell’s and discuss its application in
statistical decision problems, certain Bayesian games and in principal agent problems. We distinguish between KR (Karlin Rubin) monotone preferences and single
crossing preferences and between Lehmann’s and Persico’s theorems. A generalisation of Lehmann’s condition is given that is applicable to situations where monotone
likelihood ratio does not apply. The principal agent problem is treated both via
the first-order approach and for the general finite action model. Points of contact
with Lehmann’s condition are identified in both cases.
papers
to-read
economics
decision-making
comparison-of-experiments
re:knightian-uncertainty
discuss Lehmann’s condition, relate it to Blackwell’s and discuss its application in
statistical decision problems, certain Bayesian games and in principal agent problems. We distinguish between KR (Karlin Rubin) monotone preferences and single
crossing preferences and between Lehmann’s and Persico’s theorems. A generalisation of Lehmann’s condition is given that is applicable to situations where monotone
likelihood ratio does not apply. The principal agent problem is treated both via
the first-order approach and for the general finite action model. Points of contact
with Lehmann’s condition are identified in both cases.
8 weeks ago
[1203.6502] Quantifying causal influences
9 weeks ago
"Common methods of causal inference generate directed acyclic graphs (DAGs) that formalize causal relations between n variables. Given the joint distribution of all these variables, the DAG contains all information about how intervening on one variable would change the distribution of the other n-1 variables. It remains, however, a non-trivial question how to quantify the causal influence of one variable on another one.
Here we propose a measure for causal strength that refers to direct effects and measure the "strength of an arrow" or a set of arrows. It is based on a hypothetical intervention that modifies the joint distribution by cutting the corresponding edge. The causal strength is then the relative entropy distance between the old and the new distribution.
We discuss other measures of causal strength like the average causal effect, transfer entropy and information flow and describe their limitations. We argue that our measure is also more appropriate for time series than the known ones.
Finally, we discuss conceptual problems in defining the strength of indirect effects."
There is no mention of directed information, and yet their measure of causal strength seems to be closely related to it.
papers
to-read
causality
information-theory
feedback-information-theory
Here we propose a measure for causal strength that refers to direct effects and measure the "strength of an arrow" or a set of arrows. It is based on a hypothetical intervention that modifies the joint distribution by cutting the corresponding edge. The causal strength is then the relative entropy distance between the old and the new distribution.
We discuss other measures of causal strength like the average causal effect, transfer entropy and information flow and describe their limitations. We argue that our measure is also more appropriate for time series than the known ones.
Finally, we discuss conceptual problems in defining the strength of indirect effects."
There is no mention of directed information, and yet their measure of causal strength seems to be closely related to it.
9 weeks ago
Majorizing codes and measures (Andreas Maurer)
9 weeks ago
An information theoretical interpretation of majorizing and minorizing
measures is given. The expression logarithmic in the reciprocal of the
measure of a ball is replaced by the number of bits needed to achieve
desired precision in some convergent code. We also give a local version of
the majorizing bound.
papers
to-read
measure-concentration
probability
information-theory
measures is given. The expression logarithmic in the reciprocal of the
measure of a ball is replaced by the number of bits needed to achieve
desired precision in some convergent code. We also give a local version of
the majorizing bound.
9 weeks ago
Social Democracy for the 21st Century: A Post Keynesian Perspective: Hayek the Evil Socialist
9 weeks ago
"He can be seen through a reading of his book Law, Legislation and Liberty: A New Statement of the Liberal Principles of Justice and Political Economy (vol. 3) (London, 1979), pp. 41–64.
In particular, this striking passage could have been written by a Keynesian:
“On the other hand, it is merely common sense that government, as the biggest spender and investor whose activities cannot be guided wholly by profitability, and which for finance is in a great measure independent of the state of the capital market, should so far as practicable distribute its expenditure over time in such a manner that it will step in when private investment flags, and thereby employ resources for public investment at the least cost and with the greatest benefit to society.” (Hayek 1979: 59)."
blogs
economics
libertarianism
In particular, this striking passage could have been written by a Keynesian:
“On the other hand, it is merely common sense that government, as the biggest spender and investor whose activities cannot be guided wholly by profitability, and which for finance is in a great measure independent of the state of the capital market, should so far as practicable distribute its expenditure over time in such a manner that it will step in when private investment flags, and thereby employ resources for public investment at the least cost and with the greatest benefit to society.” (Hayek 1979: 59)."
9 weeks ago
[0907.4491] Bounding relative entropy by the relative entropy of local specifications in product spaces
9 weeks ago
For a class of density functions $q^n(x^n)$ on $Bbb R^n$ we prove an inequality between relative entropy and the sum of average conditional relative entropies of the following form: For any density function $p^n(x^n)$ on $Bbb R^n$, $D(p^n||q^n)leq Const. sum_{i=1}^n Bbb E D(p_i(cdot|Y_1,..., Y_{i-1},Y_{i+1},..., Y_n) || Q_i(cdot|Y_1,..., Y_{i-1},Y_{i+1},..., Y_n)),$ where $p_i(cdot|y_1,..., y_{i-1},y_{i+1},..., y_n)$ and $Q_i(cdot|x_1,..., x_{i-1},x_{i+1},..., x_n)$ denote the local specifications for $p^n$ resp. $q^n$, i.e., the conditional density functions of the $i$'th coordinate, given the other coordinates. The constant depends on the properties of the local specifications of $q^n$. The above inequality implies a logarithmic Sobolev inequality for $q^n$. We get an explicit lower bound for the logarithmic Sobolev constant of $q^n$ under the assumptions that: (i) the local specifications of $q^n$ satisfy logarithmic Sobolev inequalities with constants $rho_i$, and (ii) they also satisfy some condition expressing that the mixed partial derivatives of the Hamiltonian of $q^n$ are not too large relative to the logarithmic Sobolev constants $rho_i$. Condition (ii) may be weaker than that used in Otto and Reznikoff's recent paper on the estimation of logarithmic Sobolev constants of spin systems.
papers
to-read
measure-concentration
information-theory
probability
re:erasures_and_concentration_project
9 weeks ago
[1203.4626] Active Sequential Hypothesis Testing
10 weeks ago
Consider a decision maker who is responsible to dynamically collect observations so as to enhance his information in a speedy manner about an underlying phenomena of interest while accounting for the penalty of wrong declaration. The special cases of the problem are shown to be that of variable-length coding with feedback and noisy dynamic search. Due to the sequential nature of the problem, the decision maker relies on his current information state to adaptively select the most "informative" sensing action among the available ones.
In this paper, using results in dynamic programming, a lower bound for the optimal total cost is established. Moreover, upper bounds are obtained via an analysis of heuristic policies for dynamic selection of actions. It is shown that the proposed heuristics achieve asymptotic optimality in many practically relevant problems including the problems of variable-length coding with feedback and noisy dynamic search; where asymptotic optimality implies that the relative difference between the total cost achieved by the proposed policies and the optimal total cost approaches zero as the penalty of wrong declaration or the number of hypotheses (hence the number of collected samples) increases. Furthermore, using the obtained bounds, the gain of adaptive selection of sensing actions is shown to be at least logarithmic in the penalty associated with wrong declarations.
papers
to-read
control-theory
information-theory
adaptive-systems
heard-the-talk
In this paper, using results in dynamic programming, a lower bound for the optimal total cost is established. Moreover, upper bounds are obtained via an analysis of heuristic policies for dynamic selection of actions. It is shown that the proposed heuristics achieve asymptotic optimality in many practically relevant problems including the problems of variable-length coding with feedback and noisy dynamic search; where asymptotic optimality implies that the relative difference between the total cost achieved by the proposed policies and the optimal total cost approaches zero as the penalty of wrong declaration or the number of hypotheses (hence the number of collected samples) increases. Furthermore, using the obtained bounds, the gain of adaptive selection of sensing actions is shown to be at least logarithmic in the penalty associated with wrong declarations.
10 weeks ago
[1104.1303] A new characterization of Talagrand's transport-entropy inequalities and applications
12 weeks ago
Wow! "We show that Talagrand's transport inequality is equivalent to a restricted logarithmic Sobolev inequality. This result clarifies the links between these two important functional inequalities. As an application, we give the first proof of the fact that Talagrand's inequality is stable under bounded perturbations."
papers
to-read
measure-concentration
optimal-transportation
probability
re:erasures_and_concentration_project
12 weeks ago
[0911.0280] Causal Inference on Discrete Data using Additive Noise Models
march 2012
Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. Recently, methods using additive noise models have been suggested to approach the case of continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work we extend the notion of additive noise models to these cases. We prove that whenever the joint distribution $prob^{(X,Y)}$ admits such a model in one direction, e.g. $Y=f(X)+N, N independent X$, it does not admit the reversed model $X=g(Y)+tilde N, tilde N independent Y$ as long as the model is chosen in a generic way. Based on these deliberations we propose an efficient new algorithm that is able to distinguish between cause and effect for a finite sample of discrete variables. In an extensive experimental study we show that this algorithm works both on synthetic and real data sets.
papers
to-read
machine-learning
causality
march 2012
[1202.6258] A Stochastic Gradient Method with an Exponential Convergence Rate for Strongly-Convex Optimization with Finite Training Sets
march 2012
We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex. While standard stochastic gradient methods converge at sublinear rates for this problem, the proposed method incorporates a memory of previous gradient values in order to achieve a linear convergence rate. Numerical experiments indicate that the new algorithm can dramatically outperform standard algorithms.
papers
to-read
optimization
convex-programming
march 2012
Certainty equivalence and model uncertainty
february 2012
Simon’s and Theil’s certainty equivalence property justifies a convenient algorithm for solving dynamic programming problems with quadratic objectives and linear transition laws: first, optimize under perfect foresight, then substitute optimal forecasts for unknown future values. A similar decomposition into separate optimization and forecasting steps prevails when a decision maker wants a decision rule that is robust to model misspecification. Concerns about model misspecification leave the first step of the algorithm intact and affect only the second step of forecasting the future. The decision maker attains robustness by making forecasts with a distorted model that twists probabilities relative to his approximating model. The appropriate twisting emerges from a two-player zero-sum dynamic game.
papers
to-read
adaptive-systems
control-theory
dynamic-programming
estimation
game-theory
uncertainty
february 2012
Hansen, L.P. and Sargent, T.J.: Robustness.
february 2012
The standard theory of decision making under uncertainty advises the decision maker to form a statistical model linking outcomes to decisions and then to choose the optimal distribution of outcomes. This assumes that the decision maker trusts the model completely. But what should a decision maker do if the model cannot be trusted?
Lars Hansen and Thomas Sargent, two leading macroeconomists, push the field forward as they set about answering this question. They adapt robust control techniques and apply them to economics. By using this theory to let decision makers acknowledge misspecification in economic modeling, the authors develop applications to a variety of problems in dynamic macroeconomics.
Technical, rigorous, and self-contained, this book will be useful for macroeconomists who seek to improve the robustness of decision-making processes.
books-noted
economics
robust_control
game-theory
multiagent-systems
uncertainty
Lars Hansen and Thomas Sargent, two leading macroeconomists, push the field forward as they set about answering this question. They adapt robust control techniques and apply them to economics. By using this theory to let decision makers acknowledge misspecification in economic modeling, the authors develop applications to a variety of problems in dynamic macroeconomics.
Technical, rigorous, and self-contained, this book will be useful for macroeconomists who seek to improve the robustness of decision-making processes.
february 2012
[1201.2999] Spatially Coupled Ensembles Universally Achieve Capacity under Belief Propagation
february 2012
We investigate spatially coupled code ensembles. For transmission over the binary erasure channel, it was recently shown that spatial coupling increases the belief propagation threshold of the ensemble to essentially the maximum a-priori threshold of the underlying component ensemble. This explains why convolutional LDPC ensembles, originally introduced by Felstrom and Zigangirov, perform so well over this channel. We show that the equivalent result holds true for transmission over general binary-input memoryless output-symmetric channels. More precisely, given a desired error probability and a gap to capacity, we can construct a spatially coupled ensemble which fulfills these constraints universally on this class of channels under belief propagation decoding. In fact, most codes in that ensemble have that property. The quantifier universal refers to the single ensemble/code which is good for all channels but we assume that the channel is known at the receiver. The key technical result is a proof that under belief propagation decoding spatially coupled ensembles achieve essentially the area threshold of the underlying uncoupled ensemble. We conclude by discussing some interesting open problems.
papers
to-read
information-theory
channel-coding
february 2012
"Online sequential optimization with biased gradients"
february 2012
We study a class of stochastic optimization problems where though the objective functions
may not be convex, they satisfy a generalization of convexity, called the sequentially convex
property. We focus on a setting where the distribution of the underlying uncertainty is unknown and the manager must make a decision in real-time based on historical data. Since sequentially convex functions are not necessarily convex, they pose difficulties in applying standard adaptive methods for convex optimization. We propose a nonparametric algorithm based on a gradient descent method, and show that the T-season average expected cost differs from the minimum cost by at most O(1/sqrt(T)). Our analysis is based on a careful quantication of the bias that is inherent in gradient estimation due to the adaptive nature of the problem. We demonstrate the usefulness of the concept of sequential convexity by applying it to three canonical problems in inventory control, capacity allocation, and the lifetime buy decision, under the assumption that the manager does not know the demand distributions and has access only to historical sales (censored demand) data.
papers
to-read
online-learning
optimization
convex-programming
may not be convex, they satisfy a generalization of convexity, called the sequentially convex
property. We focus on a setting where the distribution of the underlying uncertainty is unknown and the manager must make a decision in real-time based on historical data. Since sequentially convex functions are not necessarily convex, they pose difficulties in applying standard adaptive methods for convex optimization. We propose a nonparametric algorithm based on a gradient descent method, and show that the T-season average expected cost differs from the minimum cost by at most O(1/sqrt(T)). Our analysis is based on a careful quantication of the bias that is inherent in gradient estimation due to the adaptive nature of the problem. We demonstrate the usefulness of the concept of sequential convexity by applying it to three canonical problems in inventory control, capacity allocation, and the lifetime buy decision, under the assumption that the manager does not know the demand distributions and has access only to historical sales (censored demand) data.
february 2012
[1105.2274] Data-Distributed Weighted Majority and Online Mirror Descent
february 2012
In this paper, we focus on the question of the extent to which online learning can benefit from distributed computing. We focus on the setting in which $N$ agents online-learn cooperatively, where each agent only has access to its own data. We propose a generic data-distributed online learning meta-algorithm. We then introduce the Distributed Weighted Majority and Distributed Online Mirror Descent algorithms, as special cases. We show, using both theoretical analysis and experiments, that compared to a single agent: given the same computation time, these distributed algorithms achieve smaller generalization errors; and given the same generalization errors, they can be $N$ times faster.
papers
to-read
machine-learning
online-learning
optimization
distributed-systems
february 2012
[1007.1033] A Theory of Network Equivalence, Parts I and II
february 2012
A family of equivalence tools for bounding network capacities is introduced. Part I treats networks of point-to-point channels. The main result is roughly as follows. Given a network of noisy, independent, memoryless point-to-point channels, a collection of communication demands can be met on the given network if and only if it can be met on another network where each noisy channel is replaced by a noiseless bit pipe with throughput equal to the noisy channel capacity. This result was known previously for the case of a single-source multicast demand. The result given here treats general demands -- including, for example, multiple unicast demands -- and applies even when the achievable rate region for the corresponding demands is unknown in the noiseless network. In part II, definitions of upper and lower bounding channel models for general channels are introduced. By these definitions, a collection of communication demands can be met on a network of independent channels if it can be met on a network where each channel is replaced by its lower bounding model andonly if it can be met on a network where each channel is replaced by its upper bounding model. This work derives general conditions under which a network of noiseless bit pipes is an upper or lower bounding model for a multiterminal channel. Example upper and lower bounding models for broadcast, multiple access, and interference channels are given. It is then shown that bounding the difference between the upper and lower bounding models for a given channel yields bounds on the accuracy of network capacity bounds derived using those models. By bounding the capacity of a network of independent noisy channels by the network coding capacity of a network of noiseless bit pipes, this approach represents one step towards the goal of building computational tools for bounding network capacities.
papers
to-read
information-theory
networks
multiagent-systems
heard-the-talk
february 2012
Natural selection and veridical perceptions 10.1016/j.jtbi.2010.07.020 : Journal of Theoretical Biology | ScienceDirect.com
january 2012
Does natural selection favor veridical perceptions, those that more accurately depict the objective environment? Students of perception often claim that it does. But this claim, though influential, has not been adequately tested. Here we formalize the claim and a few alternatives. To test them, we introduce “interface games,” a class of evolutionary games in which perceptual strategies compete. We explore, in closed-form solutions and Monte Carlo simulations, some simpler games that assume frequency-dependent selection and complete mixing in infinite populations. We find that veridical perceptions can be driven to extinction by non-veridical strategies that are tuned to utility rather than objective reality. This suggests that natural selection need not favor veridical perceptions, and that the effects of selection on sensory perception deserve further study.
papers
to-read
perception
evolution
game-theory
re:active_feature_selection_project
january 2012
Concentration inequalities for functions of independent variables - Maurer - 2005 - Random Structures & Algorithms - Wiley Online Library
january 2012
Following the entropy method this paper presents general concentration inequalities, which can be applied to combinatorial optimization and empirical processes. The inequalities give improved concentration results for optimal traveling salesmen tours, Steiner trees, and the eigenvalues of random symmetric matrices.
papers
to-read
measure-concentration
probability
january 2012
Dominated concentration : Statistics & Probability Letters | ScienceDirect.com
january 2012
The concentration properties of one random variable may be governed by the values of another random variable which is concentrated and more easily analyzed. We present a general concentration inequality to handle such cases and apply it to the eigenvalues of the Gram matrix for a sample of independent vectors distributed in the unit ball of a Hilbert space. For large samples the deviation of the eigenvalues from their mean is shown to scale with the largest eigenvalue.
papers
to-read
measure-concentration
probability
january 2012
[1201.2256] Empirical Processes of Markov Chains and Dynamical Systems Indexed by Classes of Functions
january 2012
We study weak convergence of empirical processes of dependent data, indexed by classes of functions. We obtain results that are especially suitable for data arising from dynamical systems and Markov chains, where the Central Limit Theorem for partial sums is commonly derived via the spectral gap technique. Our results apply, e.g. to the empirical process of ergodic torus automorphisms.
papers
to-read
empirical-processes
dynamical-systems
markov-chains
re:adaptive_control_project
january 2012
Gronwall's inequality (J. W. Robin)
january 2012
The _right_ way of proving Gronwall's inequality.
mathematics
analysis
january 2012
[1111.1977] On Refined Versions of the Azuma-Hoeffding Inequality with Applications in Information Theory
january 2012
This paper derives some refined versions of the Azuma-Hoeffding inequality for discrete-parameter martingales with uniformly bounded jumps, and it considers some of their potential applications in information theory and related topics. The first part of this paper derives these refined inequalities, followed by a discussion on their relations to some classical results in probability theory. It also considers a geometric interpretation of some of these inequalities, providing an insight on the inter-connections between them. The second part exemplifies the use of these refined inequalities in the context of hypothesis testing, information theory, and communication. The paper is concluded with a discussion on some directions for further research. This work is meant to stimulate the use of some refined versions of the Azuma-Hoeffding inequality in information-theoretic aspects.
papers
to-read
information-theory
measure-concentration
martingales
markov-chains
probability
january 2012
[1201.0559] Chernoff-Hoeffding Bounds for Markov Chains: Generalized and Simplified
january 2012
We prove the first Chernoff-Hoeffding bounds for general nonreversible finite-state Markov chains based on the standard L_1 (variation distance) mixing-time of the chain. Specifically, consider an ergodic Markov chain M and a weight function f: [n] -> [0,1] on the state space [n] of M with mean mu = E_{v <- pi}[f(v)], where pi is the stationary distribution of M. A t-step random walk (v_1,...,v_t) on M starting from the stationary distribution pi has expected total weight E[X] = mu t, where X = sum_{i=1}^t f(v_i). Let T be the L_1 mixing-time of M. We show that the probability of X deviating from its mean by a multiplicative factor of delta, i.e., Pr [ |X - mu t| >= delta mu t ], is at most exp(-Omega(delta^2 mu t / T)) for 0 <= delta <= 1, and exp(-Omega(delta mu t / T)) for delta > 1. In fact, the bounds hold even if the weight functions f_i's for i in [t] are distinct, provided that all of them have the same mean mu.
We also obtain a simplified proof for the Chernoff-Hoeffding bounds based on the spectral expansion lambda of M, which is the square root of the second largest eigenvalue (in absolute value) of M tilde{M}, where tilde{M} is the time-reversal Markov chain of M. We show that the probability Pr [ |X - mu t| >= delta mu t ] is at most exp(-Omega(delta^2 (1-lambda) mu t)) for 0 <= delta <= 1, and exp(-Omega(delta (1-lambda) mu t)) for delta > 1.
Both of our results extend to continuous-time Markov chains, and to the case where the walk starts from an arbitrary distribution x, at a price of a multiplicative factor depending on the distribution x in the concentration bounds
to-read
papers
markov-chains
measure-concentration
probability
We also obtain a simplified proof for the Chernoff-Hoeffding bounds based on the spectral expansion lambda of M, which is the square root of the second largest eigenvalue (in absolute value) of M tilde{M}, where tilde{M} is the time-reversal Markov chain of M. We show that the probability Pr [ |X - mu t| >= delta mu t ] is at most exp(-Omega(delta^2 (1-lambda) mu t)) for 0 <= delta <= 1, and exp(-Omega(delta (1-lambda) mu t)) for delta > 1.
Both of our results extend to continuous-time Markov chains, and to the case where the walk starts from an arbitrary distribution x, at a price of a multiplicative factor depending on the distribution x in the concentration bounds
january 2012
Concentration Inequalities for the Missing Mass and for Histogram Rule Error (McAllester and Ortiz)
december 2011
This paper gives distribution-free concentration inequalities for the missing mass and the error rate of histogram rules. Negative association methods can be used to reduce these concentration problems to concentration questions about independent sums. Although the sums are independent, they are highly heterogeneous. Such highly heterogeneous independent sums cannot be analyzed using standard concentration inequalities such as Hoeffding's inequality, the Angluin-Valiant bound, Bernstein's inequality, Bennett's inequality, or McDiarmid's theorem. The concentration inequality for histogram rule error is motivated by the desire to construct a new class of bounds on the generalization error of decision trees.
papers
to-read
learning-theory
measure-concentration
information-theory
probability
re:erasures_and_concentration_project
december 2011
Sebastien Bubeck, "Introduction to Online Optimization"
december 2011
lecture notes for ORF570: Special Topics in Statistics and Operations Research, Prediction Games (Princeton)
lecture-notes
online-learning
optimization
prediction
december 2011
[1112.0708] Information-Theoretically Optimal Compressed Sensing via Spatial Coupling and Approximate Message Passing
december 2011
We study the compressed sensing reconstruction problem for a broad class of random, band-diagonal sensing matrices. This construction is inspired by the idea of spatial coupling in coding theory. As demonstrated heuristically and numerically by Krzakala et al. cite{KrzakalaEtAl}, message passing algorithms can effectively solve the reconstruction problem for spatially coupled measurements with undersampling rates close to the fraction of non-zero coordinates.
We use an approximate message passing (AMP) algorithm and analyze it through the state evolution method. We give a rigorous proof that this approach is successful as soon as the undersampling rate $delta$ exceeds the (upper) R'enyi information dimension of the signal, $uRenyi(p_X)$. More precisely, for a sequence of signals of diverging dimension $n$ whose empirical distribution converges to $p_X$, reconstruction is with high probability successful from $uRenyi(p_X), n+o(n)$ measurements taken according to a band diagonal matrix.
For sparse signals, i.e. sequences of dimension $n$ and $k(n)$ non-zero entries, this implies reconstruction from $k(n)+o(n)$ measurements. For `discrete' signals, i.e. signals whose coordinates take a fixed finite set of values, this implies reconstruction from $o(n)$ measurements. The result is robust with respect to noise, does not apply uniquely to random signals, but requires the knowledge of the empirical distribution of the signal $p_X$.
papers
to-read
compressed-sensing
information-theory
message-passing
sparsity
analog-information-theory
Renyi-dimension
We use an approximate message passing (AMP) algorithm and analyze it through the state evolution method. We give a rigorous proof that this approach is successful as soon as the undersampling rate $delta$ exceeds the (upper) R'enyi information dimension of the signal, $uRenyi(p_X)$. More precisely, for a sequence of signals of diverging dimension $n$ whose empirical distribution converges to $p_X$, reconstruction is with high probability successful from $uRenyi(p_X), n+o(n)$ measurements taken according to a band diagonal matrix.
For sparse signals, i.e. sequences of dimension $n$ and $k(n)$ non-zero entries, this implies reconstruction from $k(n)+o(n)$ measurements. For `discrete' signals, i.e. signals whose coordinates take a fixed finite set of values, this implies reconstruction from $o(n)$ measurements. The result is robust with respect to noise, does not apply uniquely to random signals, but requires the knowledge of the empirical distribution of the signal $p_X$.
december 2011
[1111.6822] Optimal Phase Transitions in Compressed Sensing
november 2011
Compressed sensing deals with efficient recovery of analog signals from linear measurements. This paper presents a statistical study of compressed sensing by modeling the input signal as random processes. Three classes of encoders are considered, namely optimal nonlinear, optimal linear and random linear encoders. Focusing on optimal decoders, we investigate the fundamental tradeoff between measurement rate and reconstruction fidelity gauged by error probability and noise sensitivity in the absence and presence of measurement noise, respectively. Optimal phase transition thresholds are determined as a functional of the input distribution and compared to suboptimal thresholds achieved by various popular reconstruction algorithms. In particular, we show that Gaussian sensing matrices incur no penalty on the phase transition threshold with respect to optimal nonlinear encoding. Our results also provide a rigorous justification of previous results based on replica heuristics in the weak-noise regime.
papers
to-read
information-theory
compressed-sensing
signal-processing
sparsity
november 2011
Le Cam Made Simple: No-N Asymptotics
november 2011
"If the log likelihood is approximately quadratic with constant Hessian, then the maximum likelihood estimator (MLE) is approximately normally distributed. No other assumptions are required. We do not need independent and identically distributed data. We do not need the law of large numbers (LLN) or the central limit theorem (CLT). We do not need sample size going to infinity or anything going to infinity.
The theory presented here is a combination of Le Cam style involving local asymptotic normality (LAN) and local asymptotic mixed normality (LAMN) and Cramér style involving derivatives and Fisher information. The main tool is convergence in law of the log likelihood function and its derivatives considered as random elements of a Polish space of continuous functions with the metric of uniform convergence on compact sets. We obtain results for both one-step-Newton estimators and Newton-iterated-to-convergence estimators."
have-read
statistics
estimation
Le_Cam-theory
asymptotic-normality
information-geometry
via:cshalizi
The theory presented here is a combination of Le Cam style involving local asymptotic normality (LAN) and local asymptotic mixed normality (LAMN) and Cramér style involving derivatives and Fisher information. The main tool is convergence in law of the log likelihood function and its derivatives considered as random elements of a Polish space of continuous functions with the metric of uniform convergence on compact sets. We obtain results for both one-step-Newton estimators and Newton-iterated-to-convergence estimators."
november 2011
Edsger W. Dijkstra Prize in Distributed Computing
november 2011
The Edsger W. Dijkstra Prize in Distributed Computing is named for Edsger Wybe Dijkstra (1930-2002), a pioneer in the area of distributed computing. His foundational work on concurrency primitives (such as the semaphore), concurrency problems (such as mutual exclusion and deadlock), reasoning about concurrent systems, and self-stabilization comprises one of the most important supports upon which the field of distributed computing is built. No other individual has had a larger influence on research in principles of distributed computing.
computation
distributed-computing
distributed-systems
reference
november 2011
[1111.2622] Optimal re-centering bounds, with applications to Rosenthal-type concentration of measure inequalities
november 2011
For any nonnegative Borel-measurable function f such that f(x)=0 if and only if x=0, the best constant c_f in the inequality E f(X-E X) leq c_f E f(X) for all random variables X with a finite mean is obtained. Properties of the constant c_f in the case when f=|.|^p are studied. Applications to concentration of measure in the form of Rosenthal-type bounds on the moments of separately Lipschitz functions on product spaces are given.
papers
to-read
probability
measure-concentration
november 2011
[1111.3029] Parametric estimation. Finite sample theory
november 2011
The paper aims at reconsidering the famous Le Cam LAN theory. The main features of the approach which make it different from the classical one are: (1) the study is non-asymptotic, that is, the sample size is fixed and does not tend to infinity; (2) the parametric assumption is possibly misspecified and the underlying data distribution can lie beyond the given parametric family.
The main results include a large deviation bounds for the (quasi) maximum likelihood and the local quadratic majorization of the log-likelihood process. The latter yields a number of important corollaries for statistical inference: concentration, confidence and risk bounds, expansion of the maximum likelihood estimate, etc. All these corollaries are stated in a non-classical way admitting a model misspecification and finite samples. However, the classical asymptotic results including the efficiency bounds can be easily derived as corollaries of the obtained non-asymptotic statements. The general results are illustrated for the i.i.d. set-up as well as for generalized linear and median estimation. The results apply for any dimension of the parameter space and provide a quantitative lower bound on the sample size yielding the root-n accuracy.
papers
to-read
statistics
Le_Cam-theory
asymptotic-normality
The main results include a large deviation bounds for the (quasi) maximum likelihood and the local quadratic majorization of the log-likelihood process. The latter yields a number of important corollaries for statistical inference: concentration, confidence and risk bounds, expansion of the maximum likelihood estimate, etc. All these corollaries are stated in a non-classical way admitting a model misspecification and finite samples. However, the classical asymptotic results including the efficiency bounds can be easily derived as corollaries of the obtained non-asymptotic statements. The general results are illustrated for the i.i.d. set-up as well as for generalized linear and median estimation. The results apply for any dimension of the parameter space and provide a quantitative lower bound on the sample size yielding the root-n accuracy.
november 2011
[1111.3054] Consistency under Sampling of Exponential Random Graph Models
november 2011
The growing availability of network data and of scientific interest in distributed systems has led to the rapid development of statistical models of network structure. Typically, however, these are models for the entire network, while the data consists only of a sampled sub-network. Parameters for the whole network, which is what is of interest, are estimated by applying the model to the sub-network. This assumes that the model is consistent under sampling, or, in terms of the theory of stochastic processes, that it defines a projective family. Focussing on the popular class of exponential random graph models (ERGMs), we show that this apparently trivial condition is in fact violated by many popular and scientifically appealing models, and that satisfying it drastically limits ERGM's expressive power. These results are actually special cases of more general ones about exponential families of dependent random variables, which we also prove. Using such results, we offer easily checked conditions for the consistency of maximum likelihood estimation in ERGMs, and discuss some possible constructive responses.
papers
to-read
statistics
exponential-families
graphical-models
networks
november 2011
[1111.3486] New Concentration Inequalities for Suprema of Empirical Processes
november 2011
While effective concentration inequalities for suprema of empirical processes exist under boundedness or strict tail assumptions, no comparable results have been available under considerably weaker assumptions. In this paper, we derive concentration inequalities assuming only low moments for an envelope of the empirical process. These concentration inequalities are beneficial even when the envelope is much larger than the single functions under consideration.
papers
to-read
probability
empirical-processes
measure-concentration
november 2011
academia
active-learning
adaptive-systems
ai
algorithms
alternative-fuels
analysis
anarchy
anthropology
art
Asia
assholes
asymptotic-normality
audio
avant-garde
bandit-problems
bands
banking
bayesian
bayesian-learning
biology
biophysics
biostatistics
blogs
book-reviews
books
books-noted
bounded-rationality
camp
cartoons
causality
cells
channel-coding
chapel-hill-durham
children-lit
chinese
Christianity
climate
clothing
coding-theory
coffee
cognition
cognitive-science
cold-war
combinatorics
comics
commentary
communication
communications
comparison-of-experiments
complex-systems
complexity
compressed-sensing
computation
computer-science
computer-vision
conferences
consensus-algorithms
constitution
continuing-crises
control-theory
convex-programming
cooking
copyright
creative-commons
creeping-authoritarianism
criticism
crypto
cthulhu
culture
current-events
cybernetics
data-analysis
data-mining
data-sets
decentralized-control
decision-making
decision-theory
democracy
diet
distributed-computing
distributed-systems
divergence
drinking
duke
dynamic-programming
dynamical-systems
economics
empirical-processes
energy
english
Enlightenment
environment
epistemology
epsilon-entropy
ergodic-theory
espresso-machine
essays
estimation
evisceration
evolution
experimental-design
exponential-families
fantasy
feedback
feedback-information-theory
filetype:pdf
film
films
filtering
finance
financial-regulation-reform
food
forensics
free-culture
French
frequentist
fun-stuff
gadgets
game-theory
geekery
genomics
geometric-functional-analysis
graph-theory
graphical-models
green
guitar
guitar-tabs
have-read
health
health-care
heard-the-talk
history
history_of_cybernetics
history_of_science
history_of_statistics
homepages
humor
impending_doom
industrial
information-geometry
information-structures
information-theory
innovation
institutions
interesting
internet
interviews
Japanese
journals
Judaism
labor-unions
language
LaTeX
law
Le_Cam-theory
learning-theory
lecture-notes
libertarian
libertarianism
liberty
linear-programming
linguistics
literature
lolcats
lovecraft
lyrics
mac
machine-learning
majorization
management
markets
markov-chains
martingales
mathematics
matlab
measure-concentration
media
media:document
memes
message-passing
methodology
model-selection
monetary_policy
monsters
mp3
multiagent-systems
music
network-data-analysis
networks
neuroscience
nonparametrics
nutrition
occult
omics
online-learning
optimal-search
optimal-transportation
optimization
organizations
papers
particle-filters
people
perception
philosophy
philosophy-of-science
physics
poetry
polemics
policy
politics
polymaths
pop-culture
prediction
prediction-markets
privacy
probability
productivity
profiles
progrock
psychoceramics
psychology
quantization
quantum-mechanics
random-matrices
rate-distortion
rationality
re:active_feature_selection_project
re:causality_project
re:distributed_decisions_project
re:erasures_and_concentration_project
re:knightian-uncertainty
re:posterior_matching_project
recipes
record-labels
record-stores
reference
reinforcement-learning
religion
Renaissance
research
restaurants
retro
reviews
robotics
robust_control
robust_statistics
russia
russian
satire
sci-fi
science
security
sensor-networks
Shannon
signal-processing
simpsons
simulation
slides
social-networks
sociology
software
source-coding
sparsity
spectral-graph-theory
speculation
spirits
statistical-learning
statistical-physics
statistics
steampunk
submodular-functions
subversive
system-identification
teaching
technology
tentacles
theremin
thermodynamics
to-read
toulmin
travel
trivia
uncertainty
universal-prediction
useful-tips
via:anand
via:arsyed
via:arthegall
via:bookslut
via:cshalizi
via:mreid
via:nielsen
via:shivak
via:vaguery
via:warren_ellis
want-this
web2.0
weird
Wikipedia
wisdom
workflow
workshops
Yiddish