Vaguery + nudge-targets   406

[1205.0349] Euclidean distance geometry and applications
"Euclidean distance geometry is the study of Euclidean geometry based on the concept of distance. This is useful in several applications where the input data consists of an incomplete set of distances, and the output is a set of points in Euclidean space that realizes the given distances. We survey some of the theory of Euclidean distance geometry and some of the most important applications: molecular conformation, localization of sensor networks and statics."
algorithms  nudge-targets  modeling  inverse-problems 
21 days ago by Vaguery
[1204.4200] Discrete Dynamical Genetic Programming in XCS
"A number of representation schemes have been presented for use within Learning Classifier Systems, ranging from binary encodings to neural networks. This paper presents results from an investigation into using a discrete dynamical system representation within the XCS Learning Classifier System. In particular, asynchronous random Boolean networks are used to represent the traditional condition-action production system rules. It is shown possible to use self-adaptive, open-ended evolution to design an ensemble of such discrete dynamical systems within XCS to solve a number of well-known test problems."
genetic-programming  learning-classifier-systems  representation-theory  design-patterns  boolean-networks  nudge-targets  nice 
5 weeks ago by Vaguery
Math Notes | Futility Closet
So for finite sequences of digits, which sequences are such that the most right-truncated substrings are prime? Which are such that the most right-repeating extensions are prime?
nudge-targets  number-theory  indirect-link 
5 weeks ago by Vaguery
[1204.3678] Crowd Memory: Learning in the Collective
"Crowd algorithms often assume workers are inexperienced and thus fail to adapt as workers in the crowd learn a task. These assumptions fundamentally limit the types of tasks that systems based on such algorithms can handle. This paper explores how the crowd learns and remembers over time in the context of human computation, and how more realistic assumptions of worker experience may be used when designing new systems. We first demonstrate that the crowd can recall information over time and discuss possible implications of crowd memory in the design of crowd algorithms. We then explore crowd learning during a continuous control task. Recent systems are able to disguise dynamic groups of workers as crowd agents to support continuous tasks, but have not yet considered how such agents are able to learn over time. We show, using a real-time gaming setting, that crowd agents can learn over time, and `remember' by passing strategies from one generation of workers to the next, despite high turnover rates in the workers comprising them. We conclude with a discussion of future research directions for crowd memory and learning."
crowdsourcing  learning  agent-based  collective-intelligence  memory  nudge-targets 
5 weeks ago by Vaguery
[0911.1582] Manipulating Tournaments in Cup and Round Robin Competitions
"In sports competitions, teams can manipulate the result by, for instance, throwing games. We show that we can decide how to manipulate round robin and cup competitions, two of the most popular types of sporting competitions in polynomial time. In addition, we show that finding the minimal number of games that need to be thrown to manipulate the result can also be determined in polynomial time. Finally, we show that there are several different variations of standard cup competitions where manipulation remains polynomial."
algorithms  economics  game-theory  nudge-targets 
5 weeks ago by Vaguery
[1005.4159] The Complexity of Manipulating $k$-Approval Elections
"An important problem in computational social choice theory is the complexity of undesirable behavior among agents, such as control, manipulation, and bribery in election systems. These kinds of voting strategies are often tempting at the individual level but disastrous for the agents as a whole. Creating election systems where the determination of such strategies is difficult is thus an important goal. …"
voting  game-theory  design-patterns  mechanism-design  nudge-targets 
5 weeks ago by Vaguery
[0903.1147] Tetravex is NP-complete
"Tetravex is a widely played one person computer game in which you are given $n^2$ unit tiles, each edge of which is labelled with a number. The objective is to place each tile within a $n$ by $n$ square such that all neighbouring edges are labelled with an identical number. Unfortunately, playing Tetravex is computationally hard. More precisely, we prove that deciding if there is a tiling of the Tetravex board is NP-complete. Deciding where to place the tiles is therefore NP-hard. This may help to explain why Tetravex is a good puzzle. This result compliments a number of similar results for one person games involving tiling. For example, NP-completeness results have been shown for: the offline version of Tetris, KPlumber (which involves rotating tiles containing drawings of pipes to make a connected network), and shortest sliding puzzle problems. It raises a number of open questions. For example, is the infinite version Turing-complete? How do we generate Tetravex problems which are truly puzzling as random NP-complete problems are often surprising easy to solve? Can we observe phase transition behaviour? What about the complexity of the problem when it is guaranteed to have an unique solution? How do we generate puzzles with unique solutions?"
mathematical-recreations  computational-complexity  algorithms  nudge-targets 
5 weeks ago by Vaguery
[1204.4374] Higher Order City Voronoi Diagrams
"We investigate higher-order Voronoi diagrams in the city metric. This metric is induced by quickest paths in the L1 metric in the presence of an accelerating transportation network of axis-parallel line segments. …"
computational-geometry  algorithms  voronoi-diagrams  diversity  network-theory  nudge-targets 
5 weeks ago by Vaguery
[1204.4366] Multipath-dominant, pulsed doppler analysis of rotating blades
"We present a novel angular fingerprinting algorithm for detecting changes in the direction of rotation of a target with a monostatic, stationary sonar platform. Unlike other approaches, we assume that the target's centroid is stationary, and exploit doppler multipath signals to resolve the otherwise unavoidable ambiguities that arise. Since the algorithm is based on an underlying differential topological theory, it is highly robust to distortions in the collected data. We demonstrate performance of this algorithm experimentally, by exhibiting a pulsed doppler sonar collection system that runs on a smartphone. The performance of this system is sufficiently good to both detect changes in target rotation direction using angular fingerprints, and also to form high-resolution inverse synthetic aperature images of the target."
signal-processing  algorithms  radar  nudge-targets  the-imperial-we 
5 weeks ago by Vaguery
[1204.3850] Simple Agents Learn to Find Their Way: An Introduction on Mapping Polygons
"This paper gives an introduction to the problem of mapping simple polygons with autonomous agents. We focus on minimalistic agents that move from vertex to vertex along straight lines inside a polygon, using their sensors to gather local observations at each vertex. Our attention revolves around the question whether a given configuration of sensors and movement capabilities of the agents allows them to capture enough data in order to draw conclusions regarding the global layout of the polygon. In particular, we study the problem of reconstructing the visibility graph of a simple polygon by an agent moving either inside or on the boundary of the polygon. Our aim is to provide insight about the algorithmic challenges faced by an agent trying to map a polygon. We present an overview of techniques for solving this problem with agents that are equipped with simple sensorial capabilities. We illustrate these techniques on examples with sensors that mea- sure angles between lines of sight or identify the previous location. We give an overview over related problems in combinatorial geometry as well as graph exploration."
agent-based  algorithms  nudge-targets 
5 weeks ago by Vaguery
[1204.4202] Fuzzy Dynamical Genetic Programming in XCSF
"A number of representation schemes have been presented for use within Learning Classifier Systems, ranging from binary encodings to Neural Networks, and more recently Dynamical Genetic Programming (DGP). This paper presents results from an investigation into using a fuzzy DGP representation within the XCSF Learning Classifier System. In particular, asynchronous Fuzzy Logic Networks are used to represent the traditional condition-action production system rules. It is shown possible to use self-adaptive, open-ended evolution to design an ensemble of such fuzzy dynamical systems within XCSF to solve several well-known continuous-valued test problems."
learning-classifier-systems  genetic-programming  fuzzy-math  dynamical-control  rules-learning  nudge-targets 
5 weeks ago by Vaguery
[1204.3293] Efficiently decoding strings from their shingles
"Determining whether an unordered collection of overlapping substrings (called shingles) can be uniquely decoded into a consistent string is a problem that lies within the foundation of a broad assortment of disciplines ranging from networking and information theory through cryptography and even genetic engineering and linguistics. We present three perspectives on this problem: a graph theoretic framework due to Pevzner, an automata theoretic approach from our previous work, and a new insight that yields a time-optimal streaming algorithm for determining whether a string of $n$ characters over the alphabet $Sigma$ can be uniquely decoded from its two-character shingles. Our algorithm achieves an overall time complexity $Theta(n)$ and space complexity $O(|Sigma|)$. As an application, we demonstrate how this algorithm can be extended to larger shingles for efficient string reconciliation."
strings  algorithms  computational-complexity  nudge-targets 
5 weeks ago by Vaguery
Cerebral Mastication
"There’s a charming little brain teaser that’s going around the Interwebs. It’s got various forms, but they all look something like this:…"
nudge-targets  mathematical-recreations 
5 weeks ago by Vaguery
Tanya Khovanova’s Math Blog » Blog Archive » Interlocking Polyominoes
"A set of polyominoes is interlocked if no subset can be moved far away from the rest. It was known that polyominoes that are built from four or fewer squares do not interlock. The project of Dhawan and his mentor was to investigate the interlockedness of larger polyominoes. And they totally delivered.

They quickly proved that you can interlock polyominoes with eight or more squares. Then they proved that pentominoes can’t interlock. This left them with a gray area: what happens with polyominoes with six or seven squares? After drawing many beautiful pictures, they finally found the structure presented in our accompanying image. The system consists of 12 hexominoes and 5 pentominoes, and it is rigid. You cannot move a thing. That means that hexominoes can be interlocked and thus the gray area was resolved."
polyominoes  mathematical-recreations  nudge-targets 
5 weeks ago by Vaguery
How fast is bit packing?
On my macbook air (Intel core i7), I get that the unpacking speed is not very sensitive to the specific number of bits: generally, the smaller the bit width, the faster the unpacking. The packing speed is much faster when the bit width is 8 or 16. Even so, the difference is only by a factor of two or so. The results are presented in the next figure. On the y axis, you have the time (smaller is better). On the the x axis, we have the number of bits we packed to. For example, when bit is 1, we pack 32 integers into a single 32-bit word. When the number of bits is set to 32 bits, we have a regular copy.
algorithms  nudge-targets 
7 weeks ago by Vaguery
One instruction set computer - Wikipedia, the free encyclopedia
"A one instruction set computer (OISC), sometimes called an ultimate reduced instruction set computer (URISC), is an abstract machine that uses only one instruction – obviating the need for a machine language opcode.[1][2][3] With a judicious choice for the single instruction and given infinite resources, an OISC is capable of being a universal computer in the same manner as traditional computers that have multiple instructions.[2]:55 OISCs have been recommended as aids in teaching computer architecture[1]:327[2]:2 and have been used as computational models in structural computing research.[3]"
computer-science  mathematical-recreations  one-hand-tied  nudge-targets  i-had-no-idea 
8 weeks ago by Vaguery
[1203.0856] Online Discriminative Dictionary Learning for Image Classification Based on Block-Coordinate Descent Method
"Previous researches have demonstrated that the framework of dictionary learning with sparse coding, in which signals are decomposed as linear combinations of a few atoms of a learned dictionary, is well adept to reconstruction issues. This framework has also been used for discrimination tasks such as image classification. To achieve better performances of classification, experts develop several methods to learn a discriminative dictionary in a supervised manner. However, another issue is that when the data become extremely large in scale, these methods will be no longer effective as they are all batch-oriented approaches. For this reason, we propose a novel online algorithm for discriminative dictionary learning, dubbed textbf{ODDL} in this paper. First, we introduce a linear classifier into the conventional dictionary learning formulation and derive a discriminative dictionary learning problem. Then, we exploit an online algorithm to solve the derived problem. Unlike the most existing approaches which update dictionary and classifier alternately via iteratively solving sub-problems, our approach directly explores them jointly. Meanwhile, it can largely shorten the runtime for training and is also particularly suitable for large-scale classification issues. To evaluate the performance of the proposed ODDL approach in image recognition, we conduct some experiments on three well-known benchmarks, and the experimental results demonstrate ODDL is fairly promising for image classification tasks."
image-analysis  image-segmentation  algorithms  nudge-targets 
9 weeks ago by Vaguery
[1112.3307] Polytope Codes Against Adversaries in Networks
"Network coding is studied when an adversary controls a subset of nodes in the network of limited quantity but unknown location. This problem is shown to be more difficult than when the adversary controls a given number of edges in the network, in that linear codes are insufficient. To solve the node problem, the class of Polytope Codes is introduced. Polytope Codes are constant composition codes operating over bounded polytopes in integer vector fields. The polytope structure creates additional complexity, but it induces properties on marginal distributions of code vectors so that validities of codewords can be checked by internal nodes of the network. It is shown that Polytope Codes achieve a cut-set bound for a class of planar networks. It is also shown that this cut-set bound is not always tight, and a tighter bound is given for an example network."
cryptography  privacy  algorithms  nudge-targets  network-theory  communication 
9 weeks ago by Vaguery
[1203.3353] Solving Structure with Sparse, Randomly-Oriented X-ray Data
"Single-particle imaging experiments of biomolecules at x-ray free-electron lasers (XFELs) require processing of hundreds of thousands (or more) of images that contain very few x-rays. Each low-flux image of the diffraction pattern is produced by a single, randomly oriented particle, such as a protein. We demonstrate the feasibility of collecting data at these extremes, averaging only 2.5 photons per frame, where it seems doubtful there could be information about the state of rotation, let alone the image contrast. This is accomplished with an expectation maximization algorithm that processes the low-flux data in aggregate, and without any prior knowledge of the object or its orientation. The versatility of the method promises, more generally, to redefine what measurement scenarios can provide useful signal in the high-noise regime."
structural-biology  image-analysis  crystallography  algorithms  inverse-problems  nudge-targets  statistics 
9 weeks ago by Vaguery
[1203.3341] A Comparison of Multi-Parametric Programming, Mixed-Integer Programming, Gradient Descent Based, and the Embedding Approach on Four Published Hybrid Optimal Control Examples
"...Common misconceptions regarding the embedding approach are addressed including whether or not it results in an average value control model (no), is necessary to "tweak" the algorithm to get bang-bang solutions (no), requires infinite switching (no), has real-time capability (yes), or reduction to a classical nonlinear optimization problem (a desirable yes)."
control-theory  operations-research  algorithms  numerical-methods  philosophy-of-engineering  design-patterns  nudge-targets 
9 weeks ago by Vaguery
[1203.3270] Extraction of Facial Feature Points Using Cumulative Histogram
"This paper proposes a novel adaptive algorithm to extract facial feature points automatically such as eyebrows corners, eyes corners, nostrils, nose tip, and mouth corners in frontal view faces, which is based on cumulative histogram approach by varying different threshold values. At first, the method adopts the Viola-Jones face detector to detect the location of face and also crops the face region in an image. From the concept of the human face structure, the six relevant regions such as right eyebrow, left eyebrow, right eye, left eye, nose, and mouth areas are cropped in a face image. Then the histogram of each cropped relevant region is computed and its cumulative histogram value is employed by varying different threshold values to create a new filtering image in an adaptive way. The connected component of interested area for each relevant filtering image is indicated our respective feature region. A simple linear search algorithm for eyebrows, eyes and mouth filtering images and contour algorithm for nose filtering image are applied to extract our desired corner points automatically. The method was tested on a large BioID frontal face database in different illuminations, expressions and lighting conditions and the experimental results have achieved average success rates of 95.27%."
image-segmentation  image-analysis  face-recognition  algorithms  nudge-targets 
9 weeks ago by Vaguery
[1203.3284] Efficient Enumeration of the Directed Binary Perfect Phylogenies from Incomplete Data
"We study a character-based phylogeny reconstruction problem when an incomplete set of data is given. More specifically, we consider the situation under the directed perfect phylogeny assumption with binary characters in which for some species the states of some characters are missing. Our main object is to give an efficient algorithm to enumerate (or list) all perfect phylogenies that can be obtained when the missing entries are completed. While a simple branch-and-bound algorithm (B&B) shows a theoretically good performance, we propose another approach based on a zero-suppressed binary decision diagram (ZDD). Experimental results on randomly generated data exhibit that the ZDD approach outperforms B&B. We also prove that counting the number of phylogenetic trees consistent with a given data is #P-complete, thus providing an evidence that an efficient random sampling seems hard."
phylogenetics  inverse-problems  genetics  algorithms  statistics  nudge-targets 
9 weeks ago by Vaguery
[1203.0879] Designing and using prior knowledge for phase retrieval
"In this work we develop an algorithm for signal reconstruction from the magnitude of its Fourier transform in a situation where some (non-zero) parts of the sought signal are known. Although our method does not assume that the known part comprises the boundary of the sought signal, this is often the case in microscopy: a specimen is placed inside a known mask, which can be thought of as a known light source that surrounds the unknown signal. Therefore, in the past, several algorithms were suggested that solve the phase retrieval problem assuming known boundary values. Unlike our method, these methods do rely on the fact that the known part is on the boundary. Besides the reconstruction method we give an explanation of the phenomena observed in previous work: the reconstruction is much faster when there is more energy concentrated in the known part. Quite surprisingly, this can be explained using our previous results on phase retrieval with approximately known Fourier phase."
image-analysis  image-processing  learning  inverse-problems  algorithms  nudge-targets 
9 weeks ago by Vaguery
[1203.3415] A New Approach to Count Pattern Motifs Using Combinatorial Techniches
"We proposed two new exact algorithms to detect network motifs of size 3 and 4. Considering that motifs need to count the isomorphic patterns in the original graph $G(V,E)$ and in a set of randomized graphs, the following complexities concern about count isomorphic patterns in a single graph. Let $m=|E|$ and let $a(G)$ be the arboricity of $G$. Assume $|E|geq|V|$. We describe a $O(a(G)m)$ time complexity algorithm to count isomorphic patterns of size 3. The complexity is a $O({msqrt{m}})$ in the worst graph. The second algorithm is a $O(m^2)$ complexity algorithm to count isomorphic patterns of size 4. The final result was expressive faster when compared with other implemented algorithms."
network-theory  graph-theory  algorithms  nudge-targets 
9 weeks ago by Vaguery
[1203.3249] Revisiting the Complexity of And/Or Graph Solution
"This paper presents a study on two data structures that have been used to model several problems in computer science: and/or graphs and x-y graphs. An and/or graph is an acyclic digraph containing a source, such that every vertex v has a label f(v) in {and,or} and edges represent dependency relations between vertices: a vertex labeled and depends on all of its out-neighbors, while a vertex labeled or depends on only one of its out-neighbors. X-y graphs are defined as a natural generalization of and/or graphs: every vertex vi of an x-y graph has a label xi-yi to mean that vi depends on xi of its yi out-neighbors. We analyze the complexity of the optimization problems Min-and/or and Min-x-y, which consist of finding solution subgraphs of optimal weight for and/or and x-y graphs, respectively. Motivated by the large applicability as well as the hardness of Min-and/or and Min-x-y, we study new complexity aspects of such problems, both from a classical and a parameterized point of view. …"
graph-theory  algorithms  operations-research  computational-complexity  nudge-targets 
10 weeks ago by Vaguery
[1202.3186] Two variants of Wythoff's game preserving its P-positions
"We present two variants of Wythoff's game. The first game is a restriction of Wythoff's game in which removing tokens from the smaller pile is not allowed if the two entries are not equal. The second game is an extension of Wythoff's game obtained by adjoining a move allowing players to remove k tokens from the smaller pile and l tokens from the other pile provided l < k. We show that both games preserve the P-positions of Wythoff's game. This resolves a question raised by Duchene, Fraenkel, Nowakowski and Rigo. We give formulas for those positions which have Sprague-Grundy value 1. We also prove several results on the Sprague-Grundy functions."
mathematical-recreations  game-theory  nudge-targets 
10 weeks ago by Vaguery
[1203.1900] Consensus on Moving Neighborhood Model of Peterson Graph
"In this paper, we study the consensus problem of multiple agents on a kind of famous graph, Peterson graph. It is an undirected graph with 10 vertices and 15 edges. Each agent randomly walks on this graph and communicates with each other if and only if they coincide on a node at the same time. We conduct numerical study on the consensus problem in this framework and show that global consensus can be achieved."
discrete-mathematics  graph-theory  network-theory  agent-based  nudge-targets  probably-not-the-same-hannah-arendt 
10 weeks ago by Vaguery
[1203.0100] Cake Cutting Mechanisms
"We examine the history of cake cutting mechanisms and discuss the efficiency of their allocations. In the case of piecewise uniform preferences, we define a game that in the presence of strategic agents has equilibria that are not dominated by the allocations of any mechanism. We identify that the equilibria of this game coincide with the allocations of an existing cake cutting mechanism."
algorithms  economics  nudge-targets  optimization 
10 weeks ago by Vaguery
[1203.1644] The B36/S125 "2x2" Life-Like Cellular Automaton
"The B36/S125 (or "2x2") cellular automaton is one that takes place on a 2D square lattice much like Conway's Game of Life. Although it exhibits high-level behaviour that is similar to Life, such as chaotic but eventually stable evolution and the existence of a natural diagonal glider, the individual objects that the rule contains generally look very different from their Life counterparts. In this article, a history of notable discoveries in the 2x2 rule is provided, and the fundamental patterns of the automaton are described. Some theoretical results are derived along the way, including a proof that the speed limits for diagonal and orthogonal spaceships in this rule are c/3 and c/2, respectively. A Margolus block cellular automaton that 2x2 emulates is investigated, and in particular a family of oscillators made up entirely of 2 x 2 blocks are analyzed and used to show that there exist oscillators with period 2^m(2^k - 1) for any integers m,k geq 1."
cellular-automata  artificial-life  discrete-mathematics  emergence  mathematical-recreations  nudge-targets 
10 weeks ago by Vaguery
[1203.1034] General Complex Polynomial Root Solver and Its Further Optimization for Binary Microlenses
"We present a new algorithm to solve polynomial equations, and publish its code, which is 1.6-3 times faster than the ZROOTS subroutine that is commercially available from Numerical Recipes, depending on application. The largest improvement, when compared to naive solvers, comes from a fail-safe procedure that permits us to skip the majority of the calculations in the great majority of cases, without risking catastrophic failure in the few cases that these are actually required. Second, we identify a discriminant that enables a rational choice between Laguerre's Method and Newton's Method (or a new intermediate method) on a case-by-case basis. We briefly review the history of root solving and demonstrate that "Newton's Method" was discovered neither by Newton (1671) nor by Raphson (1690), but only by Simpson (1740). Some of the arguments leading to this conclusion were first given by the British historian of science Nick Kollerstrom in 1992, but these do not appear to have penetrated the astronomical community. Finally, we argue that Numerical Recipes should voluntarily surrender its copyright protection for non-profit applications, despite the fact that, in this particular case, such protection was the major stimulant for developing our improved algorithm."
algorithms  numerical-methods  optics  nudge-targets 
10 weeks ago by Vaguery
[1203.1067] Cortical free association dynamics: distinct phases of a latching network
"... The occurrence and duration of latching dynamics is found through simulations to depend critically on the strength of local attractor states, expressed in the Potts model by a parameter w. Here we describe with simulations and then analytically the boundaries between distinct phases of no latching, of transient and sustained latching, deriving a phase diagram in the plane w-T, where T parametrizes thermal noise effects. Implications for real cortical dynamics are briefly reviewed in the conclusions."
neural-networks  biologically-inspired  dynamical-systems  emergent-design  nudge-targets 
10 weeks ago by Vaguery
[1201.6054] Attainability in Repeated Games with Vector Payoffs
"We introduce the concept of attainable sets of payoffs in two-player repeated games with vector payoffs. A set of payoff vectors is called {em attainable} if player 1 can ensure that there is a finite horizon $T$ such that after time $T$ the distance between the set and the cumulative payoff is arbitrarily small, regardless of what strategy player 2 is using. This paper focuses on the case where the attainable set consists of one payoff vector. In this case the vector is called an attainable vector. We study properties of the set of attainable vectors, and characterize when a specific vector is attainable and when every vector is attainable."
game-theory  agent-based  multiobjective-optimization  nudge-targets 
10 weeks ago by Vaguery
[1203.1080] Data Structure Lower Bounds on Random Access to Grammar-Compressed Strings
"In this paper we investigate the problem of building a static data structure that represents a string s using space close to its compressed size, and allows fast access to individual characters of s. ..."
grammars  algorithms  nudge-targets 
10 weeks ago by Vaguery
[1111.3304] Eigenvector Synchronization, Graph Rigidity and the Molecule Problem
"The graph realization problem has received a great deal of attention in recent years, due to its importance in applications such as wireless sensor networks and structural biology.…"
algorithms  statistics  structure  learning-from-data  nudge-targets 
11 weeks ago by Vaguery
[1003.5956] Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms
"…In this paper, we introduce a replay method- ology for contextual bandit algorithm evaluation. Different from simulator-based approaches, our method is completely data-driven and very easy to adapt to different applications. More importantly, our method can provide provably unbi- ased evaluations. Our empirical results on a large-scale news article recommendation dataset collected from Yahoo! Front Page conform well with our theoretical results. Furthermore, comparisons between our offline replay and online bucket evaluation of several contextual bandit algorithms show ac- curacy and effectiveness of our offline evaluation method."
classification  recommendations  algorithms  machine-learning  crowdsourcing  nudge-targets  statistics 
11 weeks ago by Vaguery
[1110.0892] On Approximability of Block Sorting
"Block Sorting is a well studied problem, motivated by its applications in Optical Character Recognition (OCR), and Computational Biology. Block Sorting has been shown to be NP-Hard, and two separate polynomial time 2-approximation algorithms have been designed for the problem. But questions like whether a better approximation algorithm can be designed, and whether the problem is APX-Hard have been open for quite a while now.
In this work we answer the latter question by proving Block Sorting to be Max-SNP-Hard (APX-Hard). The APX-Hardness result is based on a linear reduction of Max-3SAT to Block Sorting. We also provide a new lower bound for the problem via a new parametrized problem k-Block Merging."
algorithms  nudge-targets  operations-research  OCR 
february 2012 by Vaguery
[1202.5074] Solving Single-digit Sudoku Subproblems
'We show that single-digit "Nishio" subproblems in nxn Sudoku puzzles may be solved in time o(2^n), faster than previous solutions such as the pattern overlay method. We also show that single-digit deduction in Sudoku is NP-hard.'
combinatorics  games  recreational-mathematics  nudge-targets  algorithms 
february 2012 by Vaguery
[1202.0001] Vector-based model of elastic bonds for DEM simulation of solids
"A new model for computer simulation of solids, composed of bonded particles, is proposed. Vectors rigidly connected with particles are used for description of deformation of a single bond. The expression for potential energy of the bond and corresponding expressions for forces and moments are proposed. Formulas, connecting parameters of the model with longitudinal, shear, bending and torsional stiffnesses of the bond, are derived. It is shown that the model allows to describe any values of the bond stiffnesses exactly. Two different calibration procedures depending on bond length/thickness ratio are proposed. It is shown that parameters of model can be chosen so that under small deformations the bond is equivalent to either Bernoulli-Euler or Timoshenko rod or short cylinder connecting particles. Simple expressions, connecting parameters of V-model with geometrical and mechanical characteristics of the bond, are derived. Computer simulation of dynamical buckling of the straight discrete rod and discrete half-spherical shell is carried out."
modeling  mechanical-systems  materials-science  computational-methods  algorithms  nudge-targets 
february 2012 by Vaguery
[1202.0253] High-speed Flight in an Ergodic Forest
"Inspired by birds flying through cluttered environments such as dense forests, this paper studies the theoretical foundations of a novel motion planning problem: high-speed navigation through a randomly-generated obstacle field when only the statistics of the obstacle generating process are known a priori. Resembling a planar forest environment, the obstacle generating process is assumed to determine the locations and sizes of disk-shaped obstacles. When this process is ergodic, and under mild technical conditions on the dynamics of the bird, it is shown that the existence of an infinite collision-free trajectory through the forest exhibits a phase transition. On one hand, if the bird flies faster than a certain critical speed, then, with probability one, there is no infinite collision-free trajectory, i.e., the bird will eventually collide with some tree, almost surely, regardless of the planning algorithm governing the bird's motion. On the other hand, if the bird flies slower than this critical speed, then there exists at least one infinite collision-free trajectory, almost surely. Lower and upper bounds on the critical speed are derived for the special case of a homogeneous Poisson forest considering a simple model for the bird's dynamics. For the same case, an equivalent percolation model is provided. Using this model, the phase diagram is approximated in Monte-Carlo simulations. This paper also establishes novel connections between robot motion planning and statistical physics through ergodic theory and percolation theory, which may be of independent interest."
robotics  planning  algorithms  nudge-targets 
february 2012 by Vaguery
[1202.0077] An Interacting Particle Model for Clustering Euclidean Datasets
"In this paper we propose a method based on interacting particle physics, devised for clustering Euclidean datasets without initial constraints or conditions. We model any dataset as an interacting particle system, whose elements correspond to particles that interact through a simplified version of Lennard-Jones potentials. In so doing, mutual attractive interactions allow to identify groups of proximal particles. The main outcome of this modeling task is an adjacency matrix, taken as input by a community detection algorithm aimed to identify different partitions. The underlying conjecture is that, using a multiresolution analysis, the adopted model allows to find the right number of clusters for any given dataset. Experimental results, performed in comparison with a classical clustering algorithm, confirm this assumption."
clustering  data-analysis  algorithms  nudge-targets  distributed-processing 
february 2012 by Vaguery
[1201.6655] Learning Performance of Prediction Markets with Kelly Bettors
"In evaluating prediction markets (and other crowd-prediction mechanisms), investigators have repeatedly observed a so-called "wisdom of crowds" effect, which roughly says that the average of participants performs much better than the average participant. The market price---an average or at least aggregate of traders' beliefs---offers a better estimate than most any individual trader's opinion. In this paper, we ask a stronger question: how does the market price compare to the best trader's belief, not just the average trader. We measure the market's worst-case log regret, a notion common in machine learning theory. To arrive at a meaningful answer, we need to assume something about how traders behave. We suppose that every trader optimizes according to the Kelly criteria, a strategy that provably maximizes the compound growth of wealth over an (infinite) sequence of market interactions. We show several consequences.…"
prediction  performance-measure  agent-based  simulation  nudge-targets  wisdom-of-crowds 
february 2012 by Vaguery
[1201.5604] Discrete and Fuzzy Dynamical Genetic Programming in the XCSF Learning Classifier System
"A number of representation schemes have been presented for use within Learning Classifier Systems, ranging from binary encodings to neural networks. This paper presents results from an investigation into using discrete and fuzzy dynamical system representations within the XCSF Learning Classifier System. In particular, asynchronous Random Boolean Networks are used to represent the traditional condition-action production system rules in the discrete case and asynchronous Fuzzy Logic Networks in the continuous-valued case. It is shown possible to use self-adaptive, open-ended evolution to design an ensemble of such dynamical systems within XCSF to solve a number of well-known test problems."
Kauffman-networks  learning-classifier-systems  genetic-programming  nudge-targets  interesting 
january 2012 by Vaguery
[1201.4899] I Like Her more than You: Self-determined Communities
"In this paper we define what we call an affinity system, which is a set of individuals, each with a vector characterizing its preference for all other individuals in the set. The preference of a member can be given either by a ranking of all members or by a weighted vector that defines the degrees of its affinity to others. Affinity systems are useful for modeling social systems as well as general data sets, as social interactions are often determined by affinities among the members. We also define a natural notion of (potentially overlapping) communities in an affinity system, in which the members of a given community collectively prefer each other to anyone else outside the community. Thus these communities are "self-determined" or "self-certified" by the affinity system. We provide a tight polynomial bound on the number of self-determined communities as a function of the robustness of the community. Moreover, we present a polynomial-time algorithm for enumerating these communities, as well as a local algorithm with a strong stochastic performance guarantee that can find a community in time nearly linear in the of size the community.…"
network-theory  social-capital  social-dynamics  self-assembly  agent-based  graph-theory  algorithms  complexology  nudge-targets 
january 2012 by Vaguery
[1201.5076] Technical Report #SEHIR-IE-VA-12-1: Optimal Obstacle Placement with Disambiguations
"We introduce the optimal obstacle placement with disambiguations problem wherein the goal is to place true obstacles in an environment cluttered with false obstacles so as to maximize the total traversal length of a navigating agent (NAVA). Prior to the traversal, NAVA is given location information and probabilistic estimates of each disk-shaped hindrance (hereinafter referred to as disk) being a true obstacle. The NAVA can disambiguate a disk's status only when situated on its boundary. There exists an obstacle placing agent (OPA) that locates obstacles prior to NAVA's traversal. The goal of OPA is to place true obstacles in between the clutter in such a way that NAVA's traversal length is maximized in a game-theoretic sense.…"
agent-based  game-theory  robotics  disambiguation-design  nudge-targets  military-applications  algorithms 
january 2012 by Vaguery
[1010.5017] Collective motion
"We review the observations and the basic laws describing the essential aspects of collective motion -- being one of the most common and spectacular manifestation of coordinated behavior. Our aim is to provide a balanced discussion of the various facets of this highly multidisciplinary field, including experiments, mathematical methods and models for simulations, so that readers with a variety of background could get both the basics and a broader, more detailed picture of the field. The observations we report on include systems consisting of units ranging from macromolecules through metallic rods and robots to groups of animals and people. Some emphasis is put on models that are simple and realistic enough to reproduce the numerous related observations and are useful for developing concepts for a better understanding of the complexity of systems consisting of many simultaneously moving entities. As such, these models allow the establishing of a few fundamental principles of flocking. In particular, it is demonstrated, that in spite of considerable differences, a number of deep analogies exist between equilibrium statistical physics systems and those made of self-propelled (in most cases living) units. In both cases only a few well defined macroscopic/collective states occur and the transitions between these states follow a similar scenario, involving discontinuity and algebraic divergences."
emergence  emergent-design  biology  ethology  complexology  models  artificial-life  nudge-targets 
january 2012 by Vaguery
[1201.4955] Coordination, Differentiation and Fairness in a population of cooperating agents
"In a recent paper, we analyzed the self-assembly of a complex cooperation network. The network was shown to approach a state, where every agent invests the same amount of resources. Nevertheless, highly-connected agents arise that extract extra-ordinarily high payoffs while contributing comparably little to any of their cooperations. Here, we investigate a variant of the model, in which highly-connected agents have access to additional resources. We study analytically and numerically whether these resources are invested in existing collaborations, leading to a fairer load distribution, or in establishing new collaborations, leading to an even less fair distribution of loads and payoffs."
collaboration  social-capital  agent-based  network-theory  complexology  nudge-targets 
january 2012 by Vaguery
[1201.4459] An efficient parallel algorithm for the longest path problem in meshes
"In this paper, first we give a sequential linear-time algorithm for the longest path problem in meshes. This algorithm can be considered as an improvement of [13]. Then based on this sequential algorithm, we present a constant-time parallel algorithm for the problem which can be run on every parallel machine."
algorithms  graph-theory  computational-complexity  nudge-targets 
january 2012 by Vaguery
[1201.4417] Instabilities and Patterns in Coupled Reaction-Diffusion Layers
"We study instabilities and pattern formation in reaction-diffusion layers that are diffusively coupled. For two-layer systems of identical two-component reactions, we analyze the stability of homogeneous steady states by exploiting the block symmetric structure of the linear problem. There are eight possible primary bifurcation scenarios, including a Turing-Turing bifurcation that involves two disparate length scales whose ratio may be tuned via the inter-layer coupling. For systems of $n$-component layers and non-identical layers, the linear problem's block form allows approximate decomposition into lower-dimensional linear problems if the coupling is sufficiently weak. As an example, we apply these results to a two-layer Brusselator system. The competing length scales engineered within the linear problem are readily apparent in numerical simulations of the full system. Selecting a $sqrt{2}$:1 length scale ratio produces an unusual steady square pattern."
cute  emergent-design  pattern-formation  complexology  nudge-targets  nonlinear-dynamics 
january 2012 by Vaguery
[1107.0056] Fixed parameter algorithms for restricted coloring problems
In this paper, we obtain polynomial time algorithms to determine the acyclic chromatic number, the star chromatic number, the Thue chromatic number, the harmonious chromatic number and the clique chromatic number of $P_4$-tidy graphs and $(q,q-4)$-graphs, for every fixed $q$. These classes include cographs, $P_4$-sparse and $P_4$-lite graphs. All these coloring problems are known to be NP-hard for general graphs. These algorithms are fixed parameter tractable on the parameter $q(G)$, which is the minimum $q$ such that $G$ is a $(q,q-4)$-graph. We also prove that every connected $(q,q-4)$-graph with at least $q$ vertices is 2-clique-colorable and that every acyclic coloring of a cograph is also nonrepetitive.
algorithms  graph-theory  discrete-mathematics  nudge-targets 
january 2012 by Vaguery
[1112.6045] Comparing intermittency and network measurements of words and their dependency on authorship
Many features from texts and languages can now be inferred from statistical analyses using concepts from complex networks and dynamical systems. In this paper we quantify how topological properties of word co-occurrence networks and intermittency (or burstiness) in word distribution depend on the style of authors. Our database contains 40 books from 8 authors who lived in the 19th and 20th centuries, for which the following network measurements were obtained: clustering coefficient, average shortest path lengths, and betweenness. We found that the two factors with stronger dependency on the authors were the skewness in the distribution of word intermittency and the average shortest paths. Other factors such as the betweeness and the Zipf's law exponent show only weak dependency on authorship. Also assessed was the contribution from each measurement to authorship recognition using three machine learning methods. The best performance was a ca. 65 % accuracy upon combining complex network and intermittency features with the nearest neighbor algorithm. From a detailed analysis of the interdependence of the various metrics it is concluded that the methods used here are complementary for providing short- and long-scale perspectives of texts, which are useful for applications such as identification of topical words and information retrieval.
natural-language-processing  document-clustering  clustering  feature-selection  algorithms  nudge-targets 
january 2012 by Vaguery
[1108.1170] Convex Optimization without Projection Steps
For the general problem of minimizing a convex function over a compact convex domain, we will investigate a simple iterative approximation algorithm based on the method by Frank & Wolfe 1956, that does not need projection steps in order to stay inside the optimization domain. Instead of a projection step, the linearized problem defined by a current subgradient is solved, which gives a step direction that will naturally stay in the domain. Our framework generalizes the sparse greedy algorithm of Frank & Wolfe and its primal-dual analysis by Clarkson 2010 (and the low-rank SDP approach by Hazan 2008) to arbitrary convex domains. We give a convergence proof guaranteeing {epsilon}-small duality gap after O(1/{epsilon}) iterations.

The method allows us to understand the sparsity of approximate solutions for any l1-regularized convex optimization problem (and for optimization over the simplex), expressed as a function of the approximation quality. We obtain matching upper and lower bounds of {Theta}(1/{epsilon}) for the sparsity for l1-problems. The same bounds apply to low-rank semidefinite optimization with bounded trace, showing that rank O(1/{epsilon}) is best possible here as well. As another application, we obtain sparse matrices of O(1/{epsilon}) non-zero entries as {epsilon}-approximate solutions when optimizing any convex function over a class of diagonally dominant symmetric matrices.

We show that our proposed first-order method also applies to nuclear norm and max-norm matrix optimization problems. For nuclear norm regularized optimization, such as matrix completion and low-rank recovery, we demonstrate the practical efficiency and scalability of our algorithm for large matrix problems, as e.g. the Netflix dataset. For general convex optimization over bounded matrix max-norm, our algorithm is the first with a convergence guarantee, to the best of our knowledge.
operations-research  optimization  algorithms  nudge-targets 
january 2012 by Vaguery
[1112.6235] Detecting a Vector Based on Linear Measurements
We consider a situation where the state of a system is represented by a real-valued vector. Under normal circumstances, the vector is zero, while an event manifests as non-zero entries in this vector, possibly few. Our interest is in the design of algorithms that can reliably detect events (i.e., test whether the vector is zero or not) with the least amount of information. We place ourselves in a situation, now common in the signal processing literature, where information about the vector comes in the form of noisy linear measurements. We derive information bounds in an active learning setup and exhibit some simple near-optimal algorithms. In particular, our results show that the task of detection within this setting is at once much easier, simpler and different than the tasks of estimation and support recovery.
signal-processing  statistics  algorithms  nudge-targets 
january 2012 by Vaguery
[1109.2215] Finding missing edges and communities in incomplete networks
Many algorithms have been proposed for predicting missing edges in networks, but they do not usually take account of which edges are missing. We focus on networks which have missing edges of the form that is likely to occur in real networks, and compare algorithms that find these missing edges. We also investigate the effect of this kind of missing data on community detection algorithms.
network-theory  algorithms  inference  statistics  nudge-targets 
january 2012 by Vaguery
[1109.2618] Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning
We introduce a machine learning model to predict atomization energies of a diverse set of organic molecules, based on nuclear charges and atomic positions only. The problem of solving the molecular Schr"odinger equation is mapped onto a non-linear statistical regression problem of reduced complexity. Regression models are trained on and compared to atomization energies computed with hybrid density-functional theory. Cross-validation over more than seven thousand small organic molecules yields a mean absolute error of ~10 kcal/mol. Applicability is demonstrated for the prediction of molecular atomization potential energy curves.
machine-learning  learning-from-data  biochemistry  computational-science  nudge-targets 
january 2012 by Vaguery
[1109.3248] Reconstruction of sequential data with density models
We introduce the problem of reconstructing a sequence of multidimensional real vectors where some of the data are missing. This problem contains regression and mapping inversion as particular cases where the pattern of missing data is independent of the sequence index. The problem is hard because it involves possibly multivalued mappings at each vector in the sequence, where the missing variables can take more than one value given the present variables; and the set of missing variables can vary from one vector to the next. To solve this problem, we propose an algorithm based on two redundancy assumptions: vector redundancy (the data live in a low-dimensional manifold), so that the present variables constrain the missing ones; and sequence redundancy (e.g. continuity), so that consecutive vectors constrain each other. We capture the low-dimensional nature of the data in a probabilistic way with a joint density model, here the generative topographic mapping, which results in a Gaussian mixture. Candidate reconstructions at each vector are obtained as all the modes of the conditional distribution of missing variables given present variables. The reconstructed sequence is obtained by minimising a global constraint, here the sequence length, by dynamic programming. We present experimental results for a toy problem and for inverse kinematics of a robot arm.
inverse-problems  statistics  algorithms  learning-from-data  nudge-targets 
january 2012 by Vaguery
[1110.5063] Recovering a Clipped Signal in Sparseland
In many data acquisition systems it is common to observe signals whose amplitudes have been clipped. We present two new algorithms for recovering a clipped signal by leveraging the model assumption that the underlying signal is sparse in the frequency domain. Both algorithms employ ideas commonly used in the field of Compressive Sensing; the first is a modified version of Reweighted $ell_1$ minimization, and the second is a modification of a simple greedy algorithm known as Trivial Pursuit. An empirical investigation shows that both approaches can recover signals with significant levels of clipping
signal-processing  inference  compressive-sensing  algorithms  nudge-targets 
january 2012 by Vaguery
[1112.2316] Complexity-entropy causality plane: a useful approach for distinguishing songs
Nowadays we are often faced with huge databases resulting from the rapid growth of data storage technologies. This is particularly true when dealing with music databases. In this context, it is essential to have techniques and tools able to discriminate properties from these massive sets. In this work, we report on a statistical analysis of more than ten thousand songs aiming to obtain a complexity hierarchy. Our approach is based on the estimation of the permutation entropy combined with an intensive complexity measure, building up the complexity-entropy causality plane. The results obtained indicate that this representation space is very promising to discriminate songs as well as to allow a relative quantitative comparison among songs. Additionally, we believe that the here-reported method may be applied in practical situations since it is simple, robust and has a fast numerical implementation.
signal-processing  classification  data-analysis  clustering  representation  music  nudge-targets 
january 2012 by Vaguery
[1112.6178] A general framework for online audio source separation
We consider the problem of online audio source separation. Existing algorithms adopt either a sliding block approach or a stochastic gradient approach, which is faster but less accurate. Also, they rely either on spatial cues or on spectral cues and cannot separate certain mixtures. In this paper, we design a general online audio source separation framework that combines both approaches and both types of cues. The model parameters are estimated in the Maximum Likelihood (ML) sense using a Generalised Expectation Maximisation (GEM) algorithm with multiplicative updates. The separation performance is evaluated as a function of the block size and the step size and compared to that of an offline algorithm.
signal-processing  audio-segmentation  statistics  algorithms  metaheuristics  nudge-targets 
january 2012 by Vaguery
[1112.6272] A Majorize-Minimize subspace approach for l2-l0 image regularization
In this work, we have considered a class of smooth nonconvex regularization functions and we have proposed an efficient minimization stategy for solving the associated variational problems in imaging applications. Connections with l0 penalized problems have been shown asymptoti- cally. In addition, a novel convergence proof of the proposed subspace MM algorithm relying on the Kurdyka-L􏰄 ojasiewicz inequality has been given. Numerical experiments have been carried out to compare the proposed approach with other state-of-the art continuous optimization meth- ods (both for nonconvex and convex penalizations) and with discrete optimization approaches dealing with a truncated quadratic penalization. In the four presented image processing exam- ples, we argue that the proposed approach constitutes an appealing alternative to the existing methods in terms of recovered image quality and computational time.
image-processing  image-segmentation  algorithms  to-understand  nudge-targets 
january 2012 by Vaguery
[1112.6075] A semidefinite programming approach for solving Multiobjective Linear Programming
Several algorithms are available in the literature for finding the entire set of Pareto-optimal solutions in MultiObjective Linear Programming (MOLP). However, it has not been proposed so far an interior point algorithm that finds all Pareto-optimal solutions of MOLP. We present an explicit construction, based on a transformation of any MOLP into a finite sequence of SemiDefinite Programs (SDP), the solutions of which give the entire set of Pareto-optimal extreme points solutions of MOLP. These SDP problems are solved by interior point methods; thus our approach provides a pseudo-polynomial interior point methodology to find the set of Pareto-optimal solutions of MOLP.
linear-programming  algorithms  multiobjective-optimization  nudge-targets  operations-research 
january 2012 by Vaguery
[1112.0826] Clustering under Perturbation Resilience
Recently, Bilu and Linial formalized an implicit assumption often made when choosing a clustering objective: that the optimum clustering to the objective should be preserved under small multiplicative perturbations to distances between points. They showed that for max-cut clustering it is possible to circumvent NP-hardness and obtain polynomial-time algorithms for instances resilient to large (factor $O(sqrt{n})$) perturbations, and subsequently Awasthi et al. considered center-based objectives, giving algorithms for instances resilient to O(1) factor perturbations.
In this paper, we greatly advance this line of work. For center-based objectives, we present an algorithm that can optimally cluster instances resilient to $(1 + sqrt{2})$-factor perturbations, solving an open problem of Awasthi et al. For a commonly used center-based objective $k$-median, we additionally give algorithms for a more relaxed assumption in which we allow the optimal solution to change in a small $epsilon$ fraction of the points after perturbation. We give the first bounds known for this more realistic and more general setting. We also provide positive results for min-sum clustering which is a generally much harder objective than $k$-median (and also non-center-based). Our algorithms are based on new linkage criteria that may be of independent interest.
Additionally, we give sublinear-time algorithms, showing algorithms that can return an implicit clustering from only access to a small random sample.
clustering  statistics  nonparametric-methods  robustness  resilience  algorithms  nudge-targets 
january 2012 by Vaguery
[1112.5794] BATMAN-an R package for the automated quantification of metabolites from NMR spectra using a Bayesian Model
Motivation: NMR spectra are widely used in metabolomics to obtain metabolite profiles in complex biological mixtures. Common methods used to assign and estimate concentrations of metabolite involve either an expert manual peak fitting or extra pre-processing steps, such as peak alignment and binning. Peak fitting is very time consuming and is subject to human error. Conversely, alignment and binning can introduce artifacts and limit immediate biological interpretation of models. Results: We present the Bayesian AuTomated Metabolite Analyser for NMR spectra (BATMAN), an R package which deconvolves peaks from 1-dimensional NMR spectra, automatically assigns them to specific metabolites and obtains concentration estimates. The Bayesian model incorporates information on characteristic peak patterns of metabolites and is able to account for shifts in the position of peaks commonly seen in NMR spectra of biological samples. It applies a Markov Chain Monte Carlo (MCMC) algorithm to sample from a joint posterior distribution of the model parameters and obtains concentration estimates with reduced mean estimation error compared with conventional numerical integration methods.
learning-from-data  statistics  modeling  biochemistry  nudge-targets  image-segmentation 
january 2012 by Vaguery
[1111.1797] Analysis of Thompson Sampling for the multi-armed bandit problem
e multi-armed bandit problem is a popular model for studying exploration/exploitation trade-off in sequential decision problems. Many algorithms are now available for this well-studied problem. One of the earliest algorithms, given by W. R. Thompson, dates back to 1933. This algorithm, referred to as Thompson Sampling, is a natural Bayesian algorithm. The basic idea is to choose an arm to play according to its probability of being the best arm. Thompson Sampling algorithm has experimentally been shown to be close to optimal. In addition, it is efficient to implement and exhibits several desirable properties such as small regret for delayed feedback. However, theoretical understanding of this algorithm was quite limited. In this paper, for the first time, we show that Thompson Sampling algorithm achieves logarithmic expected regret for the multi-armed bandit problem. More precisely, for the two-armed bandit problem, the expected regret in time $T$ is $O(frac{ln T}{Delta} + frac{1}{Delta^3})$. And, for the $N$-armed bandit problem, the expected regret in time $T$ is $O([(sum_{i=2}^N frac{1}{Delta_i^2})^2] ln T)$. Our bounds are optimal but for the dependence on $Delta_i$ and the constant factors in big-Oh.
probability-theory  machine-learning  exploitation-exploration  nudge-targets  game-theory 
january 2012 by Vaguery
[1112.6209] Building high-level features using large scale unsupervised learning
We consider the problem of building detectors for high-level concepts using only unsupervised feature learning. For example, we would like to understand if it is possible to learn a face detector using only unlabeled images downloaded from the internet. To answer this question, we trained a simple feature learning algorithm on a large dataset of images (10 million images, each image is 200x200). The simulation is performed on a cluster of 1000 machines with fast network hardware for one week. Extensive experimental results reveal surprising evidence that such high-level concepts can indeed be learned using only unlabeled data and a simple learning algorithm.
image-analysis  image-segmentation  unsupervised-learning  learning-by-doing  feature-extraction  nudge-targets 
january 2012 by Vaguery
[1105.0158] Detecting emergent processes in cellular automata with excess information
Many natural processes occur over characteristic spatial and temporal scales. This paper presents tools for (i) flexibly and scalably coarse-graining cellular automata and (ii) identifying which coarse-grainings express an automaton's dynamics well, and which express its dynamics badly. We apply the tools to investigate a range of examples in Conway's Game of Life and Hopfield networks and demonstrate that they capture some basic intuitions about emergent processes. Finally, we formalize the notion that a process is emergent if it is better expressed at a coarser granularity.
emergence  complexology  cellular-automata  signal-processing  nudge-targets 
january 2012 by Vaguery
[1008.0901] Convergence to global consensus in opinion dynamics under a nonlinear voter model
We propose a nonlinear voter model to study the emergence of global consensus in opinion dynamics. In our model, agent $i$ agrees with one of binary opinions with the probability that is a power function of the number of agents holding this opinion among agent $i$ and its nearest neighbors, where an adjustable parameter $alpha$ controls the effect of herd behavior on consensus. We find that there exists an optimal value of $alpha$ leading to the fastest consensus for lattices, random graphs, small-world networks and scale-free networks. Qualitative insights are obtained by examining the spatiotemporal evolution of the opinion clusters.
agent-based  social-dynamics  network-theory  complexology  nudge-targets 
january 2012 by Vaguery
[1110.4876] REBOUND: An open-source multi-purpose N-body code for collisional dynamics
REBOUND is a new multi-purpose N-body code which is freely available under an open-source license. It was designed for collisional dynamics such as planetary rings but can also solve the classical N-body problem. It is highly modular and can be customized easily to work on a wide variety of different problems in astrophysics and beyond.
simulation  computational-science  astrophysics  numerical-methods  simulator  library  open-source  nudge-targets 
january 2012 by Vaguery
[1112.5908] Query Answering under Matching Dependencies for Data Cleaning: Complexity and Algorithms
Matching dependencies (MDs) have been recently introduced as declarative rules for entity resolution (ER), i.e. for identifying and resolving duplicates in relational instance $D$. A set of MDs can be used as the basis for a possibly non-deterministic mechanism that computes a duplicate-free instance from $D$. The possible results of this process are the clean, "minimally resolved instances" (MRIs). There might be several MRIs for $D$, and the "resolved answers" to a query are those that are shared by all the MRIs. We investigate the problem of computing resolved answers. We look at various sets of MDs, developing syntactic criteria for determining (in)tractability of the resolved answer problem, including a dichotomy result. For some tractable classes of MDs and conjunctive queries, we present a query rewriting methodology that can be used to retrieve the resolved answers. We also investigate connections with "consistent query answering", deriving further tractability results for MD-based ER.
databases  graph-theory  algorithms  nudge-targets 
january 2012 by Vaguery
The Washroom Game by Jan Heufer :: SSRN
This article analyses a game where players sequentially choose either to become insiders and pick one of finitely many locations or to remain outsiders. They will only become insiders if a minimum distance to the next player can be assured; their secondary objective is to maximize the minimal distance to other players. This is illustrated by considering the strategic behavior of men choosing from a set of urinals in a public lavatory. However, besides very similar situations (e.g. settling of residents in a newly developed area, the selection of food patches by foraging animals, choosing seats in waiting rooms or lines in a swimming pool), the game might also relevant to the problem of placing billboards attempting to catch the attention of passers-by or similar economic situations. In the non-cooperative equilibrium, all insiders behave as if they cooperated with each other and minimized the total number of insiders. It is shown that strategic behavior leads to an equilibrium with substantial under utilization of available locations. Increasing the number of locations tends to decrease utilization. The removal of some locations which leads to gaps can not only increase relative utilization but even absolute maximum capacity.
game-theory  agent-based  complexology  economics  nudge-targets 
january 2012 by Vaguery
[1109.0777] Efficient and Correct Stencil Computation via Pattern Matching and Static Typing
Stencil computations, involving operations over the elements of an array, are a common programming pattern in scientific computing, games, and image processing. As a programming pattern, stencil computations are highly regular and amenable to optimisation and parallelisation. However, general-purpose languages obscure this regular pattern from the compiler, and even the programmer, preventing optimisation and obfuscating (in)correctness. This paper furthers our work on the Ypnos domain-specific language for stencil computations embedded in Haskell. Ypnos allows declarative, abstract specification of stencil computations, exposing the structure of a problem to the compiler and to the programmer via specialised syntax. In this paper we show the decidable safety guarantee that well-formed, well-typed Ypnos programs cannot index outside of array boundaries. Thus indexing in Ypnos is safe and run-time bounds checking can be eliminated. Program information is encoded as types, using the advanced type-system features of the Glasgow Haskell Compiler, with the safe-indexing invariant enforced at compile time via type checking.
domain-specific-language  algorithms  grid-computing  nudge-targets 
january 2012 by Vaguery
[1108.4135] Complex-Valued Autoencoders
"Autoencoders are unsupervised machine learning circuits whose learning goal is to minimize a distortion measure between inputs and outputs. Linear autoencoders can be defined over any field and only real-valued linear autoencoder have been studied so far. Here we study complex-valued linear autoencoders where the components of the training vectors and adjustable matrices are defined over the complex field with the $L_2$ norm. We provide simpler and more general proofs that unify the real-valued and complex-valued cases, showing that in both cases the landscape of the error function is invariant under certain groups of transformations. The landscape has no local minima, a family of global minima associated with Principal Component Analysis, and many families of saddle points associated with orthogonal projections onto sub-space spanned by sub-optimal subsets of eigenvectors of the covariance matrix. The theory yields several iterative, convergent, learning algorithms, a clear understanding of the generalization properties of the trained autoencoders, and can equally be applied to the hetero-associative case when external targets are provided. Partial results on deep architecture as well as the differential geometry of autoencoders are also presented. The general framework described here is useful to classify autoencoders and identify general common properties that ought to be investigated for each class, illuminating some of the connections between information theory, unsupervised learning, clustering, Hebbian learning, and auto encoders."
neural-networks  machine-learning  classification  encoding  algorithms  nudge-targets 
december 2011 by Vaguery
[1108.5685] Predicting flow reversals in chaotic natural convection using data assimilation
"A simplified model of natural convection, similar to the Lorenz (1963) system, is compared to computational fluid dynamics simulations in order to test data assimilation methods and better understand the dynamics of convection. The thermosyphon is represented by a long time flow simulation, which serves as a reference "truth". Forecasts are then made using the Lorenz-like model and synchronized to noisy and limited observations of the truth using data assimilation. The resulting analysis is observed to infer dynamics absent from the model when using short assimilation windows.

Furthermore, chaotic flow reversal occurrence and residency times in each rotational state are forecast using analysis data. Flow reversals have been successfully forecast in the related Lorenz system, as part of a perfect model experiment, but never in the presence of significant model error or unobserved variables. Finally, we provide new details concerning the fluid dynamical processes present in the thermosyphon during these flow reversals."
chaos  dynamical-systems  experiment  prediction  numerical-methods  algorithms  nudge-targets 
december 2011 by Vaguery
[1108.1320] Compressed Matrix Multiplication
"Motivated by the problems of computing sample covariance matrices, and of transforming a collection of vectors to a basis where they are sparse, we present a simple algorithm that computes an approximation of the product of two n-by-n real matrices A and B.…"
approximation  algorithms  nudge-targets 
december 2011 by Vaguery
[1110.5296] Computing a Longest Common Palindromic Subsequence
"The {em longest common subsequence (LCS)} problem is a classic and well-studied problem in computer science. Palindrome is a word which reads the same forward as it does backward. The {em longest common palindromic subsequence (LCPS)} problem is an interesting variant of the classic LCS problem which finds the longest common subsequence between two given strings such that the computed subsequence is also a palindrome. In this paper, we study the LCPS problem and give efficient algorithms to solve this problem. To the best of our knowledge, this is the first attempt to study and solve this interesting problem."
combinatorics  strings  algorithms  nudge-targets 
december 2011 by Vaguery
[1109.5664] Deterministic Feature Selection for $k$-means Clustering
"We study feature selection for $k$-means clustering. Although the literature contains many methods with good empirical performance, algorithms with provable theoretical behavior have only recently been developed. Unfortunately, these algorithms are randomized and fail with, say, a constant probability. We address this issue by presenting a emph{deterministic} feature selection algorithm for $k$-means with theoretical guarantees. At the heart of our algorithm lies a deterministic method for decompositions of the identity."
clustering  statistics  algorithms  nudge-targets 
december 2011 by Vaguery
[1110.5190] Constant-factor approximation of domination number in sparse graphs
"The k-domination number of a graph is the minimum size of a set X such that every vertex of G is in distance at most k from X. We give a linear time constant-factor approximation algorithm for k-domination number in classes of graphs with bounded expansion, which include e.g. proper minor-closed graph classes, classes closed on topological minors or classes of graphs that can be drawn on a fixed surface with bounded number of crossings on each edge.

The algorithm is based on the following approximate min-max characterization. A subset A of vertices of a graph G is d-independent if the distance between each pair of vertices in A is greater than d. Note that the size of the largest 2k-independent set is a lower bound for the k-domination number. We show that every graph from a fixed class with bounded expansion contains a 2k-independent set A and a k-dominating set D such that |D|=O(|A|), and these sets can be found in linear time. For domination number (k=1) the assumptions can be relaxed, and the result holds for all graph classes with arrangeability bounded by a constant."
operations-research  combinatorics  graph-theory  algorithms  nudge-targets 
december 2011 by Vaguery
[1112.1945] Approximation Algorithms for Edge Partitioned Vertex Cover Problems
"In the Partial Vertex Cover (PVC) problem we are given an undirected graph G = (V, E), a positive cost associated with each vertex and a positive integer k and the goal is to find a minimum cost subset of vertices S such that atleast k edges of the graph are covered. In this paper we consider two new generalization of the PVC problem. In the first variation which we call Partition Vertex Cover (Partition-VC) problem, the edges of the graph G are divided into n disjoint partitions $P_1, P_2... P_n$ and we have to select a minimum cost subset of vertices S such that atleast $k_i$ edges are covered from partition $P_i$. In the second variation which we call Knapsack Partition Vertex Cover (KPVC) problem, in addition to the previous conditions, each edge e has a profit $pi_{e}$ associated with it and we have an added knapsack constraint that the total profit of the covered edges in partition $P_i$ should be atleast $Pi_i$. We give an $O(log n)$ approximation for both the problems using a combination of deterministic rounding and randomized rounding approach that operates on the LP strengthened by adding Knapsack Cover inequalities as proposed by Carr, Fleischer, Leung & Phillips. We also show that these bounds can not be further improved by reducing the set cover problem to the Partition-VC problem in polynomial time. We also give an $O(f)$ approximation for the Partition-VC problem using a primal dual schema where f is the maximum number of edges in any partition."
operations-research  graph-theory  graph-partitioning  linear-programming  nudge-targets 
december 2011 by Vaguery
« earlier      

related tags

acoustics  adaptive-control  aesthetics  agent-based  algorithms  amusement  amusing  analog-circuits  analog-design  analogies  analytical-results  analytics  ant-colony-optimization  antennas  applied-mathematics  approximation  archive  archives  artificial-collaboration  artificial-intelligence  artificial-life  artificial-materials  astronomy  astrophysics  audio-segmentation  autocatalysis  automata  autonomous  benchmarking  Bezier-curves  bin-packing  biochemistry  bioengineering  bioinformatics  biological-engineering  biologically-inspired  biology  biomechanics  biometrics  biomolecules  boids  boolean-networks  calculus  cellular-automata  cellular-automata-plus-many  challenge  challenges  chaos  ciliates  circuits  cladistics  classification  cloud-computing  clustering  cognition  cognitive-networks  collaboration  collective-intelligence  combinatorics  communication  communication-infrastructure  communication-theory  comparison  competition  complex-systems  complexity  complexology  composition  compressed-sensing  compression  compressive-sensing  computation  computational-complexity  computational-geometry  computational-linguistics  computational-mechanics  computational-methods  computational-paradigms  computational-science  computer-graphics  computer-science  computer-vision  computing  condensed-matter  constraint-satisfaction  contest  contests  context-is-a-feature-not-a-bug  control  control-systems  control-theory  coordination  counterintuitive  crowdsourcing  cryptography  crystallography  cultural-dynamics  cultural-norms  cute  cybernetics  data  data-access  data-analysis  data-driven  data-mining  data-pageant  data-preparation  database-administration  databases  dataflow  dataset  datasets  dba  DDOS  decision-making  decomposition  definitely-nudge-targets  demonstration  design-automation  design-optimization  design-patterns  design-theory  diagnosis  diagnostics  differential-equations  digital-logic  digitization  disambiguation-design  discrete-event-simulation  discrete-mathematics  distributed-processing  diversity  DNA-computing  document-clustering  domain-specific-language  dynamic-control-prospects  dynamic-programming  dynamical-control  dynamical-systems  dynamics  easy-pickins  economic-crisis  economics  edge-cases  edge-conditions  edge-of-chaos  electricity  electromagnetism  electronics  emergence  emergent-design  encoding  engineering-design  enviable-toys  epistasis  error-correction  estimation  ethology  evolutionary-algorithms  evolutionary-economics  exact-form  exercises  experiment  experimental-design  experimental-math  experimental-psychology  explanatory-power  exploitation-exploration  exploration-exploitation  extreme-values  face-recognition  faces  factorization  fairness  feature-extraction  feature-selection  filtering  finance  financial-engineering  finite-elements  finite-state-machine  firehose-drinking  first-principles  FLOSS  fluid-mechanics  folksonomy  forecasts  formalization  fractals  frequency-space  fuzzy-logic  fuzzy-math  galaxy-zoo  game-theory  games  gene-regulatory-networks  genetic-algorithm  genetic-programming  genetic-programming-target  genetic-regularory-networks  genetics  genomics  geology  geometry  good-candidate-for-agent-based-modeling  government2.0  grammar  grammars  grammatical-evolution  graph-layout  graph-partitioning  graph-theory  graphic-design  graphics  graphics-processing-unit  grid-computing  GRN  group-theory  hacking  hardware  heuristics  high-hanging-fruit  horse-races  how-to  huh-that's-weird-I-wonder  hyperheuristics  hyperresolution  i-could-do-that  I-guess  i-had-no-idea  iamsidd2k7  image-analysis  image-compression  image-processing  image-segmentation  image-synthesis  imputation  indirect-link  inference  information-theory  infrastructure  innovation  inspirational-computing  integer-programming  intellectual-property  intelligence-gathering  interactivity  interesting  interesting-as-a-TSP-mixin  interesting-problems  interpreters  intrusion  inverse-problems  ising-model  it's-got-quantums-innit  it's-more-complicated-than-you-think  John-Horton-Conway  Kalman-filters  Kauffman-networks  Kauffmania  L-systems  LaBean  labean-tiles  language  languages  law  learning  learning-by-doing  learning-by-watching  learning-classifier-systems  learning-from-data  Levenshtein-distance  library  lindenmayer-systems  linear-algebra  linear-programming  linear-regression  linguistics  Linux  load-balancing  locomotion  logical-operators  machine-learning  magic-squares  manfred-macx-approves  manufacturing  marketing  materials-science  Mathematica  mathematical-modeling  mathematical-programming  mathematical-recreations  mathematics  matrices  mechanical-systems  mechanism  mechanism-design  medical-technology  medicine  memory  memristors  Mersenne-primes  meta-algorithms  meta-optimization  metaheuristics  metaphor  methodologies  microbiology  microeconomics  micropragmatism  military-applications  mobile-sensor-networks  model  model-discovery  modeling  modeling-is-not-mathematics  models  models-and-modes  molecular-design  molecular-machinery  Monte-Carlo-methods  Monte-Carlo-simulation  multi-agent-systems  multiobjective-optimization  multiscale  music  musicians!?!  musicians?!?  nanotechnology  natural-language-processing  Navier-Stokes  network-design  network-engineering  network-theory  networks  neural-networks  neutral-networks  nice  noise-reduction  nonlinear-dynamics  nonparametric-methods  nonphotorealistic  NP-complete  NP-hard-things  nudge  nudge-targets  number-theory  numerical-methods  Occam's-double-edged-razor  ocr  one-hand-tied  online-learning  ontology  ontology-discovery  open-access  open-problem  open-questions  open-source  operations-research  optical-illusions  optics  optimization  origin-of-life  oscillator-networks  PageRank  paper-and-pencil-games  parallel  parallel-computing  parallelism  parsimony  patents  pattern-formation  pattern-recognition  PCA  pedagogy  performance-measure  performance-space-analysis  phenotype-genotype-stuff  philosophy-of-engineering  phononics(?!)  photonics  phylogenetics  physics  physics-envy  physics-is-fun  physiology  plane-geometry  planning  polymer-models  polyominoes  pragmatism-it-ain't  prediction  preferential-attachment  primality  printing  prisoner's-dilemma  prisoners'-dilemma  privacy  probability-theory  probably-not-the-same-hannah-arendt  problem-solving  programmable-matter  programming  proof  protein-folding  protocol  protoctista  pseudorandom-numbers  psychology  public-data  puzzles  quality-of-service  quasirandom-numbers  queueing  quorum-sensing  R-language  radar  radiation-therapy  radio  radiology  raw-data-now  recognition-problems  recommendations  recreational-mathematics  reference  regular-expression  reliability  representation  representation-theory  resilience  results  review  reviews  rewriting  rewriting-systems  RNA  RNA-folding  robotics  robustness  routing  rules-learning  safety  satisfiability  scheduling  scientific-computing  search-algorithms  search-engines  security  segmentation  self-assembly  self-similarity  semantics  semiconductors  sensor-networks  sensors  sentiment  sequences  Shannon  signal-processing  simulable  simulation  simulator  slic-representation  slime-mold  small-world  social-capital  social-dynamics  social-networks  sociology  software  software-architecture  space-exploration  sparse-coding  specification  statistical-tests  statistics  stock-picking  string-editing  strings  structural-biology  structure  Stuart-Kauffman  supervised-learning  swarms  symbolic-math  symmetry  system-administration  system-identification  systems-biology  systems-engineering  systems-thinking  tagging  taxonomy  technical-analysis  text-classification  text-mining  the-imperial-we  theoretical-biology  theory-and-practice-sitting-in-a-tree  tiling  time-series  to-understand  tomography  tools  topology  toy-problems  trading  traveling-salesman  trees  undecodability  unsupervised-learning  updated  URL  user-centric-design  utilities  via:  via:Jason-H-Moore  via:twitter  video-processing  virus  visual-effects  visualization  von-Neumann  voronoi  voronoi-diagrams  voting  Wang-tiles  want-want  wavelets  weather  why-does-it-take-26-pages-of-maths-before-we-try-it?  wireless  wisdom-of-crowds 

Copy this bookmark:



description:


tags: