arthegall + arxiv   174

Zhu, et al. "Inferring Taxi Status Using GPS Trajectories"
If I had a million free hours, I'd be thinking of spending some of them on applying this to MBTA buses ("Inferring bus fullness using GPS trajectories.") Maybe someday.
mbta  idea  gps  arxiv  research-article  inference 
3 days ago by arthegall
Frasconi et al. "kLog: A Language for Logical and Relational Learning with Kernels" (arXiv)
Also reading this morning -- unforch, the refs in their PDF are messed up, and their source-code is missing some import, so I can't recompile it (from .tex source) unaided. GRrrrr.
machinelearning  arxiv  research-article  learning  programminglanguage  to-read 
9 days ago by arthegall
Hastings, Batchelor Neuhaus, Steinbeck, "What's in an `is about' link? Chemical diagrams and the Information Artifact Ontology" (arXiv)
"We therefore propose the de nition of structural diagrams such as chemical diagrams based on their syntaxes. Any diagram expressed in an interpreted diagrammatic syntax is a valid information content entity regardless of the existence of instances that the diagram is about; although the existence of such an instance may be an interesting property depending on the application scenario." -- That's about right, but it pretty much exposes one particular problem with "realist" ontologies as such.
ontologies  obo  realist-ontologies  iao  is-about  obiwtf  article  arxiv 
4 weeks ago by arthegall
[1203.2570] Differential Privacy for Functions and Functional Data
"Previous work has focused mainly on methods for which the output is a finite dimensional vector, or an element of some discrete set. We develop methods for releasing functions while preserving differential privacy. Specifically, we show that adding an appropriate Gaussian process to the function of interest yields differential privacy."
larry-wasserman  differential-privacy  databases  arxiv  research-article  gaussian-processes 
7 weeks ago by arthegall
[1203.0697] Learning High-Dimensional Mixtures of Graphical Models
"We now propose a method for learning the mixture components given n i.i.d. samples y_n
drawn from a graphical mixture model P(y). Our method proceeds in two stages. First, we estimate the graph G_∪ := U_{r}^{h=1} G_h, which is the union of the Markov graphs of the mixture. This is accomplished via a series of rank tests. Note that in the special case when G_h ≡ G_∪, this also gives the graph estimates of the component models. We then use the graph estimate hat{G}_∪ to obtain the pairwise marginals of the respective mixture components via a spectral decomposition method. Finally, we use the Chow-Liu algorithm to obtain tree approximations {T_h}_h of the individual mixture components." -- To do: review how this works in the context of gene expression experiments for transcription factor regulatory relationships, which are (presumably) mixtures of a couple different underlying models or modes.
gene-expression  bioinformatics  research-article  arxiv  via:cshalizi  graphical-models  mixture-models  machinelearning 
11 weeks ago by arthegall
[1112.6045] Comparing intermittency and network measurements of words and their dependency on authorship
Other generic text features that can be used to determine authorship. 5th-grade science-fair project on steroids.
clustering  machinelearning  writing  authorship  classification  arxiv  research-article  nlp 
january 2012 by arthegall
[1104.1605] Efficient Top-K Retrieval in Online Social Tagging Networks
"We first consider a key aspect of the problem, which is accessing the closest or most relevant users for a given seeker. We describe how this can be done on the fly (without any pre-computations) for several possible choices - arguably the most natural ones - of proximity computation in a user network."
social-networks  tagging  folksonomies  research-article  arxiv 
december 2011 by arthegall
Doyle & Snell, "Random Walks and Electric Networks"
"Here’s the plan of the work: In Section 1 we will restrict ourselves to the study of random walks on finite networks. Here we will establish the connection between the electrical concepts of current and voltage and corresponding descriptive quantities of random walks regarded as finite state Markov chains. In Section 2 we will consider random walks on infinite networks. Polya’s theorem will be proved using Rayleigh’s method, and the proof will be compared with the classical proof using probabilistic methods. We will then discuss walks on more general infinite graphs, and use Rayleigh’s method to derive certain extensions of Polya’s theorem." -- still to read.
diffusion  electricity  random-walks  tutorial  physics  probability  markov-chain  arxiv 
december 2011 by arthegall
[1011.5287] Distributed Storage Allocations
"By using an appropriate code, successful recovery can be achieved whenever the total amount of data accessed is at least the size of the original data object. The goal is to find an optimal storage allocation that maximizes the probability of successful recovery."
coding  arxiv  research-article  genomics  idea  storage  data 
december 2011 by arthegall
Goodrich, Mitzenmacher, "Invertible Bloom Lookup Tables"
"We present a version of the Bloom filter data structure that supports not only the insertion, deletion, and lookup of key-value pairs, but also allows a complete listing of its contents with high probability, as long the number of key-value pairs is below a designed threshold."
arxiv  research-article  computerscience  bloom-filters  data-structures  invertible-bloom-lookup-tables 
december 2011 by arthegall
Bento, Ibrahimi, Montanari, "Learning Networks of Stochastic Differential Equations" (arXiv)
"We consider linear models for stochastic dynamics. To any such model can be associated a network (namely a directed graph) describing which degrees of freedom interact under the dynamics. We tackle the problem of learning such a network from observation of the system trajectory over a time interval $T$."
symbolic-methods  stochastic-processes  differential-equations  graphs  research-article  learning  arxiv  nips  statistics  information-theory  via:ded_maxim 
november 2010 by arthegall
David Mumford, "Pattern theory: the mathematics of perception"
Yes, *that* David Mumford. Oddly enough, this reminds me how I met Mumford's niece at a soccer game this summer.
david-mumford  trivia  arxiv  research-article  patterns  perception  learning 
august 2010 by arthegall
Conway and Bromage, "Succinct Data Structures for Assembling Large Genomes"
"Unfortunately, improvements in the computational feasibility for de novo assembly have not matched the improvements in the gathering of sequence data. This is for two reasons: the inherent computational complexity of the problem, and the in-practice memory requirements of tools. ... In this paper we use entropy compressed or succinct data structures to create a practical representation of the de Bruijn assembly graph, which requires at least a factor of 10 less storage than the kinds of structures used by deployed methods."
sequence-analysis  assembly  arxiv  via:vaguery  genomics  de-bruijn-graph  compression  data-structures 
august 2010 by arthegall
Kennefick, "Not Only Because of Theory: Dyson, Eddington and the Competing Myths of the 1919 Eclipse Expedition" (arXiv)
Rudolf Moritz writing to Phillip Cowell, in 1918: "I can well understand the compatriots of Riemann and Christoffel burning Louvain and sinking the Lusitania. In other words, the atrocity of inventing the tensor-based mechanisms of differential geometry which underpin general relativity is quite on a par, morally speaking, with the most notorious (to Englishmen) war crimes of World War I."
eddington  einstein  measurement  history  science  quote  arxiv  astronomy  relativity  physics 
july 2010 by arthegall
Cheney, "Causality and the Semantics of Provenance," arXiv.
"In this paper we review a theory of causality based on structural models that has been developed in artificial intelligence, and describe work in progress on using causality to give a semantics to provenance graphs."
causality  provenance  research-article  graphical-models  arxiv  data 
july 2010 by arthegall
Harrington, Hero, "Spatio-Temporal Graphical Model Selection" (arXiv)
"We consider the problem of estimating the topology of spatial interactions in a discrete state, discrete time spatio-temporal graphical model where the interactions affect the temporal evolution of each agent in a network."
via:cshalizi  lasso  graphical-models  research-article  machinelearning  spatial-data  temporal-data  arxiv 
july 2010 by arthegall
Bodirsky, Jonsson, von Oertzen "Horn versus full first-order: complexity dichotomies in algebraic constraint satisfaction" (arXiv)
"The results imply that several families of constraint satisfaction problems exhibit a complexity dichotomy: the problems are in P or NP-hard, depending on the choice of the allowed relations."
arxiv  research-article  logic  horn-clauses  computational-complexity  constraint-satisfaction  computerscience  via:Vaguery 
july 2010 by arthegall
Dominguez-Montes "Solution to the Counterfeit Coin Problem and its Generalization" (arXiv)
'This work deals with a classic problem: "Given a set of coins among which there is a counterfeit coin of a different weight, find this counterfeit coin using ordinary balance scales, with the minimum number of weighings possible, and indicate whether it weighs less or more than the rest".'
counterfeiting  coins  inference  detection  experiment  research-article  arxiv  counterfeit-coin-problem 
july 2010 by arthegall
McKinley, "Proof nets for Herbrand's Theorem" (arxiv)
"This paper explores the connection between two central results in the proof theory of classical logic: Gentzen's cut-elimination for the sequent calculus and Herbrands "fundamental theorem". Starting from Miller's expansion-tree-proofs, a highly structured way presentation of Herbrand's theorem, we define a calculus of weakening-free proof nets for (prenex) first-order classical logic, and give a weakly-normalizing cut-elimination procedure."
proofs  proof-nets  herbrands-theorem  logic  arxiv  research-article  mathematics  sequent-calculus 
june 2010 by arthegall
Allman et al. "Parameter identifiability in a class of random graph mixture models" (arXiv)
"We prove identifiability of parameters for a broad class of random graph mixture models. These models are characterized by a partition of the set of graph nodes into latent (unobservable) groups. The connectivities between nodes are independent random variables when conditioned on the groups of the nodes being connected. In the binary random graph case, in which edges are either present or absent, these models are known as stochastic blockmodels and have been widely used in the social sciences and, more recently, in biology. " -- To read, in the context of the blockmodeling paper from a few weeks back.
blockmodeling  graph  statistics  arxiv  research-article  parameters  identifiability  via:cshalizi 
june 2010 by arthegall
Ackerman, Freer, Roy, "On the computability of conditional probability" (arXiv)
"In the discrete or dominated setting, under suitable computability hypotheses, conditional probabilities are computable. However, we show that in general one cannot compute conditional probabilities. We do this by constructing a pair of computable random variables in the unit interval whose conditional distribution encodes the halting problem at almost every point. We show that this result is tight, in the sense that given an oracle for the halting problem, one can compute this conditional distribution."
research-article  computerscience  probability  conditional-probability  computability  theory  arxiv  daniel-roy 
june 2010 by arthegall
[1005.4274] This is SPIRAL-TAP: Sparse Poisson Intensity Reconstruction ALgorithms - Theory and Practice
This looks like the kind of thing I wanted, oh so long ago (pre-graduation), for thinking about delicious link frequencies. Also: great paper title.
via:Vaguery  poisson-process  statistics  estimation  images  image-analysis  optimization  arxiv  research-article 
may 2010 by arthegall
Konur, "A Survey on Temporal Logics" (arXiv)
To read, in prep for thinking about some of the OWL-OBI-BFO-time stuff.
logic  temporal-logic  time  arxiv  survey  bfo  work  obi 
may 2010 by arthegall
"The arXiv in your pocket"
Complete contents of arXiv available as a torrent.
data  papers  torrent  arxiv  via:pskomoroch 
april 2010 by arthegall
Bose, Devroye, et al. "Odds-On Trees" (arXiv)
No, I'm not dead. Yes, it would be nice if delicious had a Chrome plug-in. (Do they?)
delicious  arxiv  research-article  trees  data-structure  luc-devroye  question  via:shivak 
february 2010 by arthegall
Spivak, "Simplicial Databases"
"In this paper, we define a category DB, called the category of simplicial databases, whose objects are databases and whose morphisms are data-preserving maps. Along the way we give a precise formulation of the category of relational databases, and prove that it is a full subcategory of DB. We also prove that limits and colimits always exist in DB and that they correspond to queries such as select, join, union, etc. "
category-theory  database  mathematics  arxiv  research-article  sql  relational-data 
january 2010 by arthegall
Malmgren, Hofman, Amaral, and Watts. "Characterizing Individual Communication Patterns" (arXiv)
"Here, we propose a model of individual e-mail communication that is sufficiently rich to capture meaningful variability across individuals, while remaining simple enough to be interpretable. We show that the model, a cascading non-homogeneous Poisson process, can be formulated as a double-chain hidden Markov model..."
poisson-process  arxiv  research-article  communication  project  duncan-watts  email  markov-models  network  via:cshalizi  rediscovering-what-chl-already-knew 
december 2009 by arthegall
Ailon, Chazelle, Clarkson, Liu, Mulzer, and Seshadri, "Self-Improving Algorithms"
Why do I feel like I should recognize the name Kenneth Clarkson? (I don't know. Maybe I shouldn't.)
research  algorithms  arxiv  research-article 
november 2009 by arthegall
Lee & Wasserman, "Spectral Connectivity Analysis"
A reasonable tutorial? I still think that some of the local-linear-embedding techniques would work well in the context of disk or storage layout in a database setting.
arxiv  local-linear-embedding  research-article  spectral-graph-theory  datamining  via:cshalizi 
october 2009 by arthegall
« earlier      

related tags

active-learning  adaptive-sampling  affinity-propagation  aic  alan-willsky  algebraic-geometry  algorithms  andrew-gelman  annotation  anntotation  api  arbitrage  article  arxiv  assembly  astronomy  authorship  automatic-differentiation  axel-polleres  bandit-learning  bayesian-methods  bayesian-networks  belief-propagation  beta-process  bfo  bioinformatics  biology  blockmodeling  blog  bloom-filters  boosting  branch-and-bound  by:cshalizi  by:judea-pearl  by:mreid  by:nikete  category-theory  causal-networks  causality  change-point  change-points  channel-coding  chess  circle-packing  citation  classification  closest-pairs  clustering  coding  coding-theory  coins  combinatorial-game-theory  combinatorial-games  combinatorics  communication  communities  complex-analysis  complexity  compressed-sensing  compression  computability  computational-complexity  computational-geometry  computers  computerscience  conditional-probability  constraint-satisfaction  control  control-theory  conversation  convex-optimization  copula  coq  cosma-shalizi  cost-benefit-analysis  counterfeit-coin-problem  counterfeiting  counting  cssr  cultural-ratchet-effect  culture  daniel-roy  data  data-structure  data-structures  database  databases  datalog  datamining  david-mumford  de-bruijn-graph  decision-theory  delicious  dempster-shafer  dependent-types  description-logic  detection  differential-equations  differential-geometry  differential-privacy  diffusion  dirichlet-process  dirichlet-processes  distributed-computing  distributed-systems  divergence  draft  drug-design  duncan-watts  dynamic-models  eddington  einstein  electricity  elicited-models  email  empirical-bayes  endgames  eric-xing  estimation  evolution  experiment  exponential-families  factor-models  family  finance  folksonomies  FOPL  formal-methods  game-theory  gaussian-processes  gene-expression  generalized-linear-models  genomics  geometry  gian-carlo-rota  gibbs-sampling  google  gps  grammars  graph  graph-algorithms  graphical-models  graphs  harr-chen  hashing  heavy-tailed-distributions  herbrand  herbrands-theorem  hidden-markov-models  hierarchical-data  historical-inference  history  hmms  hopf-fibration  horn-clauses  huffman-codes  humor  hypothesis-testing  iao  icml  idea  identifiability  image-analysis  images  independence  index  indian-buffet-process  inference  information  information-theory  infotheory  integration  interpreter  interval-algebra  invertible-bloom-lookup-tables  is-about  iterative-methods  john-conway  john-langford  journal-article  jun-liu  knowledge-transmission  kullback-leibler-distance  lambda-calculus  language  larry-wasserman  lasso  latent-semantic-analysis  latent-variables  learning  learning-theory  lecture  leon-bottou  lhc  lifted-inference  linear-algebra  linear-programming  linguistics  local-linear-embedding  logic  long-tail  loop-calculus  luc-devroye  machine-learning  machinelearning  mallows-models  market-makers  markets  markov-chain  markov-chains  markov-models  martingales  mathematics  mbta  measurement  medicine  memory  message-passing  metagenomics  michael-jordan  microarray-analysis  microarrays  mixture-models  model-checking  model-selection  modeling  monads  mortality  netlog  network  networks  neuronal-networks  neyman-pearson  nips  nlp  nnmf  nonlinear-models  nonparametric-methods  nonparametric-statistics  notes  nudged  numbers  numerical-techniques  obi  obiwtf  obo  online-algorithms  online-optimization  ontologies  optimization  pagerank  papers  parallel-computing  parameters  partitions  patterns  pdf  perception  permutations  personal  pet-ideas  philosophy  phylogenetics  physics  pi-calculus  poisson-process  policy  pooling  prediction-markets  preprint  privacy  prms  probabilistic-methods  probabilistic-models  probabilistic-relational-models  probability  programming  programminglanguage  programminglanguages  project  proof-nets  proof-theory  proofs  provenance  pun  query-language  question  quote  random-projections  random-walks  randomization  rank-tests  ranking  rationality  rdf  realist-ontologies  reasoning  rediscovering-what-chl-already-knew  regression  regression-trees  regret  regulatory-networks  relational-data  relativity  reproducible-research  research  research-article  research-articles  research-paper  review  risk  samping  science  semantics  semanticweb  sensor-networks  sequence-analysis  sequencing  sequent-calculus  set-functions  signal-processing  simulation  sketches  social  social-networks  social-science  sociology  software  sparse-regression  spatial-data  spectral-graph-theory  sql  state-space-models  statistics  steins-method  stochastic-processes  storage  streaming  string-processing  strings  stuart-russel  survey  svm  symbolic-methods  systems  table  tag-transmission  tagging  temporal-data  temporal-logic  testing  text  theory  thesis  time  time-series  to-read  to-think-through  topic-models  topology  topos  torrent  trees  trivia  tutorial  typetheory  variational-methods  via:?  via:chl  via:cshalizi  via:cshalizi?  via:ded_maxim  via:johnarmstrong  via:probably-cshalizi  via:pskomoroch  via:shivak  via:vaguery  virtual-machine  von-neumann  webservice  william-feller  work  writing  yee-whye-teh  zoubin-ghahramani 

Copy this bookmark:



description:


tags: