arthegall + machinelearning   286

Frasconi et al. "kLog: A Language for Logical and Relational Learning with Kernels" (arXiv)
Also reading this morning -- unforch, the refs in their PDF are messed up, and their source-code is missing some import, so I can't recompile it (from .tex source) unaided. GRrrrr.
machinelearning  arxiv  research-article  learning  programminglanguage  to-read 
9 days ago by arthegall
[1203.0697] Learning High-Dimensional Mixtures of Graphical Models
"We now propose a method for learning the mixture components given n i.i.d. samples y_n
drawn from a graphical mixture model P(y). Our method proceeds in two stages. First, we estimate the graph G_∪ := U_{r}^{h=1} G_h, which is the union of the Markov graphs of the mixture. This is accomplished via a series of rank tests. Note that in the special case when G_h ≡ G_∪, this also gives the graph estimates of the component models. We then use the graph estimate hat{G}_∪ to obtain the pairwise marginals of the respective mixture components via a spectral decomposition method. Finally, we use the Chow-Liu algorithm to obtain tree approximations {T_h}_h of the individual mixture components." -- To do: review how this works in the context of gene expression experiments for transcription factor regulatory relationships, which are (presumably) mixtures of a couple different underlying models or modes.
gene-expression  bioinformatics  research-article  arxiv  via:cshalizi  graphical-models  mixture-models  machinelearning 
11 weeks ago by arthegall
[1112.6045] Comparing intermittency and network measurements of words and their dependency on authorship
Other generic text features that can be used to determine authorship. 5th-grade science-fair project on steroids.
clustering  machinelearning  writing  authorship  classification  arxiv  research-article  nlp 
january 2012 by arthegall
RDKit
Whoa -- did *not* realize that Greg Landrum was (also) "The RDKit guy."
cheminformatics  machinelearning  greg-landrum  software  library  python  opensource 
august 2011 by arthegall
rdkit (Google Code)
"A toolkit for cheminformatics and machine learning." -- By one of the NIBR guys.
nibr  work  cheminformatics  informatics  bioinformatics  python  google-code  machinelearning  c++  from delicious
january 2011 by arthegall
"Experiment in GP based on ImageMagick" (Notional Slurry)
"If you skip this step—even with a downloaded library—you’re a baaaaaad genetic programmer. Turn in your Jaws and go back to machine learning land."
humor  genetic-programming  machinelearning  joke  via:cshalizi  by:Vaguery  imagemagick  testing 
september 2010 by arthegall
Harrington, Hero, "Spatio-Temporal Graphical Model Selection" (arXiv)
"We consider the problem of estimating the topology of spatial interactions in a discrete state, discrete time spatio-temporal graphical model where the interactions affect the temporal evolution of each agent in a network."
via:cshalizi  lasso  graphical-models  research-article  machinelearning  spatial-data  temporal-data  arxiv 
july 2010 by arthegall
SLI | Projects / Belief Propagation
"It turns out that the SAW [self-avoiding walk] tree is also closely connected to the behavior of belief propagation on G. Weitz (2006) and Jung and Shaw (2007) showed that marginalization in a binary, pairwise Markov random field defined on G can be performed exactly on the SAW tree, which suggested various approximations to exact inference corresponding to early termination of the SAW tree." -- Time to crack out that old Alan Sokal book (no, not *that* one).
alan-sokal  self-avoiding-walk  belief-propagation  machinelearning  graphical-models  inference  trees  graphs 
july 2010 by arthegall
Chaudhuri, Monteleoni, & Sarwate, "Differentially Private Empirical Risk Minimization" (arXiv)
"We provide general techniques to produce privacy-preserving approximations of classifiers learned via (regularized) empirical risk minimization (ERM). These algorithms are private under the {\it $\epsilon$-differential privacy} definition due to Dwork et al. (2006)."
privacy  security  machinelearning  asarwate  claire-monteleoni  arvix  research-article  optimization  perturbation  to-read 
june 2010 by arthegall
Pitch Identification Tutorial
MLB.com apparently uses a custom-built neural network for doing real-time pitch identifications. But these graphs seem to imply that there might be simpler or more interpretable (offline) approaches to the same task...
data-mining  data  pitching  baseball  sports  neural-networks  clustering  machinelearning  pitchfx 
april 2010 by arthegall
Viola-Jones object detection framework - Wikipedia
Boosted networks of weak axis-aligned classifiers for face recognition (among other things).
wikipedia  face-recognition  algorithm  boosting  machinelearning  computer-vision 
march 2010 by arthegall
The `Bow' Toolkit
"The old text classifier system that won't go away..." (Oh, Rain-*bows*...)
text-classification  nlp  software  library  andrew-mccallum  classification  machinelearning 
december 2009 by arthegall
SpringerLink - Book
Machine Learning and Knowledge Discovery in Databases
European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009, Proceedings, Part II
book  springer  research  machinelearning  knowledge-base  database 
november 2009 by arthegall
SpringerLink - Book
Machine Learning and Knowledge Discovery in Databases
European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009, Proceedings, Part I
springer  book  research  knowledge-base  machinelearning 
november 2009 by arthegall
CRF Project Page
A reimplementation of some of the CRF papers (including the original one).
software  machinelearning  conditional-random-fields  research 
november 2009 by arthegall
"What’s Wrong with Probability Notation?" (LingPipe Blog)
It's a reasonable explication -- but I feel like every first-year CS graduate student has a similar personal revelation about notation when he or she first comes into contact with the machine learning or probabilistic AI literature. "This notation is horrible, but lambda calculus [or, more generally, a functional language, or some other technique I learned in my proglang course] will come to the rescue and clear all this up!" And then they work it out in the same way, have exactly the same realization (this is hard, people have tried and failed at it before), and move on. Or maybe they're Avi Pfeffer and they actually do something about it. But either way, it's a worthwhile exercise!
machinelearning  notation  programminglanguages  statistics  probability  computerscience 
october 2009 by arthegall
Pennock, Wellman, "Toward a Market Model for Bayesian Inference," (UAI 1996)
"We present a methodology for representing probabilistic relationships in a general-equilibrium economic model. Specifically, we define a precise mapping from a Bayesian network with binary nodes to a market price system where consumers and producers trade in uncertain propositions. We demonstrate the correspondence between the equilibrium prices of goods in this economy and the probabilities represented by the Bayesian network." -- (Noted in a link at this blog post: http://mblog.lib.umich.edu/strategic/archives/2009/09/market_reductio.html)
david-pennock  bayesian-networks  markets  uai  research-article  machinelearning 
september 2009 by arthegall
"Log Sum of Exponentials" (LingPipe Blog)
I was literally *just* writing up the portion of my thesis where I'm using this math... and I realize, I'm not checking for over/underflow correctly. Gotta go back and revise that, now. Thanks, LingPipe!!!
thesis  timely  logarithms  numerical-techniques  stability  floating-point-calculations  research  machinelearning 
june 2009 by arthegall
Veronising...: Cooking for nerds: ingredient polyhedron and convex hull
Time for: machine learning on recipes and ingredient lists. (Probably need to go through and code the 'labels' first: this one's a bread, this one's a cookie, this cookie is softer than that one, etc.)
machinelearning  cooking  recipes  data 
may 2009 by arthegall
"Magnatagatune - a new research data set for MIR" (Music Machinery)
"It contains:

* Human annotations collected by Edith Law’s TagATune game.
* The corresponding sound clips from magnatune.com, encoded in 16 kHz, 32kbps, mono mp3. (generously contributed by John Buckman, the founder of every MIR researcher’s favorite label Magnatune)
* A detailed analysis from The Echo Nest of the track’s structure and musical content, including rhythm, pitch and timbre.
* All the source code for generating the dataset distribution."
dataset  music  magnatagatune  research  machinelearning  luis-von-ahn 
april 2009 by arthegall
Robot scientist becomes first machine to discover new scientific knowledge
I literally cannot wait to see what Wired does with this story... (Oh, no, wait. That's wrong: I totally *can* wait. In fact, I completely dread it.)
technology  futurism  journamalism  science  robot-scientist  machinelearning 
april 2009 by arthegall
Shalizi, Camperi, and Klinkner, "Discovering Functional Communities in Dynamical Networks" (arXiv)
"In this paper, we lay out the problem of discovering_functional communities_, and describe an approach to doing so. This method combines recent work on measuring information sharing across stochastic networks with an existing and successful community-discovery algorithm for weighted networks. We illustrate it with an application to a large biophysical model of the transition from beta to gamma rhythms in the hippocampus."
via:cshalizi  arxiv  research-article  social-networks  networks  communities  machinelearning 
march 2009 by arthegall
Learning Experiment Databases
"An experiment database is a database designed to store learning experiments in full detail, aimed at providing a convenient platform for the study of learning algorithms." -- So, learning about learning algorithms, is it?
research  database  machinelearning  science  meta-learning  hyper-learning  results 
march 2009 by arthegall
« earlier      

related tags

academia  academic  acm  active-learning  ai  alan-sokal  alan-willsky  alchemy  alexander-smola  algebra  algorithm  algorithms  analogies  analysis  andrew-gelman  andrew-mccallum  andrew-ng  annotation  anonymity  anthology  api  approximation  archive  article  artificial-intelligence  arvix  arxiv  asarwate  astronomy  authorship  automatic-differentiation  award  axiom-of-choice  bandit-learning  baseball  bayes  bayesian-methods  bayesian-networks  bayesian-probability  belief-propagation  beta-process  bibliography  big-data  bioinformatics  biology  blog  blogging  blogs  bloom-filters  boilerplate  book  boosting  bugs  business  by:ded_maxim  by:hal-daume  by:mreid  by:nikete  by:Vaguery  c  c++  category-theory  causality  cheminformatics  chinese-restaurant-process  chris-bishop  citeseer  claire-monteleoni  class  classification  classifiers  clustering  communities  complexity  composition  compressed-sensing  compression  computational-geometry  computer-vision  computers  computerscience  conditional-random-fields  condtional-random-fields  conference  conference-article  congress  consistency  convex-optimization  cooking  copula  cosma-shalizi  course  course-notes  credit-scores  crf  cshalizi  cssr  culture  data  data-integration  data-mining  database  datamining  dataset  david-blei  david-haussler  david-pennock  david-wolpert  decision-trees  dempster-shafer  diagrams  differentiation  dirichlet-processes  dirichlet_processes  discriminative-training  distributed  documentation  documents  duke  dynamic-bayesian-networks  dynamic-models  email  empirical-risk-minimization  eric-xing  error  estimation  expectation-maximization  extraction  face-recognition  feature-selection  features  feyerabend  firefox  flickr  floating-point-calculations  football  framework  free  free-energy  friends  futurism  games  gaussian-processes  gaussianprocesses  gene-expression  genetic-programming  genomics  geometry  gibbs-sampling  glossary  google  google-code  government  gradient-ascent  gradient-descent  grammar  graph  graph-theory  graphical-models  graphics  graphs  greg-landrum  harr-chen  hash-kernels  hashing  hierarchical-data  hmms  homepage  html  humor  hyper-learning  hypothesis-testing  i-am-unworthy  ibm  icml  id3  identifiability  idiocy  ieee  image-analysis  image-processing  imagemagick  images  impossibility  independence  index  induction  inference  informatics  information  information-theory  infotheory  intel  internet  item-response-model  item-response-models  jason-rennie  java  jmlr  john-langford  joke  jon-kleinberg  journal  journal-article  journamalism  jstor  just-a-though  kernel  kernel-methods  kleinberg  knowledge-base  kottke  language  lasso  latent-semantic-analysis  latent-variables  law  lda  learning  least-squares  lecture  lecture-notes  lectures  leon-bottou  library  lifted-inference  linear-algebra  linear-classification  linear-models  linear-programming  linear-regression  lingpipe  linguistics  links  list  logarithms  logic  logistic-regression  luis-von-ahn  machinelearning  magnatagatune  malcolm-gladwell  mallows-models  marcel-brun  mark-johnson  market-makers  markets  markov-chains  markov-logic  markov-logic-networks  markov-models  math  mathematics  matlab  matrix-factorization  mcmc  memory  message-passing  meta-learning  metric-spaces  michael-jordan  microarray-analysis  microsoft  missing-the-point  mit  mixture-models  model  model-checking  model-selection  modeling  monads  money  mortality  movie  multilevel-modeling  music  naive-bayes  nati-srebro  nearest-neighbor  netflix  netflix-prize  network  networks  neural-networks  news-article  nibr  nips  nlp  nnmf  nonlinear-models  nonparametric-methods  nonparametric-statistics  notation  notes  numerical-techniques  nyt  nytimes  occams-razor  online-algorithms  online-optimization  opensource  opinion  optimization  paper  papers  parallel  paranoia  partition-function  paths  paul-gordan  pca  pdf  peer-review  pegasos  people  performance  permutations  personal  personality  perturbation  philosophy  photography  pitchfx  pitching  pnas  poker  politics  prediction  prediction-markets  presentation  privacy  probabilistic-methods  probabilistic-models  probabilistic-processes  probability  programming  programminglanguage  programminglanguages  psychology  publications  python  query  quote  radford-neal  ranking  rdf  recipes  reference  regret  regularization  regulatory  regulatory-networks  relational-data  resarch-article  research  research-article  research-articles  research-paper  researcher  research_article  resesarch-article  resources  response  results  review  review-article  robot-scientist  ron-kohavi  sampling  science  scraping  search  security  self-avoiding-walk  semantics  semanticweb  signal-processing  slice-sampling  social  social-networks  software  sparse-coding  spatial-data  sports  springer  sql  stability  stars  stastistics  statistics  stochastic-gradient-descent  stochastic-processes  streaming  structure  structured-data  stuart-russel  summary  support-vector-machines  surveillance  survey  svd  svm  symbolic-methods  table-of-contents  teaching  technical-report  technology  temporal-data  terence-tao  test-error  testing  text  text-classification  theology  theory  thesis  time-series  timely  to-read  tom-minka  tommi-jaakkola  tool  tools  topic-models  translation  trees  tutorial  uai  unlabeled-data  variational-methods  via:?  via:alexmallet  via:arsyed  via:chl  via:creeder  via:csantos  via:cshalizi  via:domke  via:mreid  via:probably-cshalizi  via:shivak  via:simon-willison  via:someone_on_twitter  via:vaguery  video  visualization  web  weird  wikipedia  wisdom-of-crowds  work  writing  xml  xml-rpc  yahoo  yeast  yee-whye-teh 

Copy this bookmark:



description:


tags: