chl + paper   139

novikoff, kleinberg, strogatz: education of a model student
"a dilemma faced by teachers, and increasingly by designers of educational software, is the trade-off between teaching new material and reviewing what has already been taught. complicating matters, review is useful only if it is neither too soon nor too late. moreover, different students need to review at different rates. we present a mathematical model that captures these issues in idealized form."
by:timothy-novikoff  by:steven-strogatz  by:jon-kleinberg  learning  paper  spaced-rep  from delicious
january 2012 by chl
a simpler approach to matrix completion
"this paper provides the best bounds to date on the number of randomly sampled entries required to reconstruct an unknown low-rank matrix. [...] the reconstruction is accomplished by minimizing the nuclear norm, or sum of the singular values, of the hidden matrix subject to agreement with the provided entries. if the underlying matrix satisfies a certain incoherence condition, then the number of entries required is equal to a quadratic logarithmic factor times the number of parameters in the singular value decomposition. the proof of this assertion is short, self contained, and uses very elementary analysis. the novel techniques herein are based on recent work in quantum information theory."
quantum-info-theory  via:shivak  paper  matrix-completion  from delicious
january 2012 by chl
[1107.5728v2] the network of global corporate control
'the structure of the control network of transnational corporations affects global market competition and financial stability. so far, only small national samples were studied and there was no appropriate methodology to assess control globally. we present the first investigation of the architecture of the international ownership network, along with the computation of the control held by each global player. we find that transnational corporations form a giant bow-tie structure and that a large portion of control flows to a small tightly-knit core of financial institutions. this core can be seen as an economic "super-entity" that raises new important issues both for researchers and policy makers.'
na  interlocking-directorates  via:cshalizi  paper  from delicious
october 2011 by chl
[1108.1791v1] why philosophers should care about computational complexity
"one might think that, once we know something is computable, how efficiently it can be computed is a practical question with little further philosophical importance. in this essay, I offer a detailed case that one would be wrong."
complexity  paper  by:scott-aaronson  later  via:llimllib  computation  from delicious
august 2011 by chl
[1106.2429] efficient online learning via randomized rounding
"most online algorithms used in machine learning today are based on variants of mirror descent or follow-the-leader. in this paper, we present an online algorithm based on a completely different approach, which combines "random playout" and randomized rounding of loss subgradients."
ml  online-learning  paper  later  via:shivak  from delicious
june 2011 by chl
[1105.0902] modeling network evolution using graph motifs
Modeling Network Evolution Using Graph Motifs - new paper, with Python code from @drewconway
na  paper  by:drew-conway  later  from delicious
may 2011 by chl
citeseerx — stop word location and identification for adaptive text recognition
150 words cover 50% of typical english text. by scanning page images for likely occurrences of such words (using only word image width [!]) and then determining their identities through a word shape classifier, character prototypes can be extracted and fonts be learned.
font-learning  ocr  img-proc  by:tin-kam-ho  paper  from delicious
january 2011 by chl
[1011.3854] a probabilistic and ripless theory of compressed sensing
"this paper introduces a simple and very general theory of compressive sensing. in this theory, the sensing mechanism simply selects sensing vectors independently at random from a probability distribution f [...]. we prove that if [f] obeys a simple incoherence property and an isotropy property, one can faithfully recover approximately sparse signals from a minimal number of noisy measurements. the novelty is that our recovery results do not require the restricted isometry property (rip) - they make use of a much weaker notion - or a random model for the signal."
compressed-sensing  via:ded_maxim  rip  paper  later  riplessness  from delicious
november 2010 by chl
[1008.4686] data analysis recipes: fitting a model to data
"we go through the many considerations involved in fitting a model to data, using as an example the fit of a straight line to a set of points in a two-dimensional plane. standard weighted least-squares fitting is only appropriate when there is a dimension along which the data points have negligible uncertainties, and another along which all the uncertainties can be described by gaussians of known variance; these conditions are rarely met in practice. we consider cases of general, heterogeneous, and arbitrarily covariant two-dimensional uncertainties, and situations in which there are bad data (large outliers), unknown uncertainties, and unknown but expected intrinsic scatter in the linear relationship being fit. above all we emphasize the importance of having a "generative model" for the data [...]"
data-analysis  line-fitting  paper  by:david-hogg  by:dustin-lang  by:jo-bovy  from delicious
september 2010 by chl
[1006.3868] philosophy and the practice of bayesian statistics
"a substantial school in the philosophy of science identifies bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of bayesian statistics. we argue that the most successful forms of bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism."
hypothetico-deductivism  paper  by:andrew-gelman  by:cshalizi  stat  bayes  from delicious
june 2010 by chl
pregel
The long-awaited Pregel paper from #SIGMOD2010 has been posted in ACM DL: #Hadoop #MapReduce
pregel  paper  google  na  #Hadoop  #MapReduce  #SIGMOD2010  Hadoop  MapReduce  SIGMOD2010  from delicious
june 2010 by chl
[0911.1824] community structure in time-dependent, multiscale, and multiplex networks
"we developed a generalized framework of network quality functions that allowed us to study the community structure of arbitrary multislice networks, which are combinations of individual networks coupled through links that connect each node in one network slice to itself in other slices. this framework allows one to study community structure in a very general setting encompassing networks that evolve over time, have multiple types of links (multiplexity), and have multiple scales."
na  paper  later  from delicious
may 2010 by chl
[1003.5474] angle tree: nearest neighbor search in high dimensions with low intrinsic dimensionality
'the key idea of our approach is to store the angle (the "dihedral angle") between the data region (which is a low dimensional manifold) and the random hyperplane that splits the region (the "splitter"). we show that the dihedral angle can be used to obtain a tight lower bound on the distance between the query point and any point on the opposite side of the splitter. this in turn can be used to efficiently prune the search space.' / '[...] the angle tree is the most efficient known indexing structure for nearest neighbor queries in terms of preprocessing and space usage while achieving high accuracy and fast search time.'
angle-tree  knn  via:shivak  paper  later  from delicious
march 2010 by chl
a comparison of user-generated and automatic graph layouts - microsoft research
"our results demonstrate that the best of the user-generated layouts performed as well as or better than the physics-based layout. orthogonal and circular automatic layouts were found to be considerably less effective than either the physics-based layout or the best of the user-generated layouts." / cool idea.
graph-layout  graph-viz  paper  by:tim-dwyer 
november 2009 by chl
[0908.2284] classification by set cover: the prototype vector machine
"the method is compatible with any dissimilarity measure, making it amenable to situations in which the data are not embedded in an underlying feature space or in which using a non-euclidean metric is desirable. indeed, we demonstrate on the much studied zip code data how the pvm can reap the benefits of a problem-specific metric. in this example, the pvm outperforms the highly successful 1-nn with tangent distance, and does so retaining fewer than half of the data points."
ml  pvm  knn  paper  via:agray0 
august 2009 by chl
learning compressed sensing
"in this paper we ask: given a training set typical of the signals we wish to measure, what are the optimal set of linear projections for compressed sensing? we show that the optimal projections are in general not the principal components nor the independent components of the data, but rather a seemingly novel set of projections that capture what is still uncertain about the signal, given the training set. we also show that the projections onto the learned uncertain components may far outperform random projections. this is particularly true in the case of natural images,
where random projections have vanishingly small signal to noise
ratio as the number of pixels becomes large."
compressed-sensing  random-projections  uncertain-components  paper  by:yair-weiss  by:hyun-sung-chang  by:bill-freeman  filetype:pdf  media:document 
august 2009 by chl
« earlier      

related tags

#Hadoop  #MapReduce  #SIGMOD2010  actorscript  ai  angle-tree  announcement  approx  approximate-closest-pairs  arx:q-bio.nc/0609008  author:chl  author:earl  await  bagging  bayes  bcrypt  bing  bio  bio-inf  bio-net  bitmap-indices  blockbuster  bm25  body-part-recognition  boltzmann-machines  boosting  bulk-reads  by:alexei-efros  by:alistair-moffat  by:andrew-gelman  by:andrew-moore  by:antonio-torralba  by:bill-freeman  by:carl-hewitt  by:cosma-shalizi  by:cshalizi  by:david-hogg  by:drew-conway  by:dustin-lang  by:edo-liberty  by:glinden  by:h-dieter-zeh  by:hyun-sung-chang  by:james-hays  by:jo-bovy  by:jon-kleinberg  by:justin-zobel  by:kristina-lisa-klinkner  by:léon-bottou  by:marcelo-camperi  by:mdreid  by:nir-ailon  by:peter-turney  by:purnamrita-sarkar  by:reginald-d.-smith  by:rob-fergus  by:rob-pike  by:robert-bell  by:scott-aaronson  by:steven-strogatz  by:ted-dunning  by:tim-dwyer  by:timothy-novikoff  by:tin-kam-ho  by:yair-weiss  by:yehuda-koren  cf  cga  click-chain-model  click-models  clique-matrices  clustering  code  coincidence  community-discovery  complexity  compressed-sensing  compression  computation  consciousness  contagion  crawling  crf  ctr-prediction  cv  cv?  cvpr-2008  data  data-analysis  data-parallel  deep-learning  deep-nets  density-estimation  difference-map  dim-red  disk-geometry  dist  distributed-awareness?  distributional-semantics  divergences  diversity  dna  dnazip  drl  dunning-log-likelihood  dy-na  dy-net  dynamic-illustrations  ea  edge-bundling  em  email  ensemble-methods  epidemic-centrality  evolution  evolutionary-biology  extensions  facebook  filetype:pdf  fmm  fmri  fn-com  font-learning  fpga  ga  general-hilarity  geo  geometry  global-opt  google  google-chrome  gpgpu  gps  graph-layout  graph-viz  Hadoop  hamming-embedding  hashing  haskell  hdd  hmm  homophily  hypothetico-deductivism  im2gps  img-proc  img-search  indexing  info-geometry  info-sec  info-theory  information-forests  integrated-information-theory  interactive-documents  interlocking-directorates  internet  interpretability  io  ir  johnson-lindenstrauss  k)  k-means  k-tree  kd-trees  kdd-2009  kernel-distance  kinect  knn  lambdarank  lang:de  latent-space-model  later  later?  lear  learning  likelihood-ratio-tests  lin-alg  line-fitting  ling  llr  lsa  lsh  MapReduce  matlab  matrix-approximation  matrix-completion  max-flow  media:document  media:pdf  message-passing  mf  microblogging  mine  ml  msft  msr  msr-accelerator  multi-relational-graphs  multi-scale  music  n-body  n-gram  na  neumaier's-clouds  neuro-sci  nflxprize  nips-2009  nlp  nn  node-harvest  ocas  ocr  online-alg  online-learning  opt  p-boxes  p:d3c0v  p:m3p2r  pairwise-preferences  paper  paper-transistors  pathways  pca  physics  pkm  pmi-ir  poisson-intensity-reconstruction  possibility-distributions  preference-lists  pregel  prob  probability  probability-intervals  pvm  quantum-info-theory  query-classification  random-drift  random-forests  random-indexing  random-projections  random-sets  ranking  rec-sys  redundant-bit-vectors  ref+diff  refresh:1  regression  reread:2007-01  research-programme  review  rip  riplessness  scene-completion  scene-parsing  sci-pub  search  security  semantics  semi-supervised-learning  short-codes  SIGMOD2010  sketching  sna  soc-net  sochastic-approximation  social-media  spaced-rep  sparse-coding  spectral-clustering  spectral-graph-theory  spiral-tap  sregexp  stat  storage  sum-products  summary  surf  surprise  svm  temporal-networks  texture-synthesis  tf-idf  thumbnailing  toefl  topic-models  tracking  traffic  transistors  trees  ts  twitter  uncertain-components  uncertainty  vanilla  vectors  via:adulau  via:agray0  via:alstrup  via:arthegall  via:boomblitz?  via:csantos  via:cshalizi  via:ded_maxim  via:elzzup  via:gappy  via:glinden  via:herrmann  via:llimllib  via:nikete  via:ogrisel  via:shivak  via:simon.belak  via:wmf  via:wnpxrz  vocabulary  vsm  vxinsight  vxord  y!  year:2005  yehuda-koren   

Copy this bookmark:



description:


tags: