Van Durme, Lall, "Probabilistic Counting with Randomized Storage" ICJAI, 2009.
4 days ago by arthegall
I like the "TOMB Counter" name -- this is a reasonably important technique, and this is the first place I've found it referenced in the literature.
probabilistic-methods
morris-counter
bloom-filters
research-article
big-data
to-re-read
4 days ago by arthegall
"Sergey Brin’s Search for a Parkinson’s Cure" (Wired)
june 2010 by arthegall
"Grove disagrees somewhat with Brin’s emphasis on patterns over hypothesis. “You have to be looking for something,” he says. But the two compare notes on the disease from time to time; both are enthusiastic and active investors in the Michael J. Fox Foundation. (Grove is even known to show up on the online discussion forums.)" --- Hmm. ("All Wired articles are wrong, and increasingly...." etc etc).
wired
sergey-brin
michael-j-fox-foundation
parkinsons
health
medicine
science
big-data
obscurely-referential
june 2010 by arthegall
"Beyond the Data Deluge" -- Bell et al. 323 (5919): 1297 -- Science
march 2009 by arthegall
A better take (overview) on "Google Science" and "Big Data" -- "The urgency for new tools and technologies to enable data-intensive research has been building for a decade or more (2, 7). In 2007, Jim Gray laid out his vision for a fourth research paradigm--data-intensive science--which he described as collaborative, networked, and data-driven (1, 10). He defined eScience as the synthesis of information technology and science that enables challenges on previously unimaginable scales to be tackled.
Despite the enormous potential of this approach, data-intensive science has been slow to develop due to the subtleties of databases, schemas, and ontologies, and a general lack of understanding of these topics by the scientific community. ... Indeed, many areas of science lag commercial use and understanding of data analytics by at least a decade."
science
via:creeder
big-data
jim-gray
Despite the enormous potential of this approach, data-intensive science has been slow to develop due to the subtleties of databases, schemas, and ontologies, and a general lack of understanding of these topics by the scientific community. ... Indeed, many areas of science lag commercial use and understanding of data analytics by at least a decade."
march 2009 by arthegall
Unwin, Theus, and Hofmann, "Graphics of Large Datasets: Visualizing a Million"
january 2009 by arthegall
Via Andrew Gelman's blog... I probably should pick up a copy of this book. ($90, gah.)
visualization
book
graphics
big-data
january 2009 by arthegall
related tags
acm ⊕ big-data ⊖ bloom-filters ⊕ book ⊕ citeseer ⊕ convex-optimization ⊕ data ⊕ data-cleaning ⊕ database ⊕ graphics ⊕ health ⊕ homepage ⊕ jim-gray ⊕ learning ⊕ machinelearning ⊕ medicine ⊕ michael-j-fox-foundation ⊕ monads ⊕ morris-counter ⊕ nlp ⊕ obscurely-referential ⊕ online-optimization ⊕ parkinsons ⊕ pdf ⊕ probabilistic-methods ⊕ programming ⊕ research-article ⊕ researcher ⊕ science ⊕ sergey-brin ⊕ statistics ⊕ streaming-data ⊕ to-re-read ⊕ to-read ⊕ via:creeder ⊕ via:cshalizi ⊕ visualization ⊕ wired ⊕Copy this bookmark: