datawrangling's trendingtopics at master - GitHub
june 2009 by plindberg
This repository contains the full source code for Trendingtopics.org, built by Data Wrangling to demonstrate how Hadoop & EC2 can power a data driven website.
rubyonrails
trending
statistics
wikipedia
github
hadoop
amazonec2
june 2009 by plindberg
Modernista!
january 2009 by plindberg
(This is the company whose website simply is a tiny menu rendered atop Flickr, Wikipedia, and Facebook.)
company
facebook
wikipedia
flickr
january 2009 by plindberg
Jaiku | Prova http://www.foody.se:4011/admin/input_text. Automatiskt klassificering av SVENSKA texter. Testa och rapportera resultat!
september 2008 by plindberg
"Så... kategorierna består av alla artiklar på svenska wikipedia."
tomaswennström
textmining
categorisation
wikipedia
kategorisering
ferret
ruby
september 2008 by plindberg
BBC - Radio Labs - Wikipedia + Lucene's MoreLikeThis = useful bits about the bits?
september 2008 by plindberg
"My proof-of-concept is based on vacuuming every Wikipedia article into the Lucene open source search engine to build a text categorisation tool prototype."
bbcradiolabs
textmining
lucene
categories
categorisation
wikipedia
bbc
september 2008 by plindberg
related tags
amazonec2 ⊕ bbc ⊕ bbcradiolabs ⊕ categories ⊕ categorisation ⊕ company ⊕ criticism ⊕ digg ⊕ english ⊕ facebook ⊕ ferret ⊕ flickr ⊕ github ⊕ hadoop ⊕ informationextraction ⊕ japan ⊕ japanese ⊕ kategorisering ⊕ language ⊕ lucene ⊕ naturallanguageprocessing ⊕ onepercentrule ⊕ ruby ⊕ rubyonrails ⊕ slate ⊕ socialmedia ⊕ statistics ⊕ textmining ⊕ toconsume ⊕ tomaswennström ⊕ trending ⊕ web2.0 ⊕ wikipedia ⊖Copy this bookmark: