threedaymonk + text 9
Boilerpipe
march 2010 by threedaymonk
‘Boilerplate Removal and Fulltext Extraction from HTML pages’
java
library
text
html
web
march 2010 by threedaymonk
Yoshikoder
october 2008 by threedaymonk
‘[A] a cross-platform multilingual content analysis program.’
text
linguistics
analysis
october 2008 by threedaymonk
The Xapian Project
july 2008 by threedaymonk
‘Xapian is an Open Source Search Engine Library, released under the GPL.’
search
c++
library
text
open-source
july 2008 by threedaymonk
Copy this bookmark: