Apache OpenNLP - Welcome to Apache OpenNLP
november 2011 by donturn
OpenNLP is an organizational center for open source projects related to natural language processing. Its primary role is to encourage and facilitate the collaboration of researchers and developers on such projects.
OpenNLP also hosts a variety of java-based NLP tools which perform sentence detection, tokenization, pos-tagging, chunking and parsing, named-entity detection, and coreference using the OpenNLP Maxent machine learning package
apache
java
nlp
opensource
OpenNLP also hosts a variety of java-based NLP tools which perform sentence detection, tokenization, pos-tagging, chunking and parsing, named-entity detection, and coreference using the OpenNLP Maxent machine learning package
november 2011 by donturn
Apache Tika - Apache Tika
november 2011 by donturn
The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries.
apache
java
lucene
metadata
parser
november 2011 by donturn
Apache UIMA - Apache UIMA
november 2011 by donturn
Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at.
UIMA enables applications to be decomposed into components, for example "language identification" => "language specific segmentation" => "sentence boundary detection" => "entity detection (person/place names etc.)". Each component implements interfaces defined by the framework and provides self-describing metadata via XML descriptor files. The framework manages these components and the data flow between them. Components are written in Java or C++; the data that flows between components is designed for efficient mapping between these languages.
apache
framework
java
nlp
opensource
UIMA enables applications to be decomposed into components, for example "language identification" => "language specific segmentation" => "sentence boundary detection" => "entity detection (person/place names etc.)". Each component implements interfaces defined by the framework and provides self-describing metadata via XML descriptor files. The framework manages these components and the data flow between them. Components are written in Java or C++; the data that flows between components is designed for efficient mapping between these languages.
november 2011 by donturn
Artificial Intelligence: A Modern Approach
december 2008 by donturn
includes instructor's resource page - wow, this would be a fun book to teach a course around
research
syllabi
academic
ai
lisp
python
programming
code
java
algorithm
book
cs
textbook
education
science
software
design
algorithms
intelligence
december 2008 by donturn
http://secondstring.sourceforge.net/
march 2006 by donturn
Cohen from CMU
java
grep
regex
open_source
march 2006 by donturn
http://minorthird.sourceforge.net/
march 2006 by donturn
Cohen from CMU
kdd
classification
text
java
march 2006 by donturn
http://www.borland.com/downloads/download_jbuilder.html
february 2006 by donturn
links to download free and trial versions of jbuilder java ide from borland
java
dev
ide
february 2006 by donturn
Quick tips on using Lucene
january 2006 by donturn
the open source text IR tool.
index
information_retrieval
java
lucene
open_source
january 2006 by donturn
related tags
academic ⊕ ai ⊕ ajax ⊕ algorithm ⊕ algorithms ⊕ analysis ⊕ analytics ⊕ apache ⊕ api ⊕ bayes ⊕ blog ⊕ book ⊕ classification ⊕ code ⊕ compress ⊕ crawler ⊕ cs ⊕ database ⊕ datamining ⊕ data_mining ⊕ design ⊕ desktop ⊕ dev ⊕ eclipse ⊕ education ⊕ flash ⊕ framework ⊕ google ⊕ graph ⊕ graphics ⊕ grep ⊕ gui ⊕ hadoop ⊕ hci ⊕ ia ⊕ ibm ⊕ ide ⊕ image ⊕ index ⊕ indexing ⊕ information_retrieval ⊕ intelligence ⊕ interface ⊕ ir ⊕ iseek ⊕ itunes ⊕ java ⊖ kdd ⊕ language ⊕ linguistics ⊕ linux ⊕ lisp ⊕ lucene ⊕ mac ⊕ metadata ⊕ music ⊕ netbeans ⊕ nlp ⊕ ontology ⊕ opensource ⊕ open_source ⊕ organization ⊕ osx ⊕ parser ⊕ pdf ⊕ photo ⊕ picture ⊕ programming ⊕ python ⊕ rdf ⊕ regex ⊕ research ⊕ rss ⊕ scan ⊕ scanner ⊕ science ⊕ search ⊕ semantic_web ⊕ semanti_web ⊕ sentiment ⊕ soap ⊕ software ⊕ solr ⊕ spider ⊕ statistics ⊕ stats ⊕ statsitcs ⊕ syllabi ⊕ taggng ⊕ text ⊕ textbook ⊕ ui ⊕ viz ⊕ web ⊕ web2 ⊕ webdev ⊕ web_services ⊕ wiki ⊕ windows ⊕Copy this bookmark: