berberich + ocr   3

Doc⚡split
Docsplit is a command-line utility and Ruby library for splitting apart documents into their component parts: searchable UTF-8 plain text via OCR if necessary, page images or thumbnails in any format, PDFs, single pages, and document metadata (title, author, number of pages...)
ruby  ocr  library  document  image 
september 2010 by berberich
ocropus - Google Code
open source document analysis and OCR system
ocr  google  opensource  software  code  tools 
june 2007 by berberich

Copy this bookmark:



description:


tags: