docsplit   8

Doc⚡split
Docsplit is a command-line utility and Ruby library for splitting apart documents into their component parts: searchable UTF-8 plain text via OCR if necessary, page images or thumbnails in any format, PDFs, single pages, and document metadata (title, author, number of pages...)
gems  docsplit  pdf  ruby 
25 days ago by subosito
anderser's pydocsplit at master - GitHub
Python implementation of DocumentCloud's Docsplit utility
pdf  data  documentcloud  docsplit 
january 2010 by bycoffe

Copy this bookmark:



description:


tags: