docsplit 8
Doc⚡split
25 days ago by subosito
Docsplit is a command-line utility and Ruby library for splitting apart documents into their component parts: searchable UTF-8 plain text via OCR if necessary, page images or thumbnails in any format, PDFs, single pages, and document metadata (title, author, number of pages...)
gems
docsplit
pdf
ruby
25 days ago by subosito
anderser's pydocsplit at master - GitHub
january 2010 by bycoffe
Python implementation of DocumentCloud's Docsplit utility
pdf
data
documentcloud
docsplit
january 2010 by bycoffe
Copy this bookmark: