jonty + document   3

Doc⚡split
"Docsplit is a command-line utility and Ruby library for splitting apart documents into their component parts: searchable UTF-8 plain text via OCR if necessary, page images or thumbnails in any format, PDFs, single pages, and document metadata (title, author, number of pages...)"
ruby  pdf  document  parsing  ocr  documents  data  processing  split  from delicious
december 2010 by jonty
Pandoc
Pandoc is a tool for converting from one markup format to another. It can read markdown and (subsets of) reStructuredText, HTML, and LaTeX, and it can write markdown, reStructuredText, HTML, LaTeX, ConTeXt, PDF, RTF, DocBook XML, OpenDocument XML, ODT, GNU Texinfo, MediaWiki markup, groff man pages, and S5 HTML slide shows.
convert  document  documentation  html  markdown  pdf  text  restructuredtext  writing  latex  conversion  docbook  haskell  markup  pandoc  publishing  tex  converter 
march 2010 by jonty
WaveNZ Development: Introduction to Operational Transformation
OT is a technology for supporting concurrent editing of single shared document by a number of parties. OT provides for a way of transmitting the edits that one party is performing against the document to all other parties, but much more importantly it provides a way to fix issues that arise along the way from concurrent edits being performed.
algorithms  google  text  realtime  distributedsystems  distributed  editing  editor  document  concurrent 
august 2009 by jonty

Copy this bookmark:



description:


tags: