bell notificationshomepageloginedit profileclubsdmBox

10% popularity   0 Reactions

Scanned texts are most efficiently stored in the DJVU format, if lossy compression is acceptable (if not, use a multi-page format like TIFF).
If you convert scans to the DJVU format with OCR recognition enabled, you can extract the OCR-ed text and use if for EPUB generation.

On Linux you can do so using djvutxt to get the text and convert that to EPUB.

A more comfortable way of extraction/conversion is using Calibre to convert the text in the DJVU file to EPUB, this works on Linux and Windows. The Linux version uses djvutxt to extract the text if available, if not it falls back to Python based extraction of the (non-standard compressed) text stream. Windows always uses the slower Python based extraction.

(This is a shameless plug for the calibre plug-in that I wrote a few years ago for exactly this purpose).


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (0)

Login to follow story

More posts by @Lorraine

0 Comments

Sorted by latest first Latest Oldest Best

 

Back to top