
: Different size pages in DJVU book I have a djvu book that has different sized pages. How can I solve this by making all pages to same size. I am familiar with Python programming and I'm ready
I have a djvu book that has different sized pages.
How can I solve this by making all pages to same size. I am familiar with Python programming and I'm ready to try it. Any advise is appreciated.
Free books android app tbrJar TBR JAR Read Free books online gutenberg
More posts by @Twilah

: Software recommendation for writing ebook and exporting to multiple formats I am thinking about writing and/publishing an ebook with the following in mind: It will contain code (needs proper formatting)
2 Comments
Sorted by latest first Latest Oldest Best
If you want to try to make all your kindle reader there's other easy way around. try K2pdfopt it will help you resizing all pages to a specific screen size.
Free books android app tbrJar TBR JAR Read Free books online gutenberg
As all pages are having the same or very similar width and height, this seems a "simple" problem of some pages having the wrong resolution. Most pages have metadata that specifies 600 DPI others only 96 DPI. That later are then of course displayed much larger.
My Linux distribution comes with djvutoxml and the corresponding djvuxmlparser (from package djuvlibre-bin) which can extract the metadata, resp. merge the metadata back in. Those should be available for Windows as well (http://djvu.sourceforge.net/, make sure the executables are in your PATH) That metadata includes the DPI information from the file. Actual changing of the XML is fast, but extracting and merging takes a long (several minutes) time.
Make sure you have a copy of your book, in case the merging breaks something, and run python program.py book.djvu on this program.py:
import sys
import subprocess
book = sys.argv[1]
xml_in = 'in.xml'
xml_out = 'out.xml'
print('extracting XML')
subprocess.check_output(['djvutoxml', book, xml_in])
print('converting XML')
with open(xml_in) as inf:
with open(xml_out, 'w') as outf:
for line in inf:
if line.startswith('<PARAM name="DPI" value="96" />'):
line = line.replace('96', '600')
outf.write(line)
print('merging XML')
subprocess.check_output(['djvuxmlparser', '-o', book, xml_out])
print('done')
In general I am against parsing XML without a real parser, but you don't need regex or anything that easily breaks to get this information fixed.
The intermediate XML (two files) has the same order of size as the DjVu file itself, although the XML has no image information, it is just inefficient. Make sure you have enough room (and run this program on a fast/local drive)
There are 367 incorrect pages out of 1201, you might be able to speed up the process by only including the incorrect pages in the output XML, but then you should use an XML parser. If this is a one off conversion, I would not bother with such an optimisation.
Free books android app tbrJar TBR JAR Read Free books online gutenberg