bell notificationshomepageloginedit profileclubsdmBox

10% popularity   0 Reactions

The size of a PDF file is dependent on the content of that file. A PDF file is a bundle of streams, with mostly compressed data.

If you generate a PDF file from e.g. a Word or OpenOffice document, these file tend to be relatively small, especially if you do not include Font information and rely on the system provided fonts or font substitutions.
Adding images to your text will make the files much larger.

Since PDF is one of the minority of image file formats that support multiple images, it is often (mis-)used to store multiple images, that e.g. come from a scan. Those scans are often already compressed JPEG images and for those the PDF file works only as a container (no, or little further compression is possible). For those kinds of PDF files, the size can be very large, depending on the pixel size of the images (scan resolution x paper format) and in case of lossy compression (JPEG) the quality of the compression.

If you extract such lossy image files to a lossless format like PNG immediately blows up each of the images often by an order of magnitude. So your results are not surprising.

It would be much better to just extract the individual pages of the file into separate PDF files and recombine only the pages that you need. This can be done without having to decompress the streams containing the imagery e.g. by a program like pdftk. If you pick half of the pages of a book, you can expect to have halve the size document in the end (on average).


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (0)

Login to follow story

More posts by @Lorraine

0 Comments

Sorted by latest first Latest Oldest Best

 

Back to top