
: Identify if a PDF document is digital native Is it possible to know if a given PDF is digital native or has gone through some kind OCR or digitization process. If yes, is it possible to
Is it possible to know if a given PDF is digital native or has gone through some kind OCR or digitization process.
If yes, is it possible to do so programmatically?
Free books android app tbrJar TBR JAR Read Free books online gutenberg
More posts by @Debbie

: How to split an EPUB into multiple files of one chapter? An EPUB book has multiple chapters. What method to split it into multiple files, each of them consist one chapter, effectively?

: Does .epub support inline links? I want to better understand what options the .epub format have to provide more dynamic experiences than just reading a text linearly. Is there a way to have
1 Comments
Sorted by latest first Latest Oldest Best
PDF documents embed all its fonts within the document. You can get these embedded fonts information programmatically with most PDF libraries.
If there are no embedded fonts then the PDF at hand is a scanned one.
If there are only one or two fonts, then the document is OCRized.
If there are three or more fonts, then the document is a digital native.
Potentially, there can be digital native PDF with one or two fonts only but luckily for me this is an acceptable error rate.
Free books android app tbrJar TBR JAR Read Free books online gutenberg