
: How to remove hard hyphens? Is there any reasonably easy way to remove "hard" hyphens from an eBook? For example, I am reading a book by Ted Dekker where there are many places where a word
Is there any reasonably easy way to remove "hard" hyphens from an eBook? For example, I am reading a book by Ted Dekker where there are many places where a word has a hy-phen in the middle. (Not picking on Ted here -- lots of books seem to have the problem.)
Each one acts like a visual "speed-bump" to me. It seems that someone at some point manually hyphenated line endings, forgetting that a different screen size, font choice/size, etc. would require different hyphenation.
Ideas?
Free books android app tbrJar TBR JAR Read Free books online gutenberg
More posts by @Sara

: Do e-ink devices suffer from burn-in? Back in the bad old days of CRT computer monitors, computer users were confronted with the problem of burn-in, in which a monitor that displayed the same
2 Comments
Sorted by latest first Latest Oldest Best
If the book has not a DRM and you are free to edit/convert it, you can use Calibre to convert to the same format (i.e. you can use an .epub as a source file and still convert to an .epub output.
In the Heuristic processing tab make sure to enable both the processing and Remove unnecessary hyphenation; if you want you can also enable other options here if you need to.
This method use the source document as a dictionary to verify if the hyphenated word is also spelled as a single word. Of course, if a given hyphenated word has a single occurrence it won't be corrected. From Calibre user manual:
calibre will analyze all hyphenated content in the document when this
option is enabled. The document itself is used as a dictionary for
analysis. This allows calibre to accurately remove hyphens for any
words in the document in any language, along with made-up and obscure
scientific words. The primary drawback is words appearing only a
single time in the document will not be changed. Analysis happens in
two passes, the first pass analyzes line endings. Lines are only
unwrapped if the word exists with or without a hyphen in the document.
The second pass analyzes all hyphenated words throughout the document,
hyphens are removed if the word exists elsewhere in the document
without a match.
If you happen to have the the source .pdf, where all hyphenated words are placed at the end of their respective lines, you can use the Search & Replace tab to add another tweak (I've personally tested and used it many times).
You must enter -<br> in the Search box and leave completely blank the Replacement one, and be sure to click Add to, well, add the rule to the conversion. This only works, as I've stated, if the hyphens are correctly placed at the end of their lines, so, in most cases, you need to have access to the source file. You can also click on the Wizard button to open a preview of the file, where you can preview what your source regex code (in this case -<br>) will select and delete (because we left blank the replacement box).
Free books android app tbrJar TBR JAR Read Free books online gutenberg
Such hard hyphens are usually the result of a bad conversion from pdf to epub/mobi. I got them in books I regularly bought, so it's not a problem of piracy: my impression is that even today publishers start with a pdf file (made with InDesign or similar), and they forgot that some words are hyphenated.
If the book had no DRM, sometimes I decided to use Sigil and painstakingly search for all occurrences, but this is time-consuming.
Free books android app tbrJar TBR JAR Read Free books online gutenberg