bell notificationshomepageloginedit profileclubsdmBox

10% popularity   0 Reactions

Sigil's regex engine is kind of fussy; I'm not sure that the anonymous answer above would work.

I answered this on Quora, but I'll post the answer again here, because it seems like that would be helpful:

Ah. Let me guess. You're converting from a PDF. You have my sympathy. :-)

I've done this. Here's the search expression I used:

([a-z]|,|;)</p>s+<p>

That found paragraphs that ended without a period, exclamation point, question mark, or right double quotes. (Since I'm assuming you're working from a PDF of a print document, I didn't include in the query all of the other possible marks, since, for instance, a left double quotation mark is extremely unlikely to come at the end of a text line.)

The replace expression was simply this:

1

Note that there's a space after the wildcard, so that you don't end up smooshing words together.

That should get rid of the unwanted paragraph breaks and splice your text together properly.

Unfortunately, if you've got letters reproduced in the book or some other text format that has paragraphs ending in commas, semicolons, or lower case letters, you'll have to do Replace/Find and go case by case rather than Replace All.


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (0)

Login to follow story

More posts by @Jennifer

0 Comments

Sorted by latest first Latest Oldest Best

 

Back to top