bell notificationshomepageloginedit profileclubsdmBox

10.04% popularity   0 Reactions

I am generating HTML output for ePub from LaTeX source.
But, am having difficulty eliminating the "Chapter " at the start of each chapter. So, wondering if there is an easy way to tell the HTML reader that this is some form of comment so as to not display this in the output?
<span class="titlemark">Chapter&#x00A0;1</span>

Ideally, I'd like to eliminate the <br /> that follows this as well, but not sure that that would be easy.
Notes:

The command line tool I am using htlatex which takes a .tex file and produces HTML.


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (4)

Login to follow story

More posts by @RJ

4 Comments

Sorted by latest first Latest Oldest Best

10% popularity   0 Reactions

You can try to suppress both those tags using CSS in an included .css file (or inserting in the HTML code between <style> and </style>):

span.titlemark, span.titlemark + br {
display: none;
}

But you would have to test that on all devices to see if their renderers correctly handle it.¹

If you don't want to go into the effort of testing this, it is better to remove both nodes altogether, with appropriate parsing of the input. Using python (2.x) and the BeautifulSoup package², you can do:

import sys
import io
from bs4 import BeautifulSoup

with io.open(sys.argv[1]) as fp:
soup = BeautifulSoup(fp)

for node in soup.select("span.titlemark"):
print node.get_text()
sibling = node.find_next_sibling()
if sibling and sibling.name == 'br' and not sibling.get_text():
sibling.extract()
node.extract()

with io.open(sys.argv[1], 'w') as fp:
fp.write(unicode(soup))

to get rid of both.³ BeautifulSoup supports several html/xml parser, depending on the type and quality of the output of htlatex, you might need to experiment with the alternatives to get better/faster results.

htlatex is a shell script, so you could make a copy (/usr/local/bin/htlatexstrip) and add calling the python script as a postprocessing step in there.

¹ The X + Y suppresses the sibling <br /> node
² install with pip install beautifulsoup4 or easy_install beautifulsoup4
³ I am sure you can do something like that easily in PERL (or Ruby) as well, I just don't know how


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (0)

10% popularity   0 Reactions

If you want them to remain in the source code, but simply hide them in the browser output, you might add the following statement to your style sheets :

.titlemark {
display:none;
}


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (0)

 

@Katie

10% popularity   0 Reactions

If you have an html editor which supports wild card or RegEx search and replace like Adobe Dreamweaver you could delete these out of the source code very quickly. See image below-

Also you can go to RegEx tester sites like these to form the most effective search expression for your needs. regex101.com/


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (0)

10% popularity   0 Reactions

You could try using CSS to style the titlemark class with "visibility: hidden" or "display:none".


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (0)

 

Back to top