bell notificationshomepageloginedit profileclubsdmBox

10% popularity   0 Reactions

I'm converting an RTF document to epub (and mobi) and am trying to extract a table of contents.

All of my chapters start with the word "CHAPTER" in all caps, but are not tagged in any special way. I'm using the following XPATH expression to find the chapters:

//*[re:test(., "CHAPTER")]

This is working except that my TOC looks like this:

As you can see, the table of contents includes the cover page text as several chapters that all point to the first page.

Also I'm getting duplicate entries for each chapter.

I was able to fix this with the manual TOC editor, but I'm wondering if I'm doing something wrong with my detection approach.


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (0)

Login to follow story

More posts by @Kimberly

0 Comments

Sorted by latest first Latest Oldest Best

 

Back to top