bell notificationshomepageloginedit profileclubsdmBox

10% popularity   0 Reactions

Define what is: "fix it at the first place".

If you want to fix wrong output from OCR analysis, a simple solution on an infinite set of TOCs you will never make.
You will never apply all variations. You would have to create a machine learning algorithm that would analyze each TOC variant.

Or count substrings of the same characteristics (in simple TOC).

Chapter number
Chapter number
Chapter number
Chapter number
Chapter number
...

= 5

Chapter title
Chapter title
Chapter title
Chapter title
Chapter title
...

= 5

If you want to fix OCR analysis, it's a good to answer:
What OCR tool do you use?

For example, in Tesseract you can set, that text is processed by rows instead of columns.


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (0)

Login to follow story

More posts by @Megan

0 Comments

Sorted by latest first Latest Oldest Best

 

Back to top