
: Are there ways to automatically detect what book an e-text is? I know that these days, there are services (I think Google provides one) that can take a "digital fingerprint" of a song, and
I know that these days, there are services (I think Google provides one) that can take a "digital fingerprint" of a song, and help you catalogue that song, e.g. composer/band/genre/year.
Is there a similar service for books, where it would, based on text snippets, automatically tell you the author/title/year/ISBN?
Don't care if the service is packaged as a "book cataloging" app or merely a per-single-book API that I can use in my own software.
Free books android app tbrJar TBR JAR Read Free books online gutenberg
More posts by @RJ
1 Comments
Sorted by latest first Latest Oldest Best
I am not aware if it is such a tool/service/API, and generally publishers don't offer APIs IMHO mainly because copyright infringement sites or concurrent businesses might use them.
So you need to take a custom approach, using URL because most of the sites use GET method to do their queries and do some data-mining using scripts (wget/selenium etc).
You could do like this:
Search for exact text in google
ex search:
"Numerical boundaries take many forms but are always applied in finite games. Persons are selected for finite play."
Look for ISBN in resulted pages or for title and author using regular expressions or CSS selectors, XPATH etc.
Search using advanced query on amazon or other site: www.amazon.com/gp/search/ref=sr_adv_b/?search-alias=stripbooks&unfiltered=1&field-isbn=1476731713
notice &field-isbn=1476731713 same could be used for &field-author= or &field-title=
Use regular expressions to extract all the book data.
This would be my approach.
Free books android app tbrJar TBR JAR Read Free books online gutenberg