
: Command line extraction of metadata (title. author) from epub file I would like to use a command line to extract the title of a book (possibly also other metadata) from its epub file and return
I would like to use a command line to extract the title of a book
(possibly also other metadata) from its epub file and return it as a
string.
I am using Linux, but my guess is that the question makes sense in any
other environment.
I was expecting to easily find a clear and simple answer by serching
the web. But that is not apparently the case, or I am still to
ignorant to recognize answers.
The command could look like: getbookinfo -m title myfile.epub
and would yield the title of the book found in the epub file.
Free books android app tbrJar TBR JAR Read Free books online gutenberg
More posts by @Ashley

: Where should font licenses be embedded? I am using a font that is licensed under the Open Font License which states: 2) Original or Modified Versions of the Font Software may be bundled,

: OCR means Optical Character Recognition and concerns issues related to the translation of an image of a typed, handwritten or printed text into a structure including a character by character
6 Comments
Sorted by latest first Latest Oldest Best
Ok, so since nobody has ANY info on this online and this is the top search results. Here is what you are looking for. exiftool -a -u -g1 nameofthe.epub. This line gives you a giant output of all the metadata tags you can extract using exiftool. exiftool -t -Title -Creator -PublicationDate -CreatorFile-as nameofthe.epub this is how you use it. If you push -t lowercase it shows you the description of the tags you pulled. If you run -T uppercase, it just shows the tags. Your welcome. Enjoy.
Edit: Also, since exiftool is a GIANT PITA and their help is semi useless. Here is a few scripts I wrote for it that might help someone out. This took me forever... These are formatted for batch files with %% and ". You may need to edit
This moves any epubs in folder DIR into folders sorted by their Creator (Author).
for %%f in (DIR*.epubs) do (exiftool "%%f" "-Directory<Creator")
This renames any epubs in a folder, to the Title and Author
for %%f in (01-input*.epub) do (exiftool "-filename<$title - $creator.%%e" "%%f")
To make them recursive into subdirectories you need to change the for formatting. Example
for /R "01-input" %%f in (*.epub) do (exiftool "-filename<$title - $creator.%%e" "%%f")
Edit2: On the off chance your tags contain "illegal" characters, outlined here (https://exiftool.org/filename.html), place the tag inside {} and put a ; at the end of the tag. This strips those out so it works. Example script.
for /R "01-input" %%f in (*.epub) do (exiftool "-filename<${title;} - ${creator;}.%%e" "%%f")
Free books android app tbrJar TBR JAR Read Free books online gutenberg
To print out epub metadata,
exiftool yourbook.epub
To edit the metadata, you'd need to unzip it and then zip it back again, as mentioned by others, epub is just a zip file
mkdir tempfolder
mv yourbook.epub tempfolder
cd tempfolder
unzip yourbook.epub
mv yourbook.epub ../youroldbook.epub
Find and edit the metadata in opf file
find . -iname '*.opf'
Metadata would most probably be in the <dc:title> tag,
<dc:title id="pub-title">TITLE OF YOUR BOOK</dc:title>
zip it after editing
zip -rX ../YOUR_BOOK.epub mimetype META-INF/ .
Free books android app tbrJar TBR JAR Read Free books online gutenberg
Expounding on @pheon 's answer, I created this shell script
#!/bin/bash
exiftool="/usr/local/bin/exiftool"
filename=$(basename "")
extension="${filename##*.}"
filename="${filename%.*}"
directory=$(dirname "")
newfilename=`${exiftool} -T -Title `
echo mv "" "${directory}/${newfilename}.${extension}"
mv "" "${directory}/${newfilename}.${extension}"
Just run the script, passing the pub filename and it will rename it in-place to be the title of the eBook. Note that it is likely there will be spaces in the filename.
Free books android app tbrJar TBR JAR Read Free books online gutenberg
exiftool can read (but not write) epub meta data. For example
exiftool -T -Title main.epub
Free books android app tbrJar TBR JAR Read Free books online gutenberg
Here's a quick bash script (with no error-checking whatsoever) to do what you want:
#!/bin/bash
#
if [ $# -lt 3 ]
then
echo
echo "Usage: minfo -m <meta-type> <epub-file>"
echo
else
fileloc=`unzip -l "" | grep -Po 'b[^s-]*.opfb'`
metafound=`zipgrep '<dc:''>(.*)</dc:''>' "" $fileloc`
echo `expr "$metafound" : '.*<dc:''>(.*)</dc:''>.*'`
fi
This uses unzip -l to find out where the .opf file is in the .epub (normally OEBPScontent.opf, but it can be named anything as long as it has the .opf extension). Then it uses zipgrep to find occurrences of the desired metadata type in that file. Finally, strip off the tags to leave just the metadata.
And here's a test run:
beaker$ ./minfo -m title Make_Electronics.epub
Make: Electronics
beaker$ ./minfo -m publisher Make_Electronics.epub
O'Reilly Media, Inc.
beaker$ ./minfo -m subject Make_Electronics.epub
beaker$
That last line is blank because the metadata entry for subject in the opf file is:
<dc:subject/>
Free books android app tbrJar TBR JAR Read Free books online gutenberg
An EPUB file is just a zip file, and the book's metadata is contained in the OPF (Open Packaging Format) file, which is an XML file. The title is located in the /package/metadata/dc:title element. The other info you're looking for is probably also in children of the metadata element. Here's a good intro to the EPUB format: www.ibm.com/developerworks/xml/tutorials/x-epubtut/.
I'm not very familiar with Linux, so I don't know exactly how you'd do all this from the command line. I suspect it would involve writing or finding a script that would uncompress or dig around in the EPUB file, find the OPF file, parse its XML, and locate the element with the metadata you specify, perhaps using XPath.
Free books android app tbrJar TBR JAR Read Free books online gutenberg