bell notificationshomepageloginedit profileclubsdmBox

10.03% popularity   0 Reactions

I know from sources I do not remember anymore, that an epub file is a valid zip file containing (among other files) chapters as *.xhtml files, a manifest file content.opf and a table of contents file toc.ncx.

What is the file and directory structure of an unpacked epub?
Which files must be there (with fixed names?) and are there optional files?
What information is stored in the single files?

I am asking for a very basic epub file, so the version should not matter, but assume epub2 in case it makes a difference for you.

PS: I will probably ask for the content and structure of those files later, but for now I only want to know about which files are mandatory or optional and an overview what they contain, to keep the question more specific and answers short.


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (3)

Login to follow story

More posts by @Mike

3 Comments

Sorted by latest first Latest Oldest Best

 

@Candy

10% popularity   0 Reactions

An EPUB file is just a zip file. Copy mybook.epub to mybook.zip, and use your favorite zip tool to open the zip file. You should see some XHTML files and images somewhere in there, along with an OPF file, manifest, and some other control files. Windows 7+ natively treats zip files as a folder.

XHTML is similar to HTML but is more strict, and the bold and italic tags are different.

Below is a screen shot of a test2.epub I copied to test2.zip. The screenshot only shows the root dir of the epub.


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (0)

10% popularity   0 Reactions

The files and directory structure of the EPUB files is specified in the OCF (OpenContainerFormat). There are two versions are most interesting: 2.0.1 and 3.0.1. Both specify only one required file in a specific subdirectory, and that is:

META-INF/container.xml

There are some optional files that can go in that directory as well (signatures.xml, encrytpion.xml, metadata.xml, rights.xml) and a file named manifest.xml is allowed there as well.

The container.xml refers to the full path of one or more files, which names are essentially free and the directory structure as well.

Of course some programs generate EPUB files always with the same structure. That is why it might seem that you need a content.opf in the root of the EPUB (zip) file structure, but that is only a valid name in any particular EPUB if and only if it is named in a <rootfile> element in the container.xml.

The contents file (with references to the individual) HTML files which together form the e-book could be:

TOC/TableOfContents.opf

and the HTML files could be

LOTR/The_Fellowship_of_the_Ring.htm
LOTR/The_Two_Towers.htm
LOTR/The_Return_of_the_King.htm

as long as the paths of files, specified internally starting from container.xml are correct.

As Mark pointed out a mimetype file needs to be present. Actually according to the 2.0.1 spec (page 7, bottom) that file has to be the first file in the EPUB file's ZIP structure.

The only names in the root directory reserved by the 2.0.1 are mimetype and META-INF. The use of a specific folder (LOTR in the example) is recommended (to prevent collisions when there are multiple renditions), but not required.


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (0)

 

@Katie

10% popularity   0 Reactions

EPUB is an open format so you can find the standard specifications online. Wikipedia has a good article on the EPUB format.

If you want a brief description of the characteristics you just mentioned you can find in this article here.

Directory structure:

EpubFolderYouWant
META-INF
container.xml
mimetype
content.opf
toc.ncx

Needed files:

mimetype
container.xml
content.opf
toc.ncx

And which information is contained in each file is described in the above mentioned article.


Free books android app tbrJar TBR JAR Read Free books online gutenberg


Load Full (0)

 

Back to top