
: How can I convert an HTML site into an ebook? I'd like to create a .mobi file from a website like the Python Tutorial, to read on my Kindle. Is it possible to do it on Ubuntu?
I'd like to create a .mobi file from a website like the Python Tutorial, to read on my Kindle.
Is it possible to do it on Ubuntu?
Free books android app tbrJar TBR JAR Read Free books online gutenberg
More posts by @Sarah
8 Comments
Sorted by latest first Latest Oldest Best
bookn.net is an option, it can generate EPUB and later you can use Calibre to converted it into a MOBI
Free books android app tbrJar TBR JAR Read Free books online gutenberg
Another option is dotepub.com, which does not require the installation of any additional programs, works across all platforms and is free.
Free books android app tbrJar TBR JAR Read Free books online gutenberg
I went nuts figuring this out. Developed mile long Xpath strings and what not. Then it turned out to be ridiculously simple. I am ignoring op systems in this answer...
(a) for the conversion to epub use Calibre. It imports as a zip file. No need to make a zip file beforehand. Then you convert the zip to ebook in Calibre.
(b) spider the web site: Cyotek WebCopy or HTTrack Website Copier will do just fine. This will create a browsable mirror of the web site on your local device.
(if the web has an index page or site map listing the pages, your job is easy.)
(c) Take the site map or index or whatever page that does the best listing of what's in the web. Copy it, naming it "Table Of Contents". Edit it and change title to Table Of Contents. Change (or make one if there is none) the first header to h1>Table Of Contents. If needed, make your additions to the links in this file. ( I am assuming this file is the parent to all links. If other files refer back to this file, see if you can just use the original parent file without changing its file name as the "Table Of Contents" file. Just change its html title and header and see what happens.)
(d) to set up calibre, go to "preferences" ==> Common Options. Untick everything in the "Heuristic processing" and "Structure detection" pages. On the "Table Of Contents" page tick "Force use of auto generated Table Of Contents and Do not add detected chapters. ToC Level thingies should be blank also. That is it.
(d) Click "add book" button in Calibre and select your "Table Of Contents" file. Calibre will create the book in zip format. Then select your book and click "convert books". You have your ebook!
(e) if you want to make changes in your ToC after making the epub, it is easier with Sigil. Load your book in Sigil. go to tools==>Table Of Contents==> Edit Table Of Contents. Table Of Contents editor probably has what it takes to make the changes.
the tutorial site is very well organized. so here is another way to go about it.
You need to download the web contents with an app like httrack.
You can download httrack from apps.ubuntu.com/cat/applications/precise/webhttrack/
The best way to go may be to use Calibre to create an epub,
and then convert it to mobi again with calibre
You can download Calibre for Ubuntu from calibre-ebook.com/download_linux or apps.ubuntu.com/cat/applications/lucid/calibre/
The tutorial @ docs.python.org/3/tutorial/index.html is already well organized so it is doable.
Here is one way of going about it...
after you download it set up Calibre
click preferences -> common options
select structure detection
in Detect Chapters at xpath expression blank
paste: //[((name()='h1' or name()='h2' or name()='h3') and re:test(., 's((chapter|book|section|part)s+)|((prolog|prologue|epilogue)(s+|$))', 'i')) or @class = 'chapter']
select Table of Contents
check Force use of auto-generatyed table of Contents
for ToC level 1 select h1 with magic wand
for ToC level 2 select h2 with magic wand
for ToC level 3 select h2 with magic wand
That's it for Calibre setup
Now you can click the add books button
go to your download folder
go all the way in to the tutorial folder
and select index.html
calibre will create a book in zip format
select the book and click convert books button
you will end up with an epub with an acceptable ToC
to clean up the ToC
you can click edit book -> select table of contents and edit
then you can convert to mobi
sorry about the bold. I could not get rid of it.
Free books android app tbrJar TBR JAR Read Free books online gutenberg
In step 1 you can use Website to PDF to save entire website as one HTML file.
The html pages must be merged. That can be done online at websitetopdf.net/. The result have to be one large HTML file. At the and just have to close the print program.
If you work in Firefox, you can use PrintEdit addition to remove some menus,advertising banners and more.
You have to save the html file on your computer.
Using a Calibre you can convert the html file into the required e-book format (MOBI).
Free books android app tbrJar TBR JAR Read Free books online gutenberg
I had the exact same problem for a long time. You can produce good results with Calibre, but I found the process was a bit involved.
Instead I created a way to do this much more easily. It's a browser extension called EpubPress (https://epub.press).
All you do is:
Open the webpages you want to save in different tabs.
Open EpubPress
Select all the pages you want in your ebook
Download
The content from the pages gets extracted and stitched together into an ebook.
Hope that helps!
Free books android app tbrJar TBR JAR Read Free books online gutenberg
I prefer Calibre solution. Debian Calibre package come with ebook-convert utility.
Grab HTML files from site by:
$ wget -r -np -nc -k -c .../.../..
Locate your main HTML file (usually book.html or index.html) and convert to MOBI:
$ cd dir-with-index
$ ebook-convert index.html book.mobi
$ ebook-convert index.html book.fb2
$ ebook-convert index.html book.epub
Free books android app tbrJar TBR JAR Read Free books online gutenberg
One thought is the Sigil e-book-building software. It's available for Linux and I believe it would do what you need although there is some manual manipulation involved.
Free books android app tbrJar TBR JAR Read Free books online gutenberg
Of course you can. The easiest method is
save the whole site into a folder (e.g. mirror it with some command line tool like wget - it is available on Windows too - or Httrack)
zip the folder
send the zip to your device via Amazon e-mail solution.
Here's the official Amazon page documenting this feature.
If it's not what you want, you can use Calibre to convert HTML to various ebook formats, here is the official documentation.
And even more, you can download the official Kindlegen Linux CLI tool from Amazon and hand tailor the downloaded HTML with your favourite editor then convert it.
(And you can download the official Python docs in EPUB format which Amazon happily convert for you (the mailing method), and every other conversion option I had mentioned above will work with the EPUB too.)
Free books android app tbrJar TBR JAR Read Free books online gutenberg