Abhay Rana aka Nemo


A fun side-project that I often pick up is generation of semantically correct ebooks from various online books and content.


Here’s a list of the interesting books I’ve covered:

Non Fiction

Never Say You Can’t Survive by Charlie Jane Anders [source]
Ebook generated from Never Say You Can’t Survive, by Charlie Jane Anders. A how-to book about the storytelling craft, but it’s also full of memoir, personal anecdote, and insight about how to flourish in the present emergency. Originally serialized at Tor.com. Now available at Amazon.com, Indiebound, Barnes and Noble, and Audible.
Site Reliability Engineering [source]
Google’s book on Site Reliability Engineering. Published by Google as a online read under CC BY-NC-ND 4.0.
The Site Reliability Workbook [source]
The Site Reliability Workbook is the hands-on companion to the bestselling Site Reliability Engineering book and uses concrete examples to show how to put SRE principles and practices to work. Published by Google as a online read under CC BY-NC-ND 4.0.
Security Engineering — Third Edition by Ross Anderson [source]
Comprised of the third-edition, currently under review PDFs. Ebook available via Wiley and Amazon.


The Ickabog [source]
EBook generated from The Ickabog, written by J.K Rowling, and published online in seralized format.
Hoshruba (Translated by Musharraf Ali Farooqi) [source]
Tor’s serialized publication of Hoshruba-The Land and the Tilism. This is just Vol 1 of the planned translation of the famous persian epic, Hamzanama.
Magic Muggle [source]
Harry Potter fan-fiction about a muggle going to Hogwarts. (3rd book ongoing abandoned)

I also went around converting all of the original fiction published on Tor.com using url-to-epub. If you’d like a copy, reach out.

Brandon Sanderson

The Lost Metal [[source][lm-source]]
Initial chapters (1-19) from The Lost Metal, Book 4 in Mistborn Era 2m being published by Tor in serialized format.
Rhythm of War [source]
Initial chapters (1-19) from Rhythm of War, Book 4 in the Stormlight Archive, being published by Tor in serialized format.
Oathbringer [source]
Chapters 0-32 from Oathbringer, Book 3 in the Stormlight Archive, being published by Tor in serialized format.
Warbreaker Prime: Mythwalker [source]
Initial draft of Warbreaker by Brandon Sanderson, before various parts of it were incorporated into Mistborn.
Way of Kings Reread [source]
Tor re-read of Way of kings, Book 1 in the Stormlight Archive.
Words of Radiance Reread [source]
Tor re-read of Words of Radiance, Book 2 in the Stormlight Archive.
Oathbringer Reread [source]
Tor re-read of Oathbringer, Book 3 in the Stormlight Archive.
Edgedancer Reread [source]
Tor re-read of Edgedancer, a novella between Books 2 & 3 in the Stormlight Archive.
Defending Elysium by Brandon Sanderson [source]
Brandon Sanderson’s short story, part of the Skyward universe. Published on his website.
Skyward by Brandon Sanderson [source]
Chapters 1-15 of the book. Published at getunderlined.com.

Things next on my list:

I only provide the scripts for the generation on GitHub. If you’d like a copy without having to run the script, get in touch and ask.


To help me in these various projects, I’ve ended up some tooling:

pystitcher stitches your PDF files together, generating nice customizable bookmarks for you using a declarative markdown file as input.
A command line application to support downloading and stitching of books from Project MUSE. Project MUSE is a leading provider of digital humanities and social science content for the scholarly community around the world, run by the Johns Hopkins University. This generates well-polished PDFs with proper metadata.
Generates a metadata.xml file for an EPUB from various online sources, can be used with pandoc.
A simple zero-config script that generates a standards-compliant EPUB from a webpage. Zero config. Requires pandoc


what-to-read ![][dead]
Peppers your goodreads to-read list with amazon links. Sadly dead now, because Amazon broke its API.
Minerva ![][dead]
Minerva is a simple ebook scanning system, which uses amazon’s Product Search API along with google book search to generate metadata for each book. No longer maintained, but might be useful for someone wanting to do full-text indexing in PHP.


Here are links to other similar tools that you might like:

As a side-effect, I often get into the rabbit hole of generating the perfect EPUB instead of actually reading them.