A fun side-project that I often pick up is generation of semantically correct ebooks from various online books and content.
Here’s a list of the interesting books I’ve covered:
- Never Say You Can’t Survive [source]
- Ebook generated from Never Say You Can’t Survive, by Charlie Jane Anders. Originally serialized at Tor.com.
- Site Reliability Engineering [source]
- Google’s book on Site Reliability Engineering. Published by Google as a online read under CC BY-NC-ND 4.0
- The Site Reliability Workbook [source]
- The Site Reliability Workbook is the hands-on companion to the bestselling Site Reliability Engineering book and uses concrete examples to show how to put SRE principles and practices to work. Published by Google as a online read under CC BY-NC-ND 4.0.
- Security Engineering — Third Edition by Ross Anderson [source]
- Comprised of the third-edition, currently under review PDFs. Only available till it goes to press.
- The Ickabog [source]
- EBook generated from The Ickabog, written by J.K Rowling, and published online in seralized format.
- Hoshruba (Translated by Musharraf Ali Farooqi) [source]
- Tor’s serialized publication of Hoshruba-The Land and the Tilism. This is just Vol 1 of the planned translation of the famous persian epic, Hamzanama.
- Magic Muggle [source]
- Harry Potter fan-fiction about a muggle going to Hogwarts. (3rd book
I also went around converting all of the original fiction published on Tor.com using url-to-epub. If you’d like a copy, reach out.
- Rhythm of War [source]
- Initial chapters (1-19) from Rhythm of War, Book 4 in the Stormlight Archive, being published by Tor in serialized format.
- Oathbringer [source]
- Chapters 0-32 from Oathbringer, Book 3 in the Stormlight Archive, being published by Tor in serialized format.
- Warbreaker Prime: Mythwalker [source]
- Initial draft of Warbreaker by Brandon Sanderson, before various parts of it were incorporated into Mistborn.
- Way of Kings Reread [source]
- Tor re-read of Way of kings, Book 1 in the Stormlight Archive.
- Words of Radiance Reread [source]
- Tor re-read of Words of Radiance, Book 2 in the Stormlight Archive.
- Oathbringer Reread [source]
- Tor re-read of Oathbringer, Book 3 in the Stormlight Archive.
- Edgedancer Reread [source]
- Tor re-read of Edgedancer, a novella between Books 2 & 3 in the Stormlight Archive.
- Defending Elysium by Brandon Sanderson [source]
- Brandon Sanderson’s short story, part of the Skyward universe. Published on his website.
- Skyward by Brandon Sanderson [source]
- Chapters 1-15 of the book. Published at getunderlined.com.
Things next on my list:
I only provide the scripts for the generation on GitHub. If you’d like a copy without having to run the script, get in touch and ask.
To help me in these various projects, I’ve ended up some tooling:
- pystitcher stitches your PDF files together, generating nice customizable bookmarks for you using a declarative markdown file as input.
- Wrote a crystal app to support downloading and stitching of books from Project MUSE. Project MUSE is a leading provider of digital humanities and social science content for the scholarly community around the world, run by the Johns Hopkins University. This generates well-polished PDFs with proper metadata.
- Generates a metadata.xml file for an EPUB from various online sources, can be used with pandoc.
- A simple zero-config script that generates a standards-compliant EPUB from a webpage. Zero config. Requires pandoc
- what-to-read ![dead]
- Peppers your goodreads to-read list with amazon links. Sadly dead now, because Amazon broke its API.
- Minerva ![dead]
- Minerva is a simple ebook scanning system, which uses amazon’s Product Search API along with google book search to generate metadata for each book. No longer maintained, but might be useful for someone wanting to do full-text indexing in PHP.
Here are links to other similar tools that you might like:
As a side-effect, I often get into the rabbit hole of generating the perfect EPUB instead of actually reading them.