DEV Community

Sean Brunnock
Sean Brunnock

Posted on

EPubbing

I've been an avid reader for as long as I can remember. When I bought my first house, I made sure there would be a dedicated library with lots of bookshelves.

Since then, I've become addicted to ebooks and can't abide paper books.

I feel terrible confessing this. Paper books have been a part of my life for so long and I thought the idea of reading a book on a smartphone was ridiculous. But my eyesight is deteriorating and ebooks make it so easy to set a comfortable font. Another advantage, since my phone only displays a paragraph at a time, it's forced me to slow down and savor the words. I can also look up any words I'm unfamiliar with instantly.

But, I've got a lot of books in my library that aren't available as ebooks. Time to learn to create them.

So, the EPUB format is an open format that is supported by almost all ereaders. It was started by the Open eBook Forum, but it's now maintained by the W3C.

EPUB is over 10 years old, and there have been some changes over time, so there's a lot of outdated docs out there. But of course, nobody ever points out that their docs are outdated. It's left as an exercise for the reader.

Most of the docs that I've read insist that you should use existing software to convert text to EPUB, but I found that Apple's Page app injects useless Javascript. So I decided to use my own technique.

I created a simple guide to EPUB's structure. It doesn't go into excessive detail. It just shows how to create a very simple ebook that complies with EPUB v3.2.

To create an ebook, I use the "Scan Text" feature on my iPhone's Notes app to convert the page to ASCII text. The Notes app syncs with my Apple account, so the text shows up on my desktop as well.

Next, I use my own software to convert the raw text into XHTML. I copy the text to my app and compare it to the original book and insert a mark (three backticks) at the start of each paragraph. My app can delete all of the existing newlines and then replace the backticks with newlines. I can then triple-click entire paragraphs and mark them up.

Notes is great at scanning words, but not so great with punctuation. All em dashes get turned into hyphens, for example. My app can highlight all of the puncuation so you can hopefully eyeball any mistakes.

My app removes hyphens from the ends of lines and combines word parts into single words. Sometimes, the hyphen at the end of a line was meant to be part of a hyphenated word. Chrome's textarea has a spellcheck feature that can highlight those mistakes.

My app doesn't create an entire XHTML file. It just marks up the raw text and I copy that into a template.

So far, I'e managed to turn Weaving the Web into an ebook which I'm now finally reading comfortably. Very strange that a book by the guy who invented the Web isn't available in electronic format.

Top comments (0)