DEV Community

Convert any .pdf file 📚 into an audio 🔈 book with Python

Mustafa Anas on January 07, 2020

(edit: I am glad you all liked this project! It got to be the top Python article of the week!) A while ago I was messing around with google's Text...

Read full post

Steve (Gadget) Barnes • Jan 14 '20

I would suggest adding two lines to save the MP3 file to the same location and name as the PDF file.

from os.path import splitext

outname = splitext(filelocation)[0] + '.mp3'

then use:

final_file.save(outname)

Mustafa Anas • Jan 14 '20

That would be a nice add!

sadorect • Jan 14 '20

Oh, fantastic! I was looking to add this by myself but I don't know python coding. Thanks for bringing it up!

Ashwanth • Jan 10 '20

I am really intrigued by this article. I tried everything to install pdftotext lib on my mac but was unsuccessful. I keep getting this error --> " error: command 'gcc' failed with exit status 1"
I installed OS dependencies , Poppler using brew but didn't work. Can you anyone help me?

Mustafa Anas • Jan 10 '20

make sure you have these two installed:
python-dev
libevent-dev

Ashwanth • Jan 10 '20

Yup i installed them . NO matter what i do, i keep getting this error --> "ERROR: Command errored out with exit status 1"
and i installed gcc too!

Kelvin Thompson • Jan 10 '20 • Edited

I just started getting the same thing on my system (Ubuntu). After a lot of Google/StackExchange, this worked (copy from my annotations):

For whatever reason, in order to install the following two, I had to install some stuff on my Ubuntu Mate ** system-wide ** to get rid of compile errors:

sudo apt-get install python3-setuptools python3-dev libpython3-dev
sudo apt-get update
sudo apt-get install build-essential libpoppler-cpp-dev pkg-config python-dev

I'm using PyCharmCE. After the above, I could use this in the PyCharm terminal:

pip3 install pdftotext
pip3 install gtts

After I did all of that, successful! Program works like a charm (hehe).

Cheers!

Mustafa Anas • Jan 11 '20

Thanks for sharing your solution!

Kelvin Thompson • Jan 11 '20

A pleasure to finally be able to give back a little!

Ashwanth • Jan 11 '20

I have a Mac, brother. Can't use app-get. what should i do now?

David Souza • Jan 14 '20

Are you using the default Python 2.7?? You may need to use Python 3.x

David Souza • Jan 14 '20

I got this working on the Mac using Python 3.7.4 using virtual env and brew. Works fine.

Jogesh • Jan 14 '20

I am using docker with my Macbook without any issue. And it is a great alternative to start working on any environment, stack, etc.

Rohit Prasad • Jan 16 '20

They mention what all has to be installed for various O.S's in here pypi.org/project/pdftotext/

Harald Nezbeda • Jan 25 '20

Have you tried to install the OS dependencies as specified in the docs? github.com/jalan/pdftotext#macos

schwepmo • Jan 9 '20

Really cool and quick project! One thing I would suggest is to use python's join() method instead of looping over the list of strings. I think that's the more "pythonic" way and should also perform a little better.

Mustafa Anas • Jan 9 '20

Thanks for the tip!
I sure will start using that

Narendra Kumar Vadapalli • Jan 14 '20

I am on fedora and had to install the following dependencies to get this working before I could pip install pdftotext

Sequence would be

sudo dnf install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config
pip install pdftotext gtts

Kristina Gocheva • Jan 7 '20 • Edited

My favorite part is (if I am not mistaken) that this would work for any language PDF as long as google text to speech supports the language.

Mustafa Anas • Jan 7 '20

hahaha omg how could I not think about doing the research.
You're true.
check this out
cloud.google.com/text-to-speech/

SURAJ BRANWAL • Jan 8 '20

Thanks a lot for the article, I tried a lot finding such thing but now am able to read(listen) to all my untouched PDFs.

Mustafa Anas • Jan 8 '20

That was my intention.
Glad you liked it :)

SURAJ BRANWAL • Jan 15 '20

I tried this on Win10, but was unable to install pdftotext package in Python 3.8.
Hence, I did this using another way :

github.com/suryabranwal/TIL/blob/m...

sadorect • Jan 14 '20 • Edited

An observation here ( I'm sure this has to do with the gtts engine though ):

The reader would rather spell some words than pronounce the actual words and its a bit strange. I did a conversion where the word "first" was spelt rather than pronounced. Initially, I thought such occurs when words are not properly written and the text recognition engine is affected. "Five" was pronounced fai-vee-e,and other spellings like that.

Overall though, it is manageable and one can make good sense out of the readings. Now I can "read" my e-books faster with this ingenious solution.

Thanks again, @mustapha

Deepak Raj • Oct 25 '20

It will not work offline. Try AudioBook to listen offline.

Documentation:- audiobook.readthedocs.io/

Convert your Pdf in cool AudioBook with 3 lines of python code

CodePerfectPlus ・ Oct 23 ・ 1 min read

#python #opensource #hacktoberfest #github

sadorect • Jan 14 '20

This is a life-saving procedure you shared. I tried it and works like charm. Thank you so very much.

I have a question though...
I know this is a simplistic approach to just explain the basics( and its awesome). Please, is it possible to change the reader's voice and reading speed?

Mustafa Anas • Jan 15 '20

I am glad you liked it!
The intention of all my writings is to be as simple as possible so all-levels readers can understand.
If you wish to know more about customizing this API, please check this page:
gtts.readthedocs.io/en/latest/

dennisboscodemello1989 • Aug 13 '21

Is there any way to pop up an option for choosing the page from which the reading will start & option for choosing the pdf file is there, I am pasting the code

import pyttsx3 as py
import PyPDF2 as pd

pdfReader = pd.PdfFileReader(open('Excel-eBook.pdf', 'rb'))

from tkinter.filedialog import *

speaker = py.init()

voices = speaker.getProperty('voices')

for voice in voices:
speaker.setProperty('voice', voice.id)

book = askopenfilename()
pdfreader = pd.PdfFileReader(book)
pages = pdfreader.numPages

for num in range(0, pages): # O is the number from where the reading will start
page = pdfreader.getPage(num)
text = page.extractText()
player = py.init()
player.say(text)
player.runAndWait()

Abhinav Kumar Srivastava • Jan 14 '20

Really cool !
However , when I tried to convert a decent sized pdf file (3.0 MB) , I got the following error :

"gtts.tts.gTTSError: 500 (Internal Server Error) from TTS API. Probable
cause: Uptream API error. Try again later."

Is Gtts blocking me from using their API ? How shall I resolve this ?

Dima Naboka • Jan 15 '20 • Edited

I have a problem running [vagrant@centos8 ~]$ sudo pip3 install pdftotext on CentoOS8:
error: command 'gcc' failed with exit status 1
Command "/usr/bin/python3.6 -u -c "import setuptools, tokenize;file='/tmp/pip-build-7_3v7vuh/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-ac0irxfy-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-7_3v7vuh/pdftotext/

I'm running Python 3.6.8, do I have to use Python 3.8 explicitly?

abragred • Jan 2 '21 • Edited

This seems a very nice idea, might get my friend that knows how to do stuff with Python to get this done for me. I'm not at such a high level of technology use to be able to do this stuff alone, but I'd like to learn. One thing that I'm proud of myself that I can help my family with is working with PDF forms. I especially help my mom a lot cause she has a lot of forms to fill for her job, but has a big lack of technology talent. I found this site pdfliner.com/alternative/sejda_alt... that lets me edit anything I want.

Kushagra0347 • Nov 20 '20

This code gets stuck after I add PDF. can anyone provide any solution to this?

from tkinter import *
import pygame
import PyPDF2
from gtts import gTTS
from tkinter import filedialog
from os.path import splitext

root = Tk();
root.title('PDF Audio Player')
root.geometry("500x300")

Initialise Pygame Mixer

pygame.mixer.init()

Add PDF Function

def addPDF():
PDF = filedialog.askopenfilename(title="Choose a PDF", filetypes=(("PDF Files", "*.PDF"), ))
PDF_dir = PDF

# Strip Out the Directory Info and .pdf extension
# So That Only the Title Shows Up
PDF = PDF.replace('C:/Users/kusha/Downloads/', '')
PDF = PDF.replace(".pdf", '')

audioBookBox.insert(END, PDF)
PDFtoAudio(PDF_dir)

def PDFtoAudio(PDF_dir):
file = open(PDF_dir, 'rb')
reader = PyPDF2.PdfFileReader(file)
totalPages = reader.numPages
string = ""

for i in range(0, totalPages):

    page = reader.getPage(i)

    text = page.extractText()

    string += text

outName = splitext(PDF_dir)[0] + '.mp3'

audioFile = gTTS(text=string, lang='en')  # store file in variable

audioFile.save(outName)  # save file to computer

Play Selected PDF Function

def play():
audio = audioBookBox.get(ACTIVE)
audio = f'C:/Users/kusha/Downloads/{audio}.mp3'

pygame.mixer.music.load(audio)

pygame.mixer.music.play(loops=0)

Create Playlist Box

audioBookBox = Listbox(root, bg="black", fg="red", width = 70, selectbackground="gray", selectforeground="black")
audioBookBox.pack(pady=20)

Define Player Control Button Images

backBtnImg = PhotoImage(file='Project Pics/back50.png')
forwardBtnImg = PhotoImage(file='Project Pics/forward50.png')
playBtnImg = PhotoImage(file='Project Pics/play50.png')
pauseBtnImg = PhotoImage(file='Project Pics/pause50.png')
stopBtnImg = PhotoImage(file='Project Pics/stop50.png')

Create Player Control Frame

controlsFrame = Frame(root)
controlsFrame.pack()

Create Player Control Buttons

backBtn = Button(controlsFrame, image=backBtnImg, borderwidth=0)
forwardBtn = Button(controlsFrame, image=forwardBtnImg, borderwidth=0)
playBtn = Button(controlsFrame, image=playBtnImg, borderwidth=0, command=play)
pauseBtn = Button(controlsFrame, image=pauseBtnImg, borderwidth=0)
stopBtn = Button(controlsFrame, image=stopBtnImg, borderwidth=0)

backBtn.grid(row=0, column=0, padx=10)
forwardBtn.grid(row=0, column=1, padx=10)
playBtn.grid(row=0, column=2, padx=10)
pauseBtn.grid(row=0, column=3, padx=10)
stopBtn.grid(row=0, column=4, padx=10)

Create Menu

myMenu = Menu(root)
root.config(menu=myMenu)

Add the converted audio file in the menu

addAudioMenu = Menu(myMenu)
myMenu.add_cascade(label="Add PDF", menu=addAudioMenu)
addAudioMenu.add_command(label="Add One PDF", command=addPDF)

root.mainloop()

Priyanshu Kumar • Oct 10 '20

will it also read page number, footer or any extra garbage text?

Vaibhav Kaushik • Nov 5 '20 • Edited

Yes, of course as they are also a type of text.

Priyanshu Kumar • Nov 5 '20

Using Machine learning you can avoid those things.

Belkin • Jan 7 '20

Do you have any demo audio files? I'm really interested to hear it. :)

Mustafa Anas • Jan 7 '20

Run this code and hear the result

from gtts import gTTS
final_file = gTTS(text='Demo String', lang='en')  # store file in variable
final_file.save("Generated Speech.mp3")  # save file to computer