DEV Community

loading...
Cover image for Convert any .pdf file 📚 into an audio 🔈 book with Python

Convert any .pdf file 📚 into an audio 🔈 book with Python

Mustafa Anas on January 07, 2020

(edit: I am glad you all liked this project! It got to be the top Python article of the week!) A while ago I was messing around with google's Text...
Collapse
gadgetsteve profile image
Steve (Gadget) Barnes

I would suggest adding two lines to save the MP3 file to the same location and name as the PDF file.

from os.path import splitext

outname = splitext(filelocation)[0] + '.mp3'

then use:

final_file.save(outname)

Collapse
mustafaanaskh99 profile image
Mustafa Anas Author

That would be a nice add!

Collapse
sadorect profile image
sadorect

Oh, fantastic! I was looking to add this by myself but I don't know python coding. Thanks for bringing it up!

Collapse
ash_wanth profile image
Ashwanth

I am really intrigued by this article. I tried everything to install pdftotext lib on my mac but was unsuccessful. I keep getting this error --> " error: command 'gcc' failed with exit status 1"
I installed OS dependencies , Poppler using brew but didn't work. Can you anyone help me?

Collapse
mustafaanaskh99 profile image
Mustafa Anas Author

make sure you have these two installed:
python-dev
libevent-dev

Collapse
ash_wanth profile image
Ashwanth

Yup i installed them . NO matter what i do, i keep getting this error --> "ERROR: Command errored out with exit status 1"
and i installed gcc too!

Thread Thread
redeving profile image
Kelvin Thompson • Edited

I just started getting the same thing on my system (Ubuntu). After a lot of Google/StackExchange, this worked (copy from my annotations):

For whatever reason, in order to install the following two, I had to install some stuff on my Ubuntu Mate ** system-wide ** to get rid of compile errors:

sudo apt-get install python3-setuptools python3-dev libpython3-dev
sudo apt-get update
sudo apt-get install build-essential libpoppler-cpp-dev pkg-config python-dev

I'm using PyCharmCE. After the above, I could use this in the PyCharm terminal:

pip3 install pdftotext
pip3 install gtts

After I did all of that, successful! Program works like a charm (hehe).

Cheers!

Thread Thread
mustafaanaskh99 profile image
Mustafa Anas Author

Thanks for sharing your solution!

Thread Thread
redeving profile image
Kelvin Thompson

A pleasure to finally be able to give back a little!

Thread Thread
ash_wanth profile image
Ashwanth

I have a Mac, brother. Can't use app-get. what should i do now?

Thread Thread
davidsouza profile image
David Souza

Are you using the default Python 2.7?? You may need to use Python 3.x

Thread Thread
davidsouza profile image
David Souza

I got this working on the Mac using Python 3.7.4 using virtual env and brew. Works fine.

Thread Thread
jogeshpi03 profile image
Jogesh

I am using docker with my Macbook without any issue. And it is a great alternative to start working on any environment, stack, etc.

Collapse
maskedman99 profile image
Rohit Prasad

They mention what all has to be installed for various O.S's in here pypi.org/project/pdftotext/

Collapse
nezhar profile image
Harald Nezbeda

Have you tried to install the OS dependencies as specified in the docs? github.com/jalan/pdftotext#macos

Collapse
schwepmo profile image
schwepmo

Really cool and quick project! One thing I would suggest is to use python's join() method instead of looping over the list of strings. I think that's the more "pythonic" way and should also perform a little better.

Collapse
mustafaanaskh99 profile image
Mustafa Anas Author

Thanks for the tip!
I sure will start using that

Collapse
narenandu profile image
Narendra Kumar Vadapalli

I am on fedora and had to install the following dependencies to get this working before I could pip install pdftotext

Sequence would be

sudo dnf install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config
pip install pdftotext gtts
Collapse
kriska profile image
Kristina Gocheva • Edited

My favorite part is (if I am not mistaken) that this would work for any language PDF as long as google text to speech supports the language.

Collapse
mustafaanaskh99 profile image
Mustafa Anas Author

hahaha omg how could I not think about doing the research.
You're true.
check this out
cloud.google.com/text-to-speech/

Collapse
suryabranwal profile image
SURAJ BRANWAL

Thanks a lot for the article, I tried a lot finding such thing but now am able to read(listen) to all my untouched PDFs.

Collapse
mustafaanaskh99 profile image
Mustafa Anas Author

That was my intention.
Glad you liked it :)

Collapse
suryabranwal profile image
SURAJ BRANWAL

I tried this on Win10, but was unable to install pdftotext package in Python 3.8.
Hence, I did this using another way :

github.com/suryabranwal/TIL/blob/m...

Collapse
Collapse
sadorect profile image
sadorect

This is a life-saving procedure you shared. I tried it and works like charm. Thank you so very much.

I have a question though...
I know this is a simplistic approach to just explain the basics( and its awesome). Please, is it possible to change the reader's voice and reading speed?

Collapse
mustafaanaskh99 profile image
Mustafa Anas Author

I am glad you liked it!
The intention of all my writings is to be as simple as possible so all-levels readers can understand.
If you wish to know more about customizing this API, please check this page:
gtts.readthedocs.io/en/latest/

Collapse
sadorect profile image
sadorect • Edited

An observation here ( I'm sure this has to do with the gtts engine though ):

The reader would rather spell some words than pronounce the actual words and its a bit strange. I did a conversion where the word "first" was spelt rather than pronounced. Initially, I thought such occurs when words are not properly written and the text recognition engine is affected. "Five" was pronounced fai-vee-e,and other spellings like that.

Overall though, it is manageable and one can make good sense out of the readings. Now I can "read" my e-books faster with this ingenious solution.

Thanks again, @mustapha

Collapse
mrofisr profile image
Muhammad Abdur Rofi

Can i use custom voice?

Collapse
probeta1 profile image
Abhinav Kumar Srivastava

Really cool !
However , when I tried to convert a decent sized pdf file (3.0 MB) , I got the following error :

"gtts.tts.gTTSError: 500 (Internal Server Error) from TTS API. Probable
cause: Uptream API error. Try again later."

Is Gtts blocking me from using their API ? How shall I resolve this ?

Collapse
veryutils profile image
VeryUtils

Thank you for recommend these good Text to Speech Software Solutions. I'm come from VeryUtils software company, VeryUtils has a DocVoicer (Text-To-Speech) Software, it can convert from PDF files to MP3 Audio easily. I would like to see our product get featured in articles like this. Would it be possible for you to write something for us? If so, please let me know, thank you.

Frank Xue
CEO VeryUtils.com
veryutils.com
frank@veryutils.com

Collapse
kushagra0347 profile image
Kushagra0347

This code gets stuck after I add PDF. can anyone provide any solution to this?

from tkinter import *
import pygame
import PyPDF2
from gtts import gTTS
from tkinter import filedialog
from os.path import splitext

root = Tk();
root.title('PDF Audio Player')
root.geometry("500x300")

Initialise Pygame Mixer

pygame.mixer.init()

Add PDF Function

def addPDF():
PDF = filedialog.askopenfilename(title="Choose a PDF", filetypes=(("PDF Files", "*.PDF"), ))
PDF_dir = PDF

# Strip Out the Directory Info and .pdf extension
# So That Only the Title Shows Up
PDF = PDF.replace('C:/Users/kusha/Downloads/', '')
PDF = PDF.replace(".pdf", '')

audioBookBox.insert(END, PDF)
PDFtoAudio(PDF_dir)
Enter fullscreen mode Exit fullscreen mode

def PDFtoAudio(PDF_dir):
file = open(PDF_dir, 'rb')
reader = PyPDF2.PdfFileReader(file)
totalPages = reader.numPages
string = ""

for i in range(0, totalPages):
page = reader.getPage(i)
text = page.extractText()
string += text

outName = splitext(PDF_dir)[0] + '.mp3'
audioFile = gTTS(text=string, lang='en') # store file in variable
audioFile.save(outName) # save file to computer

Enter fullscreen mode Exit fullscreen mode



Play Selected PDF Function

def play():
audio = audioBookBox.get(ACTIVE)
audio = f'C:/Users/kusha/Downloads/{audio}.mp3'

pygame.mixer.music.load(audio)
pygame.mixer.music.play(loops=0)
Enter fullscreen mode Exit fullscreen mode



Create Playlist Box

audioBookBox = Listbox(root, bg="black", fg="red", width = 70, selectbackground="gray", selectforeground="black")
audioBookBox.pack(pady=20)

Define Player Control Button Images

backBtnImg = PhotoImage(file='Project Pics/back50.png')
forwardBtnImg = PhotoImage(file='Project Pics/forward50.png')
playBtnImg = PhotoImage(file='Project Pics/play50.png')
pauseBtnImg = PhotoImage(file='Project Pics/pause50.png')
stopBtnImg = PhotoImage(file='Project Pics/stop50.png')

Create Player Control Frame

controlsFrame = Frame(root)
controlsFrame.pack()

Create Player Control Buttons

backBtn = Button(controlsFrame, image=backBtnImg, borderwidth=0)
forwardBtn = Button(controlsFrame, image=forwardBtnImg, borderwidth=0)
playBtn = Button(controlsFrame, image=playBtnImg, borderwidth=0, command=play)
pauseBtn = Button(controlsFrame, image=pauseBtnImg, borderwidth=0)
stopBtn = Button(controlsFrame, image=stopBtnImg, borderwidth=0)

backBtn.grid(row=0, column=0, padx=10)
forwardBtn.grid(row=0, column=1, padx=10)
playBtn.grid(row=0, column=2, padx=10)
pauseBtn.grid(row=0, column=3, padx=10)
stopBtn.grid(row=0, column=4, padx=10)

Create Menu

myMenu = Menu(root)
root.config(menu=myMenu)

Add the converted audio file in the menu

addAudioMenu = Menu(myMenu)
myMenu.add_cascade(label="Add PDF", menu=addAudioMenu)
addAudioMenu.add_command(label="Add One PDF", command=addPDF)

root.mainloop()

Collapse
abragred profile image
abragred • Edited

This seems a very nice idea, might get my friend that knows how to do stuff with Python to get this done for me. I'm not at such a high level of technology use to be able to do this stuff alone, but I'd like to learn. One thing that I'm proud of myself that I can help my family with is working with PDF forms. I especially help my mom a lot cause she has a lot of forms to fill for her job, but has a big lack of technology talent. I found this site pdfliner.com/alternative/sejda_alt... that lets me edit anything I want.

Collapse
dnaboka profile image
Dima Naboka • Edited

I have a problem running [vagrant@centos8 ~]$ sudo pip3 install pdftotext on CentoOS8:
error: command 'gcc' failed with exit status 1
Command "/usr/bin/python3.6 -u -c "import setuptools, tokenize;file='/tmp/pip-build-7_3v7vuh/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-ac0irxfy-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-7_3v7vuh/pdftotext/

I'm running Python 3.6.8, do I have to use Python 3.8 explicitly?

Collapse
cricarba profile image
Cristian Carvajal 👽

Great!!
Does it work in any language?

Collapse
mustafaanaskh99 profile image
Collapse
droxzy profile image
Priyanshu Kumar

will it also read page number, footer or any extra garbage text?

Collapse
vansul profile image
Vaibhav Kaushik • Edited

Yes, of course as they are also a type of text.

Collapse
droxzy profile image
Priyanshu Kumar

Using Machine learning you can avoid those things.

Collapse
rishabhagg97 profile image
Rishabh Aggarwal

Hey, this is really cool.

Collapse
mustafaanaskh99 profile image
Mustafa Anas Author

hey thanks buddy!
glad you liked it

Collapse
belkin profile image
Belkin

Do you have any demo audio files? I'm really interested to hear it. :)

Collapse
mustafaanaskh99 profile image
Mustafa Anas Author

Run this code and hear the result

from gtts import gTTS
final_file = gTTS(text='Demo String', lang='en')  # store file in variable
final_file.save("Generated Speech.mp3")  # save file to computer
Collapse
hseritt profile image
Harlin Seritt

Good stuff, Mustafa! I created a github project for this in case anyone wants to see and get an idea how this is set up on an Ubuntu 18.04 workstation.

github.com/hseritt/pdf2voice

Collapse
mustafaanaskh99 profile image
Mustafa Anas Author

Thank you for sharing the repo Harlin!

Collapse
malraharsh profile image
malraharsh

This idea is great. But if you just want to listen. Use Moon+ Reader App. It converts text to speech.

Collapse
trippymonk profile image
Blake Stansell

Awesome, awesome, awesome! I'm guessing they're ok to listen to?

Collapse
mustafaanaskh99 profile image
Mustafa Anas Author

Yea they get the job done

Collapse
ankurt04 profile image
Ankur Tiwari

Cool stuff!

Collapse
bgatwitt profile image
bga

Really useful article.

Collapse
mustafaanaskh99 profile image
Collapse
usmankamal profile image
Usman Kamal

Nice one Mustafa!

I'm curious what would happen if the PDF has images or mathematical equations?

Collapse
probeta1 profile image
Abhinav Kumar Srivastava

Suggestion : Display status of the conversion ..

Collapse
raph_ok profile image
Oyeladun Rapheal Kunle

I copy this codes and paste in python 3(Anaconda) and nothing displayed, no error no output, please why, thanks

Collapse
mustafaanaskh99 profile image
Mustafa Anas Author

I do not use Anacoda so I can't guess what the problem is.
Just make sure you have all the needed packages installed and it should run smoothly.

Forem Open with the Forem app