TO EXTRACT TEXT FROM IMAGE WITH PYTHON IS PRETTY EASY:
For some good reason one might want to extract texts from images, but the question is usually: how do I do that? , well.. if you happen to be among those that asked, Here comes the Calvary: Python is the answer.
We only need a few lines of code and just 2 python modules.
Without further ado, let’s dive into the codes already. We’ll begin by importing the relevant python modules for this program.
We are going to use Pytesseract and Pillow(PIL). In case this is your first time using the aforementioned modules, you might want to click on them for a quick documentation read-up.
To install both modules is easy too, just use the normal pip install [module]. In case you re using linux distro like ubuntu and you got a “module not found error” on pytesseract, try installing tesseract-ocr first with sudo apt install tesseract-ocr ,then go ahead and install pytesseract with pip afterwards.
Here is your code below:
#importing the necessary modules
import pytesseract as pt
from PIL import Image
#Converting image to text
img = Image.open('/[image path]/image.jpg')
extracted = pt.image_to_string(img)
print(extracted)
print(type(extracted))
That’s all! You just got any readable text on your image. Easy Pizzy huh? Here we used a .jpg image, but you can even use a png as well, depending what your image extension is.
Meanwhile, you can now do whatever you want with your extracted text. Now that I have proven to you that to extract text from image with python is super easy, feel free to also check this image manipulation tutorial here and see if it could come in handy in your next python project.
Top comments (2)
That's very cool!
Thanks 🙂