Jeremy Morgan

Posted on Oct 23, 2023 • Originally published at jeremymorgan.com

How to Read Text From an Image with Python

#python #tutorials #programming #beginners

If you want to read text from an image with a simple Python script, this tutorial is for you. Thanks to the work of many great people over the last few decades, you can read the text from an image with a few lines of code. Really! Let's jump in.

What is OCR? Tesseract?

Optical Character Recognition, or OCR has been around for a long time. Its a technique that "reads" different types of documents into editable and searchable text. It works by recognizing characters in the image and converting them into machine-readable text. It's a lot of magic but it works well.

Tesseract is an open-source OCR engine developed by Google. It is highly accurate and supports multiple languages. This library will do all the heavy lifting for us. We'll use it in this tutorial to quickly read the text in some images.

Step 1: Set up your Python Environment

First, you'll need to make sure Python is installed. We're going to create a virtual environment.

I'm using Linux, so I'll create a directory named textreader and type in

python -m venv textreader

Then

source textreader/bin/activate

Step 2: Install the Required Libraries

First, we'll need to install Tesseract on your system. Here's the instructions to install Tesseract on your chosen operating system.

Make sure Tesseract is installed by typing:

tesseract -v

and you should see output that looks like this:

Then, we'll install a couple of Python libraries.

Pytesseract is a Python library that is a wrapper for the Tesseract OCR engine. This makes it easy to use in Python applications. We'll install that and Pillow.

Pillow is the Python Image Library. It's used for image processing and manipulation. It's used to pre-process images before applying OCR techniques. It does things like image thresholding and other steps on the image to enhance the accuracy of the reading.

Next, we'll install Pytesseract and Pillow together for our first application:

pip install pytesseract
pip install pillow

Your output should look something like this:

In some cases, like above, it may say the requirement is already satisfied for Pillow.

And we're ready to go.

Step 3: Select your Image

To start out, I'm going to choose something easy. I'll use a screenshot from my website. This will be clear, easy-to-read text that should work great.

I'll save that as image-1.jpg in my folder.

Step 4: Write the Script

Now, we're ready to build our Python script to read the text from that image and output it to the screen.

First, we'll import the libraries:

import pytesseract
from PIL import Image

Then open the image:

image = Image.open('image-1.jpg')

And then, we'll use Tesseract to convert the text in the image to a string. Didn't I say this library does all the heavy lifting for us?

text = pytesseract.image_to_string(image)

Finally, we'll print it out:

print(text)

Let's run it and see what it looks like.

Step 5: Watch the Magic Happen

We run our script and get this:

Awesome! So it's not perfect, but it's pretty darn good. You can read the text from the image we sent, and it's somewhat formatted the way it is in the image. That's awesome!

Congrats! You can now read the text from images in Python. Next, we'll look at some more advanced stuff.

Learning the Limitations

In our first example, we had a very clear image. The text is formatted and crisp in that image, so it's easy to read. Let's step it up a bit.

I picked a more challenging image, one from Pexels, that isn't quite so easy.

Let's see what the output is when reading this image:

Oof. Nothing. I included this because it's important to know the limitations of this process. Unusual fonts and different angles will affect how well this works. There isn't much we can do to read this image without some extensive work.

Conclusion

In this tutorial, we learned how to use Tesseract to read text from an image and put it into a machine-readable form. We can read many other things with OCR, and we'll deep dive into some of this stuff in future articles.

Feel free to play around with this and see what you can come up with! In a future tutorial, we'll use OpenCV to refine things and do more pre-processing of the images we'll read from. It will be fun.

Bookmark this blog and come back for more cool Python tutorials.

Questions? Comments? Yell at me!

DEV Community

How to Read Text From an Image with Python

What is OCR? Tesseract?

Step 1: Set up your Python Environment

Step 2: Install the Required Libraries

Step 3: Select your Image

Step 4: Write the Script

Step 5: Watch the Magic Happen

Learning the Limitations

Conclusion

Top comments (0)

Read next

1760. Minimum Limit of Balls in a Bag

5 Reasons to consider MVI Architecture in your android projects

Day 7: Your input is valid 🖐️

Elastic Load Balancing (ELB): Ensuring High Availability and Reliability