DEV Community

Abanoub Hanna
Abanoub Hanna

Posted on • Updated on

What I Learned From Developing Arabic OCR Android App

I searched for an app to extract text on images for Arabic language. I couldn't find any app without Internet access for Android platform. but after somedays of searching I found two apps.. but the OCR accuracy is horrible!

I decided to create Arabic OCR App for Android. I started a year ago, Oct 6, 2018. It was a great yet sad decision. I faced too many challenges.

There is a use for JNI for Android which is hard to write and deal with for me. So I need to familiarize myself with developing in CPP with Java using JNI. Then I faced the other obstacle which is the bad accuracy of trained models for identifying Arabic languages for tesseract OCR library.

I searched for tesseract 4.0 library for Android, and LSTM trained data with the highest accuracy. After too many iterations and failures for one loooooong year, I succeed!

Finally, I managed to make the OCR app accurate, works offline, and easy to use. You can use it! it's here.

I learned that developing software is iterative process not an overnight success.. researching is your first step and don't stuck in experimenting without research.. use libraries, and don't reinvent the bicycle (stand on the shoulders of giants).

If you have any suggestions or advices, tell me in the comments. Thanks.

Top comments (0)