DEV Community

villival
villival

Posted on

Need DEVELOPER (PYTHON & OCR TOOLS EXPERIENCE)

JOB DESCRIPTION – DEVELOPER (PYTHON & OCR TOOLS EXPERIENCE)

A Chennai-based BPO company is in the process of converting PDF, JPG and PNG statements to .CSV
using Python and OCR technology and is looking for a developer with experience in Python, Machine
Learning, OCR tools, Azure Storage Queue and Linux for a period of 3-4 months to develop a OCR tool that supports extraction of statements from pdf, JPG and PNG to .CSV format per the following

Scope of work:

Input Documents: Bank statements, credit card statements and payroll summaries (various non-
standard templates) in PDF, JPG & PNG formats

Output Expectations:

  1. Developed OCR tool should identify the type of document post extraction (if the document is a bank statement, credit card statement or payroll summary) based on keyword criteria 2.Developed OCR tool should extract the statements from pdf and image formats to .CSV (keywords will be provided by the client where applicable)
  2. Integration with client application – OCR tool should be integrated with client in-house application (DB /storage queue)

Project requirements:

  1. Candidate would be responsible for development of OCR tool and should support till project going live (full-time availability)
  2. Support to integrate developed OCR tool with in-house application would be provided by the company
  3. Work from home is feasible

Technical skills required:
2-3 years of experience in Python, Machine Learning applications and OCR tools (AWS Textractor Google Tesseract). Experience in AWS Textract is an added advantage
Minimum one-year experience in Azure cloud. Experience in Azure Kubernetes Services (AKS)is an added advantage
Knowledge in Azure Storage Queue and Cron job would be an added advantage.

Top comments (1)

Collapse
 
graciegregory profile image
Gracie Gregory (she/her)

Hi there, this post might fit better as a DEV Listing. It’s a dedicated area of the platform where community members and organizations are encouraged to publish information related to events, products, services, job listings, and everything in between.