Professors in my college were facing a great difficulty while organizing all student data at one place. My final project was based on building a pdf scraper that successfully scrapes the unstructured data in the college semester marksheet pdf file and convert it into well-defined csv files. This project is thus saving an ample amount of time for teachers and office personnel in my college who have to enter the data of students manually. It also analyzes the results of students of current and previous semesters and in-turn predicts their future grades using machine learning.
I built the project mainly using Python programming language. While building the project, one of the main issues I ran into was selecting a good Python library for scraping the pdf file which was most suitable. I tried many different libraries and finally selected Camelot.
This project is very close to my heart as I've spent a huge amount of time researching and developing the project. The main motive of building this project was to help my teachers who I saw doing manual data entry of student details (which took weeks to get done). The most happy moment was when one of my teachers used my project for doing her college work and praised me for building this project.