Hi again, in my previous post i had a problem on how to search a large PDF file for a keyword which can be found in multiple pages of the file and in some cases more than once in single page!
I've used PyPDF2 to open a given PDF file, then extract the text page by page, search that text for the given keyword and then check in what page the keyword was found and how many times per page and finally split those pages from the original file and merge them all together to create my final file so it can be printed with the useful data and not with other non-useful data from the original file.
All works fine with test/dummy data in English Characters but the original file is in Greek and the
function of PyPDF2 returns an empty string.
So how would you approach this problem?