PDF file manipulation with python 3 (Problem)

#python #help #question

Hi, I have a problem I would like help with, i'm working on a project/script with python 3 where I want to manipulate a large PDF file (50-60 plus pages long) where I would like to find a specific keyword in that file, this keyword is repeated multiple times in the file and each time this keyword is referring to a different data set, then save how many times the keyword was found, in what pages was found and then split those pages from the original file and then merge those pages together in a single file.

I will use multithreading of course, because this script will run alongside other's in a small in-house server and it's already running quite a lot, other scripts.

I found some things online but no luck in what my problem is, except some python libraries that is possible to do what i'm looking for, but i have no idea how i will found this keyword in the file, because the keyword isn't in the same page order in the files, it's different in every file!!

DEV Community

PDF file manipulation with python 3 (Problem)

Top comments (0)

Read next

Building an Agent Tool Management Platform: A Practical Architecture Guide

This Week In Python

DevOps Practical Experience with Home Lab

Flask or FastAPI: Choosing the Right Python Framework for Your Project