DEV Community

RyanSmoak
RyanSmoak

Posted on

Data preprocessing for extractive QA

I want to start a new project to do extractive QA based on a certain text corpus that is hundred of pages long but I don't know how to preprocess the data. I was planning on training BERT on the text corpus that looks like this:
Image description
How can I turn this into something that BERT can learn from? If you need me to clarify on anything, just ask. All help is appreciated.

Top comments (0)