I am scratching my head against NLP concepts for my BSc thesis. After a day of frustration for neverending load times of a CSV files I switched to Pandas and everything started working fast as hell.


That's great tell us more about it.


I had this huge CSV with "filename", "email". I had to ectract author and body from the email and group the texts by author. With a regular Python dict it needed a lot, so I switched to Pandas with the C engine and using the groupby function I managed to handle this in a reasonable time.

