DEV Community

Cover image for 4 keys of analyzing data fast
javi santana
javi santana

Posted on • Updated on

4 keys of analyzing data fast

  1. Don't store the data you don't need. Sounds silly but a lot of the data you have to read is not useful.

  2. Don't read the data you don't need. Discard the data using indices or any other tool your database/framework provides

  3. Run heavy operations later. For example, filtering data is faster than aggregating it so when processing data always filter first and do other heavy things later (joins, aggregations and so on)

  4. Sort your data before storing it. Sorting data makes compression much better and you use all the power of current hardware (sequential reads are 100x faster than random access)

Following these 3 rules I process large datasets 100-1000x faster than I usually did.

(image from craiyon.com generated with "f1 going fast")

Top comments (0)