Discussion on: Beyond CSV files: using Apache Parquet columnar files with Dask to reduce storage and increase performance. Try it now!

paddy3118 profile image

Sadly I don't use Dask, but in the past I have used zcat on a Linux command line to stream a csv to stdin for a script to then process without needing the whole of the data uncompressed in memory/on disk.

Thread Thread
zompro profile image
zom-pro Author

Cool I can totally see a use case for that streaming into something like Apache Kafka. I will prototype a couple of things and see if it can become another little article. Thanks!