DEV Community

Pacharapol Withayasakpunt
Pacharapol Withayasakpunt

Posted on


Please ELI5 what Parquet is for, and NOT for

I am trying to understand how good is Apache Parquet for

  • Data storage format (when you DO NOT have a Hadoop; only on your local computer)
    • How big is the size?
    • How reliable is it?
  • Query-able format
    • Do I have to index first? (Probably unique indices are not possible?)
    • Speed?
    • Resource usage?

As far as I understand, Parquet may not be good for frequent writes or updates; but is it good enough for a static database?

You can compare to the always popular SQLite, as a benchmark; disregarding SQLite features, such as foreign keys, unique indices, full text search and multiple tables.

BTW, I have seen SQLite file size goes to 700 MB for a few megabytes for final CSV data, and not sure if it is reliable as a storage anymore...

Top comments (0)

Timeless DEV post...

Git Concepts I Wish I Knew Years Ago

The most used technology by developers is not Javascript.

It's not Python or HTML.

It hardly even gets mentioned in interviews or listed as a pre-requisite for jobs.

I'm talking about Git and version control of course.

One does not simply learn git