DEV Community

Cover image for #005 How to Convert JSON to NDJSON?
Kemal Cholovich
Kemal Cholovich

Posted on • Updated on

#005 How to Convert JSON to NDJSON?

NDJSON is a convenient format for storing or streaming structured data that may be processed one record at a time.

How to convert JSON to NDJSON using npm:

Step 1.

npm install -g json-to-ndjson
Enter fullscreen mode Exit fullscreen mode

Node.js and npm are required!

Step 2.

cat file.json | json-to-ndjson
Enter fullscreen mode Exit fullscreen mode

How to convert JSON to NDJSON using jq :

jq is a lightweight and flexible command-line JSON processor.

Step 1.

Using this simple line of code, you can convert and save files in NDJSON format:

cat test.json | jq -c '.[]' > testNDJSON.json
Enter fullscreen mode Exit fullscreen mode

What we have got after?

We've just converted JSON array file.json (or a JSON object containing a JSON array), to NDJSON. When you use the NPM method, please note that the same file (file.json), we've just grind, become formatted as NDJSON! If you need to save the original JSON file, please create a backup before!

Why is NDJSON so important?

*It works well with UNIX-style text processing tools and shell pipelines.
*It's a great format for log files. It's also a flexible format for passing messages between cooperating processes.

  • It's also the download format that is used in Google Big Query.

BigQuery loves NDJSON

These days I am working with BigQuery and before I create a data pipeline, test the process and I try to insert fat/big GB tables from PostgreSQL Data warehouse DB into BigQuery.

OK, I completed the export jobs, exported a few tables from DB into JSON, saved them locally, and start to push them to the BigQuery.

Because they are larger than 4GB (GCP.BQ limit!) we must upload the data to Google Cloud Storage first, and as a second step after we created a dataset in the BigQuery service, we need to create the table from JSON file in our case. (This is one of the basic options to ingest data into and talk to BQ...)

On this step I got the Error: “Error while reading data, error message: Failed to parse JSON: Unexpected end of string; Unexpected end of string; Expected key”

WHY I GOT The error?

BECAUSE:

BigQuery only accepts new-line delimited JSON, which means one complete JSON object per line

When you are importing data from JSON files into BigQuery, first you have to format the JSON file as NDJSON, otherwise, you will get the Error noticed above!

Resources:

ndjson.org
Why NDJSON
package - json-to-ndjson
BigQuery: Loading JSON data into a new table

Latest comments (0)