DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»

Load a JSON file to Google BigQuery using Python

I'm starting to learn Python to update a data pipeline and had to upload some JSON files to Google BigQuery. Hope this helps people in need!

See GCP documentation (for a CSV example).

Steps before running the script:

  • Create a Google service account with BigQuery permissions.
  • Download the json key. Do not commit into git! Use .gitignore if needed.
  • Add the key to your .env variable. This will get load via load_dotenv library. Again, do not commit .env into git!

Example of your .env

GOOGLE_APPLICATION_CREDENTIALS=your-gcp-project-name-aaa333111aaa.json
  • Create the dataset via GCP Console, in the BigQuery section.
  • Run the script!

upload_json_to_bq.py script:

from google.cloud import bigquery

from google.oauth2 import service_account
from dotenv import load_dotenv
load_dotenv()

client = bigquery.Client()
filename = '/path/to/file/in/nd-format.json'
dataset_id = 'DatasetName'
table_id = 'TableName'

dataset_ref = client.dataset(dataset_id)
table_ref = dataset_ref.table(table_id)
job_config = bigquery.LoadJobConfig()
job_config.source_format = bigquery.SourceFormat.NEWLINE_DELIMITED_JSON
job_config.autodetect = True

with open(filename, "rb") as source_file:
    job = client.load_table_from_file(
        source_file,
        table_ref,
        location="europe-west1",  # Must match the destination dataset location.
        job_config=job_config,
    )  # API request

job.result()  # Waits for table load to complete.

print("Loaded {} rows into {}:{}.".format(job.output_rows, dataset_id, table_id))

Note: the file must be a JSON newline delimited file.

Top comments (0)

Make Your Github Profile Stand Out

Github is great, but have you considered how to make yours more attractive for potential employers or other visitors? Even non-tech ones like recruiters!

Take a couple of hours and show your best side as a person - and a programmer.