Many people have seen the nifty dashboard that John’s Hopkins University put out where you can see the number of COVID-19 cases, etc. throughout the world. It’s really nice and all, but what if you wanted to slice and dice the data yourself? Well, as it turns out, they are also publishing all the underlying data in a GitHub repository! It’s all published as daily CSV (comma separated values) files. Makes it super easy to import into Excel spreadsheets, but spreadsheets are so over. All the cool kids are visualizing their data in InfluxDB.
Since I work at InfluxData, I figured I should make it easy to read the data in to InfluxDB 2.0. In order to do that, I had to process each of the CSV files in the dataset, transform the data into a format that InfluxDB could ingest efficiently, and then send it to a database. The easiest way, at least for me, was to use one of the provided InfluxData client libraries, so I chose the Golang one.
I won’t go into the specifics of whaat the program does, as it’s super simple, but I’ll point you to the Github Repository where the code resides: https://github.com/davidgs/covid-data
-dir: Path to where the .csv data files live. Default is . (current Directory) -url: URL of your InfluxDB server, including port. (default: http://localhos:9999) -bucket: Bucket name — no default, REQUIRED -organization: Organization name — no default, REQUIRED -measurement: Measurement name — no default, REQUIRED -token: InfluxDB Token — no default, REQUIRED
So all you have to do is build it, and then run it:
$ go build covid.go $ ./covid -dir path/to/data -bucket bucket\_name -organization org\_name -measurement measure\_name -url http://your.server.com:9999 -token yourToken
You will see output as it runs:
Scanning Data Directory: ../../COVID-19/csse\_covid\_19\_data/csse\_covid\_19\_daily\_reports Processing File: ../../COVID-19/csse\_covid\_19\_data/csse\_covid\_19\_daily\_reports/01-22-2020.csv Processing File: ../../COVID-19/csse\_covid\_19\_data/csse\_covid\_19\_daily\_reports/01-23-2020.csv Processing File: ../../COVID-19/csse\_covid\_19\_data/csse\_covid\_19\_daily\_reports/01-24-2020.csv Processing File: ../../COVID-19/csse\_covid\_19\_data/csse\_covid\_19\_daily\_reports/01-25-2020.csv Processing File: ../../COVID-19/csse\_covid\_19\_data/csse\_covid\_19\_daily\_reports/01-26-2020.csv …
And you should see data flowing into your InfluxDB instance as well.
Note: These videos don't embed and play here, so you can see them via the original post here
And here is is coming in to a scatter plot:
Feel free to play around with it and let me know what you think!