DEV Community

Ghoshan J
Ghoshan J

Posted on

Hong Kong Covid-19 Data Sourcing (2020)

Hong Kong is one of the early countries to report a patient positive for covid-19 on January 23rd 2020. Hong Kong also experienced the SARS outbreak in 2003, which is still remembered as a dark chapter in the region's post colonial memory. The protests against the government and thereby in extension to the Chinese authority had reached its peak during the middle of 2019, where more than a million protesters were out on the streets up until a security law was passed to reduce the havoc and political, financial instability it was causing. As the city was repairing the damages and assuring the financial sector that Hong Kong is still the free open economy that had once attracted the foreign companies to open offices, the unexpected pandemic comes as another stress test for this "developed" asian city.

Nevertheless, I think Hong Kong has so far in 2020 done a good job at damage control and to prioritise public safety.

Github python notebook : HK-Covid-analysis

Data sourcing

The popular covid data repository Ourworldindata (owid) reports numbers from Hong Kong under China (so I didn't feel like using that). Anyways we could make use of other official sources.

Data sourcing from data.gov.hk

The government has an abundant open datasets available for download. The datasets related to covid has list of cases, list of homes under quarantine, list of transportation modes taken by infected people, daily testing statistics, occupancy of quarantine centres, etc. Seriously, huge props for the government to make all this available through an api and update it daily (I've seen this being used at my company's HR to monitor anything related to the staff's residence). I used the "Details of probable/confirmed cases of COVID-19 infection in Hong Kong"

Data sourcing from Google big data

No surprises here. Google also collects and consolidates covid-19 data that is updated regularly. This is offered through their big data platform where you could use bigquery to narrow down the dataset. What's interesting is that this dataset also includes other complementary informations such as change in mobility, weather, testings, etc.

I though there might be differences in the data as it is a point of contention between how America records a patient as positive vs the rest of the world. But, I guess Google is just using the official data submitted from each country.

covid 19 plots

High level analysis

Do restrictions work?

The government was quick to impose restrictions on social gathering and declare a state of urgency if i.e, more that 20 cases seen for 3 days in a row. In late 2020, mandatory testing for entire building or area were also imposed. Let's see if social distancing had an impact. Also, using google's mobility reports (data retrieved from location sharing) to see if the residents did in-fact follow the government guidelines to reduce going out.

Restrictions

Initial social distancing restrictions reduced mobility severely and also was able to contain the 1st wave. But we can see that people are visiting groceries up to 40% more than last year. This might be the reason why the 4rd wave continued for a prolonged time as there was no dip in grocery mobility (holiday season to blame).

Third wave analysis

Let's look at the third wave closely. As the number of cases increased in June and July of 2020, the government blamed it mainly on the resident not adhering to the social distancing rules, and the influx of residents from abroad. Let's have a look at the timeline of events and air traffic.The amount of air travel reduced due to the pandemic is absolutely crazy (bad times for Cathay Pacific 😞). But other than that, I wasn't able to infer much from this.

third wave

Top comments (0)