COVID-19 Open-Source Helpdesk

Pandas JHU json load

I am doing some policy analysis to inform ethics.harvard.edu/covid-roadmap the code is at https://colab.research.google.com/drive/10BSqU-NqCJ7m-dcijssJUS97E-XAKoK0#scrollTo=anZiWValBKVl

I am trying to pull the jhu data in json into pandas and the auto format detection isnt quite working

county_jhu = pd.read_json(‘https://pomber.github.io/covid19/timeseries.json’)

Probably not the most efficient, but here’s a relatively straightforward approach:

import requests
import pandas as pd
timeseries_json = requests.get('https://pomber.github.io/covid19/timeseries.json').json()
pd.concat(pd.DataFrame([dict(record, country=country) for record in data]) for country, data in timeseries_json.items())
1 Like

It looks like you might only want the data for the US. If so, here’s a slight simplifcation of @philippjfr’s answer:

import requests
import pandas as pd
timeseries_json = requests.get('https://pomber.github.io/covid19/timeseries.json').json()
county_jhu = pd.DataFrame(timeseries_json['US'])

You can then optionally set the date column as the index (row labels):

county_jhu = county_jhu.set_index('date')
1 Like

Thanks for the help (sadly that endpoint didnt have the disagregated timeseries at the county elvel I was after)