The Wayback Machine - https://web.archive.org/web/20201208145558/https://github.com/hongru94/COVID-20
Skip to content
master
Go to file
Code
This branch is 3 commits ahead, 29 commits behind CSSEGISandData:master.

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 

README.md

Unified COVID-19 Dataset

Copyright © 2020 Johns Hopkins University (JHU) Citation: Badr et. al 2020 License: GPL v3 GitHub Commit

This is an all-in-one unified COVID-19 dataset to fulfil the following objectives:

  • Merging data from all credible sources at all levels
  • Unifying variable names, types, and categories
  • Standardizing Country/Province/State/County names
  • Standardizing Country/Province/State/County codes
  • Standardizing dates and data types/formats
  • Cleaning the data and fixing confusing entries
  • Optimizing the data for machine learning applications

Data Structure

Column Type Description
Date Date Date of data record
Year Integer Year of data record
Month Integer Month of data record
Day Integer Day of data record
DoW Charachter Day of the week
Cases Integer Number of cumulative cases
Cases_Max Integer Maximum number of cumulative cases
Cases_New Integer Number of new daily cases
Type Charachter Type of the reported cases
Source Charachter Source of the data: CTP, JHU, NYT, DPC, RKI
Level Charachter Geographic level: County, Jurisdiction, State, Province, Cruise Ship, Country
Longitude Double Geographic coordinate (centroid), east–west
Latitude Double Geographic coordinate (centroid), north–south
Population Integer Population for each group and geographic unit
ISO3166_1_3N Charachter ISO 3166-1 numeric code, 3-digit, country/region
ISO3166_1_3C Charachter ISO 3166-1 alpha-3 code, 3-letter, country/region
ISO3166_1_2C Charachter ISO 3166-1 alpha-2 code, 3-letter, country/region
ISO3166_2 Charachter ISO 3166-2 code, province/state
ISO3166_2_UID Charachter ISO 3166-2 code, province/state, full/unique
NUTS Charachter Nomenclature of Territorial Units for Statistics (NUTS), Europe
FIPS Charachter Federal Information Processing Standard Publication, United States
County Charachter Standard county name
Province_State Charachter Standard province/state name
Country_Region Charachter Standard country/region name
Full_Name Charachter Full combined name of geographic unit, unique ID

Case Types

Type Description
Active Active cases
Confirmed Confimed cases
Deaths Deaths
Home_Isolation Home isolation
Hospitalized Total hospitalized cases excluding intensive care units
Hospitalized_Now Currently hospitalized cases excluding intensive care units
Hospitalized_Sym Symptomatic hospitalized cases excluding intensive care units
ICU Total cases in intensive care units
ICU_Now Currently in intensive care units
Negative Negative tests
Pending Pegative tests
Positive Positive tests
Recovered Recovered cases
Tested Cases tested
Tests Total performed tests
Ventilator Total cases receiving mechanical ventilation
Ventilator_Now Currently receiving mechanical ventilation

Data Sources

Source Description Level
JHU The Johns Hopkins University Center for Systems Science and Engineering Global & County/State, United States
CTP The COVID Tracking Project State, United States
NYT The New York Times County/State, United States
DPC Italian Civil Protection Department NUTS3/NUTS2, Italy
RKI Robert Koch-Institut, Germany NUTS3/NUTS2, Germany

Citation

To cite this dataset:

Badr, H. S., 2020: Unified COVID-19 Dataset. Available on: https://github.com/hsbadr/COVID-19.

About

Unified COVID-19 Dataset

Resources

Releases

No releases published

Packages

No packages published
You can’t perform that action at this time.