The COVID-19 pandemic has generated an enormous amount of data, providing a unique opportunity for modeling and analysis. In this paper, we present a data-informed approach for building stochastic compartmental models that is grounded in the Markovian processes underlying these models. Our initial data analyses reveal that the SIRD model -- susceptiple (S), infected (I), recovered (R), and death (D) -- is not consistent with the data. In particular, the transition times expressed in the dataset do not obey exponential distributions, implying that there exist unmodeled (hidden) states. We make use of the available epidemiological data to inform the location of these hidden states, allowing us to develop an augmented compartmental model which includes states for hospitalization (H) and end of infectious viral shedding (V). Using the proposed model, we characterize delay distributions analytically and match model parameters to empirical quantities in the data to obtain a good model fit. Insights from an epidemiological perspective are presented, as well as their implications for mitigation and control strategies.Competing Interest StatementThe authors have declared no competing interest.Funding StatementResearch supported in part by the C3.ai Digital Transformation Institute sponsored by C3.ai Inc. and the Microsoft Corporation, in part by the Jump ARCHES endowment through the Health Care Engineering Systems Center of the University of Illinois at Urbana-Champaign, and in part by the National Science Foundation grant NSF-ECCS 20-32321.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Only the publicly available data were used in this study. Therefore, an IRB review was not required.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).Yes I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe data referred in this study are collected from publicly available resources. Centers for Disease Control and Prevention, COVID-19 Case Surveillance Public Use Data, https://data.cdc.gov/Case-Surveillance/COVID-19-Case-Surveillance-Public-Use-Data/vbim-akqf/data, 2020. M. Kraemer, Epidemiological data from the nCoV-2019 outbreak: Early descriptions from publicly available data, https://virological.org/t/epidemiological-data-from-the-ncov-2019-outbreak-early-descriptions-from-publicly-available-data/337, 2020. midas network, COVID-19, https://github.com/midas-network/COVID-19/blob/master/data/cases/global/line_listings_imperial_college/international_cases_2020_08_02.csv, 2020. Y. Xu, COVID19 inpatient cases data, https://doi.org/10.6084/m9.figshare.12195735.v3, 2020. ThisIsIsaac,Data-Science-for-COVID-19, https://github.com/ThisIsIsaac/Data-Science-for-COVID-19/blob/master/Covid19_Dataset/patients.csv, 2020. mrc ide,COVID19_CFR_submission,https://github.com/ mrc-ide/COVID19_CFR_submission/blob/master/data/deaths_integrated_with_linelist_17feb.csv, 2020. Public line list and summaries of the COVID-19 outbreak in South Korea, https://github.com/parksw3/COVID19-Korea/blob/master/COVID19-Korea-2020-04-06.xlsx, 2020. Novel Coronavirus 2019 time series data on cases, https://github.com/datasets/covid-19, 2020. I. Dorigatti, L. Okell, A. Cori, N. Imai, M. Baguelin, S. Bhatia, A. Boonyasiri, Z. Cucunuba, G. Cuomo-Dannenburg, R. FitzJohn et al., Report 4: Severity of 2019-novel coronavirus (nCoV), Imperial College London, London, 2020.
Original languageEnglish (US)
PublisherCold Spring Harbor Laboratory Press
Number of pages8
StatePublished - Oct 6 2020

Publication series

PublisherCold Spring Harbor Laboratory Press


  • Coronavirus
  • COVID-19
  • severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)
  • Novel coronavirus
  • 2019-nCoV
  • Pandemic

Fingerprint Dive into the research topics of 'A Data-Informed Approach for Analysis, Validation, and Identification of COVID-19 Models'. Together they form a unique fingerprint.

Cite this