The COVID-19 pandemic has generated an enormous amount of data, providing a unique opportunity for modeling and analysis. In this paper, we present a data-informed approach for building stochastic compartmental models that is grounded in the Markovian processes underlying these models. Our initial data analyses reveal that the SIRD model - susceptiple (S), infected (I), recovered (R), and death (D) - is not consistent with the data. In particular, the transition times expressed in the dataset do not obey exponential distributions, implying that there exist unmodeled (hidden) states. We make use of the available epidemiological data to inform the location of these hidden states, allowing us to develop an augmented compartmental model which includes states for hospitalization (H) and end of infectious viral shedding (V). Using the proposed model, we characterize delay distributions analytically and match model parameters to empirical quantities in the data to obtain a good model fit. Insights from an epidemiological perspective are presented, as well as their implications for mitigation and control strategies.