Characterization of SARS-CoV-2 viral diversity within and across hosts

Palash Sashittal, Yunan Luo, Jian Peng, Mohammed El-Kebir

Research output: Working paper


In light of the current COVID-19 pandemic, there is an urgent need to accurately infer the evolutionary and transmission history of the virus to inform real-time outbreak management, public health policies and mitigation strategies. Current phylogenetic and phylodynamic approaches typically use consensus sequences, essentially assuming the presence of a single viral strain per host. Here, we analyze 621 bulk RNA sequencing samples and 7,540 consensus sequences from COVID-19 patients, and identify multiple strains of the virus, SARS-CoV-2, in four major clades that are prevalent within and across hosts. In particular, we find evidence for (i) within-host diversity across phylogenetic clades, (ii) putative cases of recombination, multi-strain and/or superinfections as well as (iii) distinct strain profiles across geographical locations and time. Our findings and algorithms will facilitate more detailed evolutionary analyses and contact tracing that specifically account for within-host viral diversity in the ongoing COVID-19 pandemic as well as future pandemics.Competing Interest StatementThe authors have declared no competing interest.
Original languageEnglish (US)
PublisherCold Spring Harbor Laboratory Press
Number of pages36
StateIn preparation - May 13 2020

Publication series



  • Coronavirus
  • COVID-19
  • severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)
  • Novel coronavirus
  • 2019-nCoV
  • Pandemic


Dive into the research topics of 'Characterization of SARS-CoV-2 viral diversity within and across hosts'. Together they form a unique fingerprint.

Cite this