Genes in SARS-CoV-2 and other viruses in the order of Nidovirales are expressed by a process of discontinuous transcription which is distinct from alternative splicing in eukaryotes and is mediated by the viral RNA-dependent RNA polymerase. Here, we introduce the DISCONTINUOUS TRANSCRIPT ASSEMBLYproblem of finding transcripts and their abundances given an alignment of paired-end short reads under a maximum likelihood model that accounts for varying transcript lengths. We show, using simulations, that our method, JUMPER, outperforms existing methods for classical transcript assembly. On short-read data of SARS-CoV-1, SARS-CoV-2 and MERS-CoV samples, we find that JUMPER not only identifies canonical transcripts that are part of the reference transcriptome, but also predicts expression of non-canonical transcripts that are supported by subsequent orthogonal analyses. Moreover, application of JUMPER on samples with and without treatment reveals viral drug response at the transcript level. As such, JUMPER enables detailed analyses of Nidovirales transcriptomes under varying conditions.
- Gene expression
- severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)
- RNA splicing
- Genome informatics
ASJC Scopus subject areas
- Biochemistry, Genetics and Molecular Biology(all)
- Physics and Astronomy(all)