TY - JOUR
T1 - A draft chromosome-scale genome assembly of a commercial sugarcane
AU - Shearman, Jeremy R.
AU - Pootakham, Wirulda
AU - Sonthirod, Chutima
AU - Naktang, Chaiwat
AU - Yoocha, Thippawan
AU - Sangsrakru, Duangjai
AU - Jomchai, Nukoon
AU - Tongsima, Sissades
AU - Piriyapongsa, Jittima
AU - Ngamphiw, Chumpol
AU - Wanasen, Nanchaya
AU - Ukoskit, Kittipat
AU - Punpee, Prapat
AU - Klomsa-ard, Peeraya
AU - Sriroth, Klanarong
AU - Zhang, Jisen
AU - Zhang, Xingtan
AU - Ming, Ray
AU - Tragoonrung, Somvong
AU - Tangphatsornruang, Sithichoke
N1 - Funding Information:
The authors would like to acknowledge funding from the National Science and Technology Development Agency, Thailand, and Mitr Phol Sugarcane Research Center.
Publisher Copyright:
© 2022, The Author(s).
PY - 2022/12
Y1 - 2022/12
N2 - Sugarcane accounts for a large portion of the worlds sugar production. Modern commercial cultivars are complex hybrids of S. officinarum, S. spontaneum, and several other Saccharum species, resulting in an auto-allopolyploid with 8–12 copies of each chromosome. The current genome assembly gold standard is to generate a long read assembly followed by chromatin conformation capture sequencing to scaffold. We used the PacBio RSII and chromatin conformation capture sequencing to sequence and assemble the genome of a South East Asian commercial sugarcane cultivar, known as Khon Kaen 3. The Khon Kaen 3 genome assembled into 104,477 contigs totalling 7 Gb, which scaffolded into 56 pseudochromosomes containing 5.2 Gb of sequence. Genome annotation produced 242,406 genes from 30,927 orthogroups. Aligning the Khon Kaen 3 genome sequence to S. officinarum and S. spontaneum revealed a high level of apparent recombination, indicating a chimeric assembly. This assembly error is explained by high nucleotide identity between S. officinarum and S. spontaneum, where 91.8% of S. spontaneum aligns to S. officinarum at 94% identity. Thus, the subgenomes of commercial sugarcane are so similar that using short reads to correct long PacBio reads produced chimeric long reads. Future attempts to sequence sugarcane must take this information into account.
AB - Sugarcane accounts for a large portion of the worlds sugar production. Modern commercial cultivars are complex hybrids of S. officinarum, S. spontaneum, and several other Saccharum species, resulting in an auto-allopolyploid with 8–12 copies of each chromosome. The current genome assembly gold standard is to generate a long read assembly followed by chromatin conformation capture sequencing to scaffold. We used the PacBio RSII and chromatin conformation capture sequencing to sequence and assemble the genome of a South East Asian commercial sugarcane cultivar, known as Khon Kaen 3. The Khon Kaen 3 genome assembled into 104,477 contigs totalling 7 Gb, which scaffolded into 56 pseudochromosomes containing 5.2 Gb of sequence. Genome annotation produced 242,406 genes from 30,927 orthogroups. Aligning the Khon Kaen 3 genome sequence to S. officinarum and S. spontaneum revealed a high level of apparent recombination, indicating a chimeric assembly. This assembly error is explained by high nucleotide identity between S. officinarum and S. spontaneum, where 91.8% of S. spontaneum aligns to S. officinarum at 94% identity. Thus, the subgenomes of commercial sugarcane are so similar that using short reads to correct long PacBio reads produced chimeric long reads. Future attempts to sequence sugarcane must take this information into account.
UR - http://www.scopus.com/inward/record.url?scp=85142898799&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85142898799&partnerID=8YFLogxK
U2 - 10.1038/s41598-022-24823-0
DO - 10.1038/s41598-022-24823-0
M3 - Article
C2 - 36443360
AN - SCOPUS:85142898799
SN - 2045-2322
VL - 12
JO - Scientific reports
JF - Scientific reports
IS - 1
M1 - 20474
ER -