TY - JOUR
T1 - Inpactor, integrated and parallel analyzer and classifier of LTR retrotransposons and ITS application for pineapple LTR retrotransposons diversity and dynamics
AU - Orozco-Arias, Simon
AU - Liu, Juan
AU - Tabares-Soto, Reinel
AU - Ceballos, Diego
AU - Domingues, Douglas Silva
AU - Garavito, Andréa
AU - Ming, Ray
AU - Guyot, Romain
N1 - Publisher Copyright:
© 2018 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2018
Y1 - 2018
N2 - One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44% of transposable elements, of which 23% were classified as LTR retrotransposons. Exceptionally, 16.4% of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.
AB - One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44% of transposable elements, of which 23% were classified as LTR retrotransposons. Exceptionally, 16.4% of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.
KW - HPC
KW - Inpactor
KW - LTR retrotransposons
KW - Parallel programming
KW - Pineapple
KW - Transposable elements
UR - http://www.scopus.com/inward/record.url?scp=85048642457&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85048642457&partnerID=8YFLogxK
U2 - 10.3390/biology7020032
DO - 10.3390/biology7020032
M3 - Article
C2 - 29799487
AN - SCOPUS:85048642457
SN - 2079-7737
VL - 7
JO - Biology
JF - Biology
IS - 2
M1 - 32
ER -