Inpactor, integrated and parallel analyzer and classifier of LTR retrotransposons and ITS application for pineapple LTR retrotransposons diversity and dynamics

Simon Orozco-Arias, Juan Liu, Reinel Tabares-Soto, Diego Ceballos, Douglas Silva Domingues, Andréa Garavito, Ray Ming, Romain Guyot

Research output: Contribution to journalArticle

Abstract

One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44% of transposable elements, of which 23% were classified as LTR retrotransposons. Exceptionally, 16.4% of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.

Original languageEnglish (US)
Article number32
JournalBiology
Volume7
Issue number2
DOIs
StatePublished - Jan 1 2018

Fingerprint

Ananas
Retroelements
terminal repeat sequences
Terminal Repeat Sequences
retrotransposons
pineapples
Classifiers
Genes
Genome
genome
genome assembly
DNA Transposable Elements
taxonomy
transposons
computer techniques
Computing Methodologies
Plant Genome
bioinformatics
Supercomputers
Computational Biology

Keywords

  • HPC
  • Inpactor
  • LTR retrotransposons
  • Parallel programming
  • Pineapple
  • Transposable elements

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Agricultural and Biological Sciences(all)

Cite this

Inpactor, integrated and parallel analyzer and classifier of LTR retrotransposons and ITS application for pineapple LTR retrotransposons diversity and dynamics. / Orozco-Arias, Simon; Liu, Juan; Tabares-Soto, Reinel; Ceballos, Diego; Domingues, Douglas Silva; Garavito, Andréa; Ming, Ray; Guyot, Romain.

In: Biology, Vol. 7, No. 2, 32, 01.01.2018.

Research output: Contribution to journalArticle

Orozco-Arias, Simon ; Liu, Juan ; Tabares-Soto, Reinel ; Ceballos, Diego ; Domingues, Douglas Silva ; Garavito, Andréa ; Ming, Ray ; Guyot, Romain. / Inpactor, integrated and parallel analyzer and classifier of LTR retrotransposons and ITS application for pineapple LTR retrotransposons diversity and dynamics. In: Biology. 2018 ; Vol. 7, No. 2.
@article{314b351ffff4490195fb539275f8fa42,
title = "Inpactor, integrated and parallel analyzer and classifier of LTR retrotransposons and ITS application for pineapple LTR retrotransposons diversity and dynamics",
abstract = "One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44{\%} of transposable elements, of which 23{\%} were classified as LTR retrotransposons. Exceptionally, 16.4{\%} of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.",
keywords = "HPC, Inpactor, LTR retrotransposons, Parallel programming, Pineapple, Transposable elements",
author = "Simon Orozco-Arias and Juan Liu and Reinel Tabares-Soto and Diego Ceballos and Domingues, {Douglas Silva} and Andr{\'e}a Garavito and Ray Ming and Romain Guyot",
year = "2018",
month = "1",
day = "1",
doi = "10.3390/biology7020032",
language = "English (US)",
volume = "7",
journal = "Biology",
issn = "2079-7737",
publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",
number = "2",

}

TY - JOUR

T1 - Inpactor, integrated and parallel analyzer and classifier of LTR retrotransposons and ITS application for pineapple LTR retrotransposons diversity and dynamics

AU - Orozco-Arias, Simon

AU - Liu, Juan

AU - Tabares-Soto, Reinel

AU - Ceballos, Diego

AU - Domingues, Douglas Silva

AU - Garavito, Andréa

AU - Ming, Ray

AU - Guyot, Romain

PY - 2018/1/1

Y1 - 2018/1/1

N2 - One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44% of transposable elements, of which 23% were classified as LTR retrotransposons. Exceptionally, 16.4% of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.

AB - One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44% of transposable elements, of which 23% were classified as LTR retrotransposons. Exceptionally, 16.4% of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.

KW - HPC

KW - Inpactor

KW - LTR retrotransposons

KW - Parallel programming

KW - Pineapple

KW - Transposable elements

UR - http://www.scopus.com/inward/record.url?scp=85048642457&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048642457&partnerID=8YFLogxK

U2 - 10.3390/biology7020032

DO - 10.3390/biology7020032

M3 - Article

AN - SCOPUS:85048642457

VL - 7

JO - Biology

JF - Biology

SN - 2079-7737

IS - 2

M1 - 32

ER -