Datasets for Phylogenomics of microleafhoppers (Hemiptera: Cicadellidae: Typhlocybinae): morphological evolution, divergence times and biogeography

Dataset

Description

The following files were used to reconstruct the phylogeny of the leafhopper subfamily Typhlocybinae, using IQ-TREE v1.6.12 and ASTRAL v 4.10.5.

<b>1) Taxon_sampling.csv:</b> contains the sample IDs (1st column) and the taxonomic information (2nd column). Sample IDs were used in the alignment files and partition files.

<b>2) concatenated_nt_complete.phy:</b> a complete concatenated nucleotide dataset used for the maximum likelihood analysis by IQ-TREE v1.6.12. The file lists the sequences of 248 samples with 154,992 nucleotide positions (intron included) from 665 loci. Hyphens are used to represent gaps.

<b>3) concatenated_nt_complete_partition.nex:</b> the partitioning schemes for concatenated_nt_complete.phy. The file partitions the 154,992 nucleotide characters into 426 character sets, and defines the best substitution model for each character set.

<b>4) concatenated_cds_complete.phy:</b> a complete concatenated coding DNA sequence dataset used for the maximum likelihood analysis by IQ-TREE v1.6.12. The file lists the sequences of 248 samples with 153,525 nucleotide positions (intron excluded) from 665 loci. Hyphens are used to represent gaps.

<b>5) concatenated_cds_complete_partition.nex:</b> the partitioning schemes for concatenated_cds_complete.phy. The file partitions the 153,525 nucleotide characters into 426 character sets, and defines the best substitution model for each character set.

<b>6) concatenated_nt_reduced.phy:</b> a reduced concatenated nucleotide dataset used for the maximum likelihood analysis by IQ-TREE v1.6.12. The file lists the sequences of 248 samples with 95,076 nucleotide positions (intron included) from 374 loci. Hyphens are used to represent gaps.

<b>7) concatenated_nt_reduced_partition.nex:</b> the partitioning schemes for concatenated_nt_reduced.phy. The file partitions the 95,076 nucleotide characters into 312 character sets, and defines the best substitution model for each character set.

<b>8) concatenated_aa_complete.phy:</b> a complete concatenated amino acid dataset used for the maximum likelihood analysis by IQ-TREE v1.6.12, corresponding to concatenated_cds_complete.phy. The file lists the sequences of 248 samples with 51,175 amino acid positions from 665 loci. Hyphens are used to represent gaps.

<b>9) concatenated_aa_complete_partition.nex:</b> the partitioning schemes for concatenated_aa_complete.phy. The file partitions the 51,175 amino acid characters into 426 character sets, and defines the best substitution model for each character set.

<b>10) concatenated_aa_reduced.phy:</b> a reduced concatenated amino acid dataset used for the maximum likelihood analysis by IQ-TREE v1.6.12, corresponding to concatenated_nt_reduced.phy. The file lists the sequences of 248 samples with 31,384 amino acid positions from 374 loci. Hyphens are used to represent gaps.

<b>11) concatenated_aa_reduced_partition.nex:</b> the partitioning schemes for concatenated_aa_reduced.phy. The file partitions the 31,384 amino acid characters into 312 character sets, and defines the best substitution model for each character set.

<b>12) Individual_gene_alignment.zip:</b> contains 426 FASTA files, each one is an alignment for a gene. Hyphens are used to represent gaps. These files were used to construct gene trees using IQ-TREE v1.6.12, followed by multispecies coalescent analysis using ASTRAL v 4.10.5 based the consensus trees with a minimum average bootstrap value of 70.
Date made availableJan 1 2023
PublisherUniversity of Illinois at Urbana-Champaign

Keywords

  • Auchenorrhyncha, Cicadomorpha, Membracoidea, anchored hybrid enrichment

Cite this