TY - JOUR
T1 - Assessing support for Blaberoidea phylogeny suggests optimal locus quality
AU - Evangelista, Dominic
AU - Simon, Sabrina
AU - Wilson, Megan M.
AU - Kawahara, Akito Y.
AU - Kohli, Manpreet K.
AU - Ware, Jessica L.
AU - Wipfler, Benjamin
AU - Béthoux, Olivier
AU - Grandcolas, Philippe
AU - Legendre, Frédéric
N1 - Funding Information:
Thanks to the 1KITE consortium who supported this research with preliminary data and advice with software. Specifically, appreciation extends to Karen Meusemann, Alexander Donath, Bernhard Misof, Xin Zhou, Shanlin Liu, Ralph S. Peters, Lars Podsiadlowski, Ward Tollenaar, Mari Fujita, and Ryuichiro Machida. Huge thanks to all breeders (Nicolas Rousseaux, Tristan Shanahan, T.J. Ombrelle and Piotr Sterna), colleagues (Mike Picker), museums (MNHN, MFN, NHMUK and CAS) and curators (Jurgen Deckert, and George Beccaloni) who assisted in providing specimens. Great appreciation to New England Biolabs, MycroArray (now Arbor Biosciences), Sara Ruane, Ciara-Mae Mendoza, Melissa Sanchez-Herrera, Steven Ramirez and Mihaela Glamoclija for providing assistance in the lab. Additional thanks to Brian O'Meara for guidance and advice. Great appreciation to the reviewers whose input helped us improve the manuscript greatly. This research could not have been completed without the support of NSF (award # 1608559), all other funding agencies, the MNHN - Paris, Rutgers University and the University of Tennessee - Knoxville. This work was support by the National Science Foundation (award number 1608559) to DAE, FL and AK. The authors declare no conflicting interests.
Funding Information:
Thanks to the 1KITE consortium who supported this research with preliminary data and advice with software. Specifically, appreciation extends to Karen Meusemann, Alexander Donath, Bernhard Misof, Xin Zhou, Shanlin Liu, Ralph S. Peters, Lars Podsiadlowski, Ward Tollenaar, Mari Fujita, and Ryuichiro Machida. Huge thanks to all breeders (Nicolas Rousseaux, Tristan Shanahan, T.J. Ombrelle and Piotr Sterna), colleagues (Mike Picker), museums (MNHN, MFN, NHMUK and CAS) and curators (Jurgen Deckert, and George Beccaloni) who assisted in providing specimens. Great appreciation to New England Biolabs, MycroArray (now Arbor Biosciences), Sara Ruane, Ciara‐Mae Mendoza, Melissa Sanchez‐Herrera, Steven Ramirez and Mihaela Glamoclija for providing assistance in the lab. Additional thanks to Brian O'Meara for guidance and advice. Great appreciation to the reviewers whose input helped us improve the manuscript greatly. This research could not have been completed without the support of NSF (award # 1608559), all other funding agencies, the MNHN ‐ Paris, Rutgers University and the University of Tennessee ‐ Knoxville. This work was support by the National Science Foundation (award number 1608559) to DAE, FL and AK. The authors declare no conflicting interests.
Publisher Copyright:
© 2020 Royal Entomological Society
PY - 2021/1
Y1 - 2021/1
N2 - Phylogenomics seeks to use next-generation data to robustly infer an organism's evolutionary history. Yet, the practical caveats of phylogenomics motivate investigation of improved efficiency, particularly when quality of phylogenies are questionable. To achieve improvements, one goal is to maintain or enhance the quality of phylogenetic inference while severely reducing dataset size. We approach this by assessing which kinds of loci in phylogenomic alignments provide the majority of support for a phylogenetic inference of cockroaches in Blaberoidea. We examine locus substitution rate, saturation, evolutionary divergence, rate heterogeneity, stabilizing selection, and a priori information content as traits that may determine optimality. Our controlled experimental design is based on 265 loci for 102 blaberoidean taxa and 22 outgroup species. Loci with high substitution rate, low saturation, low sequence distance, low rate heterogeneity, and strong stabilizing selection derive more support for phylogenetic relationships. We found that some phylogenetic information content estimators may not be meaningful for assessing information content a priori. We use these findings to design concatenated datasets with an optimized subsample of 100 loci. The tree inferred from the optimized subsample alignment was largely identical to that inferred from all 265 loci but with less evidence of long branch attraction, improved statistical support, and potential 4-6x improvements to computation time. Supported by phylogenetic and morphological evidence, we erect three newly named clades (Anallactinae Evangelista & Wipfler subfam. nov., Orkrasomeria tax. nov. Evangelista, Wipfler, & Béthoux and Hemithyrsocerini Evangelista tribe nov.) and propose other taxonomic modifications. The diagnosis of Pseudophyllodromiidae Grandcolas, 1996 is modified to accommodate Anallactinae and Pseudophyllodromiinae Vickery & Kevan, 1983. The diagnosis of Ectobiidae Brunner von Wattenwyl, 1865 is modified to add novel morphological characters.
AB - Phylogenomics seeks to use next-generation data to robustly infer an organism's evolutionary history. Yet, the practical caveats of phylogenomics motivate investigation of improved efficiency, particularly when quality of phylogenies are questionable. To achieve improvements, one goal is to maintain or enhance the quality of phylogenetic inference while severely reducing dataset size. We approach this by assessing which kinds of loci in phylogenomic alignments provide the majority of support for a phylogenetic inference of cockroaches in Blaberoidea. We examine locus substitution rate, saturation, evolutionary divergence, rate heterogeneity, stabilizing selection, and a priori information content as traits that may determine optimality. Our controlled experimental design is based on 265 loci for 102 blaberoidean taxa and 22 outgroup species. Loci with high substitution rate, low saturation, low sequence distance, low rate heterogeneity, and strong stabilizing selection derive more support for phylogenetic relationships. We found that some phylogenetic information content estimators may not be meaningful for assessing information content a priori. We use these findings to design concatenated datasets with an optimized subsample of 100 loci. The tree inferred from the optimized subsample alignment was largely identical to that inferred from all 265 loci but with less evidence of long branch attraction, improved statistical support, and potential 4-6x improvements to computation time. Supported by phylogenetic and morphological evidence, we erect three newly named clades (Anallactinae Evangelista & Wipfler subfam. nov., Orkrasomeria tax. nov. Evangelista, Wipfler, & Béthoux and Hemithyrsocerini Evangelista tribe nov.) and propose other taxonomic modifications. The diagnosis of Pseudophyllodromiidae Grandcolas, 1996 is modified to accommodate Anallactinae and Pseudophyllodromiinae Vickery & Kevan, 1983. The diagnosis of Ectobiidae Brunner von Wattenwyl, 1865 is modified to add novel morphological characters.
UR - http://www.scopus.com/inward/record.url?scp=85090975146&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090975146&partnerID=8YFLogxK
U2 - 10.1111/syen.12454
DO - 10.1111/syen.12454
M3 - Article
AN - SCOPUS:85090975146
SN - 0307-6970
VL - 46
SP - 157
EP - 171
JO - Systematic Entomology
JF - Systematic Entomology
IS - 1
ER -