1
|
Wong MK, Chen WJ. Exploring the phylogeny and depth evolution of cusk eels and their relatives (Ophidiiformes: Ophidioidei). Mol Phylogenet Evol 2024; 199:108164. [PMID: 39084413 DOI: 10.1016/j.ympev.2024.108164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 07/27/2024] [Accepted: 07/27/2024] [Indexed: 08/02/2024]
Abstract
With 289 known species in 51 genera, the ophidiiform family Ophidiidae together with their relatives from the Carapidae (36 species in eight genera) of the same suborder Ophidioidei dominate the deep sea, but some occur also in shallow water habitats. Despite their high species diversity in the deep sea and wide bathymetric distributions, their phylogenetic relationships and evolution remain unexplored due in part to sampling difficulties. Thanks to the biodiversity exploratory program entitled "Tropical Deep-Sea Benthos" and joint efforts between Taiwan and French teams for sampling from different localities across the Indo-West Pacific over the last two decades, we are able to compile comprehensive datasets for investigations. In this study, 59 samples representing 36 of 59 known ophidioid genera are selected and used to construct a multi-gene dataset to infer the phylogenetic relationships of ophidioid fishes and their relatives. Our results reveal that the Ophidiidae forms a paraphyletic group with respect to the Carapidae. The four main clades of Ophidioidei resolved are the (1) clade comprising species from the subfamily Brotulinae; (2) clade that includes species in the genera Acanthonus and Xyelacyba; (3) clade grouping Hypopleuron caninum with species from the family Carapidae; and (4) clade containing the species in the subfamily Brotulotaenilinae, Neobythitinae (in part), and Ophidiinae. Accordingly, we suggest the following new revisions based on our results and proposed morphological diagnoses. The subfamily Brotulinae should be elevated to the family level. The genera Xyelacyba and probably Tauredophidium (unsampled in this study) should be included in the newly established family Acanthonidae with Acanthonus. The families Carapidae and Ophidiidae are re-defined. Our time-calibrated phylogenetic and ancestral depth reconstructions enable us to clarify the evolutionary history of ophidiiform fishes and infer past patterns of species distributions at different depths. While Ophidiiformes is inferred to have originated in shallow waters around 96.25 million years ago (Mya), the common ancestor to the Ophidioidei is inferred to have invaded the deep sea around 90.22 Mya, the dates coinciding with the global anoxic event of the OAE2. The observed bathymetric distribution patterns in Ophidioidei most likely point to the mesopelagic zone as the center of origin and diversification. This was followed by multiple events of depth transitions or range expansions towards either shallower waters or greater depth zones, which were likely triggered by past climate changes during the Paleogene-Neogene.
Collapse
Affiliation(s)
- Man-Kwan Wong
- Institute of Oceanography, National Taiwan University, No.1, Sec. 4, Roosevelt Road, Taipei 10617, Taiwan.
| | - Wei-Jen Chen
- Institute of Oceanography, National Taiwan University, No.1, Sec. 4, Roosevelt Road, Taipei 10617, Taiwan.
| |
Collapse
|
2
|
Zhang M, Song Y, Wang C, Sun G, Zhuang L, Guo M, Ren L, Wangdue S, Dong G, Dai Q, Cao P, Yang R, Liu F, Feng X, Bennett EA, Zhang X, Chen X, Wang F, Luan F, Dong W, Lu G, Hao D, Hou H, Wang H, Qiao H, Wang Z, Hu X, He W, Xi L, Wang W, Shao J, Sun Z, Yue L, Ding Y, Tashi N, Tsho Y, Tong Y, Yang Y, Zhu S, Miao B, Wang W, Zhang L, Hu S, Ni X, Fu Q. Ancient Mitogenomes Reveal the Maternal Genetic History of East Asian Dogs. Mol Biol Evol 2024; 41:msae062. [PMID: 38507661 PMCID: PMC11003542 DOI: 10.1093/molbev/msae062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 02/27/2024] [Accepted: 03/11/2024] [Indexed: 03/22/2024] Open
Abstract
Recent studies have suggested that dogs were domesticated during the Last Glacial Maximum (LGM) in Siberia, which contrasts with previous proposed domestication centers (e.g. Europe, the Middle East, and East Asia). Ancient DNA provides a powerful resource for the study of mammalian evolution and has been widely used to understand the genetic history of domestic animals. To understand the maternal genetic history of East Asian dogs, we have made a complete mitogenome dataset of 120 East Asian canids from 38 archaeological sites, including 102 newly sequenced from 12.9 to 1 ka BP (1,000 years before present). The majority (112/119, 94.12%) belonged to haplogroup A, and half of these (55/112, 49.11%) belonged to sub-haplogroup A1b. Most existing mitochondrial haplogroups were present in ancient East Asian dogs. However, mitochondrial lineages in ancient northern dogs (northeastern Eurasia and northern East Asia) were deeper and older than those in southern East Asian dogs. Results suggests that East Asian dogs originated from northeastern Eurasian populations after the LGM, dispersing in two possible directions after domestication. Western Eurasian (Europe and the Middle East) dog maternal ancestries genetically influenced East Asian dogs from approximately 4 ka BP, dramatically increasing after 3 ka BP, and afterwards largely replaced most primary maternal lineages in northern East Asia. Additionally, at least three major mitogenome sub-haplogroups of haplogroup A (A1a, A1b, and A3) reveal at least two major dispersal waves onto the Qinghai-Tibet Plateau in ancient times, indicating eastern (A1b and A3) and western (A1a) Eurasian origins.
Collapse
Affiliation(s)
- Ming Zhang
- China-Central Asia “the Belt and Road” Joint Laboratory on Human and Environment Research, Key Laboratory of Cultural Heritage Research and Conservation, School of Culture Heritage, Northwest University, Xi’an, China
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
| | - Yanbo Song
- School of Archaeology, Shandong University, Jinan, China
| | - Caihui Wang
- China-Central Asia “the Belt and Road” Joint Laboratory on Human and Environment Research, Key Laboratory of Cultural Heritage Research and Conservation, School of Culture Heritage, Northwest University, Xi’an, China
| | - Guoping Sun
- Zhejiang Provincial Institute of Cultural Relics and Archaeology, Hangzhou, China
| | | | | | - Lele Ren
- School of History and Culture, Lanzhou University, Lanzhou, China
| | - Shargan Wangdue
- Tibet Institute for Conservation and Research of Cultural Relics, Lhasa, China
| | - Guanghui Dong
- Key Laboratory of Western China's Environmental Systems (Ministry of Education), College of Earth and Environmental Sciences, Lanzhou University, Lanzhou, China
| | - Qingyan Dai
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
| | - Peng Cao
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
| | - Ruowei Yang
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
| | - Feng Liu
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
| | - Xiaotian Feng
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
| | - E Andrew Bennett
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
| | - Xiaoling Zhang
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
| | - Xi Chen
- Department of Cultural Heritage and Museology, Nanjing Normal University, Nanjing, China
| | - Fen Wang
- School of Archaeology, Shandong University, Jinan, China
| | - Fengshi Luan
- School of Archaeology, Shandong University, Jinan, China
| | - Wenbin Dong
- Shandong Provincial Institute of Cultural Relics and Archaeology, Jinan, China
| | - Guoquan Lu
- School of Archaeology, Shandong University, Jinan, China
| | - Daohua Hao
- Shandong Provincial Institute of Cultural Relics and Archaeology, Jinan, China
| | - Hongwei Hou
- Gansu Provincial Institute of Cultural Relics and Archaeology, Lanzhou, China
| | - Hui Wang
- Gansu Provincial Institute of Cultural Relics and Archaeology, Lanzhou, China
- Fudan Archaeological Science Institute, Fudan University, Shanghai, China
| | - Hong Qiao
- Qinghai Provincial Cultural Relics and Archaeology Institute, Xining, China
| | - Zhongxin Wang
- Qinghai Provincial Cultural Relics and Archaeology Institute, Xining, China
| | - Xiaojun Hu
- Qinghai Provincial Cultural Relics and Archaeology Institute, Xining, China
| | - Wei He
- Tibet Institute for Conservation and Research of Cultural Relics, Lhasa, China
| | - Lin Xi
- Shaanxi Academy of Archaeology, Xi’an, China
| | - Weilin Wang
- School of Archaeology and Museology, Shanxi University, Taiyuan, China
| | - Jing Shao
- Shaanxi Academy of Archaeology, Xi’an, China
| | | | | | - Yan Ding
- Shaanxi Academy of Archaeology, Xi’an, China
| | - Norbu Tashi
- Tibet Institute for Conservation and Research of Cultural Relics, Lhasa, China
| | - Yang Tsho
- Tibet Institute for Conservation and Research of Cultural Relics, Lhasa, China
| | - Yan Tong
- Tibet Institute for Conservation and Research of Cultural Relics, Lhasa, China
| | - Yangheshan Yang
- School of Ecological and Environmental Sciences, East China Normal University, Shanghai, China
| | - Shilun Zhu
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Bo Miao
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
| | - Wenjun Wang
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
- Science and Technology Archaeology, National Centre for Archaeology, Beijing, China
| | - Lizhao Zhang
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
| | - Songmei Hu
- Joint International Research Laboratory of Environmental and Social Archaeology, Shandong University, Qingdao, China
- Shaanxi Academy of Archaeology, Xi’an, China
| | - Xijun Ni
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Qiaomei Fu
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China
- University of the Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
3
|
Seo HW, Wassano NS, Amir Rawa MS, Nickles GR, Damasio A, Keller NP. A Timeline of Biosynthetic Gene Cluster Discovery in Aspergillus fumigatus: From Characterization to Future Perspectives. J Fungi (Basel) 2024; 10:266. [PMID: 38667937 PMCID: PMC11051388 DOI: 10.3390/jof10040266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 03/27/2024] [Accepted: 03/28/2024] [Indexed: 04/28/2024] Open
Abstract
In 1999, the first biosynthetic gene cluster (BGC), synthesizing the virulence factor DHN melanin, was characterized in Aspergillus fumigatus. Since then, 19 additional BGCs have been linked to specific secondary metabolites (SMs) in this species. Here, we provide a comprehensive timeline of A. fumigatus BGC discovery and find that initial advances centered around the commonly expressed SMs where chemical structure informed rationale identification of the producing BGC (e.g., gliotoxin, fumigaclavine, fumitremorgin, pseurotin A, helvolic acid, fumiquinazoline). Further advances followed the transcriptional profiling of a ΔlaeA mutant, which aided in the identification of endocrocin, fumagillin, hexadehydroastechrome, trypacidin, and fumisoquin BGCs. These SMs and their precursors are the commonly produced metabolites in most A. fumigatus studies. Characterization of other BGC/SM pairs required additional efforts, such as induction treatments, including co-culture with bacteria (fumicycline/neosartoricin, fumigermin) or growth under copper starvation (fumivaline, fumicicolin). Finally, four BGC/SM pairs were discovered via overexpression technologies, including the use of heterologous hosts (fumicycline/neosartoricin, fumihopaside, sphingofungin, and sartorypyrone). Initial analysis of the two most studied A. fumigatus isolates, Af293 and A1160, suggested that both harbored ca. 34-36 BGCs. However, an examination of 264 available genomes of A. fumigatus shows up to 20 additional BGCs, with some strains showing considerable variations in BGC number and composition. These new BGCs present a new frontier in the future of secondary metabolism characterization in this important species.
Collapse
Affiliation(s)
- Hye-Won Seo
- Department of Medical Microbiology and Immunology, University of Wisconsin, Madison, WI 53706, USA; (H.-W.S.); (N.S.W.); (M.S.A.R.); (G.R.N.)
| | - Natalia S. Wassano
- Department of Medical Microbiology and Immunology, University of Wisconsin, Madison, WI 53706, USA; (H.-W.S.); (N.S.W.); (M.S.A.R.); (G.R.N.)
- Department of Biochemistry and Tissue Biology, Institute of Biology, University of Campinas (UNICAMP), São Paulo 13083-970, Brazil;
| | - Mira Syahfriena Amir Rawa
- Department of Medical Microbiology and Immunology, University of Wisconsin, Madison, WI 53706, USA; (H.-W.S.); (N.S.W.); (M.S.A.R.); (G.R.N.)
| | - Grant R. Nickles
- Department of Medical Microbiology and Immunology, University of Wisconsin, Madison, WI 53706, USA; (H.-W.S.); (N.S.W.); (M.S.A.R.); (G.R.N.)
| | - André Damasio
- Department of Biochemistry and Tissue Biology, Institute of Biology, University of Campinas (UNICAMP), São Paulo 13083-970, Brazil;
| | - Nancy P. Keller
- Department of Medical Microbiology and Immunology, University of Wisconsin, Madison, WI 53706, USA; (H.-W.S.); (N.S.W.); (M.S.A.R.); (G.R.N.)
- Department of Plant Pathology, University of Wisconsin, Madison, WI 53706, USA
| |
Collapse
|
4
|
Kiledal EA, Reitz LA, Kuiper EQ, Evans J, Siddiqui R, Denef VJ, Dick GJ. Comparative genomic analysis of Microcystis strain diversity using conserved marker genes. HARMFUL ALGAE 2024; 132:102580. [PMID: 38331539 DOI: 10.1016/j.hal.2024.102580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 01/08/2024] [Accepted: 01/09/2024] [Indexed: 02/10/2024]
Abstract
Microcystis-dominated cyanobacterial harmful algal blooms (cyanoHABs) have a global impact on freshwater environments, affecting both wildlife and human health. Microcystis diversity and function in field samples and laboratory cultures can be determined by sequencing whole genomes of cultured isolates or natural populations, but these methods remain computationally and financially expensive. Amplicon sequencing of marker genes is a lower cost and higher throughput alternative to characterize strain composition and diversity in mixed samples. However, the selection of appropriate marker gene region(s) and primers requires prior understanding of the relationship between single gene genotype, whole genome content, and phenotype. To identify phylogenetic markers of Microcystis strain diversity, we compared phylogenetic trees built from each of 2,351 individual core genes to an established phylogeny and assessed the ability of these core genes to predict whole genome content and bioactive compound genotypes. We identified single-copy core genes better able to resolve Microcystis phylogenies than previously identified marker genes. We developed primers suitable for current Illumina-based amplicon sequencing with near-complete coverage of available Microcystis genomes and demonstrate that they outperform existing options for assessing Microcystis strain composition. Results showed that genetic markers can be used to infer Microcystis gene content and phenotypes such as potential production of bioactive compounds , although marker performance varies by bioactive compound gene and sequence similarity. Finally, we demonstrate that these markers can be used to characterize the Microcystis strain composition of laboratory or field samples like those collected for surveillance and modeling of Microcystis-dominated cyanobacterial harmful algal blooms.
Collapse
Affiliation(s)
- E Anders Kiledal
- Department of Earth and Environmental Sciences, University of Michigan, 2534 North University Building, 1100 North University Avenue Ave, Rm. 2004, Ann Arbor, MI 48109-1005, USA.
| | - Laura A Reitz
- Department of Earth and Environmental Sciences, University of Michigan, 2534 North University Building, 1100 North University Avenue Ave, Rm. 2004, Ann Arbor, MI 48109-1005, USA
| | - Esmée Q Kuiper
- Department of Earth and Environmental Sciences, University of Michigan, 2534 North University Building, 1100 North University Avenue Ave, Rm. 2004, Ann Arbor, MI 48109-1005, USA
| | - Jacob Evans
- Department of Ecology and Evolutionary Biology, University of Michigan, 2220 Biological Sciences Building, 1105 North University Avenue, Ann Arbor, MI 48109-1005, USA
| | - Ruqaiya Siddiqui
- Microbiome Core, University of Michigan, 1500 MSRB 1, 1150W Medical Center Drive, Ann Arbor, MI 48109-5666, USA
| | - Vincent J Denef
- Department of Ecology and Evolutionary Biology, University of Michigan, 2220 Biological Sciences Building, 1105 North University Avenue, Ann Arbor, MI 48109-1005, USA
| | - Gregory J Dick
- Department of Earth and Environmental Sciences, University of Michigan, 2534 North University Building, 1100 North University Avenue Ave, Rm. 2004, Ann Arbor, MI 48109-1005, USA; Cooperative Institute for Great Lakes Research, University of Michigan, 4040 Dana Building, 440 Church Street, Ann Arbor, MI 48109-1041, USA
| |
Collapse
|
5
|
Piwczyński M, Granjon L, Trzeciak P, Carlos Brito J, Oana Popa M, Daba Dinka M, Johnston NP, Boratyński Z. Unraveling phylogenetic relationships and species boundaries in the arid adapted Gerbillus rodents (Muridae: Gerbillinae) by RAD-seq data. Mol Phylogenet Evol 2023; 189:107913. [PMID: 37659480 DOI: 10.1016/j.ympev.2023.107913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 08/25/2023] [Accepted: 08/28/2023] [Indexed: 09/04/2023]
Abstract
Gerbillus is one of the most speciose genera among rodents, with ca. 51 recognized species. Previous attempts to reconstruct the evolutionary history of Gerbillus mainly relied on the mitochondrial cyt-b marker as a source of phylogenetic information. In this study, we utilize RAD-seq genomic data from 37 specimens representing 11 species to reconstruct the phylogenetic tree for Gerbillus, applying concatenation and coalescence methods. We identified four highly supported clades corresponding to the traditionally recognized subgenera: Dipodillus, Gerbillus, Hendecapleura and Monodia. Only two uncertain branches were detected in the resulting trees, with one leading to diversification of the main lineages in the genus, recognized by quartet sampling analysis as uncertain due to possible introgression. We also examined species boundaries for four pairs of sister taxa, including potentially new species from Morocco, using SNAPP. The results strongly supported a speciation model in which all taxa are treated as separate species. The dating analyses confirmed the Plio-Pleistocene diversification of the genus, with the uncertain branch coinciding with the beginning of aridification of the Sahara at the the Plio-Pleistocene boundary. This study aligns well with the earlier analyses based on the cyt-b marker, reaffirming its suitability as an adequate marker for estimating genetic diversity in Gerbillus.
Collapse
Affiliation(s)
- Marcin Piwczyński
- Department of Ecology and Biogeography, Nicolaus Copernicus University in Toruń, Lwowska 1, PL-87-100 Toruń, Poland.
| | - Laurent Granjon
- CBGP, IRD, CIRAD, INRAE, Institut Agro, Université de Montpellier, Montpellier, France
| | - Paulina Trzeciak
- Department of Ecology and Biogeography, Nicolaus Copernicus University in Toruń, Lwowska 1, PL-87-100 Toruń, Poland
| | - José Carlos Brito
- CIBIO-InBio, Research Center in Biodiversity and Genetic Resources, University of Porto, Campus de Vairão, Rua Padre Armando Quintas 7, 4485-661 Vairão, Portugal; BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, Vairão, Portugal; Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - Madalina Oana Popa
- Department of Ecology and Biogeography, Nicolaus Copernicus University in Toruń, Lwowska 1, PL-87-100 Toruń, Poland; "Stejarul" Research Centre for Biological Sciences, National Institute of Research and Development for Biological Sciences, Alexandru cel Bun 6, RO-610004, Piatra Neamţ, Romania
| | - Mergi Daba Dinka
- Department of Ecology and Biogeography, Nicolaus Copernicus University in Toruń, Lwowska 1, PL-87-100 Toruń, Poland
| | - Nikolas P Johnston
- School of Life Sciences, University of Technology Sydney, 15 Broadway, Ultimo, NSW 2007, Australia; Centre for Sustainable Ecosystem Solutions, School of Earth, Atmospheric and Life Sciences, University of Wollongong, Northfields Ave, Wollongong, NSW 2500, Australia
| | - Zbyszek Boratyński
- CIBIO-InBio, Research Center in Biodiversity and Genetic Resources, University of Porto, Campus de Vairão, Rua Padre Armando Quintas 7, 4485-661 Vairão, Portugal; BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, Vairão, Portugal
| |
Collapse
|
6
|
Nickles GR, Oestereicher B, Keller NP, Drott M. Mining for a new class of fungal natural products: the evolution, diversity, and distribution of isocyanide synthase biosynthetic gene clusters. Nucleic Acids Res 2023; 51:7220-7235. [PMID: 37427794 PMCID: PMC10415135 DOI: 10.1093/nar/gkad573] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 06/16/2023] [Accepted: 07/06/2023] [Indexed: 07/11/2023] Open
Abstract
The products of non-canonical isocyanide synthase (ICS) biosynthetic gene clusters (BGCs) mediate pathogenesis, microbial competition, and metal-homeostasis through metal-associated chemistry. We sought to enable research into this class of compounds by characterizing the biosynthetic potential and evolutionary history of these BGCs across the Fungal Kingdom. We amalgamated a pipeline of tools to predict BGCs based on shared promoter motifs and located 3800 ICS BGCs in 3300 genomes, making ICS BGCs the fifth largest class of specialized metabolites compared to canonical classes found by antiSMASH. ICS BGCs are not evenly distributed across fungi, with evidence of gene-family expansions in several Ascomycete families. We show that the ICS dit1/2 gene cluster family (GCF), which was prior only studied in yeast, is present in ∼30% of all Ascomycetes. The dit variety ICS exhibits greater similarity to bacterial ICS than other fungal ICS, suggesting a potential convergence of the ICS backbone domain. The evolutionary origins of the dit GCF in Ascomycota are ancient and these genes are diversifying in some lineages. Our results create a roadmap for future research into ICS BGCs. We developed a website (https://isocyanides.fungi.wisc.edu/) that facilitates the exploration and downloading of all identified fungal ICS BGCs and GCFs.
Collapse
Affiliation(s)
- Grant R Nickles
- Department of Medical Microbiology and Immunology, University of Wisconsin—Madison, Madison, WI 53706, USA
| | | | - Nancy P Keller
- Department of Medical Microbiology and Immunology, University of Wisconsin—Madison, Madison, WI 53706, USA
- Department of Plant Pathology, University of Wisconsin—Madison, Madison, WI 53706, USA
| | - Milton T Drott
- USDA-ARS Cereal Disease Lab (CDL), St. Paul, MN 55108, USA
| |
Collapse
|
7
|
Samaradiwakara NP, de Farias ARG, Tennakoon DS, Aluthmuhandiram JVS, Bhunjun CS, Chethana KWT, Kumla J, Lumyong S. Appendage-Bearing Sordariomycetes from Dipterocarpus alatus Leaf Litter in Thailand. J Fungi (Basel) 2023; 9:625. [PMID: 37367561 DOI: 10.3390/jof9060625] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/25/2023] [Accepted: 05/26/2023] [Indexed: 06/28/2023] Open
Abstract
Leaf litter is an essential functional aspect of forest ecosystems, acting as a source of organic matter, a protective layer in forest soils, and a nurturing habitat for micro- and macro-organisms. Through their successional occurrence, litter-inhabiting microfungi play a key role in litter decomposition and nutrient recycling. Despite their importance in terrestrial ecosystems and their abundance and diversity, information on the taxonomy, diversity, and host preference of these decomposer taxa is scarce. This study aims to clarify the taxonomy and phylogeny of four saprobic fungal taxa inhabiting Dipterocarpus alatus leaf litter. Leaf litter samples were collected from Doi Inthanon National Park in Chiang Mai, northern Thailand. Fungal isolates were characterized based on morphology and molecular phylogeny of the nuclear ribosomal DNA (ITS, LSU) and protein-coding genes (tub2, tef1-α, rpb2). One novel saprobic species, Ciliochorella dipterocarpi, and two new host records, Pestalotiopsis dracontomelon and Robillarda australiana, are introduced. The newly described taxa are compared with similar species, and comprehensive descriptions, micrographs, and phylogenetic trees are provided.
Collapse
Affiliation(s)
- Nethmini P Samaradiwakara
- Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
- Research Center of Microbial Diversity and Sustainable Utilization, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
- School of Science, Mae Fah Luang University, Chiang Rai 57100, Thailand
| | | | - Danushka S Tennakoon
- Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
- Research Center of Microbial Diversity and Sustainable Utilization, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Janith V S Aluthmuhandiram
- School of Science, Mae Fah Luang University, Chiang Rai 57100, Thailand
- Center of Excellence in Fungal Research, Mae Fah Luang University, Chiang Rai 57100, Thailand
- Beijing Key Laboratory of Environment Friendly Management on Fruit Diseases and Pests in North China, Institute of Plant and Environment Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
| | - Chitrabhanu S Bhunjun
- School of Science, Mae Fah Luang University, Chiang Rai 57100, Thailand
- Center of Excellence in Fungal Research, Mae Fah Luang University, Chiang Rai 57100, Thailand
| | - K W Thilini Chethana
- School of Science, Mae Fah Luang University, Chiang Rai 57100, Thailand
- Center of Excellence in Fungal Research, Mae Fah Luang University, Chiang Rai 57100, Thailand
| | - Jaturong Kumla
- Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
- Research Center of Microbial Diversity and Sustainable Utilization, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Saisamorn Lumyong
- Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
- Research Center of Microbial Diversity and Sustainable Utilization, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
- Academy of Science, The Royal Society of Thailand, Bangkok 10300, Thailand
| |
Collapse
|
8
|
Nickles GR, Oestereicher B, Keller NP, Drott MT. Mining for a New Class of Fungal Natural Products: The Evolution, Diversity, and Distribution of Isocyanide Synthase Biosynthetic Gene Clusters. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.17.537281. [PMID: 37131656 PMCID: PMC10153163 DOI: 10.1101/2023.04.17.537281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The products of non-canonical isocyanide synthase (ICS) biosynthetic gene clusters (BGCs) have notable bioactivities that mediate pathogenesis, microbial competition, and metal-homeostasis through metal-associated chemistry. We sought to enable research into this class of compounds by characterizing the biosynthetic potential and evolutionary history of these BGCs across the Fungal Kingdom. We developed the first genome-mining pipeline to identify ICS BGCs, locating 3,800 ICS BGCs in 3,300 genomes. Genes in these clusters share promoter motifs and are maintained in contiguous groupings by natural selection. ICS BGCs are not evenly distributed across fungi, with evidence of gene-family expansions in several Ascomycete families. We show that the ICS dit1 / 2 gene cluster family (GCF), which was thought to only exist in yeast, is present in ∼30% of all Ascomycetes, including many filamentous fungi. The evolutionary history of the dit GCF is marked by deep divergences and phylogenetic incompatibilities that raise questions about convergent evolution and suggest selection or horizontal gene transfers have shaped the evolution of this cluster in some yeast and dimorphic fungi. Our results create a roadmap for future research into ICS BGCs. We developed a website ( www.isocyanides.fungi.wisc.edu ) that facilitates the exploration, filtering, and downloading of all identified fungal ICS BGCs and GCFs.
Collapse
Affiliation(s)
- Grant R. Nickles
- Department of Medical Microbiology and Immunology, University of Wisconsin—Madison, Madison, WI 53706, USA
| | | | - Nancy P. Keller
- Department of Medical Microbiology and Immunology, University of Wisconsin—Madison, Madison, WI 53706, USA
- Department of Plant Pathology, University of Wisconsin—Madison, Madison, WI 53706, USA
| | | |
Collapse
|
9
|
Chambers EA, Tarvin RD, Santos JC, Ron SR, Betancourth‐Cundar M, Hillis DM, Matz MV, Cannatella DC. 2b or not 2b? 2bRAD is an effective alternative to ddRAD for phylogenomics. Ecol Evol 2023; 13:e9842. [PMID: 36911313 PMCID: PMC9994478 DOI: 10.1002/ece3.9842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 02/02/2023] [Accepted: 02/03/2023] [Indexed: 03/10/2023] Open
Abstract
Restriction-site-associated DNA sequencing (RADseq) has become an accessible way to obtain genome-wide data in the form of single-nucleotide polymorphisms (SNPs) for phylogenetic inference. Nonetheless, how differences in RADseq methods influence phylogenetic estimation is poorly understood because most comparisons have largely relied on conceptual predictions rather than empirical tests. We examine how differences in ddRAD and 2bRAD data influence phylogenetic estimation in two non-model frog groups. We compare the impact of method choice on phylogenetic information, missing data, and allelic dropout, considering different sequencing depths. Given that researchers must balance input (funding, time) with output (amount and quality of data), we also provide comparisons of laboratory effort, computational time, monetary costs, and the repeatability of library preparation and sequencing. Both 2bRAD and ddRAD methods estimated well-supported trees, even at low sequencing depths, and had comparable amounts of missing data, patterns of allelic dropout, and phylogenetic signal. Compared to ddRAD, 2bRAD produced more repeatable datasets, had simpler laboratory protocols, and had an overall faster bioinformatics assembly. However, many fewer parsimony-informative sites per SNP were obtained from 2bRAD data when using native pipelines, highlighting a need for further investigation into the effects of each pipeline on resulting datasets. Our study underscores the importance of comparing RADseq methods, such as expected results and theoretical performance using empirical datasets, before undertaking costly experiments.
Collapse
Affiliation(s)
- E. Anne Chambers
- Department of Integrative Biology and Biodiversity CenterUniversity of Texas at AustinAustinTexasUSA
- Department of Environmental Science, Policy, and Management and Museum of Vertebrate ZoologyUniversity of California BerkeleyBerkeleyCaliforniaUSA
| | - Rebecca D. Tarvin
- Department of Integrative Biology and Biodiversity CenterUniversity of Texas at AustinAustinTexasUSA
- Department of Integrative Biology and Museum of Vertebrate ZoologyUniversity of California BerkeleyBerkeleyCaliforniaUSA
| | - Juan C. Santos
- Department of Biological SciencesSt John's UniversityNew YorkNew YorkUSA
| | - Santiago R. Ron
- Museo de Zoología, Escuela de Ciencias BiológicasPontificia Universidad Católica del EcuadorQuitoEcuador
| | | | - David M. Hillis
- Department of Integrative Biology and Biodiversity CenterUniversity of Texas at AustinAustinTexasUSA
| | - Mikhail V. Matz
- Department of Integrative Biology and Biodiversity CenterUniversity of Texas at AustinAustinTexasUSA
| | - David C. Cannatella
- Department of Integrative Biology and Biodiversity CenterUniversity of Texas at AustinAustinTexasUSA
| |
Collapse
|
10
|
Out of chaos: Phylogenomics of Asian Sonerileae. Mol Phylogenet Evol 2022; 175:107581. [PMID: 35810973 DOI: 10.1016/j.ympev.2022.107581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 05/23/2022] [Accepted: 05/26/2022] [Indexed: 11/22/2022]
Abstract
Sonerileae is a diverse Melastomataceae lineage comprising ca. 1000 species in 44 genera, with >70% of genera and species distributed in Asia. Asian Sonerileae are taxonomically intractable with obscure generic circumscriptions. The backbone phylogeny of this group remains poorly resolved, possibly due to complexity caused by rapid species radiation in early and middle Miocene, which hampers further systematic study. Here, we used genome resequencing data to reconstruct the phylogeny of Asian Sonerileae. Three parallel datasets, viz. single-copy ortholog (SCO), genomic SNPs, and whole plastome, were assembled from genome resequencing data of 205 species for this purpose. Based on these genome-scale data, we provided the first well resolved phylogeny of Asian Sonerileae, with 34 major clades identified and 74% of the interclade relationships consistently resolved by both SCO and genomic data. Meanwhile, widespread phylogenetic discordance was detected among SCO gene trees as well as species trees reconstructed using different tree estimation methods (concatenation/site-based coalescent method/summary method) or different datasets (SCO/genomic/plastome). We explored sources of discordance using multiple approaches and found that the observed discordance in Asian Sonerileae was mainly caused by a combination of biased distribution of missing data, random noise from uninformative genes, incomplete lineage sorting, and hybridization/introgression. Exploration of these sources can enable us to generate hypotheses for future testing, which is the first step towards understanding the evolution of Asian Sonerileae. We also detected high levels of homoplasy for some characters traditionally used in taxonomy, which explains current chaotic generic delimitations. The backbone phylogeny of Asian Sonerileae revealed in this study offers a solid basis for future taxonomic revision at the generic level.
Collapse
|
11
|
Mai U, Mirarab S. Completing gene trees without species trees in sub-quadratic time. Bioinformatics 2022; 38:1532-1541. [PMID: 34978565 DOI: 10.1093/bioinformatics/btab875] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 11/27/2021] [Accepted: 12/30/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION As genome-wide reconstruction of phylogenetic trees becomes more widespread, limitations of available data are being appreciated more than ever before. One issue is that phylogenomic datasets are riddled with missing data, and gene trees, in particular, almost always lack representatives from some species otherwise available in the dataset. Since many downstream applications of gene trees require or can benefit from access to complete gene trees, it will be beneficial to algorithmically complete gene trees. Also, gene trees are often unrooted, and rooting them is useful for downstream applications. While completing and rooting a gene tree with respect to a given species tree has been studied, those problems are not studied in depth when we lack such a reference species tree. RESULTS We study completion of gene trees without a need for a reference species tree. We formulate an optimization problem to complete the gene trees while minimizing their quartet distance to the given set of gene trees. We extend a seminal algorithm by Brodal et al. to solve this problem in quasi-linear time. In simulated studies and on a large empirical data, we show that completion of gene trees using other gene trees is relatively accurate and, unlike the case where a species tree is available, is unbiased. AVAILABILITY AND IMPLEMENTATION Our method, tripVote, is available at https://github.com/uym2/tripVote. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Uyen Mai
- Department of Computer Science and Engineering, University of California San Diego, San Diego, CA 92093, USA
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA 92093, USA
| |
Collapse
|
12
|
Jamdade R, Upadhyay M, Al Shaer K, Al Harthi E, Al Sallani M, Al Jasmi M, Al Ketbi A. Evaluation of Arabian Vascular Plant Barcodes (rbcL and matK): Precision of Unsupervised and Supervised Learning Methods towards Accurate Identification. PLANTS (BASEL, SWITZERLAND) 2021; 10:plants10122741. [PMID: 34961211 PMCID: PMC8708657 DOI: 10.3390/plants10122741] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 09/16/2021] [Accepted: 09/23/2021] [Indexed: 06/14/2023]
Abstract
Arabia is the largest peninsula in the world, with >3000 species of vascular plants. Not much effort has been made to generate a multi-locus marker barcode library to identify and discriminate the recorded plant species. This study aimed to determine the reliability of the available Arabian plant barcodes (>1500; rbcL and matK) at the public repository (NCBI GenBank) using the unsupervised and supervised methods. Comparative analysis was carried out with the standard dataset (FINBOL) to assess the methods and markers' reliability. Our analysis suggests that from the unsupervised method, TaxonDNA's All Species Barcode criterion (ASB) exhibits the highest accuracy for rbcL barcodes, followed by the matK barcodes using the aligned dataset (FINBOL). However, for the Arabian plant barcode dataset (GBMA), the supervised method performed better than the unsupervised method, where the Random Forest and K-Nearest Neighbor (gappy kernel) classifiers were robust enough. These classifiers successfully recognized true species from both barcode markers belonging to the aligned and alignment-free datasets, respectively. The multi-class classifier showed high species resolution following the two classifiers, though its performance declined when employed to recognize true species. Similar results were observed for the FINBOL dataset through the supervised learning approach; overall, matK marker showed higher accuracy than rbcL. However, the lower rate of species identification in matK in GBMA data could be due to the higher evolutionary rate or gaps and missing data, as observed for the ASB criterion in the FINBOL dataset. Further, a lower number of sequences and singletons could also affect the rate of species resolution, as observed in the GBMA dataset. The GBMA dataset lacks sufficient species membership. We would encourage the taxonomists from the Arabian Peninsula to join our campaign on the Arabian Barcode of Life at the Barcode of Life Data (BOLD) systems. Our efforts together could help improve the rate of species identification for the Arabian Vascular plants.
Collapse
Affiliation(s)
- Rahul Jamdade
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority, Sharjah P.O. Box 2926, United Arab Emirates; (K.A.S.); (E.A.H.); (M.A.S.); (M.A.J.); (A.A.K.)
| | - Maulik Upadhyay
- Population Genomics Group, Department of Veterinary Sciences, Ludwig Maximillians University, 80539 Munich, Germany;
| | - Khawla Al Shaer
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority, Sharjah P.O. Box 2926, United Arab Emirates; (K.A.S.); (E.A.H.); (M.A.S.); (M.A.J.); (A.A.K.)
| | - Eman Al Harthi
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority, Sharjah P.O. Box 2926, United Arab Emirates; (K.A.S.); (E.A.H.); (M.A.S.); (M.A.J.); (A.A.K.)
| | - Mariam Al Sallani
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority, Sharjah P.O. Box 2926, United Arab Emirates; (K.A.S.); (E.A.H.); (M.A.S.); (M.A.J.); (A.A.K.)
| | - Mariam Al Jasmi
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority, Sharjah P.O. Box 2926, United Arab Emirates; (K.A.S.); (E.A.H.); (M.A.S.); (M.A.J.); (A.A.K.)
| | - Asma Al Ketbi
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority, Sharjah P.O. Box 2926, United Arab Emirates; (K.A.S.); (E.A.H.); (M.A.S.); (M.A.J.); (A.A.K.)
| |
Collapse
|
13
|
Abstract
Growing popularity of herbal medicine has increased the demand of medicinal orchids in the global markets leading to their overharvesting from natural habitats for illegal trade. To stop such illegal trade, the correct identification of orchid species from their traded products is a foremost requirement. Different species of medicinal orchids are traded as their dried or fresh parts (tubers, pseudobulbs, stems), which look similar to each other making it almost impossible to identify them merely based on morphological observation. To overcome this problem, DNA barcoding could be an important method for accurate identification of medicinal orchids. Therefore, this research evaluated DNA barcoding of medicinal orchids in Asia where illegal trade of medicinal orchids has long existed. Based on genetic distance, similarity-based and tree-based methods with sampling nearly 7,000 sequences from five single barcodes (ITS, ITS2, matK, rbcL, trnH-psbA and their seven combinations), this study revealed that DNA barcoding is effective for identifying medicinal orchids. Among single locus, ITS performed the best barcode, whereas ITS + matK exhibited the most efficient barcode among multi-loci. A barcode library as a resource for identifying medicinal orchids has been established which contains about 7,000 sequences of 380 species (i.e. 90%) of medicinal orchids in Asia.
Collapse
|
14
|
Nazareno AG, Knowles LL. There Is No 'Rule of Thumb': Genomic Filter Settings for a Small Plant Population to Obtain Unbiased Gene Flow Estimates. FRONTIERS IN PLANT SCIENCE 2021; 12:677009. [PMID: 34721447 PMCID: PMC8551369 DOI: 10.3389/fpls.2021.677009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Accepted: 06/16/2021] [Indexed: 06/13/2023]
Abstract
The application of high-density polymorphic single-nucleotide polymorphisms (SNP) markers derived from high-throughput sequencing methods has heralded plenty of biological questions about the linkages of processes operating at micro- and macroevolutionary scales. However, the effects of SNP filtering practices on population genetic inference have received much less attention. By performing sensitivity analyses, we empirically investigated how decisions about the percentage of missing data (MD) and the minor allele frequency (MAF) set in bioinformatic processing of genomic data affect direct (i.e., parentage analysis) and indirect (i.e., fine-scale spatial genetic structure - SGS) gene flow estimates. We focus specifically on these manifestations in small plant populations, and particularly, in the rare tropical plant species Dinizia jueirana-facao, where assumptions implicit to analytical procedures for accurate estimates of gene flow may not hold. Avoiding biases in dispersal estimates are essential given this species is facing extinction risks due to habitat loss, and so we also investigate the effects of forest fragmentation on the accuracy of dispersal estimates under different filtering criteria by testing for recent decrease in the scale of gene flow. Our sensitivity analyses demonstrate that gene flow estimates are robust to different setting of MAF (0.05-0.35) and MD (0-20%). Comparing the direct and indirect estimates of dispersal, we find that contemporary estimates of gene dispersal distance (σ r t = 41.8 m) was ∼ fourfold smaller than the historical estimates, supporting the hypothesis of a temporal shift in the scale of gene flow in D. jueirana-facao, which is consistent with predictions based on recent, dramatic forest fragmentation process. While we identified settings for filtering genomic data to avoid biases in gene flow estimates, we stress that there is no 'rule of thumb' for bioinformatic filtering and that relying on default program settings is not advisable. Instead, we suggest that the approach implemented here be applied independently in each separate empirical study to confirm appropriate settings to obtain unbiased population genetics estimates.
Collapse
Affiliation(s)
- Alison G. Nazareno
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, United States
- Department of Genetics, Ecology and Evolution, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - L. Lacey Knowles
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
15
|
Unmack PJ, Adams M, Hammer MP, Johnson JB, Gruber B, Gilles A, Young M, Georges A. Plotting for change: an analytical framework to aid decisions on which lineages are candidate species in phylogenomic species discovery. Biol J Linn Soc Lond 2021. [DOI: 10.1093/biolinnean/blab095] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Abstract
A recent study argued that coalescent-based models of species delimitation mostly delineate population structure, not species, and called for the validation of candidate species using biological information additional to the genetic information, such as phenotypic or ecological data. Here, we introduce a framework to interrogate genomic datasets and coalescent-based species trees for the presence of candidate species in situations where additional biological data are unavailable, unobtainable or uninformative. For de novo genomic studies of species boundaries, we propose six steps: (1) visualize genetic affinities among individuals to identify both discrete and admixed genetic groups from first principles and to hold aside individuals involved in contemporary admixture for independent consideration; (2) apply phylogenetic techniques to identify lineages; (3) assess diagnosability of those lineages as potential candidate species; (4) interpret the diagnosable lineages in a geographical context (sympatry, parapatry, allopatry); (5) assess significance of difference or trends in the context of sampling intensity; and (6) adopt a holistic approach to available evidence to inform decisions on species status in the difficult cases of allopatry. We adopt this approach to distinguish candidate species from within-species lineages for a widespread species complex of Australian freshwater fishes (Retropinna spp.). Our framework addresses two cornerstone issues in systematics that are often not discussed explicitly in genomic species discovery: diagnosability and how to determine it, and what criteria should be used to decide whether diagnosable lineages are conspecific or represent different species.
Collapse
Affiliation(s)
- Peter J Unmack
- Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia
- Centre for Applied Water Science, Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia
- Department of Biology, Brigham Young University, Provo, UT, USA
| | - Mark Adams
- Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia
- Department of Biological Sciences, University of Adelaide, Adelaide, SA, Australia
| | - Michael P Hammer
- Museum & Art Gallery of the Northern Territory, Darwin, NT, Australia
| | - Jerald B Johnson
- Department of Biology, Brigham Young University, Provo, UT, USA
- Monte L. Bean Life Science Museum, Brigham Young University, Provo, UT, USA
| | - Bernd Gruber
- Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia
| | - André Gilles
- UMR 1467 RECOVER, Aix Marseille Univ, INRAE, Centre St Charles, 3 place Victor Hugo, Marseille, France
| | - Matthew Young
- Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia
| | - Arthur Georges
- Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia
| |
Collapse
|
16
|
VanWallendael A, Alvarez M. Alignment-free methods for polyploid genomes: Quick and reliable genetic distance estimation. Mol Ecol Resour 2021; 22:612-622. [PMID: 34478242 DOI: 10.1111/1755-0998.13499] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 08/20/2021] [Indexed: 01/10/2023]
Abstract
Polyploid genomes pose several inherent challenges to population genetic analyses. While alignment-based methods are fundamentally limited in their applicability to polyploids, alignment-free methods bypass most of these limits. We investigated the use of Mash, a k-mer analysis tool that uses the MinHash method to reduce complexity in large genomic data sets, for basic population genetic analyses of polyploid sequences. We measured the degree to which Mash correctly estimated pairwise genetic distance in simulated haploid and polyploid short-read sequences with various levels of missing data. Mash-based estimates of genetic distance were comparable to alignment-based estimates, and were less impacted by missing data. We also used Mash to analyse publicly available short-read data for three polyploid and one diploid species, then compared Mash results to published results. For both simulated and real data, Mash accurately estimated pairwise genetic differences for polyploids as well as diploids as much as 476 times faster than alignment-based methods, though we found that Mash genetic distance estimates could be biased by per-sample read depth. Mash may be a particularly useful addition to the toolkit of polyploid geneticists for rapid confirmation of alignment-based results and for basic population genetics in reference-free systems or those with only poor-quality sequence data available.
Collapse
Affiliation(s)
- Acer VanWallendael
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA
| | - Mariano Alvarez
- Biology Department, Wesleyan University, Middletown, CT, USA
| |
Collapse
|
17
|
Smith BT, Mauck WM, Benz BW, Andersen MJ. Uneven Missing Data Skew Phylogenomic Relationships within the Lories and Lorikeets. Genome Biol Evol 2021; 12:1131-1147. [PMID: 32470111 PMCID: PMC7486955 DOI: 10.1093/gbe/evaa113] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/26/2020] [Indexed: 01/21/2023] Open
Abstract
The resolution of the Tree of Life has accelerated with advances in DNA sequencing technology. To achieve dense taxon sampling, it is often necessary to obtain DNA from historical museum specimens to supplement modern genetic samples. However, DNA from historical material is generally degraded, which presents various challenges. In this study, we evaluated how the coverage at variant sites and missing data among historical and modern samples impacts phylogenomic inference. We explored these patterns in the brush-tongued parrots (lories and lorikeets) of Australasia by sampling ultraconserved elements in 105 taxa. Trees estimated with low coverage characters had several clades where relationships appeared to be influenced by whether the sample came from historical or modern specimens, which were not observed when more stringent filtering was applied. To assess if the topologies were affected by missing data, we performed an outlier analysis of sites and loci, and a data reduction approach where we excluded sites based on data completeness. Depending on the outlier test, 0.15% of total sites or 38% of loci were driving the topological differences among trees, and at these sites, historical samples had 10.9× more missing data than modern ones. In contrast, 70% data completeness was necessary to avoid spurious relationships. Predictive modeling found that outlier analysis scores were correlated with parsimony informative sites in the clades whose topologies changed the most by filtering. After accounting for biased loci and understanding the stability of relationships, we inferred a more robust phylogenetic hypothesis for lories and lorikeets.
Collapse
Affiliation(s)
- Brian Tilston Smith
- Department of Ornithology, American Museum of Natural History, New York, New York
| | - William M Mauck
- Department of Ornithology, American Museum of Natural History, New York, New York.,New York Genome Center, New York, New York
| | - Brett W Benz
- Museum of Zoology and Department of Ecology and Evolutionary Biology, University of Michigan
| | - Michael J Andersen
- Department of Biology and Museum of Southwestern Biology, University of New Mexico
| |
Collapse
|
18
|
Talavera G, Lukhtanov V, Pierce NE, Vila R. DNA barcodes combined with multi-locus data of representative taxa can generate reliable higher-level phylogenies. Syst Biol 2021; 71:382-395. [PMID: 34022059 PMCID: PMC8830075 DOI: 10.1093/sysbio/syab038] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 05/13/2021] [Accepted: 05/25/2021] [Indexed: 12/04/2022] Open
Abstract
Taxa are frequently labeled incertae sedis when their placement is debated at ranks above the species level, such as their subgeneric, generic, or subtribal placement. This is a pervasive problem in groups with complex systematics due to difficulties in identifying suitable synapomorphies. In this study, we propose combining DNA barcodes with a multilocus backbone phylogeny in order to assign taxa to genus or other higher-level categories. This sampling strategy generates molecular matrices containing large amounts of missing data that are not distributed randomly: barcodes are sampled for all representatives, and additional markers are sampled only for a small percentage. We investigate the effects of the degree and randomness of missing data on phylogenetic accuracy using simulations for up to 100 markers in 1000-tips trees, as well as a real case: the subtribe Polyommatina (Lepidoptera: Lycaenidae), a large group including numerous species with unresolved taxonomy. Our simulation tests show that when a strategic and representative selection of species for higher-level categories has been made for multigene sequencing (approximately one per simulated genus), the addition of this multigene backbone DNA data for as few as 5–10% of the specimens in the total data set can produce high-quality phylogenies, comparable to those resulting from 100% multigene sampling. In contrast, trees based exclusively on barcodes performed poorly. This approach was applied to a 1365-specimen data set of Polyommatina (including ca. 80% of described species), with nearly 8% of representative species included in the multigene backbone and the remaining 92% included only by mitochondrial COI barcodes, a phylogeny was generated that highlighted potential misplacements, unrecognized major clades, and placement for incertae sedis taxa. We use this information to make systematic rearrangements within Polyommatina, and to describe two new genera. Finally, we propose a systematic workflow to assess higher-level taxonomy in hyperdiverse groups. This research identifies an additional, enhanced value of DNA barcodes for improvements in higher-level systematics using large data sets. [Birabiro; DNA barcoding; incertae sedis; Kipepeo; Lycaenidae; missing data; phylogenomic; phylogeny; Polyommatina; supermatrix; systematics; taxonomy]
Collapse
Affiliation(s)
- Gerard Talavera
- Institut Botànic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Passeig del Migdia s/n, 08038 Barcelona, Catalonia, Spain.,Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, United States
| | - Vladimir Lukhtanov
- Department of Karyosystematics, Zoological Institute of Russian Academy of Sciences, Universitetskaya nab. 1, 199034 St. Petersburg, Russia
| | - Naomi E Pierce
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, United States
| | - Roger Vila
- Institut de Biologia Evolutiva (CSIC-UPF), Passeig Marítim de la Barceloneta, 08003 Barcelona, Catalonia, Spain
| |
Collapse
|
19
|
Meleshko O, Martin MD, Korneliussen TS, Schröck C, Lamkowski P, Schmutz J, Healey A, Piatkowski BT, Shaw AJ, Weston DJ, Flatberg KI, Szövényi P, Hassel K, Stenøien HK. Extensive Genome-Wide Phylogenetic Discordance Is Due to Incomplete Lineage Sorting and Not Ongoing Introgression in a Rapidly Radiated Bryophyte Genus. Mol Biol Evol 2021; 38:2750-2766. [PMID: 33681996 PMCID: PMC8233498 DOI: 10.1093/molbev/msab063] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The relative importance of introgression for diversification has long been a highly disputed topic in speciation research and remains an open question despite the great attention it has received over the past decade. Gene flow leaves traces in the genome similar to those created by incomplete lineage sorting (ILS), and identification and quantification of gene flow in the presence of ILS is challenging and requires knowledge about the true phylogenetic relationship among the species. We use whole nuclear, plastid, and organellar genomes from 12 species in the rapidly radiated, ecologically diverse, actively hybridizing genus of peatmoss (Sphagnum) to reconstruct the species phylogeny and quantify introgression using a suite of phylogenomic methods. We found extensive phylogenetic discordance among nuclear and organellar phylogenies, as well as across the nuclear genome and the nodes in the species tree, best explained by extensive ILS following the rapid radiation of the genus rather than by postspeciation introgression. Our analyses support the idea of ancient introgression among the ancestral lineages followed by ILS, whereas recent gene flow among the species is highly restricted despite widespread interspecific hybridization known in the group. Our results contribute to phylogenomic understanding of how speciation proceeds in rapidly radiated, actively hybridizing species groups, and demonstrate that employing a combination of diverse phylogenomic methods can facilitate untangling complex phylogenetic patterns created by ILS and introgression.
Collapse
Affiliation(s)
- Olena Meleshko
- Department of Natural History, NTNU University Museum, Norwegian University of Science and Technology, Trondheim, Norway
| | - Michael D Martin
- Department of Natural History, NTNU University Museum, Norwegian University of Science and Technology, Trondheim, Norway
| | | | | | - Paul Lamkowski
- Institute of Botany and Landscape Ecology, University of Greifswald, Greifswald, Germany
| | - Jeremy Schmutz
- United States Department of Energy, Joint Genome Institute, Berkeley, CA, USA.,HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Adam Healey
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | | | | | - David J Weston
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA.,Climate Change Science Institute, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Kjell Ivar Flatberg
- Department of Natural History, NTNU University Museum, Norwegian University of Science and Technology, Trondheim, Norway
| | - Péter Szövényi
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zurich, Switzerland
| | - Kristian Hassel
- Department of Natural History, NTNU University Museum, Norwegian University of Science and Technology, Trondheim, Norway
| | - Hans K Stenøien
- Department of Natural History, NTNU University Museum, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
20
|
Gatesy J, Sloan DB, Warren JM, Baker RH, Simmons MP, Springer MS. Partitioned coalescence support reveals biases in species-tree methods and detects gene trees that determine phylogenomic conflicts. Mol Phylogenet Evol 2019; 139:106539. [DOI: 10.1016/j.ympev.2019.106539] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2018] [Revised: 06/10/2019] [Accepted: 06/17/2019] [Indexed: 12/26/2022]
|
21
|
Freitas ES, Datta-Roy A, Karanth P, Grismer LL, Siler CD. Multilocus phylogeny and a new classification for African, Asian and Indian supple and writhing skinks (Scincidae: Lygosominae). Zool J Linn Soc 2019. [DOI: 10.1093/zoolinnean/zlz001] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
AbstractThe genera Lepidothyris, Lygosoma and Mochlus comprise the writhing or supple skinks, a group of semi-fossorial, elongate-bodied skinks distributed across the Old World Tropics. Due to their generalized morphology and lack of diagnostic characters, species- and clade-level relationships have long been debated. Recent molecular phylogenetic studies of the group have provided some clarification of species-level relationships, but a number of issues regarding higher level relationships among genera still remain. Here we present a phylogenetic estimate of relationships among species in Lygosoma, Mochlus and Lepidothyris generated by concatenated and species tree analyses of multilocus data using the most extensive taxonomic sampling of the group to date. We also use multivariate statistics to examine species and clade distributions in morpho space. Our results reject the monophyly of Lygosoma s.l., Lygosoma s.s. and Mochlus, which highlights the instability of the current taxonomic classification of the group. We, therefore, revise the taxonomy of the writhing skinks to better reflect the evolutionary history of Lygosoma s.l. by restricting Lygosoma for Southeast Asia, resurrecting the genus Riopa for a clade of Indian and Southeast Asian species, expanding the genus Mochlus to include all African species of writhing skinks and describing a new genus in Southeast Asia.
Collapse
Affiliation(s)
- Elyse S Freitas
- Sam Noble Oklahoma Museum of Natural History and Department of Biology, University of Oklahoma, Norman, OK, USA
| | - Aniruddha Datta-Roy
- Centre for Ecological Sciences, Indian Institute of Science, Bangalore, India
- School of Biological Sciences, National Institute of Science Education and Research, Bhubaneswar, Odisha, India
| | - Praveen Karanth
- Centre for Ecological Sciences, Indian Institute of Science, Bangalore, India
| | - L Lee Grismer
- Department of Biology, La Sierra University, Riverside, California, USA
| | - Cameron D Siler
- Sam Noble Oklahoma Museum of Natural History and Department of Biology, University of Oklahoma, Norman, OK, USA
| |
Collapse
|
22
|
Montingelli GG, Grazziotin FG, Battilana J, Murphy RW, Zhang Y, Zaher H. Higher‐level phylogenetic affinities of the Neotropical genus
Mastigodryas
Amaral, 1934 (Serpentes: Colubridae), species‐group definition and description of a new genus for
Mastigodryas bifossatus. J ZOOL SYST EVOL RES 2019. [DOI: 10.1111/jzs.12262] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Affiliation(s)
- Giovanna G. Montingelli
- Department of Life SciencesNatural History Museum London UK
- Museu de Zoologia da Universidade de São Paulo São Paulo Brazil
| | | | | | - Robert W. Murphy
- Royal Ontario MuseumCentre for Biodiversity and Conservation Biology Toronto Ontario Canada
- State Key Laboratory of Genetic Resources and EvolutionKunming Institute of Zoology Kunming China
| | - Ya‐Ping Zhang
- State Key Laboratory of Genetic Resources and EvolutionKunming Institute of Zoology Kunming China
- Laboratory for Conservation and Utilization of Bio‐ResourcesYunnan University Kunming China
| | - Hussam Zaher
- Museu de Zoologia da Universidade de São Paulo São Paulo Brazil
| |
Collapse
|
23
|
Kang Q, Schardl CL, Moore N, Yoshida R. CURatio: Genome-wide phylogenomic analysis method using ratios of total branch lengths. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 17:10.1109/TCBB.2018.2878564. [PMID: 30387738 PMCID: PMC7372714 DOI: 10.1109/tcbb.2018.2878564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Evolutionary hypotheses provide important underpinnings of biological and medical sciences, and comprehensive, genome-wide understanding of evolutionary relationships among organisms are needed to test and refine such hypotheses. Theory and empirical evidence clearly indicate that phylogenies (trees) of different genes (loci) should not display precisely matching topologies. The main reason for such phylogenetic incongruence is reticulated evolutionary history of most species due to meiotic sexual recombination in eukaryotes, or horizontal transfers of genetic material in prokaryotes. Nevertheless, many genes should display topologically related phylogenies, and should group into one or more (for genetic hybrids) clusters in poly-dimensional "tree space". Unusual evolutionary histories or effects of selection may result in "outlier" genes with phylogenies that fall outside the main distribution(s) of trees in tree space. We present a new phylogenomic method, CURatio, which uses ratios of total branch lengths in gene trees to help identify phylogenetic outliers in a given set of ortholog groups from multiple genomes. An advantage of CURatio over other methods is that genes absent from and/or duplicated in some genomes can be included in the analysis. We conducted a simulation study under the coalescent model, and showed that, given sufficient species depth and topological difference, these ratios are significantly higher for the "outlier" gene phylogenies. Also, we applied CURatio to a set of annotated genomes of the fungal family, Clavicipitaceae, and identified alkaloid biosynthesis genes as outliers, probably due to a history of duplication and loss. The source code is available at https://github.com/QiwenKang/CURatio, and the empirical data set on Clavicipitaceae and simulated data set are available at Mendeley https://data.mendeley.com/datasets/mrxts7wjrr/1.
Collapse
|
24
|
Collins RA, Hrbek T. An In Silico Comparison of Protocols for Dated Phylogenomics. Syst Biol 2018; 67:633-650. [PMID: 29319797 DOI: 10.1093/sysbio/syx089] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Accepted: 10/24/2017] [Indexed: 01/02/2023] Open
Abstract
In the age of genome-scale DNA sequencing, choice of molecular marker arguably remains an important decision in planning a phylogenetic study. Using published genomes from 23 primate species, we make a standardized comparison of four of the most frequently used protocols in phylogenomics, viz., targeted sequence-enrichment using ultraconserved element and exon-capture probes, and restriction-site-associated DNA sequencing (RADseq and ddRADseq). Here, we present a procedure to perform in silico extractions from genomes and create directly comparable data sets for each class of marker. We then compare these data sets in terms of both phylogenetic resolution and ability to consistently and precisely estimate clade ages using fossil-calibrated molecular-clock models. Furthermore, we were also able to directly compare these results to previously published data sets from Sanger-sequenced nuclear exons and mitochondrial genomes under the same analytical conditions. Our results show-although with the exception of the mitochondrial genome data set and the smallest ddRADseq data set-that for uncontroversial nodes all data classes performed equally well, that is they recovered the same well supported topology. However, for one difficult-to-resolve node comprising a rapid diversification, we report well supported but conflicting topologies among the marker classes consistent with the mismodeling of gene tree heterogeneity as demonstrated by species tree analyses of single nucleotide polymorphisms. Likewise, clade age estimates showed consistent discrepancies between data sets under strict and relaxed clock models; for recent nodes, clade ages estimated by nuclear exon data sets were younger than those of the UCE, RADseq and mitochondrial data, but vice versa for the deepest nodes in the primate phylogeny. This observation is explained by temporal differences in phylogenetic informativeness (PI), with the data sets with strong PI peaks toward the present underestimating the deepest node ages. Finally, we conclude by emphasizing that while huge numbers of loci are probably not required for uncontroversial phylogenetic questions-for which practical considerations such as ease of data generation, sharing, and aggregating, therefore become increasingly important-accurately modeling heterogeneous data remains as relevant as ever for the more recalcitrant problems.
Collapse
Affiliation(s)
- Rupert A Collins
- Laboratório de Evolução e Genética Animal, Department of Genetics, Federal University of Amazonas, Av. Rodrigo Otavio Ramos, 3000, Manaus, AM, 69077-000, Brazil.,School of Biological Sciences, Life Sciences Building, University of Bristol, 24 Tyndall Ave, Bristol BS8 1TH, UK
| | - Tomas Hrbek
- Laboratório de Evolução e Genética Animal, Department of Genetics, Federal University of Amazonas, Av. Rodrigo Otavio Ramos, 3000, Manaus, AM, 69077-000, Brazil.,Department of Biology, 4102 LSB Brigham Young University, Provo, UT, 84602, USA
| |
Collapse
|
25
|
Tang Q, Edwards SV, Rheindt FE. Rapid diversification and hybridization have shaped the dynamic history of the genus Elaenia. Mol Phylogenet Evol 2018; 127:522-533. [DOI: 10.1016/j.ympev.2018.05.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 04/11/2018] [Accepted: 05/08/2018] [Indexed: 01/04/2023]
|
26
|
Herrando-Moraira S. Exploring data processing strategies in NGS target enrichment to disentangle radiations in the tribe Cardueae (Compositae). Mol Phylogenet Evol 2018; 128:69-87. [PMID: 30036700 DOI: 10.1016/j.ympev.2018.07.012] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2018] [Revised: 07/13/2018] [Accepted: 07/14/2018] [Indexed: 12/17/2022]
Abstract
Target enrichment is a cost-effective sequencing technique that holds promise for elucidating evolutionary relationships in fast-evolving lineages. However, potential biases and impact of bioinformatic sequence treatments in phylogenetic inference have not been thoroughly explored yet. Here, we investigate this issue with an ultimate goal to shed light into a highly diversified group of Compositae (Asteraceae) constituted by four main genera: Arctium, Cousinia, Saussurea, and Jurinea. Specifically, we compared sequence data extraction methods implemented in two easy-to-use workflows, PHYLUCE and HybPiper, and assessed the impact of two filtering practices intended to reduce phylogenetic noise. In addition, we compared two phylogenetic inference methods: (1) the concatenation approach, in which all loci were concatenated in a supermatrix; and (2) the coalescence approach, in which gene trees were produced independently and then used to construct a species tree under coalescence assumptions. Here we confirm the usefulness of the set of 1061 COS targets (a nuclear conserved orthology loci set developed for the Compositae) across a variety of taxonomic levels. Intergeneric relationships were completely resolved: there are two sister groups, Arctium-Cousinia and Saussurea-Jurinea, which are in agreement with a morphological hypothesis. Intrageneric relationships among species of Arctium, Cousinia, and Saussurea are also well defined. Conversely, conflicting species relationships remain for Jurinea. Methodological choices significantly affected phylogenies in terms of topology, branch length, and support. Across all analyses, the phylogeny obtained using HybPiper and the strictest scheme of removing fast-evolving sites was estimated as the optimal. Regarding methodological choices, we conclude that: (1) trees obtained under the coalescence approach are topologically more congruent between them than those inferred using the concatenation approach; (2) refining treatments only improved support values under the concatenation approach; and (3) branch support values are maximized when fast-evolving sites are removed in the concatenation approach, and when a higher number of loci is analyzed in the coalescence approach.
Collapse
Affiliation(s)
- Sonia Herrando-Moraira
- Botanic Institute of Barcelona (IBB, CSIC-ICUB), Pg. del Migdia, s.n., 08038 Barcelona, Spain.
| | | |
Collapse
|
27
|
Sayyari E, Whitfield JB, Mirarab S. Fragmentary Gene Sequences Negatively Impact Gene Tree and Species Tree Reconstruction. Mol Biol Evol 2018; 34:3279-3291. [PMID: 29029241 DOI: 10.1093/molbev/msx261] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Species tree reconstruction from genome-wide data is increasingly being attempted, in most cases using a two-step approach of first estimating individual gene trees and then summarizing them to obtain a species tree. The accuracy of this approach, which promises to account for gene tree discordance, depends on the quality of the inferred gene trees. At the same time, phylogenomic and phylotranscriptomic analyses typically use involved bioinformatics pipelines for data preparation. Errors and shortcomings resulting from these preprocessing steps may impact the species tree analyses at the other end of the pipeline. In this article, we first show that the presence of fragmentary data for some species in a gene alignment, as often seen on real data, can result in substantial deterioration of gene trees, and as a result, the species tree. We then investigate a simple filtering strategy where individual fragmentary sequences are removed from individual genes but the rest of the gene is retained. Both in simulations and by reanalyzing a large insect phylotranscriptomic data set, we show the effectiveness of this simple filtering strategy.
Collapse
Affiliation(s)
- Erfan Sayyari
- Department of Electrical and Computer Engineering, University of California at San Diego, La Jolla, CA
| | | | - Siavash Mirarab
- Department of Electrical and Computer Engineering, University of California at San Diego, La Jolla, CA
| |
Collapse
|
28
|
Schaer J, McMichael L, Gordon AN, Russell D, Matuschewski K, Perkins SL, Field H, Power M. Phylogeny of Hepatocystis parasites of Australian flying foxes reveals distinct parasite clade. INTERNATIONAL JOURNAL FOR PARASITOLOGY-PARASITES AND WILDLIFE 2018; 7:207-212. [PMID: 29988481 PMCID: PMC6024243 DOI: 10.1016/j.ijppaw.2018.06.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Revised: 05/31/2018] [Accepted: 06/05/2018] [Indexed: 12/25/2022]
Abstract
Hepatocystis parasites are close relatives of mammalian Plasmodium species and infect a range of primates and bats. Here, we present the phylogenetic relationships of Hepatocystis parasites of three Australian flying fox species. Multilocus phylogenetic analysis revealed that Hepatocystis parasites of Pteropus species from Australia and Asia form a distinct clade that is sister to all other Hepatocystis parasites of primates and bats from Africa and Asia. No patterns of host specificity were recovered within the Pteropus-specific parasite clade and the Hepatocystis sequences from all three Australian host species sampled fell into two divergent clades. First molecular phylogeny of Hepatocystis parasites in Australian flying foxes. Hepatocystis parasites of Pteropus form a distinct clade. Lack of host species specificity as distinct hallmark of Hepatocystis parasites.
Collapse
Affiliation(s)
- Juliane Schaer
- Department of Biological Sciences, Macquarie University, North Ryde, 2109, Australia.,Department of Molecular Parasitology, Institute of Biology, Humboldt University, 10117, Berlin, Germany
| | - Lee McMichael
- School of Veterinary Science, University of Queensland, Gatton Campus, Gatton, QLD, 4343, Australia
| | - Anita N Gordon
- Biosecurity Sciences Laboratory, Health and Food Science Precinct, 39 Kessels Rd, Coopers Plains, Queensland, 4108, Australia
| | - Daniel Russell
- Department of Biological Sciences, Macquarie University, North Ryde, 2109, Australia
| | - Kai Matuschewski
- Department of Molecular Parasitology, Institute of Biology, Humboldt University, 10117, Berlin, Germany
| | - Susan L Perkins
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY, 10024, USA
| | - Hume Field
- EcoHealth Alliance, New York, NY, 10001, USA
| | - Michelle Power
- Department of Biological Sciences, Macquarie University, North Ryde, 2109, Australia
| |
Collapse
|
29
|
Gates DJ, Pilson D, Smith SD. Filtering of target sequence capture individuals facilitates species tree construction in the plant subtribe Iochrominae (Solanaceae). Mol Phylogenet Evol 2018; 123:26-34. [DOI: 10.1016/j.ympev.2018.02.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2017] [Revised: 01/30/2018] [Accepted: 02/01/2018] [Indexed: 10/18/2022]
|
30
|
Nute M, Chou J, Molloy EK, Warnow T. The performance of coalescent-based species tree estimation methods under models of missing data. BMC Genomics 2018; 19:286. [PMID: 29745854 PMCID: PMC5998899 DOI: 10.1186/s12864-018-4619-8] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Estimation of species trees from multiple genes is complicated by processes such as incomplete lineage sorting, gene duplication and loss, and horizontal gene transfer, that result in gene trees that differ from each other and from the species phylogeny. Methods to estimate species trees in the presence of gene tree discord due to incomplete lineage sorting have been developed and proved to be statistically consistent when gene tree discord is due only to incomplete lineage sorting and every gene tree includes the full set of species. RESULTS We establish statistical consistency of certain coalescent-based species tree estimation methods under some models of taxon deletion from genes. We also evaluate the impact of missing data on four species tree estimation methods (ASTRAL-II, ASTRID, MP-EST, and SVDquartets) using simulated datasets with varying levels of incomplete lineage sorting, gene tree estimation error, and degrees/patterns of missing data. CONCLUSIONS All the species tree estimation methods improved in accuracy as the number of genes increased and often produced highly accurate species trees even when the amount of missing data was large. These results together indicate that accurate species tree estimation is possible under a variety of conditions, even when there are substantial amounts of missing data.
Collapse
Affiliation(s)
- Michael Nute
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 S. Wright St., Champaign, IL, 61820 USA
| | - Jed Chou
- Department of Mathematics, University of Illinois at Urbana-Champaign, 1409 W. Green St., Urbana, IL, 61801 USA
| | - Erin K. Molloy
- Department of Computer Science, University of Illinois at Urbana-Champaign, 201 North Goodwin Avenue, Urbana, IL, 61801 USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, 201 North Goodwin Avenue, Urbana, IL, 61801 USA
| |
Collapse
|
31
|
Affiliation(s)
- David Posada
- Department of Biochemistry, Genetics and Immunology, University of Vigo, 36310 Vigo, Spain
| |
Collapse
|
32
|
Mallo D, Posada D. Multilocus inference of species trees and DNA barcoding. Philos Trans R Soc Lond B Biol Sci 2017; 371:rstb.2015.0335. [PMID: 27481787 PMCID: PMC4971187 DOI: 10.1098/rstb.2015.0335] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/10/2016] [Indexed: 11/30/2022] Open
Abstract
The unprecedented amount of data resulting from next-generation sequencing has opened a new era in phylogenetic estimation. Although large datasets should, in theory, increase phylogenetic resolution, massive, multilocus datasets have uncovered a great deal of phylogenetic incongruence among different genomic regions, due both to stochastic error and to the action of different evolutionary process such as incomplete lineage sorting, gene duplication and loss and horizontal gene transfer. This incongruence violates one of the fundamental assumptions of the DNA barcoding approach, which assumes that gene history and species history are identical. In this review, we explain some of the most important challenges we will have to face to reconstruct the history of species, and the advantages and disadvantages of different strategies for the phylogenetic analysis of multilocus data. In particular, we describe the evolutionary events that can generate species tree—gene tree discordance, compare the most popular methods for species tree reconstruction, highlight the challenges we need to face when using them and discuss their potential utility in barcoding. Current barcoding methods sacrifice a great amount of statistical power by only considering one locus, and a transition to multilocus barcodes would not only improve current barcoding methods, but also facilitate an eventual transition to species-tree-based barcoding strategies, which could better accommodate scenarios where the barcode gap is too small or inexistent. This article is part of the themed issue ‘From DNA barcodes to biomes’.
Collapse
Affiliation(s)
- Diego Mallo
- Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo 36310, Spain
| | - David Posada
- Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo 36310, Spain
| |
Collapse
|
33
|
Yu G, Rao D, Matsui M, Yang J. Coalescent-based delimitation outperforms distance-based methods for delineating less divergent species: the case of Kurixalus odontotarsus species group. Sci Rep 2017; 7:16124. [PMID: 29170403 PMCID: PMC5700917 DOI: 10.1038/s41598-017-16309-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2017] [Accepted: 11/10/2017] [Indexed: 12/30/2022] Open
Abstract
Few empirical studies have compared coalescent-based methods to distance-based methods for delimitation of less divergent species. In this study, we used two coalescent-based (BFD and BPP) and two distance-based barcoding (ABGD and jMOTU) methods to delimit closely related species in the Kurixalus odontotarsus species group. Phylogenetic analyses revealed that the K. odontotarsus species group comprises 11 distinct maternal clades with strong support values. Based on the genetic and morphological evidences, we consider that species diversity in the K. odontotarsus species group was underestimated and the 11 clades represent 11 species, of which six are unnamed. The coalescent-based delimitations decisively supported the scenario of 11-species corresponding to the 11 clades. However, the distance-based ABGD only obtained 3-6 candidate species, which is not consistent with morphological evidence. These results indicate that BFD and BPP are more conservative than ABGD to false negatives (lumping). Method of fixed threshold (jMOTU) may obtain a resolution similar to that inferred by BFD and BPP, but it severely relies on subjective choice of the threshold and lacks statistical support. We consider that coalescent-based BFD and BPP approaches outperform distance-based methods for delineation of less divergent species.
Collapse
Affiliation(s)
- Guohua Yu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, 32 Jiaochang Donglu, Kunming, Yunnan, 650223, China
| | - Dingqi Rao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, 32 Jiaochang Donglu, Kunming, Yunnan, 650223, China
| | - Masafumi Matsui
- Graduate School of Human and Environmental Studies, Kyoto University, Yoshida Nihonmatsu, Kakyo-ku, Kyoto, 606-8501, Japan
| | - Junxing Yang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, 32 Jiaochang Donglu, Kunming, Yunnan, 650223, China.
| |
Collapse
|
34
|
Myers EA, Burgoon JL, Ray JM, Martínez-Gómez JE, Matías-Ferrer N, Mulcahy DG, Burbrink FT. Coalescent Species Tree Inference of Coluber and Masticophis. COPEIA 2017. [DOI: 10.1643/ch-16-552] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
35
|
Molloy EK, Warnow T. To Include or Not to Include: The Impact of Gene Filtering on Species Tree Estimation Methods. Syst Biol 2017; 67:285-303. [DOI: 10.1093/sysbio/syx077] [Citation(s) in RCA: 138] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Accepted: 09/13/2017] [Indexed: 01/27/2023] Open
Affiliation(s)
- Erin K Molloy
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| |
Collapse
|
36
|
Suchan T, Espíndola A, Rutschmann S, Emerson BC, Gori K, Dessimoz C, Arrigo N, Ronikier M, Alvarez N. Assessing the potential of RAD-sequencing to resolve phylogenetic relationships within species radiations: The fly genus Chiastocheta (Diptera: Anthomyiidae) as a case study. Mol Phylogenet Evol 2017. [DOI: 10.1016/j.ympev.2017.06.012] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
37
|
Shen XX, Zhou X, Kominek J, Kurtzman CP, Hittinger CT, Rokas A. Reconstructing the Backbone of the Saccharomycotina Yeast Phylogeny Using Genome-Scale Data. G3 (BETHESDA, MD.) 2016; 6:3927-3939. [PMID: 27672114 PMCID: PMC5144963 DOI: 10.1534/g3.116.034744] [Citation(s) in RCA: 134] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Accepted: 09/21/2016] [Indexed: 01/20/2023]
Abstract
Understanding the phylogenetic relationships among the yeasts of the subphylum Saccharomycotina is a prerequisite for understanding the evolution of their metabolisms and ecological lifestyles. In the last two decades, the use of rDNA and multilocus data sets has greatly advanced our understanding of the yeast phylogeny, but many deep relationships remain unsupported. In contrast, phylogenomic analyses have involved relatively few taxa and lineages that were often selected with limited considerations for covering the breadth of yeast biodiversity. Here we used genome sequence data from 86 publicly available yeast genomes representing nine of the 11 known major lineages and 10 nonyeast fungal outgroups to generate a 1233-gene, 96-taxon data matrix. Species phylogenies reconstructed using two different methods (concatenation and coalescence) and two data matrices (amino acids or the first two codon positions) yielded identical and highly supported relationships between the nine major lineages. Aside from the lineage comprised by the family Pichiaceae, all other lineages were monophyletic. Most interrelationships among yeast species were robust across the two methods and data matrices. However, eight of the 93 internodes conflicted between analyses or data sets, including the placements of: the clade defined by species that have reassigned the CUG codon to encode serine, instead of leucine; the clade defined by a whole genome duplication; and the species Ascoidea rubescens These phylogenomic analyses provide a robust roadmap for future comparative work across the yeast subphylum in the disciplines of taxonomy, molecular genetics, evolutionary biology, ecology, and biotechnology. To further this end, we have also provided a BLAST server to query the 86 Saccharomycotina genomes, which can be found at http://y1000plus.org/blast.
Collapse
Affiliation(s)
- Xing-Xing Shen
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee 37235
| | - Xiaofan Zhou
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee 37235
| | - Jacek Kominek
- Laboratory of Genetics, Genome Center of Wisconsin, DOE Great Lakes Bioenergy Research Center, Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, Wisconsin 53706
| | - Cletus P Kurtzman
- Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Agricultural Research Service, U.S. Department of Agriculture, Peoria, Illinois 61604
| | - Chris Todd Hittinger
- Laboratory of Genetics, Genome Center of Wisconsin, DOE Great Lakes Bioenergy Research Center, Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, Wisconsin 53706
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee 37235
| |
Collapse
|
38
|
Zhao L, Li X, Zhang N, Zhang SD, Yi TS, Ma H, Guo ZH, Li DZ. Phylogenomic analyses of large-scale nuclear genes provide new insights into the evolutionary relationships within the rosids. Mol Phylogenet Evol 2016; 105:166-176. [DOI: 10.1016/j.ympev.2016.06.007] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2015] [Revised: 06/06/2016] [Accepted: 06/27/2016] [Indexed: 12/28/2022]
|
39
|
Arbizu CI, Ellison SL, Senalik D, Simon PW, Spooner DM. Genotyping-by-sequencing provides the discriminating power to investigate the subspecies of Daucus carota (Apiaceae). BMC Evol Biol 2016; 16:234. [PMID: 27793080 PMCID: PMC5084430 DOI: 10.1186/s12862-016-0806-x] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 10/14/2016] [Indexed: 12/05/2022] Open
Abstract
BACKGROUND The majority of the subspecies of Daucus carota have not yet been discriminated clearly by various molecular or morphological methods and hence their phylogeny and classification remains unresolved. Recent studies using 94 nuclear orthologs and morphological characters, and studies employing other molecular approaches were unable to distinguish clearly many of the subspecies. Fertile intercrosses among traditionally recognized subspecies are well documented. We here explore the utility of single nucleotide polymorphisms (SNPs) generated by genotyping-by-sequencing (GBS) to serve as an effective molecular method to discriminate the subspecies of the D. carota complex. RESULTS We used GBS to obtain SNPs covering all nine Daucus carota chromosomes from 162 accessions of Daucus and two related genera. To study Daucus phylogeny, we scored a total of 10,814 or 38,920 SNPs with a maximum of 10 or 30 % missing data, respectively. To investigate the subspecies of D. carota, we employed two data sets including 150 accessions: (i) rate of missing data 10 % with a total of 18,565 SNPs, and (ii) rate of missing data 30 %, totaling 43,713 SNPs. Consistent with prior results, the topology of both data sets separated species with 2n = 18 chromosome from all other species. Our results place all cultivated carrots (D. carota subsp. sativus) in a single clade. The wild members of D. carota from central Asia were on a clade with eastern members of subsp. sativus. The other subspecies of D. carota were in four clades associated with geographic groups: (1) the Balkan Peninsula and the Middle East, (2) North America and Europe, (3) North Africa exclusive of Morocco, and (4) the Iberian Peninsula and Morocco. Daucus carota subsp. maximus was discriminated, but neither it, nor subsp. gummifer (defined in a broad sense) are monophyletic. CONCLUSIONS Our study suggests that (1) the morphotypes identified as D. carota subspecies gummifer (as currently broadly circumscribed), all confined to areas near the Atlantic Ocean and the western Mediterranean Sea, have separate origins from sympatric members of other subspecies of D. carota, (2) D. carota subsp. maximus, on two clades with some accessions of subsp. carota, can be distinguished from each other but only with poor morphological support, (3) D. carota subsp. capillifolius, well distinguished morphologically, is an apospecies relative to North African populations of D. carota subsp. carota, (4) the eastern cultivated carrots have origins closer to wild carrots from central Asia than to western cultivated carrots, and (5) large SNP data sets are suitable for species-level phylogenetic studies in Daucus.
Collapse
Affiliation(s)
- Carlos I Arbizu
- Department of Horticulture, University of Wisconsin-Madison, 1575 Linden Drive, Madison, WI, 53706-1590, USA
| | - Shelby L Ellison
- Department of Horticulture, University of Wisconsin-Madison, 1575 Linden Drive, Madison, WI, 53706-1590, USA
| | - Douglas Senalik
- Department of Horticulture, University of Wisconsin-Madison, 1575 Linden Drive, Madison, WI, 53706-1590, USA
- USDA-Agricultural Research Service, Vegetable Crops Research Unit, University of Wisconsin-Madison, 1575 Linden Drive, Madison, WI, 53706-1590, USA
| | - Philipp W Simon
- Department of Horticulture, University of Wisconsin-Madison, 1575 Linden Drive, Madison, WI, 53706-1590, USA
- USDA-Agricultural Research Service, Vegetable Crops Research Unit, University of Wisconsin-Madison, 1575 Linden Drive, Madison, WI, 53706-1590, USA
| | - David M Spooner
- Department of Horticulture, University of Wisconsin-Madison, 1575 Linden Drive, Madison, WI, 53706-1590, USA.
- USDA-Agricultural Research Service, Vegetable Crops Research Unit, University of Wisconsin-Madison, 1575 Linden Drive, Madison, WI, 53706-1590, USA.
| |
Collapse
|
40
|
Šmíd J, Shobrak M, Wilms T, Joger U, Carranza S. Endemic diversification in the mountains: genetic, morphological, and geographical differentiation of the Hemidactylus geckos in southwestern Arabia. ORG DIVERS EVOL 2016. [DOI: 10.1007/s13127-016-0293-3] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
41
|
Improved resolution and a novel phylogeny for the Neotropical triplefin blennies (Teleostei: Tripterygiidae). Mol Phylogenet Evol 2016; 96:70-78. [DOI: 10.1016/j.ympev.2015.12.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Revised: 11/24/2015] [Accepted: 12/01/2015] [Indexed: 11/22/2022]
|
42
|
Massatti R, Reznicek AA, Knowles LL. Utilizing RADseq data for phylogenetic analysis of challenging taxonomic groups: A case study in Carex sect. Racemosae. AMERICAN JOURNAL OF BOTANY 2016; 103:337-347. [PMID: 26851268 DOI: 10.3732/ajb.1500315] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2015] [Accepted: 12/29/2015] [Indexed: 06/05/2023]
Abstract
PREMISE OF THE STUDY Relationships among closely related and recently diverged taxa can be especially difficult to resolve. Here we use both Sanger sequencing and next-generation RADseq data sets to estimate phylogenetic relationships among species of Carex section Racemosae (Cyperaceae), a clade largely restricted to high latitudes and elevations. Interest in relationships among these taxa derives from questions about the species' biogeographic histories and possible links between diversification and Pleistocene glaciations. METHODS A combination of approaches and molecular markers were used to estimate relationships among Carex species within sect. Racemosae and taxa from closely related sections. Nuclear and chloroplast loci generated by Sanger sequencing were analyzed with *BEAST, and SNP data from RADseq loci were analyzed as a concatenated data set using maximum likelihood and as independent loci using SVDquartets. KEY RESULTS Sanger sequencing data sets resolved relationships among taxa at intermediate phylogenetic depths (albeit with low levels of support). Only the RADseq data resolved relationships with strong support at all phylogenetic depths. Moreover, different methods and data partitions of the RADseq data resulted in nearly identical topologies. Carex sect. Racemosae is a strongly supported clade, although a handful of species were found to group with closely related sections. Herbarium specimens up to 35 yr old successfully produced informative RADseq data. CONCLUSIONS Despite the short read lengths of RADseq data, they nevertheless resolved relationships that Sanger sequencing data did not. Resolution of the phylogenetic relationships among recently and rapidly diversifying taxa within sect. Racemosae clades suggest a role for the Pleistocene glaciations in clade diversification.
Collapse
Affiliation(s)
- Rob Massatti
- Department of Ecology and Evolutionary Biology, The University of Michigan, Ann Arbor, Michigan, 41809-1079 USA
| | - Anton A Reznicek
- Department of Ecology and Evolutionary Biology, The University of Michigan, Ann Arbor, Michigan, 41809-1079 USA
| | - L Lacey Knowles
- Department of Ecology and Evolutionary Biology, The University of Michigan, Ann Arbor, Michigan, 41809-1079 USA
| |
Collapse
|
43
|
Xi Z, Liu L, Davis CC. The Impact of Missing Data on Species Tree Estimation. Mol Biol Evol 2015; 33:838-60. [DOI: 10.1093/molbev/msv266] [Citation(s) in RCA: 101] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
|
44
|
Streicher JW, Schulte JA, Wiens JJ. How Should Genes and Taxa be Sampled for Phylogenomic Analyses with Missing Data? An Empirical Study in Iguanian Lizards. Syst Biol 2015; 65:128-45. [PMID: 26330450 DOI: 10.1093/sysbio/syv058] [Citation(s) in RCA: 111] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Accepted: 08/04/2015] [Indexed: 11/12/2022] Open
Abstract
Targeted sequence capture is becoming a widespread tool for generating large phylogenomic data sets to address difficult phylogenetic problems. However, this methodology often generates data sets in which increasing the number of taxa and loci increases amounts of missing data. Thus, a fundamental (but still unresolved) question is whether sampling should be designed to maximize sampling of taxa or genes, or to minimize the inclusion of missing data cells. Here, we explore this question for an ancient, rapid radiation of lizards, the pleurodont iguanians. Pleurodonts include many well-known clades (e.g., anoles, basilisks, iguanas, and spiny lizards) but relationships among families have proven difficult to resolve strongly and consistently using traditional sequencing approaches. We generated up to 4921 ultraconserved elements with sampling strategies including 16, 29, and 44 taxa, from 1179 to approximately 2.4 million characters per matrix and approximately 30% to 60% total missing data. We then compared mean branch support for interfamilial relationships under these 15 different sampling strategies for both concatenated (maximum likelihood) and species tree (NJst) approaches (after showing that mean branch support appears to be related to accuracy). We found that both approaches had the highest support when including loci with up to 50% missing taxa (matrices with ~40-55% missing data overall). Thus, our results show that simply excluding all missing data may be highly problematic as the primary guiding principle for the inclusion or exclusion of taxa and genes. The optimal strategy was somewhat different for each approach, a pattern that has not been shown previously. For concatenated analyses, branch support was maximized when including many taxa (44) but fewer characters (1.1 million). For species-tree analyses, branch support was maximized with minimal taxon sampling (16) but many loci (4789 of 4921). We also show that the choice of these sampling strategies can be critically important for phylogenomic analyses, since some strategies lead to demonstrably incorrect inferences (using the same method) that have strong statistical support. Our preferred estimate provides strong support for most interfamilial relationships in this important but phylogenetically challenging group.
Collapse
Affiliation(s)
- Jeffrey W Streicher
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA; Department of Life Sciences, The Natural History Museum, London SW7 5BD, UK and
| | - James A Schulte
- Department of Biology, Clarkson University, Potsdam, NY 13699, USA
| | - John J Wiens
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA
| |
Collapse
|
45
|
DaCosta JM, Sorenson MD. ddRAD-seq phylogenetics based on nucleotide, indel, and presence-absence polymorphisms: Analyses of two avian genera with contrasting histories. Mol Phylogenet Evol 2015; 94:122-35. [PMID: 26279345 DOI: 10.1016/j.ympev.2015.07.026] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2015] [Revised: 07/22/2015] [Accepted: 07/29/2015] [Indexed: 11/16/2022]
Abstract
Genotype-by-sequencing (GBS) methods have revolutionized the field of molecular ecology, but their application in molecular phylogenetics remains somewhat limited. In addition, most phylogenetic studies based on large GBS data sets have relied on analyses of concatenated data rather than species tree methods that explicitly account for genealogical stochasticity among loci. We explored the utility of "double-digest" restriction site-associated DNA sequencing (ddRAD-seq) for phylogenetic analyses of the Lagonosticta firefinches (family Estrildidae) and the Vidua brood parasitic finches (family Viduidae). As expected, the number of homologous loci shared among samples was negatively correlated with genetic distance due to the accumulation of restriction site polymorphisms. Nonetheless, for each genus, we obtained data sets of ∼3000 loci shared in common among all samples, including a more distantly related outgroup taxon. For all samples combined, we obtained >1000 homologous loci despite ∼20my divergence between estrildid and parasitic finches. In addition to nucleotide polymorphisms, the ddRAD-seq data yielded large sets of indel and locus presence-absence polymorphisms, all of which had higher consistency indices than mtDNA sequence data in the context of concatenated parsimony analyses. Species tree methods, using individual gene trees or single nucleotide polymorphisms as input, generated results broadly consistent with analyses of concatenated data, particularly for Lagonosticta, which appears to have a well resolved, bifurcating history. Results for Vidua were also generally consistent across methods and data sets, although nodal support and results from different species tree methods were more variable. Lower gene tree congruence in Vidua is likely the result of its unique evolutionary history, which includes rapid speciation by host shift and occasional hybridization and introgression due to incomplete reproductive isolation. We conclude that ddRAD-seq is a cost-effective method for generating robust phylogenetic data sets, particularly for analyses of closely related species and genera.
Collapse
|
46
|
Liu L, Xi Z, Wu S, Davis CC, Edwards SV. Estimating phylogenetic trees from genome-scale data. Ann N Y Acad Sci 2015; 1360:36-53. [DOI: 10.1111/nyas.12747] [Citation(s) in RCA: 129] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Liang Liu
- Department of Statistics; University of Georgia; Athens Georgia
- Institute of Bioinformatics; University of Georgia; Athens Georgia
| | - Zhenxiang Xi
- Department of Organismic and Evolutionary Biology; Harvard University; Cambridge Massachusetts
| | - Shaoyuan Wu
- Department of Biochemistry and Molecular Biology & Tianjin Key Laboratory of Medical Epigenetics, School of Basic Medical Sciences; Tianjin Medical University; Tianjin China
| | - Charles C. Davis
- Department of Organismic and Evolutionary Biology; Harvard University; Cambridge Massachusetts
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology; Harvard University; Cambridge Massachusetts
| |
Collapse
|
47
|
Nguyen HDT, Jančič S, Meijer M, Tanney JB, Zalar P, Gunde-Cimerman N, Seifert KA. Application of the phylogenetic species concept to Wallemia sebi from house dust and indoor air revealed by multi-locus genealogical concordance. PLoS One 2015; 10:e0120894. [PMID: 25799362 PMCID: PMC4370657 DOI: 10.1371/journal.pone.0120894] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Accepted: 01/27/2015] [Indexed: 01/17/2023] Open
Abstract
A worldwide survey of Wallemia occurring in house dust and indoor air was conducted. The isolated strains were identified as W. sebi and W. muriae. Previous studies suggested that the W. sebi phylogenetic clade contained cryptic species but conclusive evidence was lacking because only the internal transcribed spacer (ITS) marker was analyzed. The ITS and four protein-coding genes (MCM7, RPB1, RPB2, and TSR1) were sequenced for 85 isolates. Based on an initial neighbor joining analysis of the concatenated genes, W. muriae remained monophyletic but four clades were found in W. sebi, which we designated as W. sebi clades 1, 2, 3, and 4. We hypothesized that these clades represent distinct phylogenetic species within the Wallemia sebi species complex (WSSC). We then conducted multiple phylogenetic analyses and demonstrated genealogical concordance, which supports the existence of four phylogenetic species within the WSSC. Geographically, W. muriae was only found in Europe, W. sebi clade 3 was only found in Canada, W. sebi clade 4 was found in subtropical regions, while W. sebi clade 1 and 2 were found worldwide. Haplotype analysis showed that W. sebi clades 1 and 2 had multiple haplotypes while W. sebi clades 3 and 4 had one haplotype and may have been under sampled. We describe W. sebi clades 2, 3, and 4 as new species in a companion study.
Collapse
Affiliation(s)
- Hai D. T. Nguyen
- Department of Biology, Faculty of Science, University of Ottawa, Ottawa, Ontario, Canada
- Biodiversity (Mycology), Eastern Cereal and Oilseed Research Centre, Agriculture and Agri-Food Canada, Ottawa, Ontario, Canada
- * E-mail: (HDTN); (KAS)
| | - Sašo Jančič
- Department of Biology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Martin Meijer
- CBS-KNAW Fungal Biodiversity Centre, Utrecht, The Netherlands
| | - Joey B. Tanney
- Biodiversity (Mycology), Eastern Cereal and Oilseed Research Centre, Agriculture and Agri-Food Canada, Ottawa, Ontario, Canada
| | - Polona Zalar
- Department of Biology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Nina Gunde-Cimerman
- Department of Biology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Keith A. Seifert
- Department of Biology, Faculty of Science, University of Ottawa, Ottawa, Ontario, Canada
- Biodiversity (Mycology), Eastern Cereal and Oilseed Research Centre, Agriculture and Agri-Food Canada, Ottawa, Ontario, Canada
| |
Collapse
|
48
|
Tonini J, Moore A, Stern D, Shcheglovitova M, Ortí G. Concatenation and Species Tree Methods Exhibit Statistically Indistinguishable Accuracy under a Range of Simulated Conditions. PLOS CURRENTS 2015; 7. [PMID: 25901289 PMCID: PMC4391732 DOI: 10.1371/currents.tol.34260cc27551a527b124ec5f6334b6be] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Phylogeneticists have long understood that several biological processes can cause a gene tree to disagree with its species tree. In recent years, molecular phylogeneticists have increasingly foregone traditional supermatrix approaches in favor of species tree methods that account for one such source of error, incomplete lineage sorting (ILS). While gene tree-species tree discordance no doubt poses a significant challenge to phylogenetic inference with molecular data, researchers have only recently begun to systematically evaluate the relative accuracy of traditional and ILS-sensitive methods. Here, we report on simulations demonstrating that concatenation can perform as well or better than methods that attempt to account for sources of error introduced by ILS. Based on these and similar results from other researchers, we argue that concatenation remains a useful component of the phylogeneticist’s toolbox and highlight that phylogeneticists should continue to make explicit comparisons of results produced by contemporaneous and classical methods.
Collapse
Affiliation(s)
- João Tonini
- Department of Biological Sciences, The George Washington Univerisity, Washington, District of Columbia, USA
| | - Andrew Moore
- Department of Biological Sciences, The George Washington University, Washington, District of Columbia, USA
| | - David Stern
- Computational Biology Institute, Department of Biological Sciences, The George Washington University, Washington, District of Columbia, USA
| | - Maryia Shcheglovitova
- Department of Geography & Environmental Systems, University of Maryland Baltimore County, Baltimore, MD, USA
| | - Guillermo Ortí
- Department of Biological Sciences, The George Washington Univerisity, Washington, District of Columbia, USA
| |
Collapse
|
49
|
Zheng Y, Wiens JJ. Do missing data influence the accuracy of divergence-time estimation with BEAST? Mol Phylogenet Evol 2015; 85:41-9. [PMID: 25681677 DOI: 10.1016/j.ympev.2015.02.002] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2014] [Revised: 01/26/2015] [Accepted: 02/01/2015] [Indexed: 10/24/2022]
Abstract
Time-calibrated phylogenies have become essential to evolutionary biology. A recurrent and unresolved question for dating analyses is whether genes with missing data cells should be included or excluded. This issue is particularly unclear for the most widely used dating method, the uncorrelated lognormal approach implemented in BEAST. Here, we test the robustness of this method to missing data. We compare divergence-time estimates from a nearly complete dataset (20 nuclear genes for 32 species of squamate reptiles) to those from subsampled matrices, including those with 5 or 2 complete loci only and those with 5 or 8 incomplete loci added. In general, missing data had little impact on estimated dates (mean error of ∼5Myr per node or less, given an overall age of ∼220Myr in squamates), even when 80% of sampled genes had 75% missing data. Mean errors were somewhat higher when all genes were 75% incomplete (∼17Myr). However, errors increased dramatically when only 2 of 9 fossil calibration points were included (∼40Myr), regardless of missing data. Overall, missing data (and even numbers of genes sampled) may have only minor impacts on the accuracy of divergence dating with BEAST, relative to the dramatic effects of fossil calibrations.
Collapse
Affiliation(s)
- Yuchi Zheng
- Department of Herpetology, Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu 610041, China; Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721-088, USA.
| | - John J Wiens
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721-088, USA.
| |
Collapse
|
50
|
Boubli JP, Ribas C, Lynch Alfaro JW, Alfaro ME, da Silva MNF, Pinho GM, Farias IP. Spatial and temporal patterns of diversification on the Amazon: A test of the riverine hypothesis for all diurnal primates of Rio Negro and Rio Branco in Brazil. Mol Phylogenet Evol 2015; 82 Pt B:400-12. [DOI: 10.1016/j.ympev.2014.09.005] [Citation(s) in RCA: 140] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2013] [Revised: 08/27/2014] [Accepted: 09/09/2014] [Indexed: 11/16/2022]
|