1
|
Sousa FD, Bertrand YJ, Zizka A, Cangrén P, Oxelman B, Pfeil BE. Chloroplast genome and nuclear loci data for 71 Medicago species. Data Brief 2024; 54:110540. [PMID: 38868387 PMCID: PMC11166683 DOI: 10.1016/j.dib.2024.110540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 04/15/2024] [Accepted: 05/13/2024] [Indexed: 06/14/2024] Open
Abstract
We present a dataset containing nuclear and chloroplast sequences for 71 species in genus Medicago (Fabaceae), as well as for 8 species in genera Melilotus and Trigonella. Sequence data for a total of 130 samples was obtained with high-throughput sequencing of enriched genomic DNA libraries targeting 61 single-copy nuclear genes from across the Medicago truncatula genome. Chloroplast sequence reads were also generated, allowing for the recovery of chloroplast genome sequences for all 130 samples. A fully-resolved phylogenetic tree was inferred from the chloroplast dataset using maximum-likelihoood methods. More than 80% of accepted Medicago species are represented in this dataset, including three subspecies of Medicago sativa (alfalfa). These data can be further utilised for phylogenetic analyses in Medicago and related genera, but also for probe and primer design and plant breeding studies.
Collapse
Affiliation(s)
- Filipe de Sousa
- cE3c-Centre for Ecology, Evolution and Environmental Changes & CHANGE-Global Change and Sustainability Institute, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisbon, Portugal
| | - Yann J.K. Bertrand
- Institute of Botany, Academy of Sciences of the Czech Republic, CZ-252 43 Průhonice, Czech Republic
| | - Alexander Zizka
- Department of Biology, Philipps-University Marburg, Karl-von-Frisch-Strape 8, 35043 Marburg, Germany
| | - Patrik Cangrén
- Department of Biological and Environmental Sciences, University of Gothenburg, Medicinaregatan 7B, Göteborg 413 90, Sweden
| | - Bengt Oxelman
- Department of Biological and Environmental Sciences, University of Gothenburg, Medicinaregatan 7B, Göteborg 413 90, Sweden
| | - Bernard E. Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Medicinaregatan 7B, Göteborg 413 90, Sweden
| |
Collapse
|
2
|
Crameri S, Fior S, Zoller S, Widmer A. A target capture approach for phylogenomic analyses at multiple evolutionary timescales in rosewoods (Dalbergia spp.) and the legume family (Fabaceae). Mol Ecol Resour 2022; 22:3087-3105. [PMID: 35689779 PMCID: PMC9796917 DOI: 10.1111/1755-0998.13666] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Revised: 03/29/2022] [Accepted: 06/01/2022] [Indexed: 01/07/2023]
Abstract
Understanding the genetic changes associated with the evolution of biological diversity is of fundamental interest to molecular ecologists. The assessment of genetic variation at hundreds or thousands of unlinked genetic loci forms a sound basis to address questions ranging from micro- to macroevolutionary timescales, and is now possible thanks to advances in sequencing technology. Major difficulties are associated with (i) the lack of genomic resources for many taxa, especially from tropical biodiversity hotspots; (ii) scaling the numbers of individuals analysed and loci sequenced; and (iii) building tools for reproducible bioinformatic analyses of such data sets. To address these challenges, we developed target capture probes for genomic studies of the highly diverse, pantropically distributed and economically significant rosewoods (Dalbergia spp.), explored the performance of an overlapping probe set for target capture across the legume family (Fabaceae), and built the general purpose bioinformatic pipeline CaptureAl. Phylogenomic analyses of Malagasy Dalbergia species yielded highly resolved and well supported hypotheses of evolutionary relationships. Population genomic analyses identified differences between closely related species and revealed the existence of a potentially new species, suggesting that the diversity of Malagasy Dalbergia species has been underestimated. Analyses at the family level corroborated previous findings by the recovery of monophyletic subfamilies and many well-known clades, as well as high levels of gene tree discordance, especially near the root of the family. The new genomic and bioinformatic resources, including the Fabaceae1005 and Dalbergia2396 probe sets, will hopefully advance systematics and ecological genetics research in legumes, and promote conservation of the highly diverse and endangered Dalbergia rosewoods.
Collapse
Affiliation(s)
- Simon Crameri
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
| | - Simone Fior
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
| | - Stefan Zoller
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
- Genetic Diversity Centre (GDC)ETH ZurichZürichSwitzerland
| | - Alex Widmer
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
| |
Collapse
|
3
|
Schneider JV, Paule J, Jungcurt T, Cardoso D, Amorim AM, Berberich T, Zizka G. Resolving Recalcitrant Clades in the Pantropical Ochnaceae: Insights From Comparative Phylogenomics of Plastome and Nuclear Genomic Data Derived From Targeted Sequencing. FRONTIERS IN PLANT SCIENCE 2021; 12:638650. [PMID: 33613613 PMCID: PMC7890083 DOI: 10.3389/fpls.2021.638650] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Accepted: 01/15/2021] [Indexed: 05/13/2023]
Abstract
Plastid DNA sequence data have been traditionally widely used in plant phylogenetics because of the high copy number of plastids, their uniparental inheritance, and the blend of coding and non-coding regions with divergent substitution rates that allow the reconstruction of phylogenetic relationships at different taxonomic ranks. In the present study, we evaluate the utility of the plastome for the reconstruction of phylogenetic relationships in the pantropical plant family Ochnaceae (Malpighiales). We used the off-target sequence read fraction of a targeted sequencing study (targeting nuclear loci only) to recover more than 100 kb of the plastid genome from the majority of the more than 200 species of Ochnaceae and all but two genera using de novo and reference-based assembly strategies. Most of the recalcitrant nodes in the family's backbone were resolved by our plastome-based phylogenetic inference, corroborating the most recent classification system of Ochnaceae and findings from a phylogenomic study based on nuclear loci. Nonetheless, the phylogenetic relationships within the major clades of tribe Ochnineae, which comprise about two thirds of the family's species diversity, received mostly low support. Generally, the phylogenetic resolution was lowest at the infrageneric level. Overall there was little phylogenetic conflict compared to a recent analysis of nuclear loci. Effects of taxon sampling were invoked as the most likely reason for some of the few well-supported discords. Our study demonstrates the utility of the off-target fraction of a target enrichment study for assembling near-complete plastid genomes for a large proportion of samples.
Collapse
Affiliation(s)
- Julio V. Schneider
- Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany
- Entomology III, Department of Terrestrial Zoology, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany
| | - Juraj Paule
- Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany
- Institute of Ecology, Evolution and Diversity, Goethe University, Frankfurt am Main, Germany
| | - Tanja Jungcurt
- Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany
- Institute of Ecology, Evolution and Diversity, Goethe University, Frankfurt am Main, Germany
| | - Domingos Cardoso
- Instituto de Biologia, Universidade Federal da Bahia (UFBA), Salvador, Brazil
| | - André Márcio Amorim
- Universidade Estadual de Santa Cruz (UESC), Ilhéus, Brazil
- Herbário André Maurício Vieira de Carvalho, CEPEC, CEPLAC, Itabuna, Brazil
| | - Thomas Berberich
- Senckenberg Biodiversity and Climate Research Center, Lab-Center, Frankfurt am Main, Germany
| | - Georg Zizka
- Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany
- Institute of Ecology, Evolution and Diversity, Goethe University, Frankfurt am Main, Germany
- *Correspondence: Georg Zizka, ;
| |
Collapse
|
4
|
Alzahrani DA, Yaradua SS, Albokhari EJ, Abba A. Complete chloroplast genome sequence of Barleria prionitis, comparative chloroplast genomics and phylogenetic relationships among Acanthoideae. BMC Genomics 2020; 21:393. [PMID: 32532210 PMCID: PMC7291470 DOI: 10.1186/s12864-020-06798-2] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 05/27/2020] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND The plastome of medicinal and endangered species in Kingdom of Saudi Arabia, Barleria prionitis was sequenced. The plastome was compared with that of seven Acanthoideae species in order to describe the plastome, spot the microsatellite, assess the dissimilarities within the sampled plastomes and to infer their phylogenetic relationships. RESULTS The plastome of B. prionitis was 152,217 bp in length with Guanine-Cytosine and Adenine-Thymine content of 38.3 and 61.7% respectively. It is circular and quadripartite in structure and constitute of a large single copy (LSC, 83, 772 bp), small single copy (SSC, 17, 803 bp) and a pair of inverted repeat (IRa and IRb 25, 321 bp each). 131 genes were identified in the plastome out of which 113 are unique and 18 were repeated in IR region. The genome consists of 4 rRNA, 30 tRNA and 80 protein-coding genes. The analysis of long repeat showed all types of repeats were present in the plastome and palindromic has the highest frequency. A total number of 98 SSR were also identified of which mostly were mononucleotide Adenine-Thymine and are located at the non coding regions. Comparative genomic analysis among the plastomes revealed that the pair of the inverted repeat is more conserved than the single copy region. In addition high variation is observed in the intergenic spacer region than the coding region. The genes, ycf1and ndhF and are located at the border junction of the small single copy region and IRb region of all the plastome. The analysis of sequence divergence in the protein coding genes indicates that the following genes undergo positive selection (atpF, petD, psbZ, rpl20, petB, rpl16, rps16, rpoC, rps7, rpl32 and ycf3). Phylogenetic analysis indicated sister relationship between Ruellieae and Justcieae. In addition, Barleria, Justicia and Ruellia are paraphyletic, suggesting that Justiceae, Ruellieae, Andrographideae and Barlerieae should be treated as tribes. CONCLUSIONS This study sequenced and assembled the first plastome of the taxon Barleria and reported the basics resources for evolutionary studies of B. prionitis and tools for phylogenetic relationship studies within the core Acanthaceae.
Collapse
Affiliation(s)
- Dhafer A Alzahrani
- Department of Biological Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Samaila S Yaradua
- Department of Biological Sciences, King Abdulaziz University, Jeddah, Saudi Arabia. .,Department of Biology, Umaru Musa Yaradua University, Centre for Biodiversity and Conservation, Katsina, Nigeria.
| | - Enas J Albokhari
- Department of Biological Sciences, King Abdulaziz University, Jeddah, Saudi Arabia.,Department of Biological Sciences, Umm Al-Qura University, Makkah, Saudi Arabia
| | - Abidina Abba
- Department of Biological Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
5
|
Andermann T, Torres Jiménez MF, Matos-Maraví P, Batista R, Blanco-Pastor JL, Gustafsson ALS, Kistler L, Liberal IM, Oxelman B, Bacon CD, Antonelli A. A Guide to Carrying Out a Phylogenomic Target Sequence Capture Project. Front Genet 2020; 10:1407. [PMID: 32153629 PMCID: PMC7047930 DOI: 10.3389/fgene.2019.01407] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Accepted: 12/24/2019] [Indexed: 12/17/2022] Open
Abstract
High-throughput DNA sequencing techniques enable time- and cost-effective sequencing of large portions of the genome. Instead of sequencing and annotating whole genomes, many phylogenetic studies focus sequencing effort on large sets of pre-selected loci, which further reduces costs and bioinformatic challenges while increasing coverage. One common approach that enriches loci before sequencing is often referred to as target sequence capture. This technique has been shown to be applicable to phylogenetic studies of greatly varying evolutionary depth. Moreover, it has proven to produce powerful, large multi-locus DNA sequence datasets suitable for phylogenetic analyses. However, target capture requires careful considerations, which may greatly affect the success of experiments. Here we provide a simple flowchart for designing phylogenomic target capture experiments. We discuss necessary decisions from the identification of target loci to the final bioinformatic processing of sequence data. We outline challenges and solutions related to the taxonomic scope, sample quality, and available genomic resources of target capture projects. We hope this review will serve as a useful roadmap for designing and carrying out successful phylogenetic target capture studies.
Collapse
Affiliation(s)
- Tobias Andermann
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
| | - Maria Fernanda Torres Jiménez
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
| | - Pável Matos-Maraví
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
- Institute of Entomology, Biology Centre of the Czech Academy of Sciences, České Budějovice, Czechia
| | - Romina Batista
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
- Programa de Pós-Graduação em Genética, Conservação e Biologia Evolutiva, PPG GCBEv–Instituto Nacional de Pesquisas da Amazônia—INPA Campus II, Manaus, Brazil
- Coordenação de Zoologia, Museu Paraense Emílio Goeldi, Belém, Brazil
| | - José L. Blanco-Pastor
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- INRAE, Centre Nouvelle-Aquitaine-Poitiers, Lusignan, France
| | | | - Logan Kistler
- Department of Anthropology, National Museum of Natural History, Smithsonian Institution, Washington, DC, United States
| | - Isabel M. Liberal
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Bengt Oxelman
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
| | - Christine D. Bacon
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
| | - Alexandre Antonelli
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
- Royal Botanic Gardens, Kew, Richmond-Surrey, United Kingdom
| |
Collapse
|
6
|
Granados Mendoza C, Jost M, Hágsater E, Magallón S, van den Berg C, Lemmon EM, Lemmon AR, Salazar GA, Wanke S. Target Nuclear and Off-Target Plastid Hybrid Enrichment Data Inform a Range of Evolutionary Depths in the Orchid Genus Epidendrum. FRONTIERS IN PLANT SCIENCE 2020; 10:1761. [PMID: 32063915 PMCID: PMC7000662 DOI: 10.3389/fpls.2019.01761] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Accepted: 12/16/2019] [Indexed: 05/12/2023]
Abstract
Universal angiosperm enrichment probe sets designed to enrich hundreds of putatively orthologous nuclear single-copy loci are increasingly being applied to infer phylogenetic relationships of different lineages of angiosperms at a range of evolutionary depths. Studies applying such probe sets have focused on testing the universality and performance of the target nuclear loci, but they have not taken advantage of off-target data from other genome compartments generated alongside the nuclear loci. Here we do so to infer phylogenetic relationships in the orchid genus Epidendrum and closely related genera of subtribe Laeliinae. Our aims are to: 1) test the technical viability of applying the plant anchored hybrid enrichment (AHE) method (Angiosperm v.1 probe kit) to our focal group, 2) mine plastid protein coding genes from off-target reads; and 3) evaluate the performance of the target nuclear and off-target plastid loci in resolving and supporting phylogenetic relationships along a range of taxonomical depths. Phylogenetic relationships were inferred from the nuclear data set through coalescent summary and site-based methods, whereas plastid loci were analyzed in a concatenated partitioned matrix under maximum likelihood. The usefulness of target and flanking non-target nuclear regions and plastid loci was assessed through the estimation of their phylogenetic informativeness. Our study successfully applied the plant AHE probe kit to Epidendrum, supporting the universality of this kit in angiosperms. Moreover, it demonstrated the feasibility of mining plastome loci from off-target reads generated with the Angiosperm v.1 probe kit to obtain additional, uniparentally inherited sequence data at no extra sequencing cost. Our analyses detected some strongly supported incongruences between nuclear and plastid data sets at shallow divergences, an indication of potential lineage sorting, hybridization, or introgression events in the group. Lastly, we found that the per site phylogenetic informativeness of the ycf1 plastid gene surpasses that of all other plastid genes and several nuclear loci, making it an excellent candidate for assessing phylogenetic relationships at medium to low taxonomic levels in orchids.
Collapse
Affiliation(s)
- Carolina Granados Mendoza
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Matthias Jost
- Institut für Botanik, Technische Universität Dresden, Dresden, Germany
| | - Eric Hágsater
- Herbario AMO, Instituto Chinoin, A.C., Mexico City, Mexico
| | - Susana Magallón
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Cássio van den Berg
- Departamento de Ciências Biológicas, Universidade Estadual de Feira de Santana, Feira de Santana, Brazil
| | - Emily Moriarty Lemmon
- Department of Biological Science, Florida State University, Tallahassee, FL, United States
| | - Alan R. Lemmon
- Department of Scientific Computing, Florida State University, Tallahassee, FL, United States
| | - Gerardo A. Salazar
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Stefan Wanke
- Institut für Botanik, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
7
|
Blanco-Pastor JL, Bertrand YJK, Liberal IM, Wei Y, Brummer EC, Pfeil BE. Evolutionary networks from RADseq loci point to hybrid origins of Medicago carstiensis and Medicago cretacea. AMERICAN JOURNAL OF BOTANY 2019; 106:1219-1228. [PMID: 31535720 DOI: 10.1002/ajb2.1352] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 07/12/2019] [Indexed: 06/10/2023]
Abstract
PREMISE Although hybridization has played an important role in the evolution of many plant species, phylogenetic reconstructions that include hybridizing lineages have been historically constrained by the available models and data. Restriction-site-associated DNA sequencing (RADseq) has been a popular sequencing technique for the reconstruction of hybridization in the next-generation sequencing era. However, the utility of RADseq for the reconstruction of complex evolutionary networks has not been thoroughly investigated. Conflicting phylogenetic relationships in the genus Medicago have been mainly attributed to hybridization, but the specific hybrid origins of taxa have not been yet clarified. METHODS We obtained new molecular data from diploid species of Medicago section Medicago using single-digest RADseq to reconstruct evolutionary networks from gene trees, an approach that is computationally tractable with data sets that include several species and complex hybridization patterns. RESULTS Our analyses revealed that assembly filters to exclusively select a small set of loci with high phylogenetic information led to the most-divergent network topologies. Conversely, alternative clustering thresholds or filters on the number of samples per locus had a lower impact on networks. A strong hybridization signal was detected for M. carstiensis and M. cretacea, while signals were less clear for M. rugosa, M. rhodopea, M. suffruticosa, M. marina, M. scutellata, and M. sativa. CONCLUSIONS Complex network reconstructions from RADseq gene trees were not robust under variations of the assembly parameters and filters. But when the most-divergent networks were discarded, all remaining analyses consistently supported a hybrid origin for M. carstiensis and M. cretacea.
Collapse
Affiliation(s)
- José Luis Blanco-Pastor
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Göteborg, Sweden
- INRA, Centre Nouvelle-Aquitaine-Poitiers, UR4 (URP3F), 86600, Lusignan, France
| | - Yann J K Bertrand
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Göteborg, Sweden
- Institute of Botany, Czech Academy of Sciences, Zámek 1, 25243, Průhonice, Czech Republic
| | | | - Yanling Wei
- Plant Breeding Center, Department of Plant Sciences, University of California, Davis, Davis, CA, USA
| | - E Charles Brummer
- Plant Breeding Center, Department of Plant Sciences, University of California, Davis, Davis, CA, USA
| | - Bernard E Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Göteborg, Sweden
| |
Collapse
|
8
|
Sousa F, Neiva J, Martins N, Jacinto R, Anderson L, Raimondi PT, Serrão EA, Pearson GA. Increased evolutionary rates and conserved transcriptional response following allopolyploidization in brown algae. Evolution 2019; 73:59-72. [PMID: 30421788 DOI: 10.1111/evo.13645] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Revised: 10/23/2018] [Accepted: 10/24/2018] [Indexed: 01/08/2023]
Abstract
Genome mergers between independently evolving lineages, via allopolyploidy, can potentially lead to instantaneous sympatric speciation. However, little is known about the consequences of allopolyploidy and the resultant "genome shock" on genome evolution and expression beyond the plant and fungal branches of the Tree of Life. The aim of this study was to compare substitution rates and gene expression patterns in two allopolyploid brown algae (Phaeophyceae and Heterokonta) and their progenitors in the genus Pelvetiopsis N. L. Gardner in the north-east Pacific, and to date their relationships. We used RNA-seq data, all potential orthologues, and putative single-copy loci for phylogenomic, divergence, and gene expression analyses. The multispecies coalescent placed the origin of allopolyploids in the late Pleistocene (0.35-0.05 Ma). Homoeologues displayed increased nonsynonymous divergence compared with parental orthologues, consistent with relaxed selective constraint following allopolyploidization, including for genes with no evidence of pseudogenization or neofunctionalization. Patterns of homoeologue-orthologue expression conservation and expression level dominance were largely shared with both natural plant and fungal allopolyploids. Our results provide further support for common cross-Kingdom patterns of allopolyploid genome evolution and transcriptional responses, here in the evolutionarily distinct marine heterokont brown algae.
Collapse
Affiliation(s)
- Filipe Sousa
- CCMAR-Centro de Ciências do Mar da Universidade do Algarve, Edifício 7, Gambelas, Faro, 8005-139, Portugal
| | - João Neiva
- CCMAR-Centro de Ciências do Mar da Universidade do Algarve, Edifício 7, Gambelas, Faro, 8005-139, Portugal
| | - Neusa Martins
- CCMAR-Centro de Ciências do Mar da Universidade do Algarve, Edifício 7, Gambelas, Faro, 8005-139, Portugal
| | - Rita Jacinto
- CCMAR-Centro de Ciências do Mar da Universidade do Algarve, Edifício 7, Gambelas, Faro, 8005-139, Portugal
| | - Laura Anderson
- Long Marine Laboratory, University of California, Santa Cruz, California, 95064
| | - Peter T Raimondi
- Long Marine Laboratory, University of California, Santa Cruz, California, 95064
| | - Ester A Serrão
- CCMAR-Centro de Ciências do Mar da Universidade do Algarve, Edifício 7, Gambelas, Faro, 8005-139, Portugal
| | - Gareth A Pearson
- CCMAR-Centro de Ciências do Mar da Universidade do Algarve, Edifício 7, Gambelas, Faro, 8005-139, Portugal
| |
Collapse
|
9
|
de Abreu NL, Alves RJV, Cardoso SRS, Bertrand YJ, Sousa F, Hall CF, Pfeil BE, Antonelli A. The use of chloroplast genome sequences to solve phylogenetic incongruences in Polystachya Hook (Orchidaceae Juss). PeerJ 2018; 6:e4916. [PMID: 29922511 PMCID: PMC6005162 DOI: 10.7717/peerj.4916] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Accepted: 05/16/2018] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND Current evidence suggests that for more robust estimates of species tree and divergence times, several unlinked genes are required. However, most phylogenetic trees for non-model organisms are based on single sequences or just a few regions, using traditional sequencing methods. Techniques for massive parallel sequencing or next generation sequencing (NGS) are an alternative to traditional methods that allow access to hundreds of DNA regions. Here we use this approach to resolve the phylogenetic incongruence found in Polystachya Hook. (Orchidaceae), a genus that stands out due to several interesting aspects, including cytological (polyploid and diploid species), evolutionary (reticulate evolution) and biogeographical (species widely distributed in the tropics and high endemism in Brazil). The genus has a notoriously complicated taxonomy, with several sections that are widely used but probably not monophyletic. METHODS We generated the complete plastid genome of 40 individuals from one clade within the genus. The method consisted in construction of genomic libraries, hybridization to RNA probes designed from available sequences of a related species, and subsequent sequencing of the product. We also tested how well a smaller sample of the plastid genome would perform in phylogenetic inference in two ways: by duplicating a fast region and analyzing multiple copies of this dataset, and by sampling without replacement from all non-coding regions in our alignment. We further examined the phylogenetic implications of non-coding sequences that appear to have undergone hairpin inversions (reverse complemented sequences associated with small loops). RESULTS We retrieved 131,214 bp, including coding and non-coding regions of the plastid genome. The phylogeny was able to fully resolve the relationships among all species in the targeted clade with high support values. The first divergent species are represented by African accessions and the most recent ones are among Neotropical species. DISCUSSION Our results indicate that using the entire plastid genome is a better option than screening highly variable markers, especially when the expected tree is likely to contain many short branches. The phylogeny inferred is consistent with the proposed origin of the genus, showing a probable origin in Africa, with later dispersal into the Neotropics, as evidenced by a clade containing all Neotropical individuals. The multiple positions of Polystachya concreta (Jacq.) Garay & Sweet in the phylogeny are explained by allotetraploidy. Polystachya estrellensis Rchb.f. can be considered a genetically distinct species from P. concreta and P. foliosa (Lindl.) Rchb.f., but the delimitation of P. concreta remains uncertain. Our study shows that NGS provides a powerful tool for inferring relationships at low taxonomic levels, even in taxonomically challenging groups with short branches and intricate morphology.
Collapse
Affiliation(s)
- Narjara Lopes de Abreu
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
- Museu Nacional, Universidade Federal do Rio de Janeiro, São Cristóvão, Rio de Janeiro, Brasil
| | - Ruy José Válka Alves
- Museu Nacional, Universidade Federal do Rio de Janeiro, São Cristóvão, Rio de Janeiro, Brasil
| | - Sérgio Ricardo Sodré Cardoso
- Instituto de Pesquisas, Jardim Botânico do Rio de Janeiro, Diretoria de Pesquisa Científica, Rio de Janeiro, Brasil
| | - Yann J.K. Bertrand
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Filipe Sousa
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
- Centro de Ciências do Mar, Universidade do Algarve, Faro, Portugal
| | - Climbiê Ferreira Hall
- Campus Três Lagoas, Universidade Federal de Mato Grosso do Sul, Três Lagoas, Mato Grosso do Sul, Brasil
| | - Bernard E. Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Alexandre Antonelli
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
- Gothenburg Botanical Garden, Göteborg, Sweden
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| |
Collapse
|
10
|
Vatanparast M, Powell A, Doyle JJ, Egan AN. Targeting legume loci: A comparison of three methods for target enrichment bait design in Leguminosae phylogenomics. APPLICATIONS IN PLANT SCIENCES 2018; 6:e1036. [PMID: 29732266 PMCID: PMC5895186 DOI: 10.1002/aps3.1036] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Accepted: 02/22/2018] [Indexed: 05/19/2023]
Abstract
PREMISE OF THE STUDY The development of pipelines for locus discovery has spurred the use of target enrichment for plant phylogenomics. However, few studies have compared pipelines from locus discovery and bait design, through validation, to tree inference. We compared three methods within Leguminosae (Fabaceae) and present a workflow for future efforts. METHODS Using 30 transcriptomes, we compared Hyb-Seq, MarkerMiner, and the Yang and Smith (Y&S) pipelines for locus discovery, validated 7501 baits targeting 507 loci across 25 genera via Illumina sequencing, and inferred gene and species trees via concatenation- and coalescent-based methods. RESULTS Hyb-Seq discovered loci with the longest mean length. MarkerMiner discovered the most conserved loci with the least flagged as paralogous. Y&S offered the most parsimony-informative sites and putative orthologs. Target recovery averaged 93% across taxa. We optimized our targeted locus set based on a workflow designed to minimize paralog/ortholog conflation and thus present 423 loci for legume phylogenomics. CONCLUSIONS Methods differed across criteria important for phylogenetic marker development. We recommend Hyb-Seq as a method that may be useful for most phylogenomic projects. Our targeted locus set is a resource for future, community-driven efforts to reconstruct the legume tree of life.
Collapse
Affiliation(s)
- Mohammad Vatanparast
- Department of BotanyNational Museum of Natural HistorySmithsonian InstitutionP.O. Box 37012, MRC 166WashingtonDC20560USA
- Present address:
Forest, Nature, and Biomass SectionDepartment of Geosciences and Natural Resource ManagementRolighedsvej 23, 1958 Frederiksberg C., University of CopenhagenDenmark
| | - Adrian Powell
- Section of Plant Breeding and GeneticsSchool of Integrated Plant SciencesCornell University512 Mann LibraryIthacaNew York14853USA
- Present address:
Boyce Thompson Institute533 Tower RoadIthacaNew York14853USA
| | - Jeff J. Doyle
- Section of Plant Breeding and GeneticsSchool of Integrated Plant SciencesCornell University512 Mann LibraryIthacaNew York14853USA
| | - Ashley N. Egan
- Department of BotanyNational Museum of Natural HistorySmithsonian InstitutionP.O. Box 37012, MRC 166WashingtonDC20560USA
| |
Collapse
|
11
|
Sousa F, Bertrand YJK, Doyle JJ, Oxelman B, Pfeil BE. Using Genomic Location and Coalescent Simulation to Investigate Gene Tree Discordance in Medicago L. Syst Biol 2018; 66:934-949. [PMID: 28177088 DOI: 10.1093/sysbio/syx035] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2015] [Accepted: 02/01/2017] [Indexed: 12/28/2022] Open
Abstract
Several well-documented evolutionary processes are known to cause conflict between species-level phylogenies and gene-level phylogenies. Three of the most challenging processes for species tree inference are incomplete lineage sorting, hybridization and gene duplication, which may result in unwarranted comparisons of paralogous genes. Several existing methods have dealt with these processes but none has yet been able to untangle all three at once. Here, we propose a stepwise method by which these processes can be discerned using information on genomic location coupled with coalescent simulations. In the first step, highly discordant genes within genomic blocks (putative paralogs) are identified and excluded from the data set and, in the second step, blocks of linked genes are grouped according to their hybrid history. Existing multispecies coalescent software can then be applied to recover the principal tree(s) that make up the species tree/network without violating the underlying model. The potential of the approach is evaluated on simulated data derived from a species network composed of nine species, of which one is of hybrid origin, and displaying a single-gene duplication that leads to paralogous comparisons. We apply our method to an empirical set of 12 genes from 7 species sampled in the plant genus Medicago that display phylogenetic discordance. We identify the causes of the discordance and demonstrate that the Medicago orbicularis lineage experienced an episode of ancient hybridization. Our results show promise as a new way to explore phylogenetic sequence data that can significantly improve species tree inference in presence of hybridization and undetected paralogy or other causes leading to extremely discordant gene trees. [Coalescent simulation; gene tree; genomic location; hybridization; incomplete lineage sorting; paralogy; phylogenetic incongruence; principal tree; species tree.].
Collapse
Affiliation(s)
- F Sousa
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden
| | - Y J K Bertrand
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden
| | - J J Doyle
- Department of Plant Biology, Cornell University, 404 Mann Library Building, Ithaca, NY 14853, USA
| | - B Oxelman
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden
| | - B E Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden
| |
Collapse
|
12
|
Eriksson JS, de Sousa F, Bertrand YJK, Antonelli A, Oxelman B, Pfeil BE. Allele phasing is critical to revealing a shared allopolyploid origin of Medicago arborea and M. strasseri (Fabaceae). BMC Evol Biol 2018; 18:9. [PMID: 29374461 PMCID: PMC5787288 DOI: 10.1186/s12862-018-1127-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 01/22/2018] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Whole genome duplication plays a central role in plant evolution. There are two main classes of polyploid formation: autopolyploids which arise within one species by doubling of similar homologous genomes; in contrast, allopolyploidy (hybrid polyploidy) arise via hybridization and subsequent doubling of nonhomologous (homoeologous) genomes. The distinction between polyploid origins can be made using gene phylogenies, if alleles from each genome can be correctly retrieved. We examined whether two closely related tetraploid Mediterranean shrubs (Medicago arborea and M. strasseri) have an allopolyploid origin - a question that has remained unsolved despite substantial previous research. We sequenced and analyzed ten low-copy nuclear genes from these and related species, phasing all alleles. To test the efficacy of allele phasing on the ability to recover the evolutionary origin of polyploids, we compared these results to analyses using unphased sequences. RESULTS In eight of the gene trees the alleles inferred from the tetraploids formed two clades, in a non-sister relationship. Each of these clades was more closely related to alleles sampled from other species of Medicago, a pattern typical of allopolyploids. However, we also observed that alleles from one of the remaining genes formed two clades that were sister to one another, as is expected for autopolyploids. Trees inferred from unphased sequences were very different, with the tetraploids often placed in poorly supported and different positions compared to results obtained using phased alleles. CONCLUSIONS The complex phylogenetic history of M. arborea and M. strasseri is explained predominantly by shared allotetraploidy. We also observed that an increase in woodiness is correlated with polyploidy in this group of species and present a new possibility that woodiness could be a transgressive phenotype. Correctly phased homoeologues are likely to be critical for inferring the hybrid origin of allopolyploid species, when most genes retain more than one homoeologue. Ignoring homoeologous variation by merging the homoeologues can obscure the signal of hybrid polyploid origins and produce inaccurate results.
Collapse
Affiliation(s)
- Jonna S Eriksson
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Gothenburg, Sweden. .,Gothenburg Global Biodiversity Centre, Box 461, SE-405 30, Göteborg, Sweden.
| | - Filipe de Sousa
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Gothenburg, Sweden
| | - Yann J K Bertrand
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Gothenburg, Sweden
| | - Alexandre Antonelli
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Gothenburg, Sweden.,Gothenburg Global Biodiversity Centre, Box 461, SE-405 30, Göteborg, Sweden.,Gothenburg Botanical Garden, SE-41319, Göteborg, Sweden
| | - Bengt Oxelman
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Gothenburg, Sweden.,Gothenburg Global Biodiversity Centre, Box 461, SE-405 30, Göteborg, Sweden
| | - Bernard E Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Gothenburg, Sweden.,Gothenburg Global Biodiversity Centre, Box 461, SE-405 30, Göteborg, Sweden
| |
Collapse
|
13
|
A pilot study applying the plant Anchored Hybrid Enrichment method to New World sages (Salvia subgenus Calosphace; Lamiaceae). Mol Phylogenet Evol 2017; 117:124-134. [DOI: 10.1016/j.ympev.2017.02.006] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2016] [Revised: 02/06/2017] [Accepted: 02/06/2017] [Indexed: 11/18/2022]
|
14
|
Wanke S, Granados Mendoza C, Müller S, Paizanni Guillén A, Neinhuis C, Lemmon AR, Lemmon EM, Samain MS. Recalcitrant deep and shallow nodes in Aristolochia (Aristolochiaceae) illuminated using anchored hybrid enrichment. Mol Phylogenet Evol 2017; 117:111-123. [DOI: 10.1016/j.ympev.2017.05.014] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Revised: 05/12/2017] [Accepted: 05/15/2017] [Indexed: 01/05/2023]
|
15
|
Moore AJ, Vos JMD, Hancock LP, Goolsby E, Edwards EJ. Targeted Enrichment of Large Gene Families for Phylogenetic Inference: Phylogeny and Molecular Evolution of Photosynthesis Genes in the Portullugo Clade (Caryophyllales). Syst Biol 2017; 67:367-383. [DOI: 10.1093/sysbio/syx078] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 09/18/2017] [Indexed: 01/01/2023] Open
Affiliation(s)
- Abigail J Moore
- Department of Ecology and Evolutionary Biology, Brown University, Box G-W, Providence, RI 02912, USA
- Department of Microbiology and Plant Biology and Oklahoma Biological Survey, University of Oklahoma, 770 Van Vleet Oval, Norman, OK 73019, USA
| | - Jurriaan M De Vos
- Department of Ecology and Evolutionary Biology, Brown University, Box G-W, Providence, RI 02912, USA
- Department of Comparative Plant and Fungal Biology, Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3AE, UK
- Department of Environmental Sciences—Botany, University of Basel, Totengässlein 3, 4051 Basel, Switzerland
| | - Lillian P Hancock
- Department of Ecology and Evolutionary Biology, Brown University, Box G-W, Providence, RI 02912, USA
| | - Eric Goolsby
- Department of Ecology and Evolutionary Biology, Brown University, Box G-W, Providence, RI 02912, USA
- Department of Ecology and Evolutionary Biology, Yale University, PO Box 208105, New Haven, CT 06520, USA
| | - Erika J Edwards
- Department of Ecology and Evolutionary Biology, Brown University, Box G-W, Providence, RI 02912, USA
- Department of Ecology and Evolutionary Biology, Yale University, PO Box 208105, New Haven, CT 06520, USA
| |
Collapse
|
16
|
Kadlec M, Bellstedt DU, Le Maitre NC, Pirie MD. Targeted NGS for species level phylogenomics: "made to measure" or "one size fits all"? PeerJ 2017; 5:e3569. [PMID: 28761782 PMCID: PMC5530999 DOI: 10.7717/peerj.3569] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Accepted: 06/22/2017] [Indexed: 12/05/2022] Open
Abstract
Targeted high-throughput sequencing using hybrid-enrichment offers a promising source of data for inferring multiple, meaningfully resolved, independent gene trees suitable to address challenging phylogenetic problems in species complexes and rapid radiations. The targets in question can either be adopted directly from more or less universal tools, or custom made for particular clades at considerably greater effort. We applied custom made scripts to select sets of homologous sequence markers from transcriptome and WGS data for use in the flowering plant genus Erica (Ericaceae). We compared the resulting targets to those that would be selected both using different available tools (Hyb-Seq; MarkerMiner), and when optimising for broader clades of more distantly related taxa (Ericales; eudicots). Approaches comparing more divergent genomes (including MarkerMiner, irrespective of input data) delivered fewer and shorter potential markers than those targeted for Erica. The latter may nevertheless be effective for sequence capture across the wider family Ericaceae. We tested the targets delivered by our scripts by obtaining an empirical dataset. The resulting sequence variation was lower than that of standard nuclear ribosomal markers (that in Erica fail to deliver a well resolved gene tree), confirming the importance of maximising the lengths of individual markers. We conclude that rather than searching for "one size fits all" universal markers, we should improve and make more accessible the tools necessary for developing "made to measure" ones.
Collapse
Affiliation(s)
- Malvina Kadlec
- Institut für Organismische und Molekulare Evolutionsbiologie, Johannes-Gutenberg Universität Mainz, Mainz, Germany
| | - Dirk U. Bellstedt
- Department of Biochemistry, University of Stellenbosch, Stellenbosch, South Africa
| | | | - Michael D. Pirie
- Institut für Organismische und Molekulare Evolutionsbiologie, Johannes-Gutenberg Universität Mainz, Mainz, Germany
| |
Collapse
|
17
|
Ruggieri V, Anzar I, Paytuvi A, Calafiore R, Cigliano RA, Sanseverino W, Barone A. Exploiting the great potential of Sequence Capture data by a new tool, SUPER-CAP. DNA Res 2017; 24:81-91. [PMID: 28011720 PMCID: PMC5381350 DOI: 10.1093/dnares/dsw050] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Accepted: 10/26/2016] [Indexed: 01/08/2023] Open
Abstract
The recent development of Sequence Capture methodology represents a powerful strategy for enhancing data generation to assess genetic variation of targeted genomic regions. Here, we present SUPER-CAP, a bioinformatics web tool aimed at handling Sequence Capture data, fine calculating the allele frequency of variations and building genotype-specific sequence of captured genes. The dataset used to develop this in silico strategy consists of 378 loci and related regulative regions in a collection of 44 tomato landraces. About 14,000 high-quality variants were identified. The high depth (>40×) of coverage and adopting the correct filtering criteria allowed identification of about 4,000 rare variants and 10 genes with a different copy number variation. We also show that the tool is capable to reconstruct genotype-specific sequences for each genotype by using the detected variants. This allows evaluating the combined effect of multiple variants in the same protein. The architecture and functionality of SUPER-CAP makes the software appropriate for a broad set of analyses including SNP discovery and mining. Its functionality, together with the capability to process large data sets and efficient detection of sequence variation, makes SUPER-CAP a valuable bioinformatics tool for genomics and breeding purposes.
Collapse
Affiliation(s)
- Valentino Ruggieri
- Department of Agricultural Sciences, University of Naples Federico II, Via Università 100, 80055 Portici (NA), Italy.,Sequentia Biotech SL, Calle Compte d'Urgell, 240, 08035 Barcelona, Spain
| | - Irantzu Anzar
- Sequentia Biotech SL, Calle Compte d'Urgell, 240, 08035 Barcelona, Spain
| | - Andreu Paytuvi
- Sequentia Biotech SL, Calle Compte d'Urgell, 240, 08035 Barcelona, Spain
| | - Roberta Calafiore
- Department of Agricultural Sciences, University of Naples Federico II, Via Università 100, 80055 Portici (NA), Italy
| | | | - Walter Sanseverino
- Sequentia Biotech SL, Calle Compte d'Urgell, 240, 08035 Barcelona, Spain
| | - Amalia Barone
- Department of Agricultural Sciences, University of Naples Federico II, Via Università 100, 80055 Portici (NA), Italy
| |
Collapse
|
18
|
Léveillé-Bourret É, Starr JR, Ford BA, Moriarty Lemmon E, Lemmon AR. Resolving Rapid Radiations within Angiosperm Families Using Anchored Phylogenomics. Syst Biol 2017; 67:94-112. [DOI: 10.1093/sysbio/syx050] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2016] [Accepted: 04/28/2017] [Indexed: 11/13/2022] Open
|
19
|
Kaur P, Gaikwad K. From Genomes to GENE-omes: Exome Sequencing Concept and Applications in Crop Improvement. FRONTIERS IN PLANT SCIENCE 2017; 8:2164. [PMID: 29312405 PMCID: PMC5742236 DOI: 10.3389/fpls.2017.02164] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Accepted: 12/08/2017] [Indexed: 05/13/2023]
Abstract
Exome sequencing represents targeted capture and sequencing of 1-2% of 'high-value genomic regions' (subset of the genome) which are enriched for functional variants and harbors low level of repetitive regions. We discuss here an overview of exome sequencing, ways to approach plant exomes, and advantages and applicability of this powerful approach in deciphering functional regions of genomes. Though initially this approach was developed as an alternative to whole genome sequencing (WGS), but the multitude of benefits conferred by sequence capture via hybridization approaches created a niche for itself to solve many of biological riddles, particularly for resolving phylogenetic distances. The technique has also proved to be successful in understanding the basis of natural and induced molecular variation, marker development and developing genomic resources for complex, wild and non-model species, which are still intractable for WGS efforts. Thus, with profound applications of this powerful sequencing strategy, near future is expected to witness a collective expansion of both techniques, i.e., sequence capture via hybridization for evolutionary and ecological research and WGS approaches for its universal accessibility.
Collapse
|
20
|
Kadlec M, Bellstedt DU, Le Maitre NC, Pirie MD. Targeted NGS for species level phylogenomics: "made to measure" or "one size fits all"? PeerJ 2017. [PMID: 28761782 DOI: 10.7717/peerj3569] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/30/2023] Open
Abstract
Targeted high-throughput sequencing using hybrid-enrichment offers a promising source of data for inferring multiple, meaningfully resolved, independent gene trees suitable to address challenging phylogenetic problems in species complexes and rapid radiations. The targets in question can either be adopted directly from more or less universal tools, or custom made for particular clades at considerably greater effort. We applied custom made scripts to select sets of homologous sequence markers from transcriptome and WGS data for use in the flowering plant genus Erica (Ericaceae). We compared the resulting targets to those that would be selected both using different available tools (Hyb-Seq; MarkerMiner), and when optimising for broader clades of more distantly related taxa (Ericales; eudicots). Approaches comparing more divergent genomes (including MarkerMiner, irrespective of input data) delivered fewer and shorter potential markers than those targeted for Erica. The latter may nevertheless be effective for sequence capture across the wider family Ericaceae. We tested the targets delivered by our scripts by obtaining an empirical dataset. The resulting sequence variation was lower than that of standard nuclear ribosomal markers (that in Erica fail to deliver a well resolved gene tree), confirming the importance of maximising the lengths of individual markers. We conclude that rather than searching for "one size fits all" universal markers, we should improve and make more accessible the tools necessary for developing "made to measure" ones.
Collapse
Affiliation(s)
- Malvina Kadlec
- Institut für Organismische und Molekulare Evolutionsbiologie, Johannes-Gutenberg Universität Mainz, Mainz, Germany
| | - Dirk U Bellstedt
- Department of Biochemistry, University of Stellenbosch, Stellenbosch, South Africa
| | - Nicholas C Le Maitre
- Department of Biochemistry, University of Stellenbosch, Stellenbosch, South Africa
| | - Michael D Pirie
- Institut für Organismische und Molekulare Evolutionsbiologie, Johannes-Gutenberg Universität Mainz, Mainz, Germany
| |
Collapse
|
21
|
Li X, Hao B, Pan D, Schneeweiss GM. Marker Development for Phylogenomics: The Case of Orobanchaceae, a Plant Family with Contrasting Nutritional Modes. FRONTIERS IN PLANT SCIENCE 2017; 8:1973. [PMID: 29218053 PMCID: PMC5704539 DOI: 10.3389/fpls.2017.01973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Accepted: 11/01/2017] [Indexed: 05/02/2023]
Abstract
Phylogenomic approaches, employing next-generation sequencing (NGS) techniques, have revolutionized systematic and evolutionary biology. Target enrichment is an efficient and cost-effective method in phylogenomics and is becoming increasingly popular. Depending on availability and quality of reference data as well as on biological features of the study system, (semi-)automated identification of suitable markers will require specific bioinformatic pipelines. Here, we established a highly flexible bioinformatic pipeline, BaitsFinder, to identify putative orthologous single copy genes (SCGs) and to construct bait sequences in a single workflow. Additionally, this pipeline has been constructed to be able to cope with challenging data sets, such as the nutritionally heterogeneous plant family Orobanchaceae. To this end, we used transcriptome data of differing quality available for four Orobanchaceae species and, as reference, SCG data from monkeyflower (Erythranthe guttata, syn. Mimulus g.; 1,915 genes) and tomato (Solanum lycopersicum; 391 genes). Depending on whether gaps were permitted in initial blast searches of the four Orobanchaceae species against the reference, our pipeline identified 1,307 and 981 SCGs with average length of 994 bp and 775 bp, respectively. Automated bait sequence construction (using 2× tiling) resulted in 38,170 and 21,856 bait sequences, respectively. In comparison to the recently published MarkerMiner 1.0 pipeline BaitsFinder identified about 1.6 times as many SCGs (of at least 900 bp length). Skipping steps specific to analyses of Orobanchaceae, BaitsFinder was successfully used in a group of non-parasitic plants (three Asteraceae species and, as reference, SCG data from Arabidopsis thaliana based on previously compiled SCGs). Thus, BaitsFinder is expected to be broadly applicable in groups, where only transcriptomes or partial genome data of differing quality are available.
Collapse
|
22
|
Albayrak L, Khanipov K, Pimenova M, Golovko G, Rojas M, Pavlidis I, Chumakov S, Aguilar G, Chávez A, Widger WR, Fofanov Y. The ability of human nuclear DNA to cause false positive low-abundance heteroplasmy calls varies across the mitochondrial genome. BMC Genomics 2016; 17:1017. [PMID: 27955616 PMCID: PMC5153897 DOI: 10.1186/s12864-016-3375-x] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Accepted: 12/05/2016] [Indexed: 02/03/2023] Open
Abstract
Background Low-abundance mutations in mitochondrial populations (mutations with minor allele frequency ≤ 1%), are associated with cancer, aging, and neurodegenerative disorders. While recent progress in high-throughput sequencing technology has significantly improved the heteroplasmy identification process, the ability of this technology to detect low-abundance mutations can be affected by the presence of similar sequences originating from nuclear DNA (nDNA). To determine to what extent nDNA can cause false positive low-abundance heteroplasmy calls, we have identified mitochondrial locations of all subsequences that are common or similar (one mismatch allowed) between nDNA and mitochondrial DNA (mtDNA). Results Performed analysis revealed up to a 25-fold variation in the lengths of longest common and longest similar (one mismatch allowed) subsequences across the mitochondrial genome. The size of the longest subsequences shared between nDNA and mtDNA in several regions of the mitochondrial genome were found to be as low as 11 bases, which not only allows using these regions to design new, very specific PCR primers, but also supports the hypothesis of the non-random introduction of mtDNA into the human nuclear DNA. Conclusion Analysis of the mitochondrial locations of the subsequences shared between nDNA and mtDNA suggested that even very short (36 bases) single-end sequencing reads can be used to identify low-abundance variation in 20.4% of the mitochondrial genome. For longer (76 and 150 bases) reads, the proportion of the mitochondrial genome where nDNA presence will not interfere found to be 44.5 and 67.9%, when low-abundance mutations at 100% of locations can be identified using 417 bases long single reads. This observation suggests that the analysis of low-abundance variations in mitochondria population can be extended to a variety of large data collections such as NCBI Sequence Read Archive, European Nucleotide Archive, The Cancer Genome Atlas, and International Cancer Genome Consortium. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3375-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Levent Albayrak
- Department of Pharmacology and Toxicology, University of Texas Medical Branch, 301 University Boulevard, Galveston, TX, 77555-0144, USA.,Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston, TX, USA.,Department of Computer Science, University of Houston, Houston, TX, USA
| | - Kamil Khanipov
- Department of Pharmacology and Toxicology, University of Texas Medical Branch, 301 University Boulevard, Galveston, TX, 77555-0144, USA.,Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston, TX, USA.,Department of Computer Science, University of Houston, Houston, TX, USA
| | - Maria Pimenova
- Department of Pharmacology and Toxicology, University of Texas Medical Branch, 301 University Boulevard, Galveston, TX, 77555-0144, USA.,Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston, TX, USA
| | - George Golovko
- Department of Pharmacology and Toxicology, University of Texas Medical Branch, 301 University Boulevard, Galveston, TX, 77555-0144, USA.,Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston, TX, USA
| | - Mark Rojas
- Department of Pharmacology and Toxicology, University of Texas Medical Branch, 301 University Boulevard, Galveston, TX, 77555-0144, USA.,Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston, TX, USA
| | - Ioannis Pavlidis
- Department of Computer Science, University of Houston, Houston, TX, USA
| | - Sergei Chumakov
- Department of Physics, University of Guadalajara, Guadalajara, Jalisco, Mexico
| | - Gerardo Aguilar
- Department of Physics, University of Guadalajara, Guadalajara, Jalisco, Mexico
| | - Arturo Chávez
- Department of Physics, University of Guadalajara, Guadalajara, Jalisco, Mexico
| | - William R Widger
- Department of Biology and Biochemistry, University of Houston, Houston, TX, USA
| | - Yuriy Fofanov
- Department of Pharmacology and Toxicology, University of Texas Medical Branch, 301 University Boulevard, Galveston, TX, 77555-0144, USA. .,Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston, TX, USA.
| |
Collapse
|
23
|
Eriksson JS, Blanco-Pastor JL, Sousa F, Bertrand YJK, Pfeil BE. A cryptic species produced by autopolyploidy and subsequent introgression involving Medicago prostrata (Fabaceae). Mol Phylogenet Evol 2016; 107:367-381. [PMID: 27919807 DOI: 10.1016/j.ympev.2016.11.020] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2016] [Revised: 11/21/2016] [Accepted: 11/29/2016] [Indexed: 01/28/2023]
Abstract
Although hybridisation through genome duplication is well known, hybridisation without genome duplication (homoploid hybrid speciation, HHS) is not. Few well-documented cases have been reported. A possible instance of HHS in Medicago prostrata Jacq. was suggested previously, based on only two genes and one individual. We tested whether this species was formed through HHS by sampling eight nuclear loci and 22 individuals, with additional individuals from related species, using gene capture and Illumina sequencing. Phylogenetic inference and coalescent simulations were performed to infer the causes of gene tree incongruence. We found no evidence that phylogenetic differences among M. prostrata individuals were the result of HHS. Instead, an autopolyploid origin of tetraploids with introgression from tetraploids of the M. sativa complex is likely. We argue that tetraploid M. prostrata individuals constitute a new species, characterised by a partially non-overlapping distribution and distinctive alleles (from the M. sativa complex). No gene flow from tetraploid to diploid M. prostrata is apparent, suggesting partial reproductive isolation. Thus, speciation via autopolyploidy appears to have been reinforced by introgression. This raises the intriguing possibility that introgressed alleles may be responsible for the increased range exploited by tetraploid M. prostrata with respect to that of the diploids.
Collapse
Affiliation(s)
- J S Eriksson
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden.
| | - J L Blanco-Pastor
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden
| | - F Sousa
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden
| | - Y J K Bertrand
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden
| | - B E Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden
| |
Collapse
|
24
|
Fisher AE, Hasenstab KM, Bell HL, Blaine E, Ingram AL, Columbus JT. Evolutionary history of chloridoid grasses estimated from 122 nuclear loci. Mol Phylogenet Evol 2016; 105:1-14. [DOI: 10.1016/j.ympev.2016.08.011] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2016] [Revised: 08/09/2016] [Accepted: 08/18/2016] [Indexed: 10/25/2022]
|
25
|
Egan AN, Vatanparast M, Cagle W. Parsing polyphyletic Pueraria: Delimiting distinct evolutionary lineages through phylogeny. Mol Phylogenet Evol 2016; 104:44-59. [DOI: 10.1016/j.ympev.2016.08.001] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Revised: 07/18/2016] [Accepted: 08/01/2016] [Indexed: 11/25/2022]
|
26
|
Hollingsworth PM, Li DZ, van der Bank M, Twyford AD. Telling plant species apart with DNA: from barcodes to genomes. Philos Trans R Soc Lond B Biol Sci 2016; 371:20150338. [PMID: 27481790 PMCID: PMC4971190 DOI: 10.1098/rstb.2015.0338] [Citation(s) in RCA: 151] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/01/2016] [Indexed: 12/17/2022] Open
Abstract
Land plants underpin a multitude of ecosystem functions, support human livelihoods and represent a critically important component of terrestrial biodiversity-yet many tens of thousands of species await discovery, and plant identification remains a substantial challenge, especially where material is juvenile, fragmented or processed. In this opinion article, we tackle two main topics. Firstly, we provide a short summary of the strengths and limitations of plant DNA barcoding for addressing these issues. Secondly, we discuss options for enhancing current plant barcodes, focusing on increasing discriminatory power via either gene capture of nuclear markers or genome skimming. The former has the advantage of establishing a defined set of target loci maximizing efficiency of sequencing effort, data storage and analysis. The challenge is developing a probe set for large numbers of nuclear markers that works over sufficient phylogenetic breadth. Genome skimming has the advantage of using existing protocols and being backward compatible with existing barcodes; and the depth of sequence coverage can be increased as sequencing costs fall. Its non-targeted nature does, however, present a major informatics challenge for upscaling to large sample sets.This article is part of the themed issue 'From DNA barcodes to biomes'.
Collapse
Affiliation(s)
| | - De-Zhu Li
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, 132 Lanhei Road, Heilongtan, Kunming, Yunnan 650201, People's Republic of China
| | - Michelle van der Bank
- Department of Botany and Plant Biotechnology, University of Johannesburg, Auckland park, Johannesburg PO Box 524, South Africa
| | - Alex D Twyford
- Ashworth Laboratories, Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, UK
| |
Collapse
|
27
|
Schmickl R, Liston A, Zeisek V, Oberlander K, Weitemier K, Straub SCK, Cronn RC, Dreyer LL, Suda J. Phylogenetic marker development for target enrichment from transcriptome and genome skim data: the pipeline and its application in southern AfricanOxalis(Oxalidaceae). Mol Ecol Resour 2015; 16:1124-35. [DOI: 10.1111/1755-0998.12487] [Citation(s) in RCA: 69] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2015] [Revised: 10/06/2015] [Accepted: 11/05/2015] [Indexed: 01/08/2023]
Affiliation(s)
- Roswitha Schmickl
- Institute of Botany; The Czech Academy of Sciences; Zámek 1 252 43 Průhonice Czech Republic
| | - Aaron Liston
- Department of Botany and Plant Pathology; Oregon State University; 2082 Cordley Hall Corvallis OR 97331 USA
| | - Vojtěch Zeisek
- Institute of Botany; The Czech Academy of Sciences; Zámek 1 252 43 Průhonice Czech Republic
- Department of Botany; Faculty of Science; Charles University in Prague; Benátská 2 128 01 Prague Czech Republic
| | - Kenneth Oberlander
- Institute of Botany; The Czech Academy of Sciences; Zámek 1 252 43 Průhonice Czech Republic
- Department of Conservation Ecology and Entomology; Stellenbosch University; Private Bag X1 Matieland 7602 South Africa
| | - Kevin Weitemier
- Department of Botany and Plant Pathology; Oregon State University; 2082 Cordley Hall Corvallis OR 97331 USA
| | - Shannon C. K. Straub
- Department of Biology; Hobart and William Smith Colleges; 213 Eaton Hall Geneva NY 14456 USA
| | - Richard C. Cronn
- USDA Forest Service; Pacific Northwest Research Station; 3200 SW Jefferson Way Corvallis OR 97331 USA
| | - Léanne L. Dreyer
- Department of Botany and Zoology; Stellenbosch University; Private Bag X1 Matieland 7602 South Africa
| | - Jan Suda
- Institute of Botany; The Czech Academy of Sciences; Zámek 1 252 43 Průhonice Czech Republic
- Department of Botany; Faculty of Science; Charles University in Prague; Benátská 2 128 01 Prague Czech Republic
| |
Collapse
|
28
|
Folk RA, Mandel JR, Freudenstein JV. A protocol for targeted enrichment of intron-containing sequence markers for recent radiations: A phylogenomic example from Heuchera (Saxifragaceae). APPLICATIONS IN PLANT SCIENCES 2015; 3:apps1500039. [PMID: 26312196 PMCID: PMC4542943 DOI: 10.3732/apps.1500039] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2015] [Accepted: 07/09/2015] [Indexed: 05/18/2023]
Abstract
PREMISE OF THE STUDY Phylogenetic inference is moving to large multilocus data sets, yet there remains uncertainty in the choice of marker and sequencing method at low taxonomic levels. To address this gap, we present a method for enriching long loci spanning intron-exon boundaries in the genus Heuchera. METHODS Two hundred seventy-eight loci were designed using a splice-site prediction method combining transcriptomic and genomic data. Biotinylated probes were designed for enrichment of these loci. Reference-based assembly was performed using genomic references; additionally, chloroplast and mitochondrial genomes were used as references for off-target reads. The data were aligned and subjected to coalescent and concatenated phylogenetic analyses to demonstrate support for major relationships. RESULTS Complete or nearly complete (>99%) sequences were assembled from essentially all loci from all taxa. Aligned introns showed a fourfold increase in divergence as opposed to exons. Concatenated analysis gave decisive support to all nodes, and support was also high and relationships mostly similar in the coalescent analysis. Organellar phylogenies were also well-supported and conflicted with the nuclear signal. DISCUSSION Our approach shows promise for resolving a recent radiation. Enrichment for introns is highly successful with little or no sequencing dropout at low taxonomic levels despite higher substitution and indel frequencies, and should be exploited in studies of species complexes.
Collapse
Affiliation(s)
- Ryan A. Folk
- Herbarium, The Ohio State University, Columbus, Ohio 43212 USA
- Author for correspondence:
| | - Jennifer R. Mandel
- Department of Biology, University of Memphis, Memphis, Tennessee 38152 USA
| | | |
Collapse
|
29
|
Folk RA, Mandel JR, Freudenstein JV. A protocol for targeted enrichment of intron-containing sequence markers for recent radiations: A phylogenomic example from Heuchera (Saxifragaceae). APPLICATIONS IN PLANT SCIENCES 2015. [PMID: 26312196 DOI: 10.5061/dryad.4cn66] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
PREMISE OF THE STUDY Phylogenetic inference is moving to large multilocus data sets, yet there remains uncertainty in the choice of marker and sequencing method at low taxonomic levels. To address this gap, we present a method for enriching long loci spanning intron-exon boundaries in the genus Heuchera. METHODS Two hundred seventy-eight loci were designed using a splice-site prediction method combining transcriptomic and genomic data. Biotinylated probes were designed for enrichment of these loci. Reference-based assembly was performed using genomic references; additionally, chloroplast and mitochondrial genomes were used as references for off-target reads. The data were aligned and subjected to coalescent and concatenated phylogenetic analyses to demonstrate support for major relationships. RESULTS Complete or nearly complete (>99%) sequences were assembled from essentially all loci from all taxa. Aligned introns showed a fourfold increase in divergence as opposed to exons. Concatenated analysis gave decisive support to all nodes, and support was also high and relationships mostly similar in the coalescent analysis. Organellar phylogenies were also well-supported and conflicted with the nuclear signal. DISCUSSION Our approach shows promise for resolving a recent radiation. Enrichment for introns is highly successful with little or no sequencing dropout at low taxonomic levels despite higher substitution and indel frequencies, and should be exploited in studies of species complexes.
Collapse
Affiliation(s)
- Ryan A Folk
- Herbarium, The Ohio State University, Columbus, Ohio 43212 USA
| | - Jennifer R Mandel
- Department of Biology, University of Memphis, Memphis, Tennessee 38152 USA
| | | |
Collapse
|
30
|
Correction: phylogenetic properties of 50 nuclear Loci in medicago (leguminosae) generated using multiplexed sequence capture and next-generation sequencing. PLoS One 2015; 10:e0127130. [PMID: 25932925 PMCID: PMC4416814 DOI: 10.1371/journal.pone.0127130] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|