1
|
Phylotranscriptomic Analyses of Mycoheterotrophic Monocots Show a Continuum of Convergent Evolutionary Changes in Expressed Nuclear Genes From Three Independent Nonphotosynthetic Lineages. Genome Biol Evol 2022; 15:6965378. [PMID: 36582124 PMCID: PMC9887272 DOI: 10.1093/gbe/evac183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 12/13/2022] [Accepted: 12/18/2022] [Indexed: 12/31/2022] Open
Abstract
Mycoheterotrophy is an alternative nutritional strategy whereby plants obtain sugars and other nutrients from soil fungi. Mycoheterotrophy and associated loss of photosynthesis have evolved repeatedly in plants, particularly in monocots. Although reductive evolution of plastomes in mycoheterotrophs is well documented, the dynamics of nuclear genome evolution remains largely unknown. Transcriptome datasets were generated from four mycoheterotrophs in three families (Orchidaceae, Burmanniaceae, Triuridaceae) and related green plants and used for phylogenomic analyses to resolve relationships among the mycoheterotrophs, their relatives, and representatives across the monocots. Phylogenetic trees based on 602 genes were mostly congruent with plastome phylogenies, except for an Asparagales + Liliales clade inferred in the nuclear trees. Reduction and loss of chlorophyll synthesis and photosynthetic gene expression and relaxation of purifying selection on retained genes were progressive, with greater loss in older nonphotosynthetic lineages. One hundred seventy-four of 1375 plant benchmark universally conserved orthologous genes were undetected in any mycoheterotroph transcriptome or the genome of the mycoheterotrophic orchid Gastrodia but were expressed in green relatives, providing evidence for massively convergent gene loss in nonphotosynthetic lineages. We designate this set of deleted or undetected genes Missing in Mycoheterotrophs (MIM). MIM genes encode not only mainly photosynthetic or plastid membrane proteins but also a diverse set of plastid processes, genes of unknown function, mitochondrial, and cellular processes. Transcription of a photosystem II gene (psb29) in all lineages implies a nonphotosynthetic function for this and other genes retained in mycoheterotrophs. Nonphotosynthetic plants enable novel insights into gene function as well as gene expression shifts, gene loss, and convergence in nuclear genomes.
Collapse
|
2
|
Phylogenomic resolution of order- and family-level monocot relationships using 602 single-copy nuclear genes and 1375 BUSCO genes. FRONTIERS IN PLANT SCIENCE 2022; 13:876779. [PMID: 36483967 PMCID: PMC9723157 DOI: 10.3389/fpls.2022.876779] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 09/29/2022] [Indexed: 05/26/2023]
Abstract
We assess relationships among 192 species in all 12 monocot orders and 72 of 77 families, using 602 conserved single-copy (CSC) genes and 1375 benchmarking single-copy ortholog (BUSCO) genes extracted from genomic and transcriptomic datasets. Phylogenomic inferences based on these data, using both coalescent-based and supermatrix analyses, are largely congruent with the most comprehensive plastome-based analysis, and nuclear-gene phylogenomic analyses with less comprehensive taxon sampling. The strongest discordance between the plastome and nuclear gene analyses is the monophyly of a clade comprising Asparagales and Liliales in our nuclear gene analyses, versus the placement of Asparagales and Liliales as successive sister clades to the commelinids in the plastome tree. Within orders, around six of 72 families shifted positions relative to the recent plastome analysis, but four of these involve poorly supported inferred relationships in the plastome-based tree. In Poales, the nuclear data place a clade comprising Ecdeiocoleaceae+Joinvilleaceae as sister to the grasses (Poaceae); Typhaceae, (rather than Bromeliaceae) are resolved as sister to all other Poales. In Commelinales, nuclear data place Philydraceae sister to all other families rather than to a clade comprising Haemodoraceae+Pontederiaceae as seen in the plastome tree. In Liliales, nuclear data place Liliaceae sister to Smilacaceae, and Melanthiaceae are placed sister to all other Liliales except Campynemataceae. Finally, in Alismatales, nuclear data strongly place Tofieldiaceae, rather than Araceae, as sister to all the other families, providing an alternative resolution of what has been the most problematic node to resolve using plastid data, outside of those involving achlorophyllous mycoheterotrophs. As seen in numerous prior studies, the placement of orders Acorales and Alismatales as successive sister lineages to all other extant monocots. Only 21.2% of BUSCO genes were demonstrably single-copy, yet phylogenomic inferences based on BUSCO and CSC genes did not differ, and overall functional annotations of the two sets were very similar. Our analyses also reveal significant gene tree-species tree discordance despite high support values, as expected given incomplete lineage sorting (ILS) related to rapid diversification. Our study advances understanding of monocot relationships and the robustness of phylogenetic inferences based on large numbers of nuclear single-copy genes that can be obtained from transcriptomes and genomes.
Collapse
|
3
|
Access to RNA-sequencing data from 1,173 plant species: The 1000 Plant transcriptomes initiative (1KP). Gigascience 2019; 8:giz126. [PMID: 31644802 PMCID: PMC6808545 DOI: 10.1093/gigascience/giz126] [Citation(s) in RCA: 84] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 08/08/2019] [Accepted: 09/28/2019] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND The 1000 Plant transcriptomes initiative (1KP) explored genetic diversity by sequencing RNA from 1,342 samples representing 1,173 species of green plants (Viridiplantae). FINDINGS This data release accompanies the initiative's final/capstone publication on a set of 3 analyses inferring species trees, whole genome duplications, and gene family expansions. These and previous analyses are based on de novo transcriptome assemblies and related gene predictions. Here, we assess their data and assembly qualities and explain how we detected potential contaminations. CONCLUSIONS These data will be useful to plant and/or evolutionary scientists with interests in particular gene families, either across the green plant tree of life or in more focused lineages.
Collapse
|
4
|
Shifts in gene expression profiles are associated with weak and strong Crassulacean acid metabolism. AMERICAN JOURNAL OF BOTANY 2018; 105:587-601. [PMID: 29746718 DOI: 10.1002/ajb2.1017] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Accepted: 10/19/2017] [Indexed: 06/08/2023]
Abstract
PREMISE OF THE STUDY The relative ease of high throughput sequencing is facilitating comprehensive phylogenomic and gene expression studies, even for nonmodel groups. To date, however, these two approaches have not been merged; while phylogenomic methods might use transcriptome sequences to resolve relationships, assessment of gene expression patterns in a phylogenetic context is less common. Here we analyzed both carbon assimilation and gene expression patterns of closely related species within the Agavoideae (Asparagaceae) to elucidate changes in gene expression across weak and strong phenotypes for Crassulacean acid metabolism (CAM). METHODS Gene expression patterns were compared across four genera: Agave (CAM), which is paraphyletic with Polianthes (weak CAM) and Manfreda (CAM), and Beschorneria (weak CAM). RNA-sequencing was paired with measures of gas exchange and titratable acidity. Climate niche space was compared across the four lineages to examine abiotic factors and their correlation to CAM. KEY RESULTS Expression of homologous genes showed both shared and variable patterns in weak and strong CAM species. Network analysis highlights that despite shared expression patterns, highly connected genes differ between weak and strong CAM, implicating shifts in regulatory gene function as key for the evolution of CAM. Variation in carbohydrate metabolism between weak and strong CAM supports the importance of sugar turnovers for CAM physiology. CONCLUSIONS Integration of phylogenetics and RNA-sequencing provides a powerful tool to study the evolution of CAM photosynthesis across closely related but photosynthetically variable species. Our findings regarding shared or shifted gene expression and regulation of CAM via carbohydrate metabolism have important implications for efforts to engineer the CAM pathway into C3 food and biofuel crops.
Collapse
|
5
|
Abstract
Comparisons of flowering plant genomes reveal multiple rounds of ancient polyploidy characterized by large intragenomic syntenic blocks. Three such whole-genome duplication (WGD) events, designated as rho (ρ), sigma (σ), and tau (τ), have been identified in the genomes of cereal grasses. Precise dating of these WGD events is necessary to investigate how they have influenced diversification rates, evolutionary innovations, and genomic characteristics such as the GC profile of protein-coding sequences. The timing of these events has remained uncertain due to the paucity of monocot genome sequence data outside the grass family (Poaceae). Phylogenomic analysis of protein-coding genes from sequenced genomes and transcriptome assemblies from 35 species, including representatives of all families within the Poales, has resolved the timing of rho and sigma relative to speciation events and placed tau prior to divergence of Asparagales and the commelinids but after divergence with eudicots. Examination of gene family phylogenies indicates that rho occurred just prior to the diversification of Poaceae and sigma occurred before early diversification of Poales lineages but after the Poales-commelinid split. Additional lineage-specific WGD events were identified on the basis of the transcriptome data. Gene families exhibiting high GC content are underrepresented among those with duplicate genes that persisted following these genome duplications. However, genome duplications had little overall influence on lineage-specific changes in the GC content of coding genes. Improved resolution of the timing of WGD events in monocot history provides evidence for the influence of polyploidization on functional evolution and species diversification.
Collapse
|
6
|
Abstract
Reconstructing the origin and evolution of land plants and their algal relatives is a fundamental problem in plant phylogenetics, and is essential for understanding how critical adaptations arose, including the embryo, vascular tissue, seeds, and flowers. Despite advances in molecular systematics, some hypotheses of relationships remain weakly resolved. Inferring deep phylogenies with bouts of rapid diversification can be problematic; however, genome-scale data should significantly increase the number of informative characters for analyses. Recent phylogenomic reconstructions focused on the major divergences of plants have resulted in promising but inconsistent results. One limitation is sparse taxon sampling, likely resulting from the difficulty and cost of data generation. To address this limitation, transcriptome data for 92 streptophyte taxa were generated and analyzed along with 11 published plant genome sequences. Phylogenetic reconstructions were conducted using up to 852 nuclear genes and 1,701,170 aligned sites. Sixty-nine analyses were performed to test the robustness of phylogenetic inferences to permutations of the data matrix or to phylogenetic method, including supermatrix, supertree, and coalescent-based approaches, maximum-likelihood and Bayesian methods, partitioned and unpartitioned analyses, and amino acid versus DNA alignments. Among other results, we find robust support for a sister-group relationship between land plants and one group of streptophyte green algae, the Zygnematophyceae. Strong and robust support for a clade comprising liverworts and mosses is inconsistent with a widely accepted view of early land plant evolution, and suggests that phylogenetic hypotheses used to understand the evolution of fundamental plant traits should be reevaluated.
Collapse
|
7
|
Abstract
The 1,000 plants (1KP) project is an international multi-disciplinary consortium that has generated transcriptome data from over 1,000 plant species, with exemplars for all of the major lineages across the Viridiplantae (green plants) clade. Here, we describe how to access the data used in a phylogenomics analysis of the first 85 species, and how to visualize our gene and species trees. Users can develop computational pipelines to analyse these data, in conjunction with data of their own that they can upload. Computationally estimated protein-protein interactions and biochemical pathways can be visualized at another site. Finally, we comment on our future plans and how they fit within this scalable system for the dissemination, visualization, and analysis of large multi-species data sets.
Collapse
|
8
|
Characterization of the basal angiosperm Aristolochia fimbriata: a potential experimental system for genetic studies. BMC PLANT BIOLOGY 2013; 13:13. [PMID: 23347749 PMCID: PMC3621149 DOI: 10.1186/1471-2229-13-13] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2012] [Accepted: 12/12/2012] [Indexed: 05/15/2023]
Abstract
BACKGROUND Previous studies in basal angiosperms have provided insight into the diversity within the angiosperm lineage and helped to polarize analyses of flowering plant evolution. However, there is still not an experimental system for genetic studies among basal angiosperms to facilitate comparative studies and functional investigation. It would be desirable to identify a basal angiosperm experimental system that possesses many of the features found in existing plant model systems (e.g., Arabidopsis and Oryza). RESULTS We have considered all basal angiosperm families for general characteristics important for experimental systems, including availability to the scientific community, growth habit, and membership in a large basal angiosperm group that displays a wide spectrum of phenotypic diversity. Most basal angiosperms are woody or aquatic, thus are not well-suited for large scale cultivation, and were excluded. We further investigated members of Aristolochiaceae for ease of culture, life cycle, genome size, and chromosome number. We demonstrated self-compatibility for Aristolochia elegans and A. fimbriata, and transformation with a GFP reporter construct for Saruma henryi and A. fimbriata. Furthermore, A. fimbriata was easily cultivated with a life cycle of just three months, could be regenerated in a tissue culture system, and had one of the smallest genomes among basal angiosperms. An extensive multi-tissue EST dataset was produced for A. fimbriata that includes over 3.8 million 454 sequence reads. CONCLUSIONS Aristolochia fimbriata has numerous features that facilitate genetic studies and is suggested as a potential model system for use with a wide variety of technologies. Emerging genetic and genomic tools for A. fimbriata and closely related species can aid the investigation of floral biology, developmental genetics, biochemical pathways important in plant-insect interactions as well as human health, and various other features present in early angiosperms.
Collapse
|
9
|
The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 2012; 488:213-7. [PMID: 22801500 DOI: 10.1038/nature11241] [Citation(s) in RCA: 603] [Impact Index Per Article: 50.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2012] [Accepted: 05/18/2012] [Indexed: 01/17/2023]
Abstract
Bananas (Musa spp.), including dessert and cooking types, are giant perennial monocotyledonous herbs of the order Zingiberales, a sister group to the well-studied Poales, which include cereals. Bananas are vital for food security in many tropical and subtropical countries and the most popular fruit in industrialized countries. The Musa domestication process started some 7,000 years ago in Southeast Asia. It involved hybridizations between diverse species and subspecies, fostered by human migrations, and selection of diploid and triploid seedless, parthenocarpic hybrids thereafter widely dispersed by vegetative propagation. Half of the current production relies on somaclones derived from a single triploid genotype (Cavendish). Pests and diseases have gradually become adapted, representing an imminent danger for global banana production. Here we describe the draft sequence of the 523-megabase genome of a Musa acuminata doubled-haploid genotype, providing a crucial stepping-stone for genetic improvement of banana. We detected three rounds of whole-genome duplications in the Musa lineage, independently of those previously described in the Poales lineage and the one we detected in the Arecales lineage. This first monocotyledon high-continuity whole-genome sequence reported outside Poales represents an essential bridge for comparative genome analysis in plants. As such, it clarifies commelinid-monocotyledon phylogenetic relationships, reveals Poaceae-specific features and has led to the discovery of conserved non-coding sequences predating monocotyledon-eudicotyledon divergence.
Collapse
|
10
|
Phylogenomic analysis of transcriptome data elucidates co-occurrence of a paleopolyploid event and the origin of bimodal karyotypes in Agavoideae (Asparagaceae). AMERICAN JOURNAL OF BOTANY 2012; 99:397-406. [PMID: 22301890 DOI: 10.3732/ajb.1100537] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
PREMISE OF THE STUDY The stability of the bimodal karyotype found in Agave and closely related species has long interested botanists. The origin of the bimodal karyotype has been attributed to allopolyploidy, but this hypothesis has not been tested. Next-generation transcriptome sequence data were used to test whether a paleopolyploid event occurred on the same branch of the Agavoideae phylogenetic tree as the origin of the Yucca-Agave bimodal karyotype. METHODS Illumina RNA-seq data were generated for phylogenetically strategic species in Agavoideae. Paleopolyploidy was inferred in analyses of frequency plots for synonymous substitutions per synonymous site (K(s)) between Hosta, Agave, and Chlorophytum paralogous and orthologous gene pairs. Phylogenies of gene families including paralogous genes for these species and outgroup species were estimated to place inferred paleopolyploid events on a species tree. KEY RESULTS K(s) frequency plots suggested paleopolyploid events in the history of the genera Agave, Hosta, and Chlorophytum. Phylogenetic analyses of gene families estimated from transcriptome data revealed two polyploid events: one predating the last common ancestor of Agave and Hosta and one within the lineage leading to Chlorophytum. CONCLUSIONS We found that polyploidy and the origin of the Yucca-Agave bimodal karyotype co-occur on the same lineage consistent with the hypothesis that the bimodal karyotype is a consequence of allopolyploidy. We discuss this and alternative mechanisms for the formation of the Yucca-Agave bimodal karyotype. More generally, we illustrate how the use of next-generation sequencing technology is a cost-efficient means for assessing genome evolution in nonmodel species.
Collapse
|
11
|
A genome triplication associated with early diversification of the core eudicots. Genome Biol 2012; 13:R3. [PMID: 22280555 PMCID: PMC3334584 DOI: 10.1186/gb-2012-13-1-r3] [Citation(s) in RCA: 286] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2011] [Accepted: 01/26/2012] [Indexed: 11/23/2022] Open
Abstract
Background Although it is agreed that a major polyploidy event, gamma, occurred within the eudicots, the phylogenetic placement of the event remains unclear. Results To determine when this polyploidization occurred relative to speciation events in angiosperm history, we employed a phylogenomic approach to investigate the timing of gene set duplications located on syntenic gamma blocks. We populated 769 putative gene families with large sets of homologs obtained from public transcriptomes of basal angiosperms, magnoliids, asterids, and more than 91.8 gigabases of new next-generation transcriptome sequences of non-grass monocots and basal eudicots. The overwhelming majority (95%) of well-resolved gamma duplications was placed before the separation of rosids and asterids and after the split of monocots and eudicots, providing strong evidence that the gamma polyploidy event occurred early in eudicot evolution. Further, the majority of gene duplications was placed after the divergence of the Ranunculales and core eudicots, indicating that the gamma appears to be restricted to core eudicots. Molecular dating estimates indicate that the duplication events were intensely concentrated around 117 million years ago. Conclusions The rapid radiation of core eudicot lineages that gave rise to nearly 75% of angiosperm species appears to have occurred coincidentally or shortly following the gamma triplication event. Reconciliation of gene trees with a species phylogeny can elucidate the timing of major events in genome evolution, even when genome sequences are only available for a subset of species represented in the gene trees. Comprehensive transcriptome datasets are valuable complements to genome sequences for high-resolution phylogenomic analysis.
Collapse
|
12
|
A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure. Genome Biol 2011; 12:R48. [PMID: 21619600 PMCID: PMC3219971 DOI: 10.1186/gb-2011-12-5-r48] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2010] [Revised: 05/19/2011] [Accepted: 05/27/2011] [Indexed: 01/19/2023] Open
Abstract
Background Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome. Results Analysis of Amborella BAC ends sequenced from each contig suggests that the density of long terminal repeat retrotransposons is negatively correlated with that of protein coding genes. Syntenic, presumably ancestral, gene blocks were identified in comparisons of the Amborella BAC contigs and the sequenced Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa genomes. Parsimony mapping of the loss of synteny corroborates previous analyses suggesting that the rate of structural change has been more rapid on lineages leading to Arabidopsis and Oryza compared with lineages leading to Populus and Vitis. The gamma paleohexiploidy event identified in the Arabidopsis, Populus and Vitis genomes is shown to have occurred after the divergence of all other known angiosperms from the lineage leading to Amborella. Conclusions When placed in the context of a physical map, BAC end sequences representing just 5.4% of the Amborella genome have facilitated reconstruction of gene blocks that existed in the last common ancestor of all flowering plants. The Amborella genome is an invaluable reference for inferences concerning the ancestral angiosperm and subsequent genome evolution.
Collapse
|
13
|
Ancestral polyploidy in seed plants and angiosperms. Nature 2011; 473:97-100. [PMID: 21478875 DOI: 10.1038/nature09916] [Citation(s) in RCA: 1300] [Impact Index Per Article: 100.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2010] [Accepted: 02/10/2011] [Indexed: 11/09/2022]
Abstract
Whole-genome duplication (WGD), or polyploidy, followed by gene loss and diploidization has long been recognized as an important evolutionary force in animals, fungi and other organisms, especially plants. The success of angiosperms has been attributed, in part, to innovations associated with gene or whole-genome duplications, but evidence for proposed ancient genome duplications pre-dating the divergence of monocots and eudicots remains equivocal in analyses of conserved gene order. Here we use comprehensive phylogenomic analyses of sequenced plant genomes and more than 12.6 million new expressed-sequence-tag sequences from phylogenetically pivotal lineages to elucidate two groups of ancient gene duplications-one in the common ancestor of extant seed plants and the other in the common ancestor of extant angiosperms. Gene duplication events were intensely concentrated around 319 and 192 million years ago, implicating two WGDs in ancestral lineages shortly before the diversification of extant seed plants and extant angiosperms, respectively. Significantly, these ancestral WGDs resulted in the diversification of regulatory genes important to seed and flower development, suggesting that they were involved in major innovations that ultimately contributed to the rise and eventual dominance of seed plants and angiosperms.
Collapse
|