1
|
Hellmuth M, Stadler PF. The Theory of Gene Family Histories. Methods Mol Biol 2024; 2802:1-32. [PMID: 38819554 DOI: 10.1007/978-1-0716-3838-5_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Most genes are part of larger families of evolutionary-related genes. The history of gene families typically involves duplications and losses of genes as well as horizontal transfers into other organisms. The reconstruction of detailed gene family histories, i.e., the precise dating of evolutionary events relative to phylogenetic tree of the underlying species has remained a challenging topic despite their importance as a basis for detailed investigations into adaptation and functional evolution of individual members of the gene family. The identification of orthologs, moreover, is a particularly important subproblem of the more general setting considered here. In the last few years, an extensive body of mathematical results has appeared that tightly links orthology, a formal notion of best matches among genes, and horizontal gene transfer. The purpose of this chapter is to broadly outline some of the key mathematical insights and to discuss their implication for practical applications. In particular, we focus on tree-free methods, i.e., methods to infer orthology or horizontal gene transfer as well as gene trees, species trees, and reconciliations between them without using a priori knowledge of the underlying trees or statistical models for the inference of phylogenetic trees. Instead, the initial step aims to extract binary relations among genes.
Collapse
Affiliation(s)
- Marc Hellmuth
- Department of Mathematics, Faculty of Science, Stockholm University, Stockholm, Sweden
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, Leipzig University, Leipzig, Germany.
- Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.
- Universidad Nacional de Colombia, Bogotá, Colombia.
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria.
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.
- Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
2
|
Gao H, Suo X, Zhao L, Ma X, Cheng R, Wang G, Zhang H. Molecular evolution, diversification, and expression assessment of MADS gene family in Setaria italica, Setaria viridis, and Panicum virgatum. PLANT CELL REPORTS 2023; 42:1003-1024. [PMID: 37012438 DOI: 10.1007/s00299-023-03009-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 03/20/2023] [Indexed: 05/12/2023]
Abstract
KEY MESSAGE This paper sheds light on the evolution and expression patterns of MADS genes in Setaria and Panicum virgatum. SiMADS51 and SiMADS64 maybe involved in the ABA-dependent pathway of drought response. The MADS gene family is a key regulatory factor family that controls growth, reproduction, and response to abiotic stress in plants. However, the molecular evolution of this family is rarely reported. Here, a total of 265 MADS genes were identified in Setaria italica (foxtail millet), Setaria viridis (green millet), and Panicum virgatum (switchgrass) and analyzed by bioinformatics, including physicochemical characteristics, subcellular localization, chromosomal position and duplicate, motif distribution, genetic structure, genetic evolvement, and expression patterns. Phylogenetic analysis was used to categorize these genes into M and MIKC types. The distribution of motifs and gene structure were similar for the corresponding types. According to a collinearity study, the MADS genes have been mostly conserved during evolution. The principal cause of their expansion is segmental duplication. However, the MADS gene family tends to shrink in foxtail millet, green millet, and switchgrass. The MADS genes were subjected to purifying selection, but several positive selection sites were also identified in three species. And most of the promoters of MADS genes contain cis-elements related to stress and hormonal response. RNA-seq and quantitative Real-time PCR (qRT-PCR) analysis also were examined. SiMADS genes expression levels are considerably changed in reaction to various treatments, following qRT-PCR analysis. This sheds fresh light on the evolution and expansion of the MADS family in foxtail millet, green millet, and switchgrass, and lays the foundation for further research on their functions.
Collapse
Affiliation(s)
- Hui Gao
- Hebei Key Laboratory of Crop Stress Biology (in Preparation), Department of Life Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao, 066600, Hebei, China
- Institute of Millet Crops, Hebei Academy of Agriculture and Forestry Sciences/Key Laboratory of Genetic Improvement and Utilization for Featured Coarse Cereals (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs/National Foxtail Millet Improvement Center/Key Laboratory of Minor Cereal Crops of Hebei Province, Hebei Academy of Agriculture and Forestry Sciences, Shijiazhuang, China
| | - Xiaoman Suo
- Hebei Key Laboratory of Crop Stress Biology (in Preparation), Department of Life Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao, 066600, Hebei, China
| | - Ling Zhao
- Institute of Millet Crops, Hebei Academy of Agriculture and Forestry Sciences/Key Laboratory of Genetic Improvement and Utilization for Featured Coarse Cereals (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs/National Foxtail Millet Improvement Center/Key Laboratory of Minor Cereal Crops of Hebei Province, Hebei Academy of Agriculture and Forestry Sciences, Shijiazhuang, China
| | - Xinlei Ma
- Hebei Key Laboratory of Crop Stress Biology (in Preparation), Department of Life Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao, 066600, Hebei, China
| | - Ruhong Cheng
- Institute of Millet Crops, Hebei Academy of Agriculture and Forestry Sciences/Key Laboratory of Genetic Improvement and Utilization for Featured Coarse Cereals (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs/National Foxtail Millet Improvement Center/Key Laboratory of Minor Cereal Crops of Hebei Province, Hebei Academy of Agriculture and Forestry Sciences, Shijiazhuang, China.
| | - Genping Wang
- Institute of Millet Crops, Hebei Academy of Agriculture and Forestry Sciences/Key Laboratory of Genetic Improvement and Utilization for Featured Coarse Cereals (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs/National Foxtail Millet Improvement Center/Key Laboratory of Minor Cereal Crops of Hebei Province, Hebei Academy of Agriculture and Forestry Sciences, Shijiazhuang, China.
| | - Haoshan Zhang
- Institute of Millet Crops, Hebei Academy of Agriculture and Forestry Sciences/Key Laboratory of Genetic Improvement and Utilization for Featured Coarse Cereals (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs/National Foxtail Millet Improvement Center/Key Laboratory of Minor Cereal Crops of Hebei Province, Hebei Academy of Agriculture and Forestry Sciences, Shijiazhuang, China.
- Chinese Academy of Agricultural Sciences Institute of Crop Sciences, Beijing, 100081, China.
| |
Collapse
|
3
|
Montes-Rodríguez IM, Cadilla CL, López-Garriga J, González-Méndez R. Bioinformatic Characterization and Molecular Evolution of the Lucina pectinata Hemoglobins. Genes (Basel) 2022; 13:2041. [PMID: 36360278 PMCID: PMC9690805 DOI: 10.3390/genes13112041] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 10/24/2022] [Accepted: 10/26/2022] [Indexed: 10/01/2023] Open
Abstract
(1) Introduction: Lucina pectinata is a clam found in sulfide-rich mud environments that has three hemoglobins believed to be responsible for the transport of hydrogen sulfide (HbILp) and oxygen (HbIILp and HbIIILp) to chemoautotrophic endosymbionts. The physiological roles and evolution of these globins in sulfide-rich environments are not well understood. (2) Methods: We performed bioinformatic and phylogenetic analyses with 32 homologous mollusk globin sequences. Phylogenetics suggests a first gene duplication resulting in sulfide binding and oxygen binding genes. A more recent gene duplication gave rise to the two oxygen-binding hemoglobins. Multidimensional scaling analysis of the sequence space shows evolutionary drift of HbIILp and HbIIILp, while HbILp was closer to the Calyptogena hemoglobins. Further corroboration is seen by conservation in the coding region of hemoglobins from L. pectinata compared to those from Calyptogena. (3) Conclusions: Presence of glutamine in position E7 in organisms living in sulfide-rich environments can be considered an adaptation to prevent loss of protein function. In HbILp a substitution of phenylalanine in position B10 is accountable for its unique reactivity towards H2S. It appears that HbILp has been changing over time, apparently not subject to functional constraints of binding oxygen, and acquired a unique function for a specialized environment.
Collapse
Affiliation(s)
- Ingrid M. Montes-Rodríguez
- Cancer Biology Division, PROMIC, Comprehensive Cancer Center of the University of Puerto Rico, San Juan, PR 00936-3027, USA
| | - Carmen L. Cadilla
- Department of Biochemistry, School of Medicine, University of Puerto Rico-Medical Sciences Campus, San Juan, PR 00936-5067, USA
| | - Juan López-Garriga
- Department of Chemistry, Faculty of Arts and Sciences, University of Puerto Rico—Mayagüez Campus, Mayagüez, PR 00681-9000, USA
| | - Ricardo González-Méndez
- Department of Radiological Sciences, School of Medicine, University of Puerto Rico-Medical Sciences Campus, San Juan, PR 00936-5067, USA
| |
Collapse
|
4
|
Rei Liao JY, Friso G, Forsythe ES, Michel EJS, Williams AM, Boguraev SS, Ponnala L, Sloan DB, van Wijk KJ. Proteomics, phylogenetics, and co-expression analyses indicate novel interactions in the plastid CLP chaperone-protease system. J Biol Chem 2022; 298:101609. [PMID: 35065075 PMCID: PMC8889267 DOI: 10.1016/j.jbc.2022.101609] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 01/13/2022] [Accepted: 01/16/2022] [Indexed: 12/20/2022] Open
Abstract
The chloroplast chaperone CLPC1 unfolds and delivers substrates to the stromal CLPPRT protease complex for degradation. We previously used an in vivo trapping approach to identify interactors with CLPC1 in Arabidopsis thaliana by expressing a STREPII-tagged copy of CLPC1 mutated in its Walker B domains (CLPC1-TRAP) followed by affinity purification and mass spectrometry. To create a larger pool of candidate substrates, adaptors, or regulators, we carried out a far more sensitive and comprehensive in vivo protein trapping analysis. We identified 59 highly enriched CLPC1 protein interactors, in particular proteins belonging to families of unknown functions (DUF760, DUF179, DUF3143, UVR-DUF151, HugZ/DUF2470), as well as the UVR domain proteins EXE1 and EXE2 implicated in singlet oxygen damage and signaling. Phylogenetic and functional domain analyses identified other members of these families that appear to localize (nearly) exclusively to plastids. In addition, several of these DUF proteins are of very low abundance as determined through the Arabidopsis PeptideAtlas http://www.peptideatlas.org/builds/arabidopsis/ showing that enrichment in the CLPC1-TRAP was extremely selective. Evolutionary rate covariation indicated that the HugZ/DUF2470 family coevolved with the plastid CLP machinery suggesting functional and/or physical interactions. Finally, mRNA-based coexpression networks showed that all 12 CLP protease subunits tightly coexpressed as a single cluster with deep connections to DUF760-3. Coexpression modules for other trapped proteins suggested specific functions in biological processes, e.g., UVR2 and UVR3 were associated with extraplastidic degradation, whereas DUF760-6 is likely involved in senescence. This study provides a strong foundation for discovery of substrate selection by the chloroplast CLP protease system.
Collapse
Affiliation(s)
- Jui-Yun Rei Liao
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York, USA
| | - Giulia Friso
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York, USA
| | - Evan S Forsythe
- Graduate Program in Cell and Molecular Biology, Department of Biology, Colorado State University, Fort Collins, Colorado, USA
| | - Elena J S Michel
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York, USA
| | - Alissa M Williams
- Graduate Program in Cell and Molecular Biology, Department of Biology, Colorado State University, Fort Collins, Colorado, USA
| | - Sasha S Boguraev
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York, USA
| | | | - Daniel B Sloan
- Graduate Program in Cell and Molecular Biology, Department of Biology, Colorado State University, Fort Collins, Colorado, USA
| | - Klaas J van Wijk
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York, USA.
| |
Collapse
|
5
|
Du H, Ong YS, Knittel M, Mawhorter R, Liu N, Gross G, Tojo R, Libeskind-Hadas R, Wu YC. Multiple Optimal Reconciliations Under the Duplication-Loss-Coalescence Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2144-2156. [PMID: 31199267 DOI: 10.1109/tcbb.2019.2922337] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Gene trees can differ from species trees due to a variety of biological phenomena, the most prevalent being gene duplication, horizontal gene transfer, gene loss, and coalescence. To explain topological incongruence between the two trees, researchers apply reconciliation methods, often relying on a maximum parsimony framework. However, while several studies have investigated the space of maximum parsimony reconciliations (MPRs) under the duplication-loss and duplication-transfer-loss models, the space of MPRs under the duplication-loss-coalescence (DLC) model remains poorly understood. To address this problem, we present new algorithms for computing the size of MPR space under the DLC model and sampling from this space uniformly at random. Our algorithms are efficient in practice, with runtime polynomial in the size of the species and gene tree when the number of genes that map to any given species is fixed, thus proving that the MPR problem is fixed-parameter tractable. We have applied our methods to a biological data set of 16 fungal species to provide the first key insights in the space of MPRs under the DLC model. Our results show that a plurality reconciliation, and underlying events, are likely to be representative of MPR space.
Collapse
|
6
|
Qi F, Zhao Y, Zhao N, Wang K, Li Z, Wang Y. Structural variation and evolution of chloroplast tRNAs in green algae. PeerJ 2021; 9:e11524. [PMID: 34131524 PMCID: PMC8176911 DOI: 10.7717/peerj.11524] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 05/05/2021] [Indexed: 01/18/2023] Open
Abstract
As one of the important groups of the core Chlorophyta (Green algae), Chlorophyceae plays an important role in the evolution of plants. As a carrier of amino acids, tRNA plays an indispensable role in life activities. However, the structural variation of chloroplast tRNA and its evolutionary characteristics in Chlorophyta species have not been well studied. In this study, we analyzed the chloroplast genome tRNAs of 14 species in five categories in the green algae. We found that the number of chloroplasts tRNAs of Chlorophyceae is maintained between 28-32, and the length of the gene sequence ranges from 71 nt to 91 nt. There are 23-27 anticodon types of tRNAs, and some tRNAs have missing anticodons that are compensated for by other types of anticodons of that tRNA. In addition, three tRNAs were found to contain introns in the anti-codon loop of the tRNA, but the analysis scored poorly and it is presumed that these introns are not functional. After multiple sequence alignment, the Ψ-loop is the most conserved structural unit in the tRNA secondary structure, containing mostly U-U-C-x-A-x-U conserved sequences. The number of transitions in tRNA is higher than the number of transversions. In the replication loss analysis, it was found that green algal chloroplast tRNAs may have undergone substantial gene loss during the course of evolution. Based on the constructed phylogenetic tree, mutations were found to accompany the evolution of the Green algae chloroplast tRNA. Moreover, chloroplast tRNAs of Chlorophyceae are consistent with those of monocotyledons and gymnosperms in terms of evolutionary patterns, sharing a common multi-phylogenetic pattern and rooted in a rich common ancestor. Sequence alignment and systematic analysis of tRNA in chloroplast genome of Chlorophyceae, clarified the characteristics and rules of tRNA changes, which will promote the evolutionary relationship of tRNA and the origin and evolution of chloroplast.
Collapse
Affiliation(s)
- Fangbing Qi
- State Key Laboratory of Biotechnology of Shannxi Province, Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Science, Northwest University, Xi’an, China
| | - Yajing Zhao
- State Key Laboratory of Biotechnology of Shannxi Province, Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Science, Northwest University, Xi’an, China
| | - Ningbo Zhao
- State Key Laboratory of Biotechnology of Shannxi Province, Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Science, Northwest University, Xi’an, China
| | - Kai Wang
- State Key Laboratory of Biotechnology of Shannxi Province, Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Science, Northwest University, Xi’an, China
| | - Zhonghu Li
- State Key Laboratory of Biotechnology of Shannxi Province, Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Science, Northwest University, Xi’an, China
| | - Yingjuan Wang
- State Key Laboratory of Biotechnology of Shannxi Province, Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Science, Northwest University, Xi’an, China
| |
Collapse
|
7
|
Forsythe ES, Williams AM, Sloan DB. Genome-wide signatures of plastid-nuclear coevolution point to repeated perturbations of plastid proteostasis systems across angiosperms. THE PLANT CELL 2021; 33:980-997. [PMID: 33764472 PMCID: PMC8226287 DOI: 10.1093/plcell/koab021] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 01/16/2021] [Indexed: 05/05/2023]
Abstract
Nuclear and plastid (chloroplast) genomes experience different mutation rates, levels of selection, and transmission modes, yet key cellular functions depend on their coordinated interactions. Functionally related proteins often show correlated changes in rates of sequence evolution across a phylogeny [evolutionary rate covariation (ERC)], offering a means to detect previously unidentified suites of coevolving and cofunctional genes. We performed phylogenomic analyses across angiosperm diversity, scanning the nuclear genome for genes that exhibit ERC with plastid genes. As expected, the strongest hits were highly enriched for genes encoding plastid-targeted proteins, providing evidence that cytonuclear interactions affect rates of molecular evolution at genome-wide scales. Many identified nuclear genes functioned in post-transcriptional regulation and the maintenance of protein homeostasis (proteostasis), including protein translation (in both the plastid and cytosol), import, quality control, and turnover. We also identified nuclear genes that exhibit strong signatures of coevolution with the plastid genome, but their encoded proteins lack organellar-targeting annotations, making them candidates for having previously undescribed roles in plastids. In sum, our genome-wide analyses reveal that plastid-nuclear coevolution extends beyond the intimate molecular interactions within chloroplast enzyme complexes and may be driven by frequent rewiring of the machinery responsible for maintenance of plastid proteostasis in angiosperms.
Collapse
Affiliation(s)
- Evan S Forsythe
- Department of Biology, Colorado State University, Fort Collins, Colorado 80523, USA
| | - Alissa M Williams
- Department of Biology, Colorado State University, Fort Collins, Colorado 80523, USA
| | - Daniel B Sloan
- Department of Biology, Colorado State University, Fort Collins, Colorado 80523, USA
| |
Collapse
|
8
|
Novel Structural Variation and Evolutionary Characteristics of Chloroplast tRNA in Gossypium Plants. Genes (Basel) 2021; 12:genes12060822. [PMID: 34071968 PMCID: PMC8228828 DOI: 10.3390/genes12060822] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 05/16/2021] [Accepted: 05/24/2021] [Indexed: 11/16/2022] Open
Abstract
Cotton is one of the most important fiber and oil crops in the world. Chloroplast genomes harbor their own genetic materials and are considered to be highly conserved. Transfer RNAs (tRNAs) act as "bridges" in protein synthesis by carrying amino acids. Currently, the variation and evolutionary characteristics of tRNAs in the cotton chloroplast genome are poorly understood. Here, we analyzed the structural variation and evolution of chloroplast tRNA (cp tRNA) based on eight diploid and two allotetraploid cotton species. We also investigated the nucleotide evolution of chloroplast genomes in cotton species. We found that cp tRNAs in cotton encoded 36 or 37 tRNAs, and 28 or 29 anti-codon types with lengths ranging from 60 to 93 nucleotides. Cotton chloroplast tRNA sequences possessed specific conservation and, in particular, the Ψ-loop contained the conserved U-U-C-X3-U. The cp tRNAs of Gossypium L. contained introns, and cp tRNAIle contained the anti-codon (C-A-U), which was generally the anti-codon of tRNAMet. The transition and transversion analyses showed that cp tRNAs in cotton species were iso-acceptor specific and had undergone unequal rates of evolution. The intergenic region was more variable than coding regions, and non-synonymous mutations have been fixed in cotton cp genomes. On the other hand, phylogeny analyses indicated that cp tRNAs of cotton were derived from several inferred ancestors with greater gene duplications. This study provides new insights into the structural variation and evolution of chloroplast tRNAs in cotton plants. Our findings could contribute to understanding the detailed characteristics and evolutionary variation of the tRNA family.
Collapse
|
9
|
Schaller D, Geiß M, Stadler PF, Hellmuth M. Complete Characterization of Incorrect Orthology Assignments in Best Match Graphs. J Math Biol 2021; 82:20. [PMID: 33606106 PMCID: PMC7894253 DOI: 10.1007/s00285-021-01564-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 09/23/2020] [Accepted: 12/21/2020] [Indexed: 02/06/2023]
Abstract
Genome-scale orthology assignments are usually based on reciprocal best matches. In the absence of horizontal gene transfer (HGT), every pair of orthologs forms a reciprocal best match. Incorrect orthology assignments therefore are always false positives in the reciprocal best match graph. We consider duplication/loss scenarios and characterize unambiguous false-positive (u-fp) orthology assignments, that is, edges in the best match graphs (BMGs) that cannot correspond to orthologs for any gene tree that explains the BMG. Moreover, we provide a polynomial-time algorithm to identify all u-fp orthology assignments in a BMG. Simulations show that at least [Formula: see text] of all incorrect orthology assignments can be detected in this manner. All results rely only on the structure of the BMGs and not on any a priori knowledge about underlying gene or species trees.
Collapse
Affiliation(s)
- David Schaller
- Max-Planck-Institute for Mathematics in the Sciences, Inselstraße 22, D-04103, Leipzig, Germany.,Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center of Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107, Leipzig, Germany
| | - Manuela Geiß
- Software Competence Center Hagenberg GmbH, Softwarepark 21, A-4232, Hagenberg, Austria
| | - Peter F Stadler
- Max-Planck-Institute for Mathematics in the Sciences, Inselstraße 22, D-04103, Leipzig, Germany.,Bioinformatics Group, Department of Computer Science, Interdisciplinary Center of Bioinformatics, German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Competence Center for Scalable Data Services and Solutions, and Leipzig Research Center for Civilization Diseases, Leipzig University, Härtelstraße 16-18, D-04107, Leipzig, Germany.,Inst. f. Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090, Wien, Austria.,Facultad de Ciencias, Universidad National de Colombia, Bogotá, Colombia.,Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM, 87501, USA
| | - Marc Hellmuth
- Department of Mathematics, Faculty of Science, Stockholm University, SE 106 91, Stockholm, Sweden.
| |
Collapse
|
10
|
Fofanov MV, Prokopov DY, Kuhl H, Schartl M, Trifonov VA. Evolution of MicroRNA Biogenesis Genes in the Sterlet ( Acipenser ruthenus) and Other Polyploid Vertebrates. Int J Mol Sci 2020; 21:E9562. [PMID: 33334059 PMCID: PMC7765534 DOI: 10.3390/ijms21249562] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2020] [Revised: 12/09/2020] [Accepted: 12/14/2020] [Indexed: 01/14/2023] Open
Abstract
MicroRNAs play a crucial role in eukaryotic gene regulation. For a long time, only little was known about microRNA-based gene regulatory mechanisms in polyploid animal genomes due to difficulties of polyploid genome assembly. However, in recent years, several polyploid genomes of fish, amphibian, and even invertebrate species have been sequenced and assembled. Here we investigated several key microRNA-associated genes in the recently sequenced sterlet (Acipenser ruthenus) genome, whose lineage has undergone a whole genome duplication around 180 MYA. We show that two paralogs of drosha, dgcr8, xpo1, and xpo5 as well as most ago genes have been retained after the acipenserid-specific whole genome duplication, while ago1 and ago3 genes have lost one paralog. While most diploid vertebrates possess only a single copy of dicer1, we strikingly found four paralogs of this gene in the sterlet genome, derived from a tandem segmental duplication that occurred prior to the last whole genome duplication. ago1,3,4 and exportins1,5 look to be prone to additional segment duplications producing up to four-five paralog copies in ray-finned fishes. We demonstrate for the first time exon microsatellite amplification in the acipenserid drosha2 gene, resulting in a highly variable protein product, which may indicate sub- or neofunctionalization. Paralogous copies of most microRNA metabolism genes exhibit different expression profiles in various tissues and remain functional despite the rediploidization process. Subfunctionalization of microRNA processing gene paralogs may be beneficial for different pathways of microRNA metabolism. Genetic variability of microRNA processing genes may represent a substrate for natural selection, and, by increasing genetic plasticity, could facilitate adaptations to changing environments.
Collapse
Affiliation(s)
- Mikhail V. Fofanov
- Institute of Molecular and Cellular Biology SB RAS, Lavrentiev Ave. 8/2, 630090 Novosibirsk, Russia;
- Department of Natural Sciences, Novosibirsk State University, Pirogova 2, 630090 Novosibirsk, Russia
| | - Dmitry Yu. Prokopov
- Institute of Molecular and Cellular Biology SB RAS, Lavrentiev Ave. 8/2, 630090 Novosibirsk, Russia;
| | - Heiner Kuhl
- Leibniz-Institute of Freshwater Ecology and Inland Fisheries, Müggelseedamm 301 and 310, 12587 Berlin, Germany;
| | - Manfred Schartl
- Developmental Biochemistry, Biocenter, University of Wuerzburg, Am Hubland, 97074 Wuerzburg, Germany;
- Xiphophorus Genetic Stock Center, Texas State University, 601 University Drive, 419 Centennial Hall, San Marcos, TX 78666-4616, USA
| | - Vladimir A. Trifonov
- Institute of Molecular and Cellular Biology SB RAS, Lavrentiev Ave. 8/2, 630090 Novosibirsk, Russia;
- Department of Natural Sciences, Novosibirsk State University, Pirogova 2, 630090 Novosibirsk, Russia
| |
Collapse
|
11
|
Li Q, Scornavacca C, Galtier N, Chan YB. The Multilocus Multispecies Coalescent: A Flexible New Model of Gene Family Evolution. Syst Biol 2020; 70:822-837. [PMID: 33169795 DOI: 10.1093/sysbio/syaa084] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 05/07/2020] [Accepted: 10/19/2020] [Indexed: 02/06/2023] Open
Abstract
Incomplete lineage sorting (ILS), the interaction between coalescence and speciation, can generate incongruence between gene trees and species trees, as can gene duplication (D), transfer (T), and loss (L). These processes are usually modeled independently, but in reality, ILS can affect gene copy number polymorphism, that is, interfere with DTL. This has been previously recognized, but not treated in a satisfactory way, mainly because DTL events are naturally modeled forward-in-time, while ILS is naturally modeled backward-in-time with the coalescent. Here, we consider the joint action of ILS and DTL on the gene tree/species tree problem in all its complexity. In particular, we show that the interaction between ILS and duplications/transfers (without losses) can result in patterns usually interpreted as resulting from gene loss, and that the realized rate of D, T, and L becomes nonhomogeneous in time when ILS is taken into account. We introduce algorithmic solutions to these problems. Our new model, the multilocus multispecies coalescent, which also accounts for any level of linkage between loci, generalizes the multispecies coalescent (MSC) model and offers a versatile, powerful framework for proper simulation, and inference of gene family evolution. [Gene duplication; gene loss; horizontal gene transfer; incomplete lineage sorting; multispecies coalescent; hemiplasy; recombination.].
Collapse
Affiliation(s)
- Qiuyi Li
- School of Mathematics and Statistics / Melbourne Integrative Genomics, The University of Melbourne, Melbourne 3010, Australia
| | - Celine Scornavacca
- Institut des Sciences de l'Evolution, Université Montpellier, CNRS, IRD, EPHE, Montpellier, 34095, France
| | - Nicolas Galtier
- Institut des Sciences de l'Evolution, Université Montpellier, CNRS, IRD, EPHE, Montpellier, 34095, France
| | - Yao-Ban Chan
- School of Mathematics and Statistics / Melbourne Integrative Genomics, The University of Melbourne, Melbourne 3010, Australia
| |
Collapse
|
12
|
Wang B, Sugiyama S. Phylogenetic signal of host plants in the bacterial and fungal root microbiomes of cultivated angiosperms. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 104:522-531. [PMID: 32744366 DOI: 10.1111/tpj.14943] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Accepted: 06/25/2020] [Indexed: 06/11/2023]
Abstract
Root microbiomes are established through selective recruitment by host plants from pools of potential partners. However, the assembly rules of root microbiomes remain unclear. To elucidate (i) the effects of host plant phylogeny on root microbiome assembly and (ii) which microbial groups affect differences in root microbiome assemblies, the structures of bacterial and fungal root microbiomes from 20 cultivated angiosperms were compared. Surface-sterilized seeds from each species were sown in identical soil, and DNA was extracted from the plant roots after 7-8 weeks. The bacterial (16S rRNA) and fungal (ITS) communities were then examined using Illumina MiSeq. The phylogenetic distances of host plants and assembly dissimilarities of bacterial microbiomes, but not of fungal ones, were significantly correlated, as were the topologies of the host plant phylogenetic tree and the community dissimilarity tree, thereby confirming the phylogenetic conservation of bacterial root microbiomes. Furthermore, host plant phylogeny mainly affected only a few specific bacterial lineages, including the Betaproteobacteria, Gammaproteobacteria, and Chloroflexi. Burkholderia (Betaproteobacteria) taxa were more abundant in monocots than in dicots, whereas Streptomyces (Actinobacteria) taxa were less abundant. These findings suggest that bacterial root microbiomes have significantly contributed to the functional divergence of angiosperms at higher taxonomic levels.
Collapse
Affiliation(s)
- Boxi Wang
- Faculty of Agriculture and Life Science, Hirosaki University, Hirosaki, Aomori, Japan
- The United Graduate School of Agricultural Sciences, Iwate University, Morioka, Iwate, Japan
| | - Shuichi Sugiyama
- Faculty of Agriculture and Life Science, Hirosaki University, Hirosaki, Aomori, Japan
| |
Collapse
|
13
|
Himmel NJ, Gray TR, Cox DN. Phylogenetics Identifies Two Eumetazoan TRPM Clades and an Eighth TRP Family, TRP Soromelastatin (TRPS). Mol Biol Evol 2020; 37:2034-2044. [PMID: 32159767 PMCID: PMC7306681 DOI: 10.1093/molbev/msaa065] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Transient receptor potential melastatins (TRPMs) are most well known as cold and menthol sensors, but are in fact broadly critical for life, from ion homeostasis to reproduction. Yet, the evolutionary relationship between TRPM channels remains largely unresolved, particularly with respect to the placement of several highly divergent members. To characterize the evolution of TRPM and like channels, we performed a large-scale phylogenetic analysis of >1,300 TRPM-like sequences from 14 phyla (Annelida, Arthropoda, Brachiopoda, Chordata, Cnidaria, Echinodermata, Hemichordata, Mollusca, Nematoda, Nemertea, Phoronida, Priapulida, Tardigrada, and Xenacoelomorpha), including sequences from a variety of recently sequenced genomes that fill what would otherwise be substantial taxonomic gaps. These findings suggest: 1) the previously recognized TRPM family is in fact two distinct families, including canonical TRPM channels and an eighth major previously undescribed family of animal TRP channel, TRP soromelastatin; 2) two TRPM clades predate the last bilaterian-cnidarian ancestor; and 3) the vertebrate-centric trend of categorizing TRPM channels as 1-8 is inappropriate for most phyla, including other chordates.
Collapse
Affiliation(s)
| | - Thomas R Gray
- Neuroscience Institute, Georgia State University, Atlanta, GA
| | - Daniel N Cox
- Neuroscience Institute, Georgia State University, Atlanta, GA
| |
Collapse
|
14
|
Delabre M, El-Mabrouk N, Huber KT, Lafond M, Moulton V, Noutahi E, Castellanos MS. Evolution through segmental duplications and losses: a Super-Reconciliation approach. Algorithms Mol Biol 2020; 15:12. [PMID: 32508979 PMCID: PMC7249433 DOI: 10.1186/s13015-020-00171-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 05/05/2020] [Indexed: 02/02/2023] Open
Abstract
The classical gene and species tree reconciliation, used to infer the history of gene gain and loss explaining the evolution of gene families, assumes an independent evolution for each family. While this assumption is reasonable for genes that are far apart in the genome, it is not appropriate for genes grouped into syntenic blocks, which are more plausibly the result of a concerted evolution. Here, we introduce the Super-Reconciliation problem which consists in inferring a history of segmental duplication and loss events (involving a set of neighboring genes) leading to a set of present-day syntenies from a single ancestral one. In other words, we extend the traditional Duplication-Loss reconciliation problem of a single gene tree, to a set of trees, accounting for segmental duplications and losses. Existency of a Super-Reconciliation depends on individual gene tree consistency. In addition, ignoring rearrangements implies that existency also depends on gene order consistency. We first show that the problem of reconstructing a most parsimonious Super-Reconciliation, if any, is NP-hard and give an exact exponential-time algorithm to solve it. Alternatively, we show that accounting for rearrangements in the evolutionary model, but still only minimizing segmental duplication and loss events, leads to an exact polynomial-time algorithm. We finally assess time efficiency of the former exponential time algorithm for the Duplication-Loss model on simulated datasets, and give a proof of concept on the opioid receptor genes.
Collapse
|
15
|
Evolutionary relationships between the transcriptional repressors of the polyhydroxyalkanoate reserve storage system in prokaryotes: Conserved but phylogenetically heterogeneous. Gene 2020; 735:144397. [PMID: 31991161 DOI: 10.1016/j.gene.2020.144397] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Revised: 12/19/2019] [Accepted: 01/23/2020] [Indexed: 11/23/2022]
Abstract
Bacteria and archaea accumulate cytoplasmic polyhydroxyalkanoate (PHA) granules under nutrient-limited conditions with excess carbon. The transcriptional regulatory (TR) proteins found on the surface of PHA granules act as repressors as well as activators for the expression of major surface proteins called phasins. Until now, detailed information on the evolutionary relationships between these transcription regulators has not been available. Here, we conducted homology searches and analyzed information available for the domains and protein families of the TR proteins through phylogenetic studies. A total of 282 TR proteins were identified and further classified into four distinct subfamilies based upon the presence of conserved motifs: PHB_acc, TetR-like, AbrB-like, and PadR-like. Depending upon the particular family, the DNA-binding domains were located at either the N- or C-terminus. Our results indicated that TR proteins containing the PHB_acc domain are highly conserved within the bacteria, while other TR proteins are present only within archaea (AbrB-like), gram positive bacteria (PadR-like), or the Pseudomonas genera (TetR-like). The repression domains are charged, hydrophobic, and rich in leucine or glutamine. In phylogenetic analyses, many groups of TR proteins were clustered together according to identical domain architectures showing the independent origins of the TR proteins in the PHA reserve storage system. Further analyses revealed that the TR proteins have experienced multiple gene duplications across prokaryotes. Thus, this study investigated the evolutionary framework of TR proteins and has provided a comprehensive catalog of TR proteins for ongoing studies to characterize the functions of these proteins within diverse organisms.
Collapse
|
16
|
Geiß M, Laffitte MEG, Sánchez AL, Valdivia DI, Hellmuth M, Rosales MH, Stadler PF. Best match graphs and reconciliation of gene trees with species trees. J Math Biol 2020; 80:1459-1495. [PMID: 32002659 PMCID: PMC7052050 DOI: 10.1007/s00285-020-01469-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Revised: 01/08/2020] [Indexed: 11/19/2022]
Abstract
A wide variety of problems in computational biology, most notably the assessment of orthology, are solved with the help of reciprocal best matches. Using an evolutionary definition of best matches that captures the intuition behind the concept we clarify rigorously the relationships between reciprocal best matches, orthology, and evolutionary events under the assumption of duplication/loss scenarios. We show that the orthology graph is a subgraph of the reciprocal best match graph (RBMG). We furthermore give conditions under which an RBMG that is a cograph identifies the correct orthlogy relation. Using computer simulations we find that most false positive orthology assignments can be identified as so-called good quartets—and thus corrected—in the absence of horizontal transfer. Horizontal transfer, however, may introduce also false-negative orthology assignments.
Collapse
Affiliation(s)
- Manuela Geiß
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center of Bioinformatics, University of Leipzig, Härtelstraße 16-18, 04107, Leipzig, Germany
| | - Marcos E González Laffitte
- CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230, Juriquilla, Querétaro, QRO, Mexico
| | - Alitzel López Sánchez
- CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230, Juriquilla, Querétaro, QRO, Mexico
| | - Dulce I Valdivia
- Centro de Ciencias Básicas, Universidad Autónoma de Aguascalientes, Av. Universidad 940, 20131, Aguascalientes, AGS, México.,Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230, Juriquilla, Querétaro, QRO, Mexico
| | - Marc Hellmuth
- Institute of Mathematics and Computer Science, University of Greifswald, Walther-Rathenau-Straße 47, 17487, Greifswald, Germany.,Center for Bioinformatics, Saarland University, Building E 2.1, P.O. Box 151150, 66041, Saarbrücken, Germany
| | - Maribel Hernández Rosales
- CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230, Juriquilla, Querétaro, QRO, Mexico
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center of Bioinformatics, University of Leipzig, Härtelstraße 16-18, 04107, Leipzig, Germany. .,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany. .,Competence Center for Scalable Data Services and Solutions, Leipzig Research Center for Civilization Diseases, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany. .,Max-Planck-Institute for Mathematics in the Sciences, Inselstraße 22, 04103, Leipzig, Germany. .,Inst. f. Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090, Wien, Austria. .,Facultad de Ciencias, Universidad National de Colombia, Bogotá, Colombia. .,Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM, 87501, USA.
| |
Collapse
|
17
|
Inferring Pareto-optimal reconciliations across multiple event costs under the duplication-loss-coalescence model. BMC Bioinformatics 2019; 20:639. [PMID: 31842732 PMCID: PMC6916210 DOI: 10.1186/s12859-019-3206-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Reconciliation methods are widely used to explain incongruence between a gene tree and species tree. However, the common approach of inferring maximum parsimony reconciliations (MPRs) relies on user-defined costs for each type of event, which can be difficult to estimate. Prior work has explored the relationship between event costs and maximum parsimony reconciliations in the duplication-loss and duplication-transfer-loss models, but no studies have addressed this relationship in the more complicated duplication-loss-coalescence model. RESULTS We provide a fixed-parameter tractable algorithm for computing Pareto-optimal reconciliations and recording all events that arise in those reconciliations, along with their frequencies. We apply this method to a case study of 16 fungi to systematically characterize the complexity of MPR space across event costs and identify events supported across this space. CONCLUSION This work provides a new framework for studying the relationship between event costs and reconciliations that incorporates both macro-evolutionary events and population effects and is thus broadly applicable across eukaryotic species.
Collapse
|
18
|
Beichman AC, Koepfli KP, Li G, Murphy W, Dobrynin P, Kliver S, Tinker MT, Murray MJ, Johnson J, Lindblad-Toh K, Karlsson EK, Lohmueller KE, Wayne RK. Aquatic Adaptation and Depleted Diversity: A Deep Dive into the Genomes of the Sea Otter and Giant Otter. Mol Biol Evol 2019; 36:2631-2655. [PMID: 31212313 PMCID: PMC7967881 DOI: 10.1093/molbev/msz101] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Despite its recent invasion into the marine realm, the sea otter (Enhydra lutris) has evolved a suite of adaptations for life in cold coastal waters, including limb modifications and dense insulating fur. This uniquely dense coat led to the near-extinction of sea otters during the 18th-20th century fur trade and an extreme population bottleneck. We used the de novo genome of the southern sea otter (E. l. nereis) to reconstruct its evolutionary history, identify genes influencing aquatic adaptation, and detect signals of population bottlenecks. We compared the genome of the southern sea otter with the tropical freshwater-living giant otter (Pteronura brasiliensis) to assess common and divergent genomic trends between otter species, and with the closely related northern sea otter (E. l. kenyoni) to uncover population-level trends. We found signals of positive selection in genes related to aquatic adaptations, particularly limb development and polygenic selection on genes related to hair follicle development. We found extensive pseudogenization of olfactory receptor genes in both the sea otter and giant otter lineages, consistent with patterns of sensory gene loss in other aquatic mammals. At the population level, the southern sea otter and the northern sea otter showed extremely low genomic diversity, signals of recent inbreeding, and demographic histories marked by population declines. These declines may predate the fur trade and appear to have resulted in an increase in putatively deleterious variants that could impact the future recovery of the sea otter.
Collapse
Affiliation(s)
- Annabel C Beichman
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA
| | - Klaus-Peter Koepfli
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, DC
- Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russian Federation
| | - Gang Li
- College of Life Science, Shaanxi Normal University, Xi’an, Shaanxi, China
| | - William Murphy
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX
| | - Pasha Dobrynin
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, DC
- Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russian Federation
| | - Sergei Kliver
- Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russian Federation
| | - Martin T Tinker
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, CA
| | | | - Jeremy Johnson
- Vertebrate Genome Biology, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Kerstin Lindblad-Toh
- Vertebrate Genome Biology, Broad Institute of MIT and Harvard, Cambridge, MA
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Elinor K Karlsson
- Vertebrate Genome Biology, Broad Institute of MIT and Harvard, Cambridge, MA
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, CA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA
| | - Robert K Wayne
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA
| |
Collapse
|
19
|
Hellmuth M, Huber KT, Moulton V. Reconciling event-labeled gene trees with MUL-trees and species networks. J Math Biol 2019; 79:1885-1925. [PMID: 31410552 DOI: 10.1007/s00285-019-01414-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 05/08/2019] [Indexed: 11/30/2022]
Abstract
Phylogenomics commonly aims to construct evolutionary trees from genomic sequence information. One way to approach this problem is to first estimate event-labeled gene trees (i.e., rooted trees whose non-leaf vertices are labeled by speciation or gene duplication events), and to then look for a species tree which can be reconciled with this tree through a reconciliation map between the trees. In practice, however, it can happen that there is no such map from a given event-labeled tree to any species tree. An important situation where this might arise is where the species evolution is better represented by a network instead of a tree. In this paper, we therefore consider the problem of reconciling event-labeled trees with species networks. In particular, we prove that any event-labeled gene tree can be reconciled with some network and that, under certain mild assumptions on the gene tree, the network can even be assumed to be multi-arc free. To prove this result, we show that we can always reconcile the gene tree with some multi-labeled (MUL-)tree, which can then be "folded up" to produce the desired reconciliation and network. In addition, we study the interplay between reconciliation maps from event-labeled gene trees to MUL-trees and networks. Our results could be useful for understanding how genomes have evolved after undergoing complex evolutionary events such as polyploidy.
Collapse
Affiliation(s)
- Marc Hellmuth
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany. .,Center for Bioinformatics, Saarland University, Saarbrücken, Germany.
| | - Katharina T Huber
- School of Computing Sciences, University of East Anglia, Norwich, UK
| | - Vincent Moulton
- School of Computing Sciences, University of East Anglia, Norwich, UK
| |
Collapse
|
20
|
Van Iersel L, Janssen R, Jones M, Murakami Y, Zeh N. Polynomial-Time Algorithms for Phylogenetic Inference Problems involving duplication and reticulation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 17:14-26. [PMID: 31425045 DOI: 10.1109/tcbb.2019.2934957] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
A common problem in phylogenetics is to try to infer a species phylogeny from gene trees. We consider different variants of this problem. The first variant, called Unrestricted Minimal Episodes Inference, aims at inferring a species tree based on a model with speciation and duplication where duplications are clustered in duplication episodes. The goal is to minimize the number of such episodes. The second variant, Parental Hybridization, aims at inferring a species network based on a model with speciation and reticulation. The goal is to minimize the number of reticulation events. It is a variant of the well-studied Hybridization Number problem with a more generous view on which gene trees are consistent with a given species network. We show that these seemingly different problems are in fact closely related and can, surprisingly, both be solved in polynomial time, using a structure we call "beaded trees". However, we also show that methods based on these problems have to be used with care because the optimal species phylogenies always have a restricted form. To mitigate this problem, we introduce a new variant of Unrestricted Minimal Episodes Inference that minimizes the duplication episode depth. We prove that this new variant of the problem can also be solved in polynomial time.
Collapse
|
21
|
Diversity and evolution of chitin synthases in oomycetes (Straminipila: Oomycota). Mol Phylogenet Evol 2019; 139:106558. [PMID: 31288106 DOI: 10.1016/j.ympev.2019.106558] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Revised: 07/05/2019] [Accepted: 07/05/2019] [Indexed: 12/24/2022]
Abstract
The oomycetes are filamentous eukaryotic microorganisms, distinct from true fungi, many of which act as crop or fish pathogens that cause devastating losses in agriculture and aquaculture. Chitin is present in all true fungi, but it occurs in only small amounts in some Saprolegniomycetes and it is absent in Peronosporomycetes. However, the growth of several oomycetes is severely impacted by competitive chitin synthase (CHS) inhibitors. Here, we shed light on the diversity, evolution and function of oomycete CHS proteins. We show by phylogenetic analysis of 93 putative CHSs from 48 highly diverse oomycetes, including the early diverging Eurychasma dicksonii, that all available oomycete genomes contain at least one putative CHS gene. All gene products contain conserved CHS motifs essential for enzymatic activity and form two Peronosporomycete-specific and six Saprolegniale-specific clades. Proteins of all clades, except one, contain an N-terminal microtubule interacting and trafficking (MIT) domain as predicted by protein domain databases or manual analysis, which is supported by homology modelling and comparison of conserved structural features from sequence logos. We identified at least three groups of CHSs conserved among all oomycete lineages and used phylogenetic reconciliation analysis to infer the dynamic evolution of CHSs in oomycetes. The evolutionary aspects of CHS diversity in modern-day oomycetes are discussed. In addition, we observed hyphal tip rupture in Phytophthora infestans upon treatment with the CHS inhibitor nikkomycin Z. Combining data on phylogeny, gene expression, and response to CHS inhibitors, we propose the association of different CHS clades with certain developmental stages.
Collapse
|
22
|
Gamble T. Duplications in Corneous Beta Protein Genes and the Evolution of Gecko Adhesion. Integr Comp Biol 2019; 59:193-202. [DOI: 10.1093/icb/icz010] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Abstract
Corneous proteins are an important component of the tetrapod integument. Duplication and diversification of keratins and associated proteins are linked with the origin of most novel integumentary structures like mammalian hair, avian feathers, and scutes covering turtle shells. Accordingly, the loss of integumentary structures often coincides with the loss of genes encoding keratin and associated proteins. For example, many hair keratins in dolphins and whales have become pseudogenes. The adhesive setae of geckos and anoles are composed of both intermediate filament keratins (IF-keratins, formerly known as alpha-keratins) and corneous beta-proteins (CBPs, formerly known as beta-keratins) and recent whole genome assemblies of two gecko species and an anole uncovered duplications in seta-specific CBPs in each of these lineages. While anoles evolved adhesive toepads just once, there are two competing hypotheses about the origin(s) of digital adhesion in geckos involving either a single origin or multiple origins. Using data from three published gecko genomes, I examine CBP gene evolution in geckos and find support for a hypothesis where CBP gene duplications are associated with the repeated evolution of digital adhesion. Although these results are preliminary, I discuss how additional gecko genome assemblies, combined with phylogenies of keratin and associated protein genes and gene duplication models, can provide rigorous tests of several hypotheses related to gecko CBP evolution. This includes a taxon sampling strategy for sequencing and assembly of gecko genomes that could help resolve competing hypotheses surrounding the origin(s) of digital adhesion.
Collapse
Affiliation(s)
- Tony Gamble
- Department of Biological Sciences, Marquette University, Milwaukee, WI 53201, USA
- Bell Museum of Natural History, University of Minnesota, Saint Paul, MN 55113, USA
- Milwaukee Public Museum, Milwaukee, WI 53233, USA
| |
Collapse
|
23
|
Ravindran A, Sunderrajan S, Pennathur G. Phylogenetic Studies on the Prodigiosin Biosynthetic Operon. Curr Microbiol 2019; 76:597-606. [DOI: 10.1007/s00284-019-01665-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Accepted: 03/01/2019] [Indexed: 11/30/2022]
|
24
|
Gene tree species tree reconciliation with gene conversion. J Math Biol 2019; 78:1981-2014. [PMID: 30767052 DOI: 10.1007/s00285-019-01331-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2017] [Revised: 10/03/2018] [Indexed: 01/19/2023]
Abstract
Gene tree/species tree reconciliation is a recent decisive progress in phylogenetic methods, accounting for the possible differences between gene histories and species histories. Reconciliation consists in explaining these differences by gene-scale events such as duplication, loss, transfer, which translates mathematically into a mapping between gene tree nodes and species tree nodes or branches. Gene conversion is a frequent and important evolutionary event, which results in the replacement of a gene by a copy of another from the same species and in the same gene tree. Including this event in reconciliation models has never been attempted because it introduces a dependency between lineages, and standard algorithms based on dynamic programming become ineffective. We propose here a novel mathematical framework including gene conversion as an evolutionary event in gene tree/species tree reconciliation. We describe a randomized algorithm that finds, in polynomial running time, a reconciliation minimizing the number of duplications, losses and conversions in the case when their weights are equal. We show that the space of optimal reconciliations includes an analog of the last common ancestor reconciliation, but is not limited to it. Our algorithm outputs any optimal reconciliation with a non-null probability. We argue that this study opens a research avenue on including gene conversion in reconciliation, and discuss its possible importance in biology.
Collapse
|
25
|
Auxier B, Dee J, Berbee ML, Momany M. Diversity of opisthokont septin proteins reveals structural constraints and conserved motifs. BMC Evol Biol 2019; 19:4. [PMID: 30616529 PMCID: PMC6323724 DOI: 10.1186/s12862-018-1297-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2018] [Accepted: 11/19/2018] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Septins are cytoskeletal proteins important in cell division and in establishing and maintaining cell polarity. Although septins are found in various eukaryotes, septin genes had the richest history of duplication and diversification in the animals, fungi and protists that comprise opisthokonts. Opisthokont septin paralogs encode modular proteins that assemble into heteropolymeric higher order structures. The heteropolymers can create physical barriers to diffusion or serve as scaffolds organizing other morphogenetic proteins. How the paralogous septin modules interact to form heteropolymers is still unclear. Through comparative analyses, we hoped to clarify the evolutionary origin of septin diversity and to suggest which amino acid residues were responsible for subunit binding specificity. RESULTS Here we take advantage of newly sequenced genomes to reconcile septin gene trees with a species phylogeny from 22 animals, fungi and protists. Our phylogenetic analysis divided 120 septins representing the 22 taxa into seven clades (Groups) of paralogs. Suggesting that septin genes duplicated early in opisthokont evolution, animal and fungal lineages share septin Groups 1A, 4 and possibly also 1B and 2. Group 5 septins were present in fungi but not in animals and whether they were present in the opisthokont ancestor was unclear. Protein homology folding showed that previously identified conserved septin motifs were all located near interface regions between the adjacent septin monomers. We found specific interface residues associated with each septin Group that are candidates for providing subunit binding specificity. CONCLUSIONS This work reveals that duplication of septin genes began in an ancestral opisthokont more than a billion years ago and continued through the diversification of animals and fungi. Evidence for evolutionary conservation of ~ 49 interface residues will inform mutagenesis experiments and lead to improved understanding of the rules guiding septin heteropolymer formation and from there, to improved understanding of development of form in animals and fungi.
Collapse
Affiliation(s)
- Benjamin Auxier
- Department of Botany, University of British Columbia, Vancouver, Canada
- current address: Laboratory of Genetics, Wageningen University and Research, P.O. Box 16, 6700AA, Wageningen, The Netherlands
| | - Jaclyn Dee
- Department of Botany, University of British Columbia, Vancouver, Canada
| | - Mary L. Berbee
- Department of Botany, University of British Columbia, Vancouver, Canada
| | - Michelle Momany
- Fungal Biology Group and Plant Biology Department, University of Georgia, Athens, USA
| |
Collapse
|
26
|
Romero V, Nakaoka H, Hosomichi K, Inoue I. High Order Formation and Evolution of Hornerin in Primates. Genome Biol Evol 2018; 10:3167-3175. [PMID: 30256937 PMCID: PMC6280949 DOI: 10.1093/gbe/evy208] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/25/2018] [Indexed: 12/15/2022] Open
Abstract
Genomic duplication or loss can accelerate evolution because the number of repeats could affect molecular pathways and phenotypes. We have previously reported that the repeated region of filaggrin (FLG), a crucial component of the outer layers of mammalian skin, had high levels of nucleotide diversity with species-specific divergence and expansion and that it evolved under the birth-and-death model. We focused on hornerin (HRNR), a member of the same gene family that harbor similar tandem repeats as FLG, and examined the formation process of repeated regions and the evolutional model that best fit the HRNR repeated region in the crab-eating macaque (Macaca fascicularis), orangutan (Pongo abelii), gorilla (Gorilla gorilla), and chimpanzee (Pan troglodytes) and compared them with the human (Homo sapiens) sequence. Paar et al. (2011) and Takaishi et al. (2005) have different theories as to the formation of the repeated region of HRNR; both groups share the longest repeat length of 1,404 bp (quartic or longest unit), but they differed in the process. We identified the formation described by Paar et al. {[(“39 bp (primary) × 9” × 2 (secondary)) × 2 (tertiary)] × 5 (quartic)} to be conserved in all species except the crab-eating macaque. We detected high nucleotide diversities between the longest repeats, which fits the birth-and-death model. We concluded that the high order repeat formation of HRNR was conserved in primates except the crab-eating macaque. As previously identified in FLG, the longest repeats have high levels of nucleotide diversity, which could contribute to phenotypic differences between closely related species.
Collapse
Affiliation(s)
- Vanessa Romero
- Department of Genetics School of Life Sciences, SOKENDAI (Graduate University for Advanced Studies), Mishima, Japan.,Division of Human Genetics, National Institute of Genetics, Mishima, Japan
| | - Hirofumi Nakaoka
- Department of Genetics School of Life Sciences, SOKENDAI (Graduate University for Advanced Studies), Mishima, Japan.,Division of Human Genetics, National Institute of Genetics, Mishima, Japan
| | - Kazuyoshi Hosomichi
- Department of Bioinformatics and Genomics, Graduate School of Medical Sciences, Kanazawa University, Japan
| | - Ituro Inoue
- Department of Genetics School of Life Sciences, SOKENDAI (Graduate University for Advanced Studies), Mishima, Japan.,Division of Human Genetics, National Institute of Genetics, Mishima, Japan
| |
Collapse
|
27
|
Dąbkowski D, Tabaszewski P, Górecki P. Minimizing the deep coalescence cost. J Bioinform Comput Biol 2018; 16:1840021. [DOI: 10.1142/s0219720018400218] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Metagenomic studies identify the species present in an environmental sample usually by using procedures that match molecular sequences, e.g. genes, with the species taxonomy. Here, we first formulate the problem of gene-species matching in the parsimony framework using binary phylogenetic gene and species trees under the deep coalescence cost and the assumption that each gene is paired uniquely with one species. In particular, we solve the problem in the cases when one of the trees is a caterpillar. Next, we propose a dynamic programming algorithm, which solves the problem exactly, however, its time and space complexity is exponential. Next, we generalize the problem to include non-binary trees and show the solution for caterpillar trees. We then propose time and space-efficient heuristic algorithms for solving the gene-species matching problem for any input trees. Finally, we present the results of computational experiments on simulated and empirical datasets consisting of binary tree pairs.
Collapse
Affiliation(s)
- Dawid Dąbkowski
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Banacha 2, Warsaw 02-097, Poland
| | - Paweł Tabaszewski
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Banacha 2, Warsaw 02-097, Poland
| | - Paweł Górecki
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Banacha 2, Warsaw 02-097, Poland
| |
Collapse
|
28
|
Mykowiecka A, Szczesny P, Gorecki P. Inferring Gene-Species Assignments in the Presence of Horizontal Gene Transfer. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1571-1578. [PMID: 28541905 DOI: 10.1109/tcbb.2017.2707083] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
BACKGROUND Microbial communities from environmental samples show great diversity as bacteria quickly responds to changes in their ecosystems. To assess the scenario of the actual changes, metagenomics experiments aimed at sequencing genomic DNA from such samples are performed. These new obtained sequences together with already known are used to infer phylogenetic trees assessing the taxonomic groups the species with these genes belong to. Here, we propose the first approach to the gene-species assignment problem by using reconciliation with horizontal gene transfer. RESULTS We propose efficient algorithms that search for optimal gene-species mappings taking into account gene duplication, loss and transfer events under two tractable models of HGT reconciliation. CONCLUSIONS We calculate both the optimal cost and all possible optimal scenarios. Furthermore as the number of optimal reconstructions can be large, we use a Monte-Carlo method for the inference of approximate distributions of gene-species assignments. We demonstrate the applicability on empirical and simulated datasets.
Collapse
|
29
|
Abstract
This chapter covers the theory and practice of ortholog gene set computation. In the theoretical part we give detailed and formal descriptions of the relevant concepts. We also cover the topic of graph-based clustering as a tool to compute ortholog gene sets. In the second part we provide an overview of practical considerations intended for researchers who need to determine orthologous genes from a collection of annotated genomes, briefly describing some of the most popular programs and resources currently available for this task.
Collapse
|
30
|
Investigation on the Evolutionary Relation of Diverse Polyhydroxyalkanoate Gene Clusters in Betaproteobacteria. J Mol Evol 2018; 86:470-483. [PMID: 30062554 DOI: 10.1007/s00239-018-9859-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Accepted: 07/24/2018] [Indexed: 10/28/2022]
Abstract
Products of numerous genes (phaC, phaA, phaB, phaP, phaR, and phaZ) are involved in the synthesis and degradation processes of the ubiquitous prokaryotic polyhydroxyalkanoate (PHA) intracellular reserve storage system. In this study, we performed a bioinformatics analysis to identify PHA-related genes and proteins in the genome of 66 selected organisms (class: Betaproteobacteria) that occur in various habitats; besides, evolutionary trajectories of the PHA system are reported here. The identified PHA-related genes were organized into clusters, and the gene arrangement was highly diverse. The occurrence and distribution of PHA-related clusters revealed that a single cluster was primarily segmented into small gene groups among various genomes, which were further reorganized as novel clusters based on various functional genes. The individual phylogenies of gene and protein sequences supported that the clusters were assembled through the relocation of native orthologous genes that underwent insertion, deletion, and elongation events. Furthermore, the neighboring genes provided valuable evolutionary and functional cues regarding the conservation and maintenance of PHA-related genes in the genome. Overall, the aforementioned results strongly indicate the influence of horizontal gene transfer on the organization of PHA-related gene clusters. Therefore, our results reveal new insights into the organization, evolutionary history, and cluster conservation of the PHA-related gene inventories among Betaproteobacterial organisms.
Collapse
|
31
|
Ciach MA, Muszewska A, Górecki P. Locus-aware decomposition of gene trees with respect to polytomous species trees. Algorithms Mol Biol 2018; 13:11. [PMID: 29881445 PMCID: PMC5985597 DOI: 10.1186/s13015-018-0128-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Accepted: 05/11/2018] [Indexed: 12/29/2022] Open
Abstract
Background Horizontal gene transfer (HGT), a process of acquisition and fixation of foreign genetic material, is an important biological phenomenon. Several approaches to HGT inference have been proposed. However, most of them either rely on approximate, non-phylogenetic methods or on the tree reconciliation, which is computationally intensive and sensitive to parameter values. Results We investigate the locus tree inference problem as a possible alternative that combines the advantages of both approaches. We present several algorithms to solve the problem in the parsimony framework. We introduce a novel tree mapping, which allows us to obtain a heuristic solution to the problems of locus tree inference and duplication classification. Conclusions Our approach allows for faster comparisons of gene and species trees and improves known algorithms for duplication inference in the presence of polytomies in the species trees. We have implemented our algorithms in a software tool available at https://github.com/mciach/LocusTreeInference.
Collapse
|
32
|
Curran DM, Gilleard JS, Wasmuth JD. MIPhy: identify and quantify rapidly evolving members of large gene families. PeerJ 2018; 6:e4873. [PMID: 29868279 PMCID: PMC5983006 DOI: 10.7717/peerj.4873] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 05/10/2018] [Indexed: 11/20/2022] Open
Abstract
After transitioning to a new environment, species often exhibit rapid phenotypic innovation. One of the fastest mechanisms for this is duplication followed by specialization of existing genes. When this happens to a member of a gene family, it tends to leave a detectable phylogenetic signature of lineage-specific expansions and contractions. These can be identified by analyzing the gene family across several species and identifying patterns of gene duplication and loss that do not correlate with the known relationships between those species. This signature, termed phylogenetic instability, has been previously linked to adaptations that change the way an organism samples and responds to its environment; conversely, low phylogenetic instability has been previously linked to proteins with endogenous functions. With the increase in genome-level data, there is a need to identify and quantify phylogenetic instability. Here, we present Minimizing Instability in Phylogenetics (MIPhy), a tool that solves this problem by quantifying the incongruence of a gene's evolutionary history. The motivation behind MIPhy was to produce a tool to aid in interpreting phylogenetic trees. It can predict which members of a gene family are under adaptive evolution, working only from a gene tree and the relationship between the species under consideration. While it does not conduct any estimation of positive selection-which is the typical indication of adaptive evolution-the results tend to agree. We demonstrate the usefulness of MIPhy by accurately predicting which members of the mammalian cytochrome P450 gene superfamily metabolize xenobiotics and which metabolize endogenous compounds. Our predictions correlate very well with known substrate specificities of the human enzymes. We also analyze the Caenorhabditis collagen gene family and use MIPhy to predict genes that produce an observable phenotype when knocked down in C. elegans, and show that our predictions correlate well with existing knowledge. The software can be downloaded and installed from https://github.com/dave-the-scientist/miphy and is also available as an online web tool at http://www.miphy.wasmuthlab.org.
Collapse
Affiliation(s)
- David M. Curran
- Department of Ecosystem and Public Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB, Canada
| | - John S. Gilleard
- Department of Comparative Biology and Experimental Medicine, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB, Canada
| | - James D. Wasmuth
- Department of Ecosystem and Public Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
33
|
Gregg WCT, Ather SH, Hahn MW. Gene-Tree Reconciliation with MUL-Trees to Resolve Polyploidy Events. Syst Biol 2018; 66:1007-1018. [PMID: 28419377 DOI: 10.1093/sysbio/syx044] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Accepted: 03/30/2017] [Indexed: 11/13/2022] Open
Abstract
Polyploidy can have a huge impact on the evolution of species, and it is a common occurrence, especially in plants. The two types of polyploids-autopolyploids and allopolyploids-differ in the level of divergence between the genes that are brought together in the new polyploid lineage. Because allopolyploids are formed via hybridization, the homoeologous copies of genes within them are at least as divergent as orthologs in the parental species that came together to form them. This means that common methods for estimating the parental lineages of allopolyploidy events are not accurate, and can lead to incorrect inferences about the number of gene duplications and losses. Here, we have adapted an algorithm for topology-based gene-tree reconciliation to work with multi-labeled trees (MUL-trees). By definition, MUL-trees have some tips with identical labels, which makes them a natural representation of the genomes of polyploids. Using this new reconciliation algorithm we can: accurately place allopolyploidy events on a phylogeny, identify the parental lineages that hybridized to form allopolyploids, distinguish between allo-, auto-, and (in most cases) no polyploidy, and correctly count the number of duplications and losses in a set of gene trees. We validate our method using gene trees simulated with and without polyploidy, and revisit the history of polyploidy in data from the clades including both baker's yeast and bread wheat. Our re-analysis of the yeast data confirms the allopolyploid origin and parental lineages previously identified for this group. The method presented here should find wide use in the growing number of genomes from species with a history of polyploidy. [Polyploidy; reconciliation; whole-genome duplication.].
Collapse
Affiliation(s)
- W C Thomas Gregg
- Department of Biology and School of Informatics and Computing, Indiana University, Bloomington, IN 47405, USA
| | - S Hussain Ather
- Department of Biology and School of Informatics and Computing, Indiana University, Bloomington, IN 47405, USA
| | - Matthew W Hahn
- Department of Biology and School of Informatics and Computing, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
34
|
Nøjgaard N, Geiß M, Merkle D, Stadler PF, Wieseke N, Hellmuth M. Time-consistent reconciliation maps and forbidden time travel. Algorithms Mol Biol 2018; 13:2. [PMID: 29441122 PMCID: PMC5800358 DOI: 10.1186/s13015-018-0121-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2017] [Accepted: 01/20/2018] [Indexed: 12/04/2022] Open
Abstract
Background In the absence of horizontal gene transfer it is possible to reconstruct the history of gene families from empirically determined orthology relations, which are equivalent to event-labeled gene trees. Knowledge of the event labels considerably simplifies the problem of reconciling a gene tree T with a species trees S, relative to the reconciliation problem without prior knowledge of the event types. It is well-known that optimal reconciliations in the unlabeled case may violate time-consistency and thus are not biologically feasible. Here we investigate the mathematical structure of the event labeled reconciliation problem with horizontal transfer. Results We investigate the issue of time-consistency for the event-labeled version of the reconciliation problem, provide a convenient axiomatic framework, and derive a complete characterization of time-consistent reconciliations. This characterization depends on certain weak conditions on the event-labeled gene trees that reflect conditions under which evolutionary events are observable at least in principle. We give an \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\mathcal {O}(|V(T)|\log (|V(S)|))$$\end{document}O(|V(T)|log(|V(S)|))-time algorithm to decide whether a time-consistent reconciliation map exists. It does not require the construction of explicit timing maps, but relies entirely on the comparably easy task of checking whether a small auxiliary graph is acyclic. The algorithms are implemented in C++ using the boost graph library and are freely available at https://github.com/Nojgaard/tc-recon. Significance The combinatorial characterization of time consistency and thus biologically feasible reconciliation is an important step towards the inference of gene family histories with horizontal transfer from orthology data, i.e., without presupposed gene and species trees. The fast algorithm to decide time consistency is useful in a broader context because it constitutes an attractive component for all tools that address tree reconciliation problems.
Collapse
|
35
|
Bayzid MS, Warnow T. Gene tree parsimony for incomplete gene trees: addressing true biological loss. Algorithms Mol Biol 2018; 13:1. [PMID: 29387142 PMCID: PMC5774205 DOI: 10.1186/s13015-017-0120-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2017] [Accepted: 12/27/2017] [Indexed: 11/10/2022] Open
Abstract
Motivation Species tree estimation from gene trees can be complicated by gene duplication and loss, and “gene tree parsimony” (GTP) is one approach for estimating species trees from multiple gene trees. In its standard formulation, the objective is to find a species tree that minimizes the total number of gene duplications and losses with respect to the input set of gene trees. Although much is known about GTP, little is known about how to treat inputs containing some incomplete gene trees (i.e., gene trees lacking one or more of the species). Results We present new theory for GTP considering whether the incompleteness is due to gene birth and death (i.e., true biological loss) or taxon sampling, and present dynamic programming algorithms that can be used for an exact but exponential time solution for small numbers of taxa, or as a heuristic for larger numbers of taxa. We also prove that the “standard” calculations for duplications and losses exactly solve GTP when incompleteness results from taxon sampling, although they can be incorrect when incompleteness results from true biological loss. The software for the DP algorithm is freely available as open source code at https://github.com/smirarab/DynaDup.
Collapse
|
36
|
Mohanta TK, Syed AS, Ameen F, Bae H. Novel Genomic and Evolutionary Perspective of Cyanobacterial tRNAs. Front Genet 2017; 8:200. [PMID: 29321793 PMCID: PMC5733544 DOI: 10.3389/fgene.2017.00200] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2017] [Accepted: 11/21/2017] [Indexed: 11/30/2022] Open
Abstract
Transfer RNA (tRNA) plays a central role in protein synthesis and acts as an adaptor molecule between an mRNA and an amino acid. A tRNA has an L-shaped clover leaf-like structure and contains an acceptor arm, D-arm, D-loop, anti-codon arm, anti-codon loop, variable loop, Ψ-arm and Ψ-loop. All of these arms and loops are important in protein translation. Here, we aimed to delineate the genomic architecture of these arms and loops in cyanobacterial tRNA. Studies from tRNA sequences from 61 cyanobacterial species showed that, except for few tRNAs (tRNAAsn, tRNALeu, tRNAGln, and tRNAMet), all contained a G nucleotide at the 1st position in the acceptor arm. tRNALeu and tRNAMet did not contain any conserved nucleotides at the 1st position whereas tRNAAsn and tRNAGln contained a conserved U1 nucleotide. In several tRNA families, the variable region also contained conserved nucleotides. Except for tRNAMet and tRNAGlu, all other tRNAs contained a conserved A nucleotide at the 1st position in the D-loop. The Ψ-loop contained a conserved U1-U2-C3-x-A5-x-U7 sequence, except for tRNAGly, tRNAAla, tRNAVal, tRNAPhe, tRNAThr, and tRNAGln in which the U7 nucleotide was not conserved. However, in tRNAAsp, the U7 nucleotide was substituted with a C7 nucleotide. Additionally, tRNAArg, tRNAGly, and tRNALys of cyanobacteria contained a group I intron within the anti-codon loop region. Maximum composite likelihood study on the transition/transversion of cyanobacterial tRNA revealed that the rate of transition was higher than the rate of transversion. An evolutionary tree was constructed to understand the evolution of cyanobacterial tRNA and analyses revealed that cyanobacterial tRNA may have evolved polyphyletically with high rate of gene loss.
Collapse
Affiliation(s)
- Tapan K Mohanta
- School of Biotechnology, Yeungnam University, Gyeongsan, South Korea
| | - Asad S Syed
- Department of Botany and Microbiology, College of Science, King Saud University, Riyadh, Saudi Arabia
| | - Fuad Ameen
- Department of Botany and Microbiology, College of Science, King Saud University, Riyadh, Saudi Arabia
| | - Hanhong Bae
- School of Biotechnology, Yeungnam University, Gyeongsan, South Korea
| |
Collapse
|
37
|
Inferring incomplete lineage sorting, duplications, transfers and losses with reconciliations. J Theor Biol 2017; 432:1-13. [DOI: 10.1016/j.jtbi.2017.08.008] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Revised: 07/31/2017] [Accepted: 08/08/2017] [Indexed: 01/20/2023]
|
38
|
Hellmuth M. Biologically feasible gene trees, reconciliation maps and informative triples. Algorithms Mol Biol 2017; 12:23. [PMID: 28861118 PMCID: PMC5576477 DOI: 10.1186/s13015-017-0114-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2017] [Accepted: 08/16/2017] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND The history of gene families-which are equivalent to event-labeled gene trees-can be reconstructed from empirically estimated evolutionary event-relations containing pairs of orthologous, paralogous or xenologous genes. The question then arises as whether inferred event-labeled gene trees are biologically feasible, that is, if there is a possible true history that would explain a given gene tree. In practice, this problem is boiled down to finding a reconciliation map-also known as DTL-scenario-between the event-labeled gene trees and a (possibly unknown) species tree. RESULTS In this contribution, we first characterize whether there is a valid reconciliation map for binary event-labeled gene trees T that contain speciation, duplication and horizontal gene transfer events and some unknown species tree S in terms of "informative" triples that are displayed in T and provide information of the topology of S. These informative triples are used to infer the unknown species tree S for T. We obtain a similar result for non-binary gene trees. To this end, however, the reconciliation map needs to be further restricted. We provide a polynomial-time algorithm to decide whether there is a species tree for a given event-labeled gene tree, and in the positive case, to construct the species tree and the respective (restricted) reconciliation map. However, informative triples as well as DTL-scenarios have their limitations when they are used to explain the biological feasibility of gene trees. While reconciliation maps imply biological feasibility, we show that the converse is not true in general. Moreover, we show that informative triples neither provide enough information to characterize "relaxed" DTL-scenarios nor non-restricted reconciliation maps for non-binary biologically feasible gene trees.
Collapse
Affiliation(s)
- Marc Hellmuth
- Institute of Mathematics and Computer Science, University of Greifswald, Walther-Rathenau-Strasse 47, 17487 Greifswald, Germany
- Center for Bioinformatics, Saarland University, Building E 2.1, P.O. Box 151150, 66041 Saarbrücken, Germany
| |
Collapse
|
39
|
Rogers J, Fishberg A, Youngs N, Wu YC. Reconciliation feasibility in the presence of gene duplication, loss, and coalescence with multiple individuals per species. BMC Bioinformatics 2017; 18:292. [PMID: 28583091 PMCID: PMC5460407 DOI: 10.1186/s12859-017-1701-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2016] [Accepted: 05/22/2017] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND In phylogenetics, we often seek to reconcile gene trees with species trees within the framework of an evolutionary model. While the most popular models for eukaryotic species allow for only gene duplication and gene loss or only multispecies coalescence, recent work has combined these phenomena through a reconciliation structure, the labeled coalescent tree (LCT), that simultaneously describes the duplication-loss and coalescent history of a gene family. However, the LCT makes the simplifying assumption that only one individual is sampled per species whereas, with advances in gene sequencing, we now have access to multiple samples per species. RESULTS We demonstrate that with these additional samples, there exist gene tree topologies that are impossible to reconcile with any species tree. In particular, the multiple samples enforce new constraints on the placement of duplications within a valid reconciliation. To model these constraints, we extend the LCT to a new structure, the partially labeled coalescent tree (PLCT) and demonstrate how to use the PLCT to evaluate the feasibility of a gene tree topology. We apply our algorithm to two clades of apes and flies to characterize possible sources of infeasibility. CONCLUSION Going forward, we believe that this model represents a first step towards understanding reconciliations in duplication-loss-coalescence models with multiple samples per species.
Collapse
Affiliation(s)
- Jennifer Rogers
- Department of Computer Science, Harvey Mudd College, Claremont, 91711, California, USA
| | - Andrew Fishberg
- Department of Computer Science, Harvey Mudd College, Claremont, 91711, California, USA
| | - Nora Youngs
- Department of Mathematics, Harvey Mudd College, Claremont, 91711, California, USA
- Current Address: Department of Mathematics and Statistics, Colby College, Waterville, 04901, Maine, USA
| | - Yi-Chieh Wu
- Department of Computer Science, Harvey Mudd College, Claremont, 91711, California, USA.
| |
Collapse
|
40
|
Zayneb C, Imen RH, Walid K, Grubb CD, Bassem K, Franck V, Hafedh M, Amine E. The phytochelatin synthase gene in date palm (Phoenix dactylifera L.): Phylogeny, evolution and expression. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2017; 140:7-17. [PMID: 28231507 DOI: 10.1016/j.ecoenv.2017.02.020] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Revised: 02/11/2017] [Accepted: 02/13/2017] [Indexed: 06/06/2023]
Abstract
We studied date palm phytochelatin synthase type I (PdPCS1), which catalyzes the cytosolic synthesis of phytochelatins (PCs), a heavy metal binding protein, in plant cells. The gene encoding PdPCS1 (Pdpcs) consists of 8 exons and 7 introns and encodes a protein of 528 amino acids. PCs gene history was studied using Notung phylogeny. During evolution, gene loss from several lineages was predicted including Proteobacteria, Bilateria and Brassicaceae. In addition, eleven gene duplication events appeared toward interior nodes of the reconciled tree and four gene duplication events appeared toward the external nodes. These latter sequences belong to species with a second copy of PCs suggesting that this gene evolved through subfunctionalization. Pdpcs1 gene expression was measured in seedling hypocotyls exposed to Cd, Cu and Cr using quantitative real-time polymerase chain reaction (qPCR). A Pdpcs1 overexpression was evidenced in P. dactylifera seedlings exposed to metals suggesting that 1-the Pdpcs1 gene is functional, 2-there is an implication of the enzyme in metal detoxification mechanisms. Additionally, the structure of PdPCS1 was predicted using its homologue from Nostoc (cyanobacterium, NsPCS) as a template in Discovery studio and PyMol software. These analyses allowed us to identify the phytochelatin synthase type I enzyme in date palm (PdPCS1) via recognition of key consensus amino acids involved in the catalytic mechanism, and to propose a hypothetical binding and catalytic site for an additional substrate binding cavity.
Collapse
Affiliation(s)
- Chaâbene Zayneb
- Laboratory of Plant Biotechnology, Faculty of Sciences, University of Sfax, BP 1171, 3000 Sfax, Tunisia; Laboratoire de Génie Civil et géo-Environnement, Université de Lille 1, F-59655 Villeneuve d'Ascq, France
| | - Rekik Hakim Imen
- Laboratory of Plant Biotechnology, Faculty of Sciences, University of Sfax, BP 1171, 3000 Sfax, Tunisia
| | - Kriaa Walid
- Laboratory of Plant Biotechnology, Faculty of Sciences, University of Sfax, BP 1171, 3000 Sfax, Tunisia
| | - C Douglas Grubb
- Biorecycling Operations Research Laboratory, Des Moines, IA, USA
| | - Khemakhem Bassem
- Laboratory of Plant Biotechnology, Faculty of Sciences, University of Sfax, BP 1171, 3000 Sfax, Tunisia
| | - Vandenbulcke Franck
- Laboratoire de Génie Civil et géo-Environnement, Université de Lille 1, F-59655 Villeneuve d'Ascq, France
| | - Mejdoub Hafedh
- Laboratory of Plant Biotechnology, Faculty of Sciences, University of Sfax, BP 1171, 3000 Sfax, Tunisia
| | - Elleuch Amine
- Laboratory of Plant Biotechnology, Faculty of Sciences, University of Sfax, BP 1171, 3000 Sfax, Tunisia.
| |
Collapse
|
41
|
Thiruketheeswaran P, Thomalla P, Krüger E, Hinssen H, D'Haese J. Four paralog gelsolin genes are differentially expressed in the earthworm Lumbricus terrestris. Comp Biochem Physiol B Biochem Mol Biol 2017; 208-209:58-67. [PMID: 28400331 DOI: 10.1016/j.cbpb.2017.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2016] [Revised: 04/04/2017] [Accepted: 04/06/2017] [Indexed: 11/25/2022]
Abstract
We have identified and characterized four distinct variants of the gelsolin-related protein (EWAM P1-P4) in the earthworm L. terrestris. All of these proteins biochemically qualify as gelsolins since they sever actin filaments in a calcium dependent manner. P1, P2 and P3 are present in the Lumbricus body wall muscle whereas in the gizzard muscle P3 and P4 were found. P1-P4 are encoded by four paralog genes and are differentially expressed in various muscle cell tissues. While the genes for P1 and P2 contain one intron, there was no intron in both P3 and P4 genes. The coding sequences consist of 1104bp (368 amino acids) for P1/P4 and 1101bp (367 amino acids) for P2/P3. Corresponding genes were confirmed by northern blot analysis which revealed three (calculated lengths: 3100, 2300 and 2100 nucleotides) and two (calculated lengths: 2300 and 1700 nucleotides) mRNA transcripts in the body wall and the gizzard, respectively. EWAM mRNA was localized by fluorescence in situ hybridization in the body wall and the gizzard muscle. P1 mRNA was detected in the inner proximal layers of both the circular and longitudinal muscle of the body wall whereas in the gizzard no significant staining was observed for P1. P2-P4 mRNAs were abundant in the outer distal layers of both the circular and the longitudinal muscles of both body wall and gizzard. The differential expression of four paralog gelsolin genes suggests a functional adaptation of different muscle cells with respect to actin filament turnover and modulation of its polymer state.
Collapse
Affiliation(s)
- Prasath Thiruketheeswaran
- Institute for Cell Biology, Department Biology, Heinrich-Heine-University Düsseldorf, Universitätsstrasse 1, D-40225 Düsseldorf, Germany
| | - Paul Thomalla
- Institute for Cell Biology, Department Biology, Heinrich-Heine-University Düsseldorf, Universitätsstrasse 1, D-40225 Düsseldorf, Germany
| | - Evelyn Krüger
- Institute for Cell Biology, Department Biology, Heinrich-Heine-University Düsseldorf, Universitätsstrasse 1, D-40225 Düsseldorf, Germany
| | - Horst Hinssen
- Biochemical Cell Biology, Faculty of Biology, University of Bielefeld, Universitätsstrasse 25, D-33615 Bielefeld, Germany
| | - Jochen D'Haese
- Institute for Cell Biology, Department Biology, Heinrich-Heine-University Düsseldorf, Universitätsstrasse 1, D-40225 Düsseldorf, Germany.
| |
Collapse
|
42
|
Romero V, Hosomichi K, Nakaoka H, Shibata H, Inoue I. Structure and evolution of the filaggrin gene repeated region in primates. BMC Evol Biol 2017; 17:10. [PMID: 28077068 PMCID: PMC5225520 DOI: 10.1186/s12862-016-0851-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 12/12/2016] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND The evolutionary dynamics of repeat sequences is quite complex, with some duplicates never having differentiated from each other. Two models can explain the complex evolutionary process for repeated genes-concerted and birth-and-death, of which the latter is driven by duplications maintained by selection. Copy number variations caused by random duplications and losses in repeat regions may modulate molecular pathways and therefore affect phenotypic characteristics in a population, resulting in individuals that are able to adapt to new environments. In this study, we investigated the filaggrin gene (FLG), which codes for filaggrin-an important component of the outer layers of mammalian skin-and contains tandem repeats that exhibit copy number variation between and within species. To examine which model best fits the evolutionary pathway for the complete tandem repeats within a single exon of FLG, we determined the repeat sequences in crab-eating macaque (Macaca fascicularis), orangutan (Pongo abelii), gorilla (Gorilla gorilla), and chimpanzee (Pan troglodytes) and compared these with the sequence in human (Homo sapiens). RESULTS In this study we compared concerted and birth-and-death evolution models, commonly used for gene copies. We found that there is high nucleotide diversity between filaggrin repeat regions, which fits the birth-and-death model. Phylogenetic analyses also suggested that independent duplication events created the repeat sequences in crab-eating macaques and orangutans, while different duplication and loss events created the repeats in gorillas, chimpanzees, and humans. Comparison of the repeat sequences detected purifying selection within species and lineage-specific duplications across species. We also found variation in the length of the repeated region within species such as chimpanzee and crab-eating macaque. CONCLUSIONS We conclude that the copy number variation in the repeat sequences of FLG between primates may be a consequence of species-specific divergence and expansion.
Collapse
Affiliation(s)
- Vanessa Romero
- Department of Genetics, School of Life Sciences, Graduate University for Advanced Studies (SOKENDAI), Mishima, 411-8540, Japan.,Division of Human Genetics, National Institute of Genetics, Mishima, 411-8540, Japan
| | - Kazuyoshi Hosomichi
- Division of Human Genetics, National Institute of Genetics, Mishima, 411-8540, Japan.,Present address: Department of Bioinformatics and Genomics, Graduate School of Medical Sciences, Kanazawa University, Kanazawa, 920-8640, Japan
| | - Hirofumi Nakaoka
- Department of Genetics, School of Life Sciences, Graduate University for Advanced Studies (SOKENDAI), Mishima, 411-8540, Japan.,Division of Human Genetics, National Institute of Genetics, Mishima, 411-8540, Japan
| | - Hiroki Shibata
- Division of Genomics, Medical Institute of Bioregulation, Kyushu University, Fukuoka, 812-8582, Japan
| | - Ituro Inoue
- Department of Genetics, School of Life Sciences, Graduate University for Advanced Studies (SOKENDAI), Mishima, 411-8540, Japan. .,Division of Human Genetics, National Institute of Genetics, Mishima, 411-8540, Japan.
| |
Collapse
|
43
|
Vasco A, Smalls TL, Graham SW, Cooper ED, Wong GKS, Stevenson DW, Moran RC, Ambrose BA. Challenging the paradigms of leaf evolution: Class III HD-Zips in ferns and lycophytes. THE NEW PHYTOLOGIST 2016; 212:745-758. [PMID: 27385116 DOI: 10.1111/nph.14075] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Accepted: 05/23/2016] [Indexed: 05/06/2023]
Abstract
Despite the extraordinary significance leaves have for life on Earth, their origin and development remain vigorously debated. More than a century of paleobotanical, morphological, and phylogenetic research has still not resolved fundamental questions about leaves. Developmental genetic data are sparse in ferns, and comparative studies of lycophytes and seed plants have reached opposing conclusions on the conservation of a leaf developmental program. We performed phylogenetic and expression analyses of a leaf developmental regulator (Class III HD-Zip genes; C3HDZs) spanning lycophytes and ferns. We show that a duplication and neofunctionalization of C3HDZs probably occurred in the ancestor of euphyllophytes, and that there is a common leaf developmental mechanism conserved between ferns and seed plants. We show C3HDZ expression in lycophyte and fern sporangia and show that C3HDZs have conserved expression patterns during initiation of lateral primordia (leaves or sporangia). This expression is maintained throughout sporangium development in lycophytes and ferns and indicates an ancestral role of C3HDZs in sporangium development. We hypothesize that there is a deep homology of all leaves and that a sporangium-specific developmental program was coopted independently for the development of lycophyte and euphyllophyte leaves. This provides molecular genetic support for a paradigm shift in theories of lycophyte leaf evolution.
Collapse
Affiliation(s)
- Alejandra Vasco
- The New York Botanical Garden, 2900 Southern Blvd, Bronx, NY, 10458-5126, USA
- Instituto de Biología, Universidad Nacional Autónoma de México (UNAM), Mexico DF, 04510, Mexico
| | - Tynisha L Smalls
- The New York Botanical Garden, 2900 Southern Blvd, Bronx, NY, 10458-5126, USA
| | - Sean W Graham
- Department of Botany, University of British Columbia, 6270 University Boulevard, Vancouver, BC, V6T 1Z4, Canada
- UBC Botanical Garden & Centre for Plant Research, University of British Columbia, 6804 Marine Drive SW, Vancouver, BC, V6T 1Z4, Canada
| | - Endymion D Cooper
- School of Biological and Chemical Sciences, Queen Mary University of London, London, E1 4NS, UK
| | - Gane Ka-Shu Wong
- Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada
- Department of Medicine, University of Alberta, Edmonton, AB, T6G 2E1, Canada
- BGI-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Dennis W Stevenson
- The New York Botanical Garden, 2900 Southern Blvd, Bronx, NY, 10458-5126, USA
| | - Robbin C Moran
- The New York Botanical Garden, 2900 Southern Blvd, Bronx, NY, 10458-5126, USA
| | - Barbara A Ambrose
- The New York Botanical Garden, 2900 Southern Blvd, Bronx, NY, 10458-5126, USA.
| |
Collapse
|
44
|
De Fine Licht HH, Jensen AB, Eilenberg J. Comparative transcriptomics reveal host-specific nucleotide variation in entomophthoralean fungi. Mol Ecol 2016; 26:2092-2110. [PMID: 27717247 DOI: 10.1111/mec.13863] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Revised: 09/13/2016] [Accepted: 09/15/2016] [Indexed: 12/15/2022]
Abstract
Obligate parasites are under strong selection to increase exploitation of their host to survive while evading detection by host immune defences. This has often led to elaborate pathogen adaptations and extreme host specificity. Specialization on one host, however, often incurs a trade-off influencing the capacity to infect alternate hosts. Here, we investigate host adaptation in two morphologically indistinguishable and closely related obligate specialist insect-pathogenic fungi from the phylum Entomophthoromycota, Entomophthora muscae sensu stricto and E. muscae sensu lato, pathogens of houseflies (Musca domestica) and cabbage flies (Delia radicum), respectively. We compared single nucleotide polymorphisms within and between these two E. muscae species using 12 RNA-seq transcriptomes from five biological samples. All five isolates contained intra-isolate polymorphisms that segregate in 50:50 ratios, indicative of genetic duplication events or functional diploidy. Comparative analysis of dN/dS ratios between the multinucleate E. muscae s.str. and E. muscae s.l. revealed molecular signatures of positive selection in transcripts related to utilization of host lipids and the potential secretion of toxins that interfere with the host immune response. Phylogenetic comparison with the nonobligate generalist insect-pathogenic fungus Conidiobolus coronatus revealed a gene-family expansion of trehalase enzymes in E. muscae. The main sugar in insect haemolymph is trehalose, and efficient sugar utilization was probably important for the evolutionary transition to obligate insect pathogenicity in E. muscae. These results support the hypothesis that genetically based host specialization in specialist pathogens evolves in response to the challenge of using resources and dealing with the immune system of different hosts.
Collapse
Affiliation(s)
- Henrik H De Fine Licht
- Section for Organismal Biology, Department of Plant and Environmental Sciences, University of Copenhagen, Thorvaldsensvej 40, 1871, Frederiksberg, Denmark
| | - Annette B Jensen
- Section for Organismal Biology, Department of Plant and Environmental Sciences, University of Copenhagen, Thorvaldsensvej 40, 1871, Frederiksberg, Denmark
| | - Jørgen Eilenberg
- Section for Organismal Biology, Department of Plant and Environmental Sciences, University of Copenhagen, Thorvaldsensvej 40, 1871, Frederiksberg, Denmark
| |
Collapse
|
45
|
Mendes FK, Hahn Y, Hahn MW. Gene Tree Discordance Can Generate Patterns of Diminishing Convergence over Time. Mol Biol Evol 2016; 33:3299-3307. [PMID: 27634870 DOI: 10.1093/molbev/msw197] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Phenotypic convergence is an exciting outcome of adaptive evolution, occurring when different species find similar solutions to the same problem. Unraveling the molecular basis of convergence provides a way to link genotype to adaptive phenotypes, but can also shed light on the extent to which molecular evolution is repeatable and predictable. Many recent genome-wide studies have uncovered a striking pattern of diminishing convergence over time, ascribing this pattern to the presence of intramolecular epistatic interactions. Here, we consider gene tree discordance as an alternative cause of changes in convergence levels over time in a primate dataset. We demonstrate that gene tree discordance can produce patterns of diminishing convergence by itself, and that controlling for discordance as a cause of apparent convergence makes the pattern disappear. We also show that synonymous substitutions, where neither selection nor epistasis should be prevalent, have the same diminishing pattern of molecular convergence in primates. Finally, we demonstrate that even in situations where biological discordance is not possible, discordance due to errors in species tree inference can drive similar patterns. Though intramolecular epistasis could in principle create a pattern of declining convergence over time, our results suggest a possible alternative explanation for this widespread pattern. These results contribute to a growing appreciation not just of the presence of gene tree discordance, but of the unpredictable effects this discordance can have on analyses of molecular evolution.
Collapse
Affiliation(s)
- Fábio K Mendes
- Department of Biology, Indiana University, Bloomington, IN
| | - Yoonsoo Hahn
- Department of Life Science, Research Center for Biomolecules and Biosystems, Chung-Ang University, Seoul, Republic of Korea
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN.,School of Informatics and Computing, Indiana University, Bloomington, IN
| |
Collapse
|
46
|
Zallot R, Harrison KJ, Kolaczkowski B, de Crécy-Lagard V. Functional Annotations of Paralogs: A Blessing and a Curse. Life (Basel) 2016; 6:life6030039. [PMID: 27618105 PMCID: PMC5041015 DOI: 10.3390/life6030039] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Revised: 08/29/2016] [Accepted: 09/02/2016] [Indexed: 12/15/2022] Open
Abstract
Gene duplication followed by mutation is a classic mechanism of neofunctionalization, producing gene families with functional diversity. In some cases, a single point mutation is sufficient to change the substrate specificity and/or the chemistry performed by an enzyme, making it difficult to accurately separate enzymes with identical functions from homologs with different functions. Because sequence similarity is often used as a basis for assigning functional annotations to genes, non-isofunctional gene families pose a great challenge for genome annotation pipelines. Here we describe how integrating evolutionary and functional information such as genome context, phylogeny, metabolic reconstruction and signature motifs may be required to correctly annotate multifunctional families. These integrative analyses can also lead to the discovery of novel gene functions, as hints from specific subgroups can guide the functional characterization of other members of the family. We demonstrate how careful manual curation processes using comparative genomics can disambiguate subgroups within large multifunctional families and discover their functions. We present the COG0720 protein family as a case study. We also discuss strategies to automate this process to improve the accuracy of genome functional annotation pipelines.
Collapse
Affiliation(s)
- Rémi Zallot
- Department of Microbiology and Cell Science, Institute of Food and Agricultural Sciences, University of Florida, Gainesville, FL 32611, USA.
| | - Katherine J Harrison
- Department of Microbiology and Cell Science, Institute of Food and Agricultural Sciences, University of Florida, Gainesville, FL 32611, USA.
| | - Bryan Kolaczkowski
- Department of Microbiology and Cell Science, Institute of Food and Agricultural Sciences, University of Florida, Gainesville, FL 32611, USA.
| | - Valérie de Crécy-Lagard
- Department of Microbiology and Cell Science, Institute of Food and Agricultural Sciences, University of Florida, Gainesville, FL 32611, USA.
| |
Collapse
|
47
|
Greenwood JM, Ezquerra AL, Behrens S, Branca A, Mallet L. Current analysis of host–parasite interactions with a focus on next generation sequencing data. ZOOLOGY 2016; 119:298-306. [DOI: 10.1016/j.zool.2016.06.010] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Revised: 06/22/2016] [Accepted: 06/22/2016] [Indexed: 01/21/2023]
|
48
|
Mendes FK, Hahn MW. Gene Tree Discordance Causes Apparent Substitution Rate Variation. Syst Biol 2016; 65:711-21. [DOI: 10.1093/sysbio/syw018] [Citation(s) in RCA: 118] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2015] [Accepted: 02/23/2016] [Indexed: 01/01/2023] Open
|
49
|
Mohanta TK, Mohanta N, Parida P, Panda SK, Ponpandian LN, Bae H. Genome-Wide Identification of Mitogen-Activated Protein Kinase Gene Family across Fungal Lineage Shows Presence of Novel and Diverse Activation Loop Motifs. PLoS One 2016; 11:e0149861. [PMID: 26918378 PMCID: PMC4769017 DOI: 10.1371/journal.pone.0149861] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 02/05/2016] [Indexed: 01/24/2023] Open
Abstract
The mitogen-activated protein kinase (MAPK) is characterized by the presence of the T-E-Y, T-D-Y, and T-G-Y motifs in its activation loop region and plays a significant role in regulating diverse cellular responses in eukaryotic organisms. Availability of large-scale genome data in the fungal kingdom encouraged us to identify and analyse the fungal MAPK gene family consisting of 173 fungal species. The analysis of the MAPK gene family resulted in the discovery of several novel activation loop motifs (T-T-Y, T-I-Y, T-N-Y, T-H-Y, T-S-Y, K-G-Y, T-Q-Y, S-E-Y and S-D-Y) in fungal MAPKs. The phylogenetic analysis suggests that fungal MAPKs are non-polymorphic, had evolved from their common ancestors around 1500 million years ago, and are distantly related to plant MAPKs. We are the first to report the presence of nine novel activation loop motifs in fungal MAPKs. The specificity of the activation loop motif plays a significant role in controlling different growth and stress related pathways in fungi. Hence, the presences of these nine novel activation loop motifs in fungi are of special interest.
Collapse
Affiliation(s)
- Tapan Kumar Mohanta
- Free Major of Natural Science, College of Basic Studies, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, 712749, Republic of Korea
| | - Nibedita Mohanta
- Department Of Biotechnology, North Orissa University, Takatpur, Baripada, 757003, India
| | - Pratap Parida
- Regional Medical Research Centre, NE Region, Indian Council of Medical Research, Dibrugarh, 786001, Assam, India
| | - Sujogya Kumar Panda
- Department of Zoology; North Orissa University; Baripada, Odisha, 757003, India
| | | | - Hanhong Bae
- School of Biotechnology, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, 712749, Republic of Korea
| |
Collapse
|
50
|
Tekaia F. Inferring Orthologs: Open Questions and Perspectives. GENOMICS INSIGHTS 2016; 9:17-28. [PMID: 26966373 PMCID: PMC4778853 DOI: 10.4137/gei.s37925] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Revised: 12/30/2015] [Accepted: 01/02/2016] [Indexed: 01/25/2023]
Abstract
With the increasing number of sequenced genomes and their comparisons, the detection of orthologs is crucial for reliable functional annotation and evolutionary analyses of genes and species. Yet, the dynamic remodeling of genome content through gain, loss, transfer of genes, and segmental and whole-genome duplication hinders reliable orthology detection. Moreover, the lack of direct functional evidence and the questionable quality of some available genome sequences and annotations present additional difficulties to assess orthology. This article reviews the existing computational methods and their potential accuracy in the high-throughput era of genome sequencing and anticipates open questions in terms of methodology, reliability, and computation. Appropriate taxon sampling together with combination of methods based on similarity, phylogeny, synteny, and evolutionary knowledge that may help detecting speciation events appears to be the most accurate strategy. This review also raises perspectives on the potential determination of orthology throughout the whole species phylogeny.
Collapse
Affiliation(s)
- Fredj Tekaia
- Institut Pasteur, Unit of Structural Microbiology, CNRS URA 3528 and University Paris Diderot, Sorbonne Paris Cité, Paris, France
| |
Collapse
|