1
|
Chen B, Ren C, Ouyang Z, Xu J, Xu K, Li Y, Guo H, Bai X, Tian M, Xu X, Wang Y, Li H, Bo X, Chen H. Stratifying TAD boundaries pinpoints focal genomic regions of regulation, damage, and repair. Brief Bioinform 2024; 25:bbae306. [PMID: 38935071 PMCID: PMC11210073 DOI: 10.1093/bib/bbae306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 06/01/2024] [Accepted: 06/13/2024] [Indexed: 06/28/2024] Open
Abstract
Advances in chromatin mapping have exposed the complex chromatin hierarchical organization in mammals, including topologically associating domains (TADs) and their substructures, yet the functional implications of this hierarchy in gene regulation and disease progression are not fully elucidated. Our study delves into the phenomenon of shared TAD boundaries, which are pivotal in maintaining the hierarchical chromatin structure and regulating gene activity. By integrating high-resolution Hi-C data, chromatin accessibility, and DNA double-strand breaks (DSBs) data from various cell lines, we systematically explore the complex regulatory landscape at high-level TAD boundaries. Our findings indicate that these boundaries are not only key architectural elements but also vibrant hubs, enriched with functionally crucial genes and complex transcription factor binding site-clustered regions. Moreover, they exhibit a pronounced enrichment of DSBs, suggesting a nuanced interplay between transcriptional regulation and genomic stability. Our research provides novel insights into the intricate relationship between the 3D genome structure, gene regulation, and DNA repair mechanisms, highlighting the role of shared TAD boundaries in maintaining genomic integrity and resilience against perturbations. The implications of our findings extend to understanding the complexities of genomic diseases and open new avenues for therapeutic interventions targeting the structural and functional integrity of TAD boundaries.
Collapse
Affiliation(s)
- Bijia Chen
- Academy of Military Medical Sciences, Beijing 100850, China
| | - Chao Ren
- Academy of Military Medical Sciences, Beijing 100850, China
| | - Zhangyi Ouyang
- Academy of Military Medical Sciences, Beijing 100850, China
| | - Jingxuan Xu
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Department of Gastrointestinal Surgery, Peking University Cancer Hospital & Institute, Beijing 100142, China
| | - Kang Xu
- School of Software, Shandong University, Jinan 250101, China
| | - Yaru Li
- Academy of Military Medical Sciences, Beijing 100850, China
| | - Hejiang Guo
- Academy of Military Medical Sciences, Beijing 100850, China
| | - Xuemei Bai
- Academy of Military Medical Sciences, Beijing 100850, China
| | - Mengge Tian
- The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | - Xiang Xu
- Academy of Military Medical Sciences, Beijing 100850, China
| | - Yuyang Wang
- College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China
| | - Hao Li
- Academy of Military Medical Sciences, Beijing 100850, China
| | - Xiaochen Bo
- Academy of Military Medical Sciences, Beijing 100850, China
| | - Hebing Chen
- Academy of Military Medical Sciences, Beijing 100850, China
| |
Collapse
|
2
|
McCoy MJ, Fire AZ. Parallel gene size and isoform expansion of ancient neuronal genes. Curr Biol 2024; 34:1635-1645.e3. [PMID: 38460513 PMCID: PMC11043017 DOI: 10.1016/j.cub.2024.02.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/16/2023] [Accepted: 02/11/2024] [Indexed: 03/11/2024]
Abstract
How nervous systems evolved is a central question in biology. A diversity of synaptic proteins is thought to play a central role in the formation of specific synapses leading to nervous system complexity. The largest animal genes, often spanning hundreds of thousands of base pairs, are known to be enriched for expression in neurons at synapses and are frequently mutated or misregulated in neurological disorders and diseases. Although many of these genes have been studied independently in the context of nervous system evolution and disease, general principles underlying their parallel evolution remain unknown. To investigate this, we directly compared orthologous gene sizes across eukaryotes. By comparing relative gene sizes within organisms, we identified a distinct class of large genes with origins predating the diversification of animals and, in many cases, the emergence of neurons as dedicated cell types. We traced this class of ancient large genes through evolution and found orthologs of the large synaptic genes potentially driving the immense complexity of metazoan nervous systems, including in humans and cephalopods. Moreover, we found that while these genes are evolving under strong purifying selection, as demonstrated by low dN/dS ratios, they have simultaneously grown larger and gained the most isoforms in animals. This work provides a new lens through which to view this distinctive class of large and multi-isoform genes and demonstrates how intrinsic genomic properties, such as gene length, can provide flexibility in molecular evolution and allow groups of genes and their host organisms to evolve toward complexity.
Collapse
Affiliation(s)
- Matthew J McCoy
- Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA.
| | - Andrew Z Fire
- Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA; Department of Genetics, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA.
| |
Collapse
|
3
|
Duchêne DA, Duchêne S, Stiller J, Heller R, Ho SYW. ClockstaRX: Testing Molecular Clock Hypotheses With Genomic Data. Genome Biol Evol 2024; 16:evae064. [PMID: 38526019 PMCID: PMC10999959 DOI: 10.1093/gbe/evae064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 01/11/2024] [Accepted: 03/21/2024] [Indexed: 03/26/2024] Open
Abstract
Phylogenomic data provide valuable opportunities for studying evolutionary rates and timescales. These analyses require theoretical and statistical tools based on molecular clocks. We present ClockstaRX, a flexible platform for exploring and testing evolutionary rate signals in phylogenomic data. Here, information about evolutionary rates in branches across gene trees is placed in Euclidean space, allowing data transformation, visualization, and hypothesis testing. ClockstaRX implements formal tests for identifying groups of loci and branches that make a large contribution to patterns of rate variation. This information can then be used to test for drivers of genomic evolutionary rates or to inform models for molecular dating. Drawing on the results of a simulation study, we recommend forms of data exploration and filtering that might be useful prior to molecular-clock analyses.
Collapse
Affiliation(s)
- David A Duchêne
- Center for Evolutionary Hologenomics, University of Copenhagen, Copenhagen 1352, Denmark
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen 1352, Denmark
| | - Sebastián Duchêne
- Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, VIC 3010, Australia
| | - Josefin Stiller
- Villum Centre for Biodiversity Genomics, University of Copenhagen, 2100 Copenhagen, Denmark
| | - Rasmus Heller
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen 2100, Denmark
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
4
|
van Westerhoven AC, Aguilera-Galvez C, Nakasato-Tagami G, Shi-Kunne X, Martinez de la Parte E, Chavarro-Carrero E, Meijer HJG, Feurtey A, Maryani N, Ordóñez N, Schneiders H, Nijbroek K, Wittenberg AHJ, Hofstede R, García-Bastidas F, Sørensen A, Swennen R, Drenth A, Stukenbrock EH, Kema GHJ, Seidl MF. Segmental duplications drive the evolution of accessory regions in a major crop pathogen. THE NEW PHYTOLOGIST 2024; 242:610-625. [PMID: 38402521 DOI: 10.1111/nph.19604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 02/01/2024] [Indexed: 02/26/2024]
Abstract
Many pathogens evolved compartmentalized genomes with conserved core and variable accessory regions (ARs) that carry effector genes mediating virulence. The fungal plant pathogen Fusarium oxysporum has such ARs, often spanning entire chromosomes. The presence of specific ARs influences the host range, and horizontal transfer of ARs can modify the pathogenicity of the receiving strain. However, how these ARs evolve in strains that infect the same host remains largely unknown. We defined the pan-genome of 69 diverse F. oxysporum strains that cause Fusarium wilt of banana, a significant constraint to global banana production, and analyzed the diversity and evolution of the ARs. Accessory regions in F. oxysporum strains infecting the same banana cultivar are highly diverse, and we could not identify any shared genomic regions and in planta-induced effectors. We demonstrate that segmental duplications drive the evolution of ARs. Furthermore, we show that recent segmental duplications specifically in accessory chromosomes cause the expansion of ARs in F. oxysporum. Taken together, we conclude that extensive recent duplications drive the evolution of ARs in F. oxysporum, which contribute to the evolution of virulence.
Collapse
Affiliation(s)
- Anouk C van Westerhoven
- Laboratory of Phytopathology, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands
- Department of Biology, Theoretical Biology & Bioinformatics, Utrecht University, Padualaan 8, 3584 CH, Utrecht, the Netherlands
| | - Carolina Aguilera-Galvez
- Laboratory of Phytopathology, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands
| | - Giuliana Nakasato-Tagami
- Laboratory of Phytopathology, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands
| | - Xiaoqian Shi-Kunne
- Laboratory of Phytopathology, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands
| | - Einar Martinez de la Parte
- Laboratory of Phytopathology, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands
| | - Edgar Chavarro-Carrero
- Laboratory of Phytopathology, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands
| | - Harold J G Meijer
- Laboratory of Phytopathology, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands
- Department Biointeractions and Plant Health, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands
| | - Alice Feurtey
- Christian-Albrechts University of Kiel, Christian-Albrechts-Platz 4, 24118, Kiel, Germany
- Max Planck Institute for Evolutionary Biology, August-Thienemann-Straße 2, 24306, Plön, Germany
- Plant Pathology, Eidgenössische Technische Hochschule Zürich, Rämistrasse 101, 8092, Zürich, Switzerland
| | - Nani Maryani
- Biology Education, Universitas Sultan Ageng Tirtayasa, Jalan Raya Palka No.Km 3, 42163, Banten, Indonesia
| | - Nadia Ordóñez
- Laboratory of Phytopathology, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands
| | - Harrie Schneiders
- KeyGene, Agro Business Park 90, 6708 PW, Wageningen, the Netherlands
| | - Koen Nijbroek
- KeyGene, Agro Business Park 90, 6708 PW, Wageningen, the Netherlands
| | | | - Rene Hofstede
- KeyGene, Agro Business Park 90, 6708 PW, Wageningen, the Netherlands
| | | | - Anker Sørensen
- KeyGene, Agro Business Park 90, 6708 PW, Wageningen, the Netherlands
| | - Ronny Swennen
- Division of Crop Biotechnics, Laboratory of Tropical Crop Improvement, Catholic University of Leuven, Oude Markt 13, 3000, Leuven, Belgium
- International Institute of Tropical Agriculture, Plot 15 Naguru E Rd, Kampala, PO Box 7878, Uganda
| | - Andre Drenth
- The University of Queensland, St Lucia, 4072, Brisbane, Queensland, Australia
| | - Eva H Stukenbrock
- Christian-Albrechts University of Kiel, Christian-Albrechts-Platz 4, 24118, Kiel, Germany
- Max Planck Institute for Evolutionary Biology, August-Thienemann-Straße 2, 24306, Plön, Germany
| | - Gert H J Kema
- Laboratory of Phytopathology, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands
| | - Michael F Seidl
- Department of Biology, Theoretical Biology & Bioinformatics, Utrecht University, Padualaan 8, 3584 CH, Utrecht, the Netherlands
| |
Collapse
|
5
|
Maurer-Alcalá XX, Cote-L’Heureux A, Kosakovsky Pond SL, Katz LA. Somatic genome architecture and molecular evolution are decoupled in "young" linage-specific gene families in ciliates. PLoS One 2024; 19:e0291688. [PMID: 38271450 PMCID: PMC10810533 DOI: 10.1371/journal.pone.0291688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Accepted: 09/02/2023] [Indexed: 01/27/2024] Open
Abstract
The evolution of lineage-specific gene families remains poorly studied across the eukaryotic tree of life, with most analyses focusing on the recent evolution of de novo genes in model species. Here we explore the origins of lineage-specific genes in ciliates, a ~1 billion year old clade of microeukaryotes that are defined by their division of somatic and germline functions into distinct nuclei. Previous analyses on conserved gene families have shown the effect of ciliates' unusual genome architecture on gene family evolution: extensive genome processing-the generation of thousands of gene-sized somatic chromosomes from canonical germline chromosomes-is associated with larger and more diverse gene families. To further study the relationship between ciliate genome architecture and gene family evolution, we analyzed lineage specific gene families from a set of 46 transcriptomes and 12 genomes representing x species from eight ciliate classes. We assess how the evolution lineage-specific gene families occurs among four groups of ciliates: extensive fragmenters with gene-size somatic chromosomes, non-extensive fragmenters with "large'' multi-gene somatic chromosomes, Heterotrichea with highly polyploid somatic genomes and Karyorelictea with 'paradiploid' somatic genomes. Our analyses demonstrate that: 1) most lineage-specific gene families are found at shallow taxonomic scales; 2) extensive genome processing (i.e., gene unscrambling) during development likely influences the size and number of young lineage-specific gene families; and 3) the influence of somatic genome architecture on molecular evolution is increasingly apparent in older gene families. Altogether, these data highlight the influences of genome architecture on the evolution of lineage-specific gene families in eukaryotes.
Collapse
Affiliation(s)
- Xyrus X. Maurer-Alcalá
- Institute of Cell Biology, University of Bern, Bern, Switzerland
- Department of Invertebrate Zoology, American Museum of Natural History, New York, New York, United States of America
| | - Auden Cote-L’Heureux
- Department of Biological Sciences, Smith College, Northampton, Massachusetts, United States of America
| | - Sergei L. Kosakovsky Pond
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, Pennsylvania, United States of America
| | - Laura A. Katz
- Department of Biological Sciences, Smith College, Northampton, Massachusetts, United States of America
- Program in Organismic and Evolutionary Biology, University of Massachusetts Amherst, Amherst, Massachusetts, United States of America
| |
Collapse
|
6
|
Larue GE, Roy SW. Where the minor things are: a pan-eukaryotic survey suggests neutral processes may explain much of minor intron evolution. Nucleic Acids Res 2023; 51:10884-10908. [PMID: 37819006 PMCID: PMC10639083 DOI: 10.1093/nar/gkad797] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 09/12/2023] [Accepted: 09/19/2023] [Indexed: 10/13/2023] Open
Abstract
Spliceosomal introns are gene segments removed from RNA transcripts by ribonucleoprotein machineries called spliceosomes. In some eukaryotes a second 'minor' spliceosome is responsible for processing a tiny minority of introns. Despite its seemingly modest role, minor splicing has persisted for roughly 1.5 billion years of eukaryotic evolution. Identifying minor introns in over 3000 eukaryotic genomes, we report diverse evolutionary histories including surprisingly high numbers in some fungi and green algae, repeated loss, as well as general biases in their positional and genic distributions. We estimate that ancestral minor intron densities were comparable to those of vertebrates, suggesting a trend of long-term stasis. Finally, three findings suggest a major role for neutral processes in minor intron evolution. First, highly similar patterns of minor and major intron evolution contrast with both functionalist and deleterious model predictions. Second, observed functional biases among minor intron-containing genes are largely explained by these genes' greater ages. Third, no association of intron splicing with cell proliferation in a minor intron-rich fungus suggests that regulatory roles are lineage-specific and thus cannot offer a general explanation for minor splicing's persistence. These data constitute the most comprehensive view of minor introns and their evolutionary history to date, and provide a foundation for future studies of these remarkable genetic elements.
Collapse
Affiliation(s)
- Graham E Larue
- Quantitative and Systems Biology Graduate Program, University of California Merced, Merced, CA 95343, USA
| | - Scott W Roy
- Department of Molecular and Cell Biology, University of California Merced, Merced, CA 95343, USA
- Department of Biology, San Francisco State University, San Francisco, CA 94132, USA
| |
Collapse
|
7
|
Jiang H, Zhao Z, Yu H, Lin Q, Liu Y. Evolutionary traits and functional roles of chemokines and their receptors in the male pregnancy of the Syngnathidae. MARINE LIFE SCIENCE & TECHNOLOGY 2023; 5:500-510. [PMID: 38045539 PMCID: PMC10689615 DOI: 10.1007/s42995-023-00205-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 10/27/2023] [Indexed: 12/05/2023]
Abstract
Vertebrates have developed various modes of reproduction, some of which are found in Teleosts. Over 300 species of the Syngnathidae (seahorses, pipefishes and seadragons) exhibit male pregnancies; the males have specialized brood pouches that provide immune protection, nourishment, and oxygen regulation. Chemokines play a vital role at the mammalian maternal-fetal interface; however, their functions in fish reproduction are unclear. This study revealed the evolutionary traits and potential functions of chemokine genes in 22 oviparous, ovoviviparous, and viviparous fish species through comparative genomic analyses. Our results showed that chemokine gene copy numbers and evolutionary rates vary among species with different modes of reproduction. Syngnathidae lost cxcl13 and cxcr5, which are involved in key receptor-ligand pairs for lymphoid organ development. Notably, Syngnathidae have site-specific mutations in cxcl12b and ccl44, suggesting immune function during gestation. Moreover, transcriptome analysis revealed that chemokine gene expression varies among Syngnathidae species with different types of brood pouches, suggesting adaptive variations in chemokine functions among seahorses and their relatives. Furthermore, challenge experiments on seahorse brood pouches revealed a joint immune function of chemokine genes during male pregnancy. This study provides insights into the evolutionary diversity of chemokine genes associated with different reproductive modes in fish. Supplementary Information The online version contains supplementary material available at 10.1007/s42995-023-00205-x.
Collapse
Affiliation(s)
- Han Jiang
- CAS Key Laboratory of Tropical Marine Bio-Resources and Ecology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, 510301 China
- Guangdong Provincial Key Laboratory of Applied Marine Biology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, 510301 China
- University of Chinese Academy of Sciences, Beijing, 101400 China
| | - Zhanwei Zhao
- CAS Key Laboratory of Tropical Marine Bio-Resources and Ecology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, 510301 China
- Guangdong Provincial Key Laboratory of Applied Marine Biology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, 510301 China
- University of Chinese Academy of Sciences, Beijing, 101400 China
| | - Haiyan Yu
- CAS Key Laboratory of Tropical Marine Bio-Resources and Ecology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, 510301 China
- Guangdong Provincial Key Laboratory of Applied Marine Biology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, 510301 China
| | - Qiang Lin
- CAS Key Laboratory of Tropical Marine Bio-Resources and Ecology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, 510301 China
- Guangdong Provincial Key Laboratory of Applied Marine Biology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, 510301 China
- University of Chinese Academy of Sciences, Beijing, 101400 China
| | - Yali Liu
- CAS Key Laboratory of Tropical Marine Bio-Resources and Ecology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, 510301 China
- Guangdong Provincial Key Laboratory of Applied Marine Biology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, 510301 China
- University of Chinese Academy of Sciences, Beijing, 101400 China
| |
Collapse
|
8
|
Suresh H, Crow M, Jorstad N, Hodge R, Lein E, Dobin A, Bakken T, Gillis J. Comparative single-cell transcriptomic analysis of primate brains highlights human-specific regulatory evolution. Nat Ecol Evol 2023; 7:1930-1943. [PMID: 37667001 PMCID: PMC10627823 DOI: 10.1038/s41559-023-02186-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 08/02/2023] [Indexed: 09/06/2023]
Abstract
Enhanced cognitive function in humans is hypothesized to result from cortical expansion and increased cellular diversity. However, the mechanisms that drive these phenotypic innovations remain poorly understood, in part because of the lack of high-quality cellular resolution data in human and non-human primates. Here, we take advantage of single-cell expression data from the middle temporal gyrus of five primates (human, chimp, gorilla, macaque and marmoset) to identify 57 homologous cell types and generate cell type-specific gene co-expression networks for comparative analysis. Although orthologue expression patterns are generally well conserved, we find 24% of genes with extensive differences between human and non-human primates (3,383 out of 14,131), which are also associated with multiple brain disorders. To assess the functional significance of gene expression differences in an evolutionary context, we evaluate changes in network connectivity across meta-analytic co-expression networks from 19 animals. We find that a subset of these genes has deeply conserved co-expression across all non-human animals, and strongly divergent co-expression relationships in humans (139 out of 3,383, <1% of primate orthologues). Genes with human-specific cellular expression and co-expression profiles (such as NHEJ1, GTF2H2, C2 and BBS5) typically evolve under relaxed selective constraints and may drive rapid evolutionary change in brain function.
Collapse
Affiliation(s)
- Hamsini Suresh
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | | | | | | | - Ed Lein
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Alexander Dobin
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | | | - Jesse Gillis
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
- Department of Physiology, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
9
|
Balasooriya GI, Wee TL, Spector DL. A sub-set of guanine- and cytosine-rich genes are actively transcribed at the nuclear Lamin B1 region. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.28.564411. [PMID: 37961255 PMCID: PMC10634887 DOI: 10.1101/2023.10.28.564411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Chromatin organization in the mammalian cell nucleus plays a vital role in the regulation of gene expression. The lamina-associated domain at the inner nuclear membrane has been proposed to harbor heterochromatin, while the nuclear interior has been shown to contain most of the euchromatin. Here, we show that a sub-set of actively transcribing genes, marked by RNA Pol II pSer2, are associated with Lamin B1 at the inner nuclear envelop in mESCs and the number of genes proportionally increases upon in vitro differentiation of mESC to olfactory precursor cells. These nuclear periphery-associated actively transcribing genes primarily represent housekeeping genes, and their gene bodies are significantly enriched with guanine and cytosine compared to genes actively transcribed at the nuclear interior. We found the promoters of these genes to also be significantly enriched with guanine and to be predominantly regulated by zinc finger protein transcription factors. We provide evidence supporting the emerging notion that the Lamin B1 region is not solely transcriptionally silent.
Collapse
|
10
|
Jain A, Begum T, Ahmad S. Analysis and Prediction of Pathogen Nucleic Acid Specificity for Toll-like Receptors in Vertebrates. J Mol Biol 2023; 435:168208. [PMID: 37479078 DOI: 10.1016/j.jmb.2023.168208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/20/2023] [Accepted: 07/13/2023] [Indexed: 07/23/2023]
Abstract
Identification of key sequence, expression and function related features of nucleic acid-sensing host proteins is of fundamental importance to understand the dynamics of pathogen-specific host responses. To meet this objective, we considered toll-like receptors (TLRs), a representative class of membrane-bound sensor proteins, from 17 vertebrate species covering mammals, birds, reptiles, amphibians, and fishes in this comparative study. We identified the molecular signatures of host TLRs that are responsible for sensing pathogen nucleic acids or other pathogen-associated molecular patterns (PAMPs), and potentially play important roles in host defence mechanism. Interestingly, our findings reveal that such host-specific features are directly related to the strand (single or double) specificity of nucleic acid from pathogens. However, during host-pathogen interactions, such features were unable to explain the pathogenic PAMP (i.e., DNA, RNA or other) selectivity, suggesting a more complex mechanism. Using these features, we developed a number of machine learning models, of which Random Forest achieved a high performance (94.57% accuracy) to predict strand specificity of TLRs from protein-derived features. We applied the trained model to propose strand specificity of some previously uncharacterized distinct fish-specific novel TLRs (TLR18, TLR23, TLR24, TLR25, TLR27).
Collapse
Affiliation(s)
- Anuja Jain
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India. https://twitter.com/@Anuja334
| | - Tina Begum
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| | - Shandar Ahmad
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| |
Collapse
|
11
|
McCoy MJ, Fire AZ. Ancient origins of complex neuronal genes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.28.534655. [PMID: 37034725 PMCID: PMC10081198 DOI: 10.1101/2023.03.28.534655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
How nervous systems evolved is a central question in biology. An increasing diversity of synaptic proteins is thought to play a central role in the formation of specific synapses leading to nervous system complexity. The largest animal genes, often spanning millions of base pairs, are known to be enriched for expression in neurons at synapses and are frequently mutated or misregulated in neurological disorders and diseases. While many of these genes have been studied independently in the context of nervous system evolution and disease, general principles underlying their parallel evolution remain unknown. To investigate this, we directly compared orthologous gene sizes across eukaryotes. By comparing relative gene sizes within organisms, we identified a distinct class of large genes with origins predating the diversification of animals and in many cases the emergence of dedicated neuronal cell types. We traced this class of ancient large genes through evolution and found orthologs of the large synaptic genes driving the immense complexity of metazoan nervous systems, including in humans and cephalopods. Moreover, we found that while these genes are evolving under strong purifying selection as demonstrated by low dN/dS scores, they have simultaneously grown larger and gained the most isoforms in animals. This work provides a new lens through which to view this distinctive class of large and multi-isoform genes and demonstrates how intrinsic genomic properties, such as gene length, can provide flexibility in molecular evolution and allow groups of genes and their host organisms to evolve toward complexity.
Collapse
Affiliation(s)
- Matthew J. McCoy
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
- Whitman Center, Marine Biological Laboratory, Woods Hole, MA 02543, USA
| | - Andrew Z. Fire
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
12
|
Jain N, Richter F, Adzhubei I, Sharp AJ, Gelb BD. Small open reading frames: a comparative genetics approach to validation. BMC Genomics 2023; 24:226. [PMID: 37127568 PMCID: PMC10152738 DOI: 10.1186/s12864-023-09311-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 04/13/2023] [Indexed: 05/03/2023] Open
Abstract
Open reading frames (ORFs) with fewer than 100 codons are generally not annotated in genomes, although bona fide genes of that size are known. Newer biochemical studies have suggested that thousands of small protein-coding ORFs (smORFs) may exist in the human genome, but the true number and the biological significance of the micropeptides they encode remain uncertain. Here, we used a comparative genomics approach to identify high-confidence smORFs that are likely protein-coding. We identified 3,326 high-confidence smORFs using constraint within human populations and evolutionary conservation as additional lines of evidence. Next, we validated that, as a group, our high-confidence smORFs are conserved at the amino-acid level rather than merely residing in highly conserved non-coding regions. Finally, we found that high-confidence smORFs are enriched among disease-associated variants from GWAS. Overall, our results highlight that smORF-encoded peptides likely have important functional roles in human disease.
Collapse
Affiliation(s)
- Niyati Jain
- Department of Genetics and Genomic Sciences and Mindich Child Health and Development Institute, Icahn School of Medicine at Mount, Hess Center for Science and Medicine, 1470 Madison Avenue, New York, NY, 10029, USA
- Present Address: Committee On Genetics, Genomics, and Systems Biology, The University of Chicago, Chicago, IL, USA
| | - Felix Richter
- Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ivan Adzhubei
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Genetics, Brigham and Women's Hospital, Boston, MA, USA
| | - Andrew J Sharp
- Department of Genetics and Genomic Sciences and Mindich Child Health and Development Institute, Icahn School of Medicine at Mount, Hess Center for Science and Medicine, 1470 Madison Avenue, New York, NY, 10029, USA
| | - Bruce D Gelb
- Department of Genetics and Genomic Sciences and Mindich Child Health and Development Institute, Icahn School of Medicine at Mount, Hess Center for Science and Medicine, 1470 Madison Avenue, New York, NY, 10029, USA.
- Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
13
|
Goodheart JA, Collins AG, Cummings MP, Egger B, Rawlinson KA. A phylogenomic approach to resolving interrelationships of polyclad flatworms, with implications for life-history evolution. ROYAL SOCIETY OPEN SCIENCE 2023; 10:220939. [PMID: 36998763 PMCID: PMC10049750 DOI: 10.1098/rsos.220939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 03/07/2023] [Indexed: 06/19/2023]
Abstract
Platyhelminthes (flatworms) are a diverse invertebrate phylum useful for exploring life-history evolution. Within Platyhelminthes, only two clades develop through a larval stage: free-living polyclads and parasitic neodermatans. Neodermatan larvae are considered evolutionarily derived, whereas polyclad larvae are hypothesized to be ancestral due to ciliary band similarities among polyclad and other spiralian larvae. However, larval evolution has been challenging to investigate within polyclads due to low support for deeper phylogenetic relationships. To investigate polyclad life-history evolution, we generated transcriptomic data for 21 species of polyclads to build a well-supported phylogeny for the group. The resulting tree provides strong support for deeper nodes, and we recover a new monophyletic clade of early branching cotyleans. We then used ancestral state reconstructions to investigate ancestral modes of development within Polycladida and more broadly within flatworms. In polyclads, we were unable to reconstruct the ancestral state of deeper nodes with significant support because early branching clades show diverse modes of development. This suggests a complex history of larval evolution in polyclads that likely includes multiple losses and/or multiple gains. However, our ancestral state reconstruction across a previously published platyhelminth phylogeny supports a direct developing prorhynchid/polyclad ancestor, which suggests that a larval stage in the life cycle evolved along the polyclad stem lineage or within polyclads.
Collapse
Affiliation(s)
- Jessica A. Goodheart
- Division of Invertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
- Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA 92037, USA
| | - Allen G. Collins
- NMFS, National Systematics Laboratory, National Museum of Natural History, Smithsonian Institution, MRC-153, PO Box 37012, Washington, DC 20013, USA
| | - Michael P. Cummings
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA
| | - Bernhard Egger
- Universität Innsbruck, Department of Zoology, Technikerstr. 25, 6020 Innsbruck, Austria
| | - Kate A. Rawlinson
- Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK
- Josephine Bay Paul Center, Marine Biological Laboratory, Woods Hole, MA, 02543
| |
Collapse
|
14
|
Gupta MK, Vadde R. Next-generation development and application of codon model in evolution. Front Genet 2023; 14:1091575. [PMID: 36777719 PMCID: PMC9911445 DOI: 10.3389/fgene.2023.1091575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 01/17/2023] [Indexed: 01/28/2023] Open
Abstract
To date, numerous nucleotide, amino acid, and codon substitution models have been developed to estimate the evolutionary history of any sequence/organism in a more comprehensive way. Out of these three, the codon substitution model is the most powerful. These models have been utilized extensively to detect selective pressure on a protein, codon usage bias, ancestral reconstruction and phylogenetic reconstruction. However, due to more computational demanding, in comparison to nucleotide and amino acid substitution models, only a few studies have employed the codon substitution model to understand the heterogeneity of the evolutionary process in a genome-scale analysis. Hence, there is always a question of how to develop more robust but less computationally demanding codon substitution models to get more accurate results. In this review article, the authors attempted to understand the basis of the development of different types of codon-substitution models and how this information can be utilized to develop more robust but less computationally demanding codon substitution models. The codon substitution model enables to detect selection regime under which any gene or gene region is evolving, codon usage bias in any organism or tissue-specific region and phylogenetic relationship between different lineages more accurately than nucleotide and amino acid substitution models. Thus, in the near future, these codon models can be utilized in the field of conservation, breeding and medicine.
Collapse
|
15
|
Moutinho AF, Eyre-Walker A, Dutheil JY. Strong evidence for the adaptive walk model of gene evolution in Drosophila and Arabidopsis. PLoS Biol 2022; 20:e3001775. [PMID: 36099311 PMCID: PMC9470001 DOI: 10.1371/journal.pbio.3001775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 08/01/2022] [Indexed: 11/19/2022] Open
Abstract
Understanding the dynamics of species adaptation to their environments has long been a central focus of the study of evolution. Theories of adaptation propose that populations evolve by “walking” in a fitness landscape. This “adaptive walk” is characterised by a pattern of diminishing returns, where populations further away from their fitness optimum take larger steps than those closer to their optimal conditions. Hence, we expect young genes to evolve faster and experience mutations with stronger fitness effects than older genes because they are further away from their fitness optimum. Testing this hypothesis, however, constitutes an arduous task. Young genes are small, encode proteins with a higher degree of intrinsic disorder, are expressed at lower levels, and are involved in species-specific adaptations. Since all these factors lead to increased protein evolutionary rates, they could be masking the effect of gene age. While controlling for these factors, we used population genomic data sets of Arabidopsis and Drosophila and estimated the rate of adaptive substitutions across genes from different phylostrata. We found that a gene’s evolutionary age significantly impacts the molecular rate of adaptation. Moreover, we observed that substitutions in young genes tend to have larger physicochemical effects. Our study, therefore, provides strong evidence that molecular evolution follows an adaptive walk model across a large evolutionary timescale. This study uses population genomic datasets from Arabidopsis and Drosophila to show that young genes adapt faster and are subject to mutations of larger fitness effects, providing strong evidence that molecular evolution follows an adaptive walk model across a large evolutionary timescale.
Collapse
Affiliation(s)
- Ana Filipa Moutinho
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
- * E-mail:
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Julien Y. Dutheil
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
- Unité Mixte de Recherche 5554 Institut des Sciences de l’Evolution, CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France
| |
Collapse
|
16
|
Karamycheva S, Wolf YI, Persi E, Koonin EV, Makarova KS. Analysis of lineage-specific protein family variability in prokaryotes combined with evolutionary reconstructions. Biol Direct 2022; 17:22. [PMID: 36042479 PMCID: PMC9425974 DOI: 10.1186/s13062-022-00337-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 08/13/2022] [Indexed: 12/24/2022] Open
Abstract
Background Evolutionary rate is a key characteristic of gene families that is linked to the functional importance of the respective genes as well as specific biological functions of the proteins they encode. Accurate estimation of evolutionary rates is a challenging task that requires precise phylogenetic analysis. Here we present an easy to estimate protein family level measure of sequence variability based on alignment column homogeneity in multiple alignments of protein sequences from Clade-Specific Clusters of Orthologous Genes (csCOGs). Results We report genome-wide estimates of variability for 8 diverse groups of bacteria and archaea and investigate the connection between variability and various genomic and biological features. The variability estimates are based on homogeneity distributions across amino acid sequence alignments and can be obtained for multiple groups of genomes at minimal computational expense. About half of the variance in variability values can be explained by the analyzed features, with the greatest contribution coming from the extent of gene paralogy in the given csCOG. The correlation between variability and paralogy appears to originate, primarily, not from gene duplication, but from acquisition of distant paralogs and xenologs, introducing sequence variants that are more divergent than those that could have evolved in situ during the lifetime of the given group of organisms. Both high-variability and low-variability csCOGs were identified in all functional categories, but as expected, proteins encoded by integrated mobile elements as well as proteins involved in defense functions and cell motility are, on average, more variable than proteins with housekeeping functions. Additionally, using linear discriminant analysis, we found that variability and fraction of genomes carrying a given gene are the two variables that provide the best prediction of gene essentiality as compared to the results of transposon mutagenesis in Sulfolobus islandicus. Conclusions Variability, a measure of sequence diversity within an alignment relative to the overall diversity within a group of organisms, offers a convenient proxy for evolutionary rate estimates and is informative with respect to prediction of functional properties of proteins. In particular, variability is a strong predictor of gene essentiality for the respective organisms and indicative of sub- or neofunctionalization of paralogs. Supplementary Information The online version contains supplementary material available at 10.1186/s13062-022-00337-7.
Collapse
Affiliation(s)
- Svetlana Karamycheva
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA
| | - Erez Persi
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA
| | - Kira S Makarova
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA.
| |
Collapse
|
17
|
Gui S, Wei W, Jiang C, Luo J, Chen L, Wu S, Li W, Wang Y, Li S, Yang N, Li Q, Fernie AR, Yan J. A pan-Zea genome map for enhancing maize improvement. Genome Biol 2022; 23:178. [PMID: 35999561 PMCID: PMC9396798 DOI: 10.1186/s13059-022-02742-7] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 07/27/2022] [Indexed: 12/22/2022] Open
Abstract
Background Maize (Zea mays L.) is at the vanguard facing the upcoming breeding challenges. However, both a super pan-genome for the Zea genus and a comprehensive genetic variation map for maize breeding are still lacking. Results Here, we construct an approximately 6.71-Gb pan-Zea genome that contains around 4.57-Gb non-B73 reference sequences from fragmented de novo assemblies of 721 pan-Zea individuals. We annotate a total of 58,944 pan-Zea genes and find around 44.34% of them are dispensable in the pan-Zea population. Moreover, 255,821 common structural variations are identified and genotyped in a maize association mapping panel. Further analyses reveal gene presence/absence variants and their potential roles during domestication of maize. Combining genetic analyses with multi-omics data, we demonstrate how structural variants are associated with complex agronomic traits. Conclusions Our results highlight the underexplored role of the pan-Zea genome and structural variations to further understand domestication of maize and explore their potential utilization in crop improvement. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-022-02742-7.
Collapse
Affiliation(s)
- Songtao Gui
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Wenjie Wei
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Chenglin Jiang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jingyun Luo
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Lu Chen
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Shenshen Wu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Wenqiang Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yuebin Wang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Shuyan Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Ning Yang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China.,Hubei Hongshan Laboratory, Wuhan, 430070, China
| | - Qing Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China.,Hubei Hongshan Laboratory, Wuhan, 430070, China
| | - Alisdair R Fernie
- Department of Molecular Physiology, Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476, Potsdam, Golm, Germany
| | - Jianbing Yan
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China. .,Hubei Hongshan Laboratory, Wuhan, 430070, China.
| |
Collapse
|
18
|
Song H, Guo Z, Zhang X, Sui J. De novo genes in Arachis hypogaea cv. Tifrunner: systematic identification, molecular evolution, and potential contributions to cultivated peanut. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 111:1081-1095. [PMID: 35748398 DOI: 10.1111/tpj.15875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 06/15/2022] [Accepted: 06/21/2022] [Indexed: 06/15/2023]
Abstract
De novo genes are derived from non-coding sequences, and they can play essential roles in organisms. Cultivated peanut (Arachis hypogaea) is a major oil and protein crop derived from a cross between Arachis duranensis and Arachis ipaensis. However, few de novo genes have been documented in Arachis. Here, we identified 381 de novo genes in A. hypogaea cv. Tifrunner based on comparison with five closely related Arachis species. There are distinct differences in gene expression patterns and gene structures between conserved and de novo genes. The identified de novo genes originated from ancestral sequence regions associated with metabolic and biosynthetic processes, and they were subsequently integrated into existing regulatory networks. De novo paralogs and homoeologs were identified in A. hypogaea cv. Tifrunner. De novo paralogs and homoeologs with conserved expression have mismatching cis-acting elements under normal growth conditions. De novo genes potentially have pluripotent functions in responses to biotic stresses as well as in growth and development based on quantitative trait locus data. This work provides a foundation for future research examining gene birth processes and gene function in Arachis and related taxa.
Collapse
Affiliation(s)
- Hui Song
- Grassland Agri-husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Zhonglong Guo
- State Key Laboratory of Protein and Plant Gene Research, Peking-Tsinghua Center for Life Sciences, School of Life Sciences and School of Advanced Agricultural Sciences, Peking University, Beijing, China
| | - Xiaojun Zhang
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| | - Jiongming Sui
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| |
Collapse
|
19
|
Purkanti R, Thattai M. Genome doubling enabled the expansion of yeast vesicle traffic pathways. Sci Rep 2022; 12:11213. [PMID: 35780185 PMCID: PMC9250509 DOI: 10.1038/s41598-022-15419-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 06/23/2022] [Indexed: 11/09/2022] Open
Abstract
Vesicle budding and fusion in eukaryotes depend on a suite of protein types, such as Arfs, Rabs, coats and SNAREs. Distinct paralogs of these proteins act at distinct intracellular locations, suggesting a link between gene duplication and the expansion of vesicle traffic pathways. Genome doubling, a common source of paralogous genes in fungi, provides an ideal setting in which to explore this link. Here we trace the fates of paralog doublets derived from the 100-Ma-old hybridization event that gave rise to the whole genome duplication clade of budding yeast. We find that paralog doublets involved in specific vesicle traffic functions and pathways are convergently retained across the entire clade. Vesicle coats and adaptors involved in secretory and early-endocytic pathways are retained as doublets, at rates several-fold higher than expected by chance. Proteins involved in later endocytic steps and intra-Golgi traffic, including the entire set of multi-subunit and coiled-coil tethers, have reverted to singletons. These patterns demonstrate that selection has acted to expand and diversify the yeast vesicle traffic apparatus, across species and time.
Collapse
Affiliation(s)
- Ramya Purkanti
- Center for Integrative Genomics, Université de Lausanne, Lausanne, Switzerland
| | - Mukund Thattai
- Simons Centre for the Study of Living Machines, National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India.
| |
Collapse
|
20
|
Jiang L, Fan T, Li X, Xu J. Functional Heterogeneity of the Young and Old Duplicate Genes in Tung Tree ( Vernicia fordii). FRONTIERS IN PLANT SCIENCE 2022; 13:902649. [PMID: 35800614 PMCID: PMC9253867 DOI: 10.3389/fpls.2022.902649] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 05/12/2022] [Indexed: 06/15/2023]
Abstract
Genes are subject to birth and death during the long evolutionary period. Here, young and old duplicate genes were identified in Vernicia fordii. We performed integrative analyses, including expression pattern, gene complexity, evolution, and functional divergence between young and old duplicate genes. Compared with young genes, old genes have higher values of Ka and Ks, lower Ka/Ks values, and lower average intrinsic structural disorder (ISD) values. Gene ontology and RNA-seq suggested that most young and old duplicate genes contained asymmetric functions. Only old duplicate genes are likely to participate in response to Fusarium wilt infection and exhibit divergent expression patterns. Our data suggest that young genes differ from older genes not only by evolutionary properties but also by their function and structure. These results highlighted the characteristics and diversification of the young and old genes in V. fordii and provided a systematic analysis of these genes in the V. fordii genome.
Collapse
Affiliation(s)
- Lan Jiang
- Key Laboratory of Non-coding RNA Transformation Research of Anhui Higher Education Institution, Yijishan Hospital of Wannan Medical College, Wuhu, China
- Central Laboratory, Yijishan Hospital of Wannan Medical College, Wuhu, China
- Clinical Research Center for Critical Respiratory Medicine of Anhui Province, Wuhu, China
| | - Tingting Fan
- The Laboratory of Forestry Genetics, Central South University of Forestry and Technology, Changsha, China
| | - Xiaoxu Li
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha, China
| | - Jun Xu
- Hunan Institute of Microbiology, Changsha, China
| |
Collapse
|
21
|
Raxwal VK, Singh S, Agarwal M, Riha K. Transcriptional and post-transcriptional regulation of young genes in plants. BMC Biol 2022; 20:134. [PMID: 35676681 PMCID: PMC9178820 DOI: 10.1186/s12915-022-01339-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 05/30/2022] [Indexed: 12/03/2022] Open
Abstract
Background New genes continuously emerge from non-coding DNA or by diverging from existing genes, but most of them are rapidly lost and only a few become fixed within the population. We hypothesized that young genes are subject to transcriptional and post-transcriptional regulation to limit their expression and minimize their exposure to purifying selection. Results We performed a protein-based homology search across the tree of life to determine the evolutionary age of protein-coding genes present in the rice genome. We found that young genes in rice have relatively low expression levels, which can be attributed to distal enhancers, and closed chromatin conformation at their transcription start sites (TSS). The chromatin in TSS regions can be re-modeled in response to abiotic stress, indicating conditional expression of young genes. Furthermore, transcripts of young genes in Arabidopsis tend to be targeted by nonsense-mediated RNA decay, presenting another layer of regulation limiting their expression. Conclusions These data suggest that transcriptional and post-transcriptional mechanisms contribute to the conditional expression of young genes, which may alleviate purging selection while providing an opportunity for phenotypic exposure and functionalization. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-022-01339-7.
Collapse
Affiliation(s)
- Vivek Kumar Raxwal
- Department of Botany, University of Delhi, Delhi, 110007, India. .,Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czech Republic.
| | - Somya Singh
- Department of Botany, University of Delhi, Delhi, 110007, India
| | - Manu Agarwal
- Department of Botany, University of Delhi, Delhi, 110007, India.
| | - Karel Riha
- Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czech Republic.
| |
Collapse
|
22
|
Zhang Y, Chai M, Zhang X, Yang G, Yao X, Song H. The fate of drought-related genes after polyploidization in Arachis hypogaea cv. Tifrunner. PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2022; 28:1249-1259. [PMID: 35910439 PMCID: PMC9334475 DOI: 10.1007/s12298-022-01198-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 05/25/2022] [Accepted: 06/06/2022] [Indexed: 06/03/2023]
Abstract
Drought stress affects plant growth and development. Cultivated peanut (Arachis hypogaea) was formed by a cross between A. duranensis and A. ipaensis. The drought tolerance of A. duranensis and A. ipaensis is reportedly stronger than that of cultivated peanut. However, there has been little study of drought tolerance genes in Arachis. In this study, we compared drought tolerance genes between A. hypogaea cv. Tifrunner and its diploid donors. We have observed that polyploidization does not generate more drought tolerance genes in A. hypogaea cv. Tifrunner but promotes the loss of many ancient drought tolerance genes. Although putative drought tolerance genes occurred on gene duplication events in A. hypogaea cv. Tifrunner, most copies lacked drought tolerance. These findings suggest that the loss of drought tolerance genes in A. hypogaea cv. Tifrunner could possibly result in weaker drought tolerance. In addition, we have observed that the three Arachis species stochastically lost putative drought tolerance genes. The evolution of drought tolerance genes could possibly have correlated with environmental changes. Our results enhance the current understanding of drought tolerance and polyploidy evolution in Arachis species. Supplementary Information The online version contains supplementary material available at 10.1007/s12298-022-01198-0.
Collapse
Affiliation(s)
- Yongli Zhang
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Maofeng Chai
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Xiaojun Zhang
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| | - Guofeng Yang
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Xiang Yao
- Institute of Botany, Jiangsu Province and Chinese Academy of Sciences (Nanjing Botanical Garden Mem. Sun Yat-Sen), Nanjing, China
| | - Hui Song
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| |
Collapse
|
23
|
Li Z, Zhang Y, Li W, Irwin AJ, Finkel ZV. Conservation and architecture of housekeeping genes in the model marine diatom Thalassiosira pseudonana. THE NEW PHYTOLOGIST 2022; 234:1363-1376. [PMID: 35179783 DOI: 10.1111/nph.18039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 02/06/2022] [Indexed: 06/14/2023]
Abstract
Housekeeping genes (HKGs) are constitutively expressed with low variation across tissues/conditions. They are thought to be highly conserved and fundamental to cellular maintenance, with distinctive genomic features. Here, we identify 1505 HKGs in the unicellular marine diatom Thalassiosira pseudonana based on an RNA-seq analysis of 232 samples taken under 12 experimental conditions over 0-72 h. We identify promising internal reference genes (IRGs) for T. pseudonana from the most stably expressed HKGs. A comparative analysis indicates < 18% of HKGs in T. pseudonana have orthologs in other eukaryotes, including other diatom species. Contrary to work on human tissues, T. pseudonana HKGs are longer than non-HKGs, due to elongated introns. More ancient HKGs tend to be shorter than more recent HKGs, and expression levels of HKGs decrease more rapidly with gene length relative to non-HKGs. Our results indicate that HKGs are highly variable across the tree of life and thus unlikely to be universally fundamental for cellular maintenance. We hypothesize that the distinct genomic features of HKGs of T. pseudonana may be a consequence of selection pressures associated with high expression and low variance across conditions.
Collapse
Affiliation(s)
- Zhengke Li
- School of Food and Biological Engineering, Shaanxi University of Science and Technology, Weiyang University Park, Xi'an, Shaanxi, 710021, China
- Department of Oceanography, Dalhousie University, 1355 Oxford St, Halifax, NS, B3H 4R2, Canada
| | - Yong Zhang
- Department of Oceanography, Dalhousie University, 1355 Oxford St, Halifax, NS, B3H 4R2, Canada
- College of Environmental Science and Engineering, Fujian Key Laboratory of Pollution Control and Resource Recycling, Fujian Normal University, No. 8 Shangsan Road, Fuzhou, Fujian, 350007, China
| | - Wei Li
- College of Life and Environmental Sciences, Huangshan University, 39 Xihai Road, Huangshan, Anhui, 245041, China
| | - Andrew J Irwin
- Department of Mathematics & Statistics, Dalhousie University, 1355 Oxford St, Halifax, NS, B3H 4R2, Canada
| | - Zoe V Finkel
- Department of Oceanography, Dalhousie University, 1355 Oxford St, Halifax, NS, B3H 4R2, Canada
| |
Collapse
|
24
|
Soni V, Eyre-Walker A. OUP accepted manuscript. Genome Biol Evol 2022; 14:6528851. [PMID: 35166775 PMCID: PMC8882387 DOI: 10.1093/gbe/evac028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/09/2022] [Indexed: 12/05/2022] Open
Abstract
The rate of amino acid substitution has been shown to be correlated to a number of factors including the rate of recombination, the age of the gene, the length of the protein, mean expression level, and gene function. However, the extent to which these correlations are due to adaptive and nonadaptive evolution has not been studied in detail, at least not in hominids. We find that the rate of adaptive evolution is significantly positively correlated to the rate of recombination, protein length and gene expression level, and negatively correlated to gene age. These correlations remain significant when each factor is controlled for in turn, except when controlling for expression in an analysis of protein length; and they also generally remain significant when biased gene conversion is taken into account. However, the positive correlations could be an artifact of population size contraction. We also find that the rate of nonadaptive evolution is negatively correlated to each factor, and all these correlations survive controlling for each other and biased gene conversion. Finally, we examine the effect of gene function on rates of adaptive and nonadaptive evolution; we confirm that virus-interacting proteins (VIPs) have higher rates of adaptive and lower rates of nonadaptive evolution, but we also demonstrate that there is significant variation in the rate of adaptive and nonadaptive evolution between GO categories when removing VIPs. We estimate that the VIP/non-VIP axis explains about 5–8 fold more of the variance in evolutionary rate than GO categories.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
- Corresponding author: E-mail:
| |
Collapse
|
25
|
Ruzzante L, Feron R, Reijnders MJMF, Thiébaut A, Waterhouse RM. Functional constraints on insect immune system components govern their evolutionary trajectories. Mol Biol Evol 2021; 39:6459179. [PMID: 34893861 PMCID: PMC8788225 DOI: 10.1093/molbev/msab352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Roles of constraints in shaping evolutionary outcomes are often considered in the contexts of developmental biology and population genetics, in terms of capacities to generate new variants and how selection limits or promotes consequent phenotypic changes. Comparative genomics also recognizes the role of constraints, in terms of shaping evolution of gene and genome architectures, sequence evolutionary rates, and gene gains or losses, as well as on molecular phenotypes. Characterizing patterns of genomic change where putative functions and interactions of system components are relatively well described offers opportunities to explore whether genes with similar roles exhibit similar evolutionary trajectories. Using insect immunity as our test case system, we hypothesize that characterizing gene evolutionary histories can define distinct dynamics associated with different functional roles. We develop metrics that quantify gene evolutionary histories, employ these to characterize evolutionary features of immune gene repertoires, and explore relationships between gene family evolutionary profiles and their roles in immunity to understand how different constraints may relate to distinct dynamics. We identified three main axes of evolutionary trajectories characterized by gene duplication and synteny, maintenance/stability and sequence conservation, and loss and sequence divergence, highlighting similar and contrasting patterns across these axes amongst subsets of immune genes. Our results suggest that where and how genes participate in immune responses limit the range of possible evolutionary scenarios they exhibit. The test case study system of insect immunity highlights the potential of applying comparative genomics approaches to characterize how functional constraints on different components of biological systems govern their evolutionary trajectories.
Collapse
Affiliation(s)
- Livio Ruzzante
- Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - Romain Feron
- Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - Maarten J M F Reijnders
- Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - Antonin Thiébaut
- Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - Robert M Waterhouse
- Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| |
Collapse
|
26
|
Papadopoulos C, Callebaut I, Gelly JC, Hatin I, Namy O, Renard M, Lespinet O, Lopes A. Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution. Genome Res 2021; 31:2303-2315. [PMID: 34810219 PMCID: PMC8647833 DOI: 10.1101/gr.275638.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 09/23/2021] [Indexed: 01/08/2023]
Abstract
The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences' properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic open reading frames (ORFs) of S. cerevisiae with the aim of (1) exploring whether the structural states' diversity of proteomes is already present in noncoding sequences, and (2) estimating the potential of the noncoding genome to produce novel protein bricks that could either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural state diversity of canonical proteins, with the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by reconstructing the ancestral sequences of 70 yeast de novo genes and characterized the sequence and structural properties of intergenic ORFs with a strong translation signal. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe.
Collapse
Affiliation(s)
- Chris Papadopoulos
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005 Paris, France
| | - Jean-Christophe Gelly
- Université de Paris, Biologie Intégrée du Globule Rouge, UMR_S1134, BIGR, INSERM, F-75015 Paris, France
- Laboratoire d'Excellence GR-Ex, 75015 Paris, France
- Institut National de la Transfusion Sanguine, F-75015 Paris, France
| | - Isabelle Hatin
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Namy
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Maxime Renard
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Lespinet
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Anne Lopes
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| |
Collapse
|
27
|
Vedelek B, Kovács Á, Boros IM. Evolutionary mode for the functional preservation of fast-evolving Drosophila telomere capping proteins. Open Biol 2021; 11:210261. [PMID: 34784790 PMCID: PMC8596017 DOI: 10.1098/rsob.210261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
DNA end protection is fundamental for the long-term preservation of the genome. In vertebrates the Shelterin protein complex protects telomeric DNA ends, thereby contributing to the maintenance of genome integrity. In the Drosophila genus, this function is thought to be performed by the Terminin complex, an assembly of fast-evolving subunits. Considering that DNA end protection is fundamental for successful genome replication, the accelerated evolution of Terminin subunits is counterintuitive, as conservation is supposed to maintain the assembly and concerted function of the interacting partners. This problem extends over Drosophila telomere biology and provides insight into the evolution of protein assemblies. In order to learn more about the mechanistic details of this phenomenon we have investigated the intra- and interspecies assemblies of Verrocchio and Modigliani, two Terminin subunits using in vitro assays. Based on our results and on homology-based three-dimensional models for Ver and Moi, we conclude that both proteins contain Ob-fold and contribute to the ssDNA binding of the Terminin complex. We propose that the preservation of Ver function is achieved by conservation of specific amino acids responsible for folding or localized in interacting surfaces. We also provide here the first evidence on Moi DNA binding.
Collapse
Affiliation(s)
- Balázs Vedelek
- Department of Biochemistry and Molecular Biology, University of Szeged, Szeged, Hungary,Institute of Biochemistry, Biological Research Centre, Szeged, Hungary
| | - Ákos Kovács
- Department of Biochemistry and Molecular Biology, University of Szeged, Szeged, Hungary
| | - Imre M. Boros
- Department of Biochemistry and Molecular Biology, University of Szeged, Szeged, Hungary,Institute of Biochemistry, Biological Research Centre, Szeged, Hungary
| |
Collapse
|
28
|
de Souza ID, Reis CF, Morais DAA, Fernandes VGS, Cavalcante JVF, Dalmolin RJS. Ancestry analysis indicates two different sets of essential genes in eukaryotic model species. Funct Integr Genomics 2021; 21:523-531. [PMID: 34279742 DOI: 10.1007/s10142-021-00794-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 06/02/2021] [Accepted: 06/10/2021] [Indexed: 11/28/2022]
Abstract
Essential genes are so-called because they are crucial for organism perpetuation. Those genes are usually related to essential functions to cellular metabolism or multicellular homeostasis. Deleterious alterations on essential genes produce a spectrum of phenotypes in multicellular organisms. The effects range from the impairment of the fertilization process, disruption of fetal development, to loss of reproductive capacity. Essential genes are described as more evolutionarily conserved than non-essential genes. However, there is no consensus about the relationship between gene essentiality and gene age. Here, we identified essential genes in five model eukaryotic species (Saccharomyces cerevisiae, Schizosaccharomyces pombe, Drosophila melanogaster, Caenorhabditis elegans, and Mus musculus) and estimate their evolutionary ancestry and their network properties. We observed that essential genes, on average, are older than other genes in all species investigated. The relationship of network properties and gene essentiality convey with previous findings, showing essential genes as important nodes in biological networks. As expected, we also observed that essential orthologs shared by the five species evaluated here are old. However, all the species evaluated here have a specific set of young essential genes not shared among them. Additionally, these two groups of essential genes are involved with distinct biological functions, suggesting two sets of essential genes: (i) a set of old essential genes common to all the evaluated species, regulating basic cellular functions, and (ii) a set of young essential genes exclusive to each species, which perform specific essential functions in each species.
Collapse
Affiliation(s)
- Iara D de Souza
- Bioinformatics Multidisciplinary Environment - IMD, Federal University of Rio Grande Do Norte, Av. Odilon Gomes de Lima, 1722, Capim Macio, Natal, RN, 59078-400, Brazil
| | - Clovis F Reis
- Bioinformatics Multidisciplinary Environment - IMD, Federal University of Rio Grande Do Norte, Av. Odilon Gomes de Lima, 1722, Capim Macio, Natal, RN, 59078-400, Brazil
| | - Diego A A Morais
- Bioinformatics Multidisciplinary Environment - IMD, Federal University of Rio Grande Do Norte, Av. Odilon Gomes de Lima, 1722, Capim Macio, Natal, RN, 59078-400, Brazil
| | - Vítor G S Fernandes
- Bioinformatics Multidisciplinary Environment - IMD, Federal University of Rio Grande Do Norte, Av. Odilon Gomes de Lima, 1722, Capim Macio, Natal, RN, 59078-400, Brazil
| | - João Vitor F Cavalcante
- Bioinformatics Multidisciplinary Environment - IMD, Federal University of Rio Grande Do Norte, Av. Odilon Gomes de Lima, 1722, Capim Macio, Natal, RN, 59078-400, Brazil
| | - Rodrigo J S Dalmolin
- Bioinformatics Multidisciplinary Environment - IMD, Federal University of Rio Grande Do Norte, Av. Odilon Gomes de Lima, 1722, Capim Macio, Natal, RN, 59078-400, Brazil. .,Department of Biochemistry - CB, Federal University of Rio Grande Do Norte, Campus Universitário UFRN, Lagoa Nova, Natal, RN, 59078-970, Brazil.
| |
Collapse
|
29
|
Congrains C, Zucchi RA, de Brito RA. Phylogenomic approach reveals strong signatures of introgression in the rapid diversification of neotropical true fruit flies (Anastrepha: Tephritidae). Mol Phylogenet Evol 2021; 162:107200. [PMID: 33984467 DOI: 10.1016/j.ympev.2021.107200] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 01/30/2021] [Accepted: 05/03/2021] [Indexed: 01/08/2023]
Abstract
New sequencing techniques have allowed us to explore the variation on thousands of genes and elucidate evolutionary relationships of lineages even in complex scenarios, such as when there is rapid diversification. That seems to be the case of species in the genus Anastrepha, which shows great species diversity that has been divided into 21 species groups, several of which show wide geographical distribution. The fraterculus group has several economically important species and it is also an outstanding model for speciation studies, since it includes several lineages that have diverged recently possibly in the presence of interspecific gene flow. Our main goal is to test whether we can infer phylogenetic relationships of recently diverged taxa with gene flow, such as what is expected for the fraterculus group and determine whether certain genes remain informative even in this complex scenario. An analysis of thousands of orthologous genes derived from transcriptome datasets of 10 different lineages across the genus, including some of the economically most important pests, revealed signals of incomplete lineage sorting, vestiges of ancestral introgression between more distant lineages and ongoing gene flow between closely related lineages. Though these patterns affect the phylogenetic signal, the phylogenomic inferences consistently show that the morphologically identified species here investigated are in different evolutionary lineages, with the sole exception involving Brazilian lineages of A. fraterculus, which has been suggested to be a complex assembly of cryptic species. A tree space analysis suggested that genes with greater phylogenetic resolution have evolved under similar selection pressures and are more resilient to intraspecific gene flow, which would make it more likely that these genomic regions may be useful for identifying fraterculus group lineages. Our findings help establish relationships among the most important Anastrepha species groups, as well as bring further data to indicate that the diversification of fraterculus group lineages, and even other lineages in the genus Anastrepha, has been strongly influenced by interspecific gene flow.
Collapse
Affiliation(s)
- Carlos Congrains
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos, SP, Brazil.
| | - Roberto A Zucchi
- Escola Superior de Agricultura "Luiz de Queiroz" - ESALQ, Universidade de São Paulo - USP, Piracicaba, SP, Brazil
| | - Reinaldo A de Brito
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos, SP, Brazil
| |
Collapse
|
30
|
Park JC, Kim DH, Kim MS, Hagiwara A, Lee JS. The genome of the euryhaline rotifer Brachionus paranguensis: Potential use in molecular ecotoxicology. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY D-GENOMICS & PROTEOMICS 2021; 39:100836. [PMID: 33940320 DOI: 10.1016/j.cbd.2021.100836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Revised: 04/03/2021] [Accepted: 04/07/2021] [Indexed: 10/21/2022]
Abstract
Brachionus spp. rotifers have been proposed as model organisms for ecotoxicological studies. We analyzed the whole-genome sequence of B. paranguensis through NextDenovo, resulting in a total length of 106.2 Mb and 71 contigs. The N50 and the GC content were 4.13 Mb and 28%, respectively. A total of 18,501 genes were predicted within the genome of B. paranguensis. Prominent detoxification-related gene families of phase I and II detoxifications have been investigated. In parallel with other Brachionus rotifers, high gene expansion was observed in CYP clan 3 and GST sigma class in B. paranguensis. Moreover, species-specific expansion of sulfotransferase (SULTs) and gain of UDP-glucuronosyltransferases (UGTs) through horizontal gene transfer has been specifically found within B. plicatilis complex. This whole-genome analysis of B. paranguensis provides a basis for molecular ecotoxicological studies and provides useful information for comparative studies of the evolution of detoxification mechanisms in Brachionus spp.
Collapse
Affiliation(s)
- Jun Chul Park
- Department of Biological Sciences, College of Science, Sungkyunkwan University, Suwon 16419, South Korea
| | - Duck-Hyun Kim
- Department of Biological Sciences, College of Science, Sungkyunkwan University, Suwon 16419, South Korea
| | - Min-Sub Kim
- Department of Biological Sciences, College of Science, Sungkyunkwan University, Suwon 16419, South Korea
| | - Atsushi Hagiwara
- Graduate School of Fisheries and Environmental Sciences, Nagasaki University, Nagasaki 852-8521, Japan; Organization for Marine Science and Technology, Nagasaki University, Nagasaki 852-8521, Japan
| | - Jae-Seong Lee
- Department of Biological Sciences, College of Science, Sungkyunkwan University, Suwon 16419, South Korea.
| |
Collapse
|
31
|
Desvignes T, Sydes J, Montfort J, Bobe J, Postlethwait JH. Evolution after Whole-Genome Duplication: Teleost MicroRNAs. Mol Biol Evol 2021; 38:3308-3331. [PMID: 33871629 PMCID: PMC8321539 DOI: 10.1093/molbev/msab105] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
MicroRNAs (miRNAs) are important gene expression regulators implicated in many biological processes, but we lack a global understanding of how miRNA genes evolve and contribute to developmental canalization and phenotypic diversification. Whole-genome duplication events likely provide a substrate for species divergence and phenotypic change by increasing gene numbers and relaxing evolutionary pressures. To understand the consequences of genome duplication on miRNA evolution, we studied miRNA genes following the teleost genome duplication (TGD). Analysis of miRNA genes in four teleosts and in spotted gar, whose lineage diverged before the TGD, revealed that miRNA genes were retained in ohnologous pairs more frequently than protein-coding genes, and that gene losses occurred rapidly after the TGD. Genomic context influenced retention rates, with clustered miRNA genes retained more often than nonclustered miRNA genes and intergenic miRNA genes retained more frequently than intragenic miRNA genes, which often shared the evolutionary fate of their protein-coding host. Expression analyses revealed both conserved and divergent expression patterns across species in line with miRNA functions in phenotypic canalization and diversification, respectively. Finally, major strands of miRNA genes experienced stronger purifying selection, especially in their seeds and 3'-complementary regions, compared with minor strands, which nonetheless also displayed evolutionary features compatible with constrained function. This study provides the first genome-wide, multispecies analysis of the mechanisms influencing metazoan miRNA evolution after whole-genome duplication.
Collapse
Affiliation(s)
- Thomas Desvignes
- Institute of Neuroscience, University of Oregon, Eugene, OR, USA
| | - Jason Sydes
- Institute of Neuroscience, University of Oregon, Eugene, OR, USA
| | | | | | | |
Collapse
|
32
|
Aromolaran O, Aromolaran D, Isewon I, Oyelade J. Machine learning approach to gene essentiality prediction: a review. Brief Bioinform 2021; 22:6219158. [PMID: 33842944 DOI: 10.1093/bib/bbab128] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 03/04/2021] [Accepted: 03/17/2021] [Indexed: 12/17/2022] Open
Abstract
Essential genes are critical for the growth and survival of any organism. The machine learning approach complements the experimental methods to minimize the resources required for essentiality assays. Previous studies revealed the need to discover relevant features that significantly classify essential genes, improve on the generalizability of prediction models across organisms, and construct a robust gold standard as the class label for the train data to enhance prediction. Findings also show that a significant limitation of the machine learning approach is predicting conditionally essential genes. The essentiality status of a gene can change due to a specific condition of the organism. This review examines various methods applied to essential gene prediction task, their strengths, limitations and the factors responsible for effective computational prediction of essential genes. We discussed categories of features and how they contribute to the classification performance of essentiality prediction models. Five categories of features, namely, gene sequence, protein sequence, network topology, homology and gene ontology-based features, were generated for Caenorhabditis elegans to perform a comparative analysis of their essentiality prediction capacity. Gene ontology-based feature category outperformed other categories of features majorly due to its high correlation with the genes' biological functions. However, the topology feature category provided the highest discriminatory power making it more suitable for essentiality prediction. The major limiting factor of machine learning to predict essential genes conditionality is the unavailability of labeled data for interest conditions that can train a classifier. Therefore, cooperative machine learning could further exploit models that can perform well in conditional essentiality predictions. SHORT ABSTRACT Identification of essential genes is imperative because it provides an understanding of the core structure and function, accelerating drug targets' discovery, among other functions. Recent studies have applied machine learning to complement the experimental identification of essential genes. However, several factors are limiting the performance of machine learning approaches. This review aims to present the standard procedure and resources available for predicting essential genes in organisms, and also highlight the factors responsible for the current limitation in using machine learning for conditional gene essentiality prediction. The choice of features and ML technique was identified as an important factor to predict essential genes effectively.
Collapse
Affiliation(s)
- Olufemi Aromolaran
- Department of Computer and Information Sciences, Covenant University, Ota, Ogun State, Nigeria.,Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, Nigeria
| | - Damilare Aromolaran
- Department of Computer and Information Sciences, Covenant University, Ota, Ogun State, Nigeria.,Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, Nigeria
| | - Itunuoluwa Isewon
- Department of Computer and Information Sciences, Covenant University, Ota, Ogun State, Nigeria.,Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, Nigeria
| | - Jelili Oyelade
- Department of Computer and Information Sciences, Covenant University, Ota, Ogun State, Nigeria.,Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, Nigeria
| |
Collapse
|
33
|
Lopes I, Altab G, Raina P, de Magalhães JP. Gene Size Matters: An Analysis of Gene Length in the Human Genome. Front Genet 2021; 12:559998. [PMID: 33643374 PMCID: PMC7905317 DOI: 10.3389/fgene.2021.559998] [Citation(s) in RCA: 61] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Accepted: 01/06/2021] [Indexed: 12/23/2022] Open
Abstract
While it is expected for gene length to be associated with factors such as intron number and evolutionary conservation, we are yet to understand the connections between gene length and function in the human genome. In this study, we show that, as expected, there is a strong positive correlation between gene length, transcript length, and protein size as well as a correlation with the number of genetic variants and introns. Among tissue-specific genes, we find that the longest transcripts tend to be expressed in the blood vessels, nerves, thyroid, cervix uteri, and the brain, while the smallest transcripts tend to be expressed in the pancreas, skin, stomach, vagina, and testis. We report, as shown previously, that natural selection suppresses changes for genes with longer transcripts and promotes changes for genes with smaller transcripts. We also observe that genes with longer transcripts tend to have a higher number of co-expressed genes and protein-protein interactions, as well as more associated publications. In the functional analysis, we show that bigger transcripts are often associated with neuronal development, while smaller transcripts tend to play roles in skin development and in the immune system. Furthermore, pathways related to cancer, neurons, and heart diseases tend to have genes with longer transcripts, with smaller transcripts being present in pathways related to immune responses and neurodegenerative diseases. Based on our results, we hypothesize that longer genes tend to be associated with functions that are important in the early development stages, while smaller genes tend to play a role in functions that are important throughout the whole life, like the immune system, which requires fast responses.
Collapse
Affiliation(s)
| | | | | | - João Pedro de Magalhães
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, United Kingdom
| |
Collapse
|
34
|
Gao C, Ma C, Wang H, Zhong H, Zang J, Zhong R, He F, Yang D. Intrinsic disorder in protein domains contributes to both organism complexity and clade-specific functions. Sci Rep 2021; 11:2985. [PMID: 33542394 PMCID: PMC7862400 DOI: 10.1038/s41598-021-82656-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 01/22/2021] [Indexed: 11/09/2022] Open
Abstract
Interestingly, some protein domains are intrinsically disordered (abbreviated as IDD), and the disorder degree of same domains may differ in different contexts. However, the evolutionary causes and biological significance of these phenomena are unclear. Here, we address these issues by genome-wide analyses of the evolutionary and functional features of IDDs in 1,870 species across the three superkingdoms. As the result, there is a significant positive correlation between the proportion of IDDs and organism complexity with some interesting exceptions. These phenomena may be due to the high disorder of clade-specific domains and the different disorder degrees of the domains shared in different clades. The functions of IDDs are clade-specific and the higher proportion of post-translational modification sites may contribute to their complex functions. Compared with metazoans, fungi have more IDDs with a consecutive disorder region but a low disorder ratio, which reflects their different functional requirements. As for disorder variation, it’s greater for domains among different proteins than those within the same proteins. Some clade-specific ‘no-variation’ or ‘high-variation’ domains are involved in clade-specific functions. In sum, intrinsic domain disorder is related to both the organism complexity and clade-specific functions. These results deepen the understanding of the evolution and function of IDDs.
Collapse
Affiliation(s)
- Chao Gao
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, 38 Science Park Road, Changping District, Beijing, 102206, China
| | - Chong Ma
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, 38 Science Park Road, Changping District, Beijing, 102206, China.,Beijing Key Laboratory of Environmental and Viral Oncology, College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| | - Huqiang Wang
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, 38 Science Park Road, Changping District, Beijing, 102206, China
| | - Haolin Zhong
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, 38 Science Park Road, Changping District, Beijing, 102206, China
| | - Jiayin Zang
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, 38 Science Park Road, Changping District, Beijing, 102206, China
| | - Rugang Zhong
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| | - Fuchu He
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, 38 Science Park Road, Changping District, Beijing, 102206, China.
| | - Dong Yang
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, 38 Science Park Road, Changping District, Beijing, 102206, China.
| |
Collapse
|
35
|
Evolutionary History of Alzheimer Disease-Causing Protein Family Presenilins with Pathological Implications. J Mol Evol 2020; 88:674-688. [DOI: 10.1007/s00239-020-09966-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2019] [Accepted: 09/22/2020] [Indexed: 12/14/2022]
|
36
|
Bogomolovas J, Feng W, Yu MD, Huang S, Zhang L, Trexler C, Gu Y, Spinozzi S, Chen J. Atypical ALPK2 kinase is not essential for cardiac development and function. Am J Physiol Heart Circ Physiol 2020; 318:H1509-H1515. [PMID: 32383995 PMCID: PMC7311700 DOI: 10.1152/ajpheart.00249.2020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 04/28/2020] [Accepted: 05/05/2020] [Indexed: 01/18/2023]
Abstract
Protein kinases play an integral role in cardiac development, function, and disease. Recent experimental and clinical data have implied that protein kinases belonging to a family of atypical α-protein kinases, including α-protein kinase 2 (ALPK2), are important for regulating cardiac development and maintaining function via regulation of WNT signaling. A recent study in zebrafish reported that loss of ALPK2 leads to severe cardiac defects; however, the relevance of ALPK2 has not been studied in a mammalian animal model. To assess the role of ALPK2 in the mammalian heart, we generated two independent global Alpk2-knockout (Alpk2-gKO) mouse lines, using CRISPR/Cas9 technology. We performed physiological and biochemical analyses of Alpk2-gKO mice to determine the functional, morphological, and molecular consequences of Alpk2 deletion at the organismal level. We found that Alpk2-gKO mice exhibited normal cardiac function and morphology up to one year of age. Moreover, we did not observe altered WNT signaling in neonatal Alpk2-gKO mouse hearts. In conclusion, Alpk2 is dispensable for cardiac development and function in the murine model. Our results suggest that Alpk2 is a rapidly evolving gene that lost its essential cardiac functions in mammals.NEW & NOTEWORTHY Several studies indicated the importance of ALPK2 for cardiac function and development. A recent study in zebrafish report that loss of ALPK2 leads to severe cardiac defects. In contrast, murine Alpk2-gKO models developed in this work display no overt cardiac phenotype. Our results suggest ALPK2, as a rapidly evolving gene, lost its essential cardiac functions in mammals.
Collapse
Affiliation(s)
- Julius Bogomolovas
- Department of Medicine, University of California, San Diego, La Jolla, California
| | - Wei Feng
- Department of Medicine, University of California, San Diego, La Jolla, California
| | - Matthew Daniel Yu
- Department of Medicine, University of California, San Diego, La Jolla, California
| | - Serena Huang
- Department of Medicine, University of California, San Diego, La Jolla, California
| | - Lunfeng Zhang
- Department of Medicine, University of California, San Diego, La Jolla, California
| | - Christa Trexler
- Department of Medicine, University of California, San Diego, La Jolla, California
| | - Yusu Gu
- Department of Medicine, University of California, San Diego, La Jolla, California
| | - Simone Spinozzi
- Department of Medicine, University of California, San Diego, La Jolla, California
| | - Ju Chen
- Department of Medicine, University of California, San Diego, La Jolla, California
| |
Collapse
|
37
|
Heames B, Schmitz J, Bornberg-Bauer E. A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila. J Mol Evol 2020; 88:382-398. [PMID: 32253450 PMCID: PMC7162840 DOI: 10.1007/s00239-020-09939-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Accepted: 03/13/2020] [Indexed: 12/13/2022]
Abstract
Orphan genes, lacking detectable homologs in outgroup species, typically represent 10-30% of eukaryotic genomes. Efforts to find the source of these young genes indicate that de novo emergence from non-coding DNA may in part explain their prevalence. Here, we investigate the roots of orphan gene emergence in the Drosophila genus. Across the annotated proteomes of twelve species, we find 6297 orphan genes within 4953 taxon-specific clusters of orthologs. By inferring the ancestral DNA as non-coding for between 550 and 2467 (8.7-39.2%) of these genes, we describe for the first time how de novo emergence contributes to the abundance of clade-specific Drosophila genes. In support of them having functional roles, we show that de novo genes have robust expression and translational support. However, the distinct nucleotide sequences of de novo genes, which have characteristics intermediate between intergenic regions and conserved genes, reflect their recent birth from non-coding DNA. We find that de novo genes encode more disordered proteins than both older genes and intergenic regions. Together, our results suggest that gene emergence from non-coding DNA provides an abundant source of material for the evolution of new proteins. Following gene birth, gradual evolution over large evolutionary timescales moulds sequence properties towards those of conserved genes, resulting in a continuum of properties whose starting points depend on the nucleotide sequences of an initial pool of novel genes.
Collapse
Affiliation(s)
- Brennen Heames
- Institute for Evolution and Biodiversity, 48149, Münster, Germany
| | - Jonathan Schmitz
- Institute for Evolution and Biodiversity, 48149, Münster, Germany
| | | |
Collapse
|
38
|
Vakirlis N, Carvunis AR, McLysaght A. Synteny-based analyses indicate that sequence divergence is not the main source of orphan genes. eLife 2020; 9:e53500. [PMID: 32066524 PMCID: PMC7028367 DOI: 10.7554/elife.53500] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 01/07/2020] [Indexed: 12/20/2022] Open
Abstract
The origin of 'orphan' genes, species-specific sequences that lack detectable homologues, has remained mysterious since the dawn of the genomic era. There are two dominant explanations for orphan genes: complete sequence divergence from ancestral genes, such that homologues are not readily detectable; and de novo emergence from ancestral non-genic sequences, such that homologues genuinely do not exist. The relative contribution of the two processes remains unknown. Here, we harness the special circumstance of conserved synteny to estimate the contribution of complete divergence to the pool of orphan genes. By separately comparing yeast, fly and human genes to related taxa using conservative criteria, we find that complete divergence accounts, on average, for at most a third of eukaryotic orphan and taxonomically restricted genes. We observe that complete divergence occurs at a stable rate within a phylum but at different rates between phyla, and is frequently associated with gene shortening akin to pseudogenization.
Collapse
Affiliation(s)
- Nikolaos Vakirlis
- Smurfit Institute of GeneticsTrinity College Dublin, University of DublinDublinIreland
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, Pittsburgh Center for Evolutionary Biology and Medicine, School of MedicineUniversity of PittsburghPittsburghUnited States
| | - Aoife McLysaght
- Smurfit Institute of GeneticsTrinity College Dublin, University of DublinDublinIreland
| |
Collapse
|
39
|
Navarro-Muñoz JC, Collemare J. Evolutionary Histories of Type III Polyketide Synthases in Fungi. Front Microbiol 2020; 10:3018. [PMID: 32038517 PMCID: PMC6985275 DOI: 10.3389/fmicb.2019.03018] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Accepted: 12/16/2019] [Indexed: 12/30/2022] Open
Abstract
Type III polyketide synthases (PKSs) produce secondary metabolites with diverse biological activities, including antimicrobials. While they have been extensively studied in plants and bacteria, only a handful of type III PKSs from fungi has been characterized in the last 15 years. The exploitation of fungal type III PKSs to produce novel bioactive compounds requires understanding the diversity of these enzymes, as well as of their biosynthetic pathways. Here, phylogenetic and reconciliation analyses of 522 type III PKSs from 1,193 fungal genomes revealed complex evolutionary histories with massive gene duplications and losses, explaining their discontinuous distribution in the fungal tree of life. In addition, horizontal gene transfer events from bacteria to fungi and, to a lower extent, between fungi, could be inferred. Ancestral gene duplication events have resulted in the divergence of eight phylogenetic clades. Especially, two clades show ancestral linkage and functional co-evolution between a type III PKS and a reducing PKS genes. Investigation of the occurrence of protein domains in fungal type III PKS predicted gene clusters highlighted the diversity of biosynthetic pathways, likely reflecting a large chemical landscape. Type III PKS genes are most often located next to genes encoding cytochrome P450s, MFS transporters and transcription factors, defining ancestral core gene clusters. This analysis also allowed predicting gene clusters for the characterized fungal type III PKSs and provides working hypotheses for the elucidation of the full biosynthetic pathways. Altogether, our analyses provide the fundamental knowledge to motivate further characterization and exploitation of fungal type III PKS biosynthetic pathways.
Collapse
|
40
|
Chen H, Köllner TG, Li G, Wei G, Chen X, Zeng D, Qian Q, Chen F. Combinatorial Evolution of a Terpene Synthase Gene Cluster Explains Terpene Variations in Oryza. PLANT PHYSIOLOGY 2020; 182:480-492. [PMID: 31712306 PMCID: PMC6945850 DOI: 10.1104/pp.19.00948] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 10/28/2019] [Indexed: 05/21/2023]
Abstract
Terpenes are specialized metabolites ubiquitously produced by plants via the action of terpene synthases (TPSs). There are enormous variations in the types and amounts of terpenes produced by individual species. To understand the mechanisms responsible for such vast diversity, here we investigated the origin and evolution of a cluster of tandemly arrayed TPS genes in Oryza In the Oryza species analyzed, TPS genes occur as a three-TPS cluster, a two-TPS cluster, and a single TPS gene in five, one, and one species, respectively. Phylogenetic analysis revealed the origins of the two-TPS and three-TPS clusters and the role of species-specific losses of TPS genes. Within the three-TPS clusters, one orthologous group exhibited conserved catalytic activities. The other two groups, both of which contained pseudogenes and/or nonfunctional genes, exhibited distinct profiles of terpene products. Sequence and structural analyses combined with functional validation identified several amino acids in the active site that are critical for catalytic activity divergence of the three orthologous groups. In the five Oryza species containing the three-TPS cluster, their functional TPS genes showed both conserved and species-specific expression patterns in insect-damaged and untreated plants. Emission patterns of volatile terpenes from each species were largely consistent with the expression of their respective TPS genes and the catalytic activities of the encoded enzymes. This study indicates the importance of combinatorial evolution of TPS genes in determining terpene variations among individual species, which includes gene duplication, retention/loss/degradation of duplicated genes, varying selection pressure, retention/divergence in catalytic activities, and divergence in expression regulation.
Collapse
Affiliation(s)
- Hao Chen
- Department of Plant Sciences, University of Tennessee, Knoxville, Tennessee 37996
| | - Tobias G Köllner
- Department of Biochemistry, Max Planck Institute for Chemical Ecology, D-07745 Jena, Germany
| | - Guanglin Li
- Department of Plant Sciences, University of Tennessee, Knoxville, Tennessee 37996
| | - Guo Wei
- Department of Plant Sciences, University of Tennessee, Knoxville, Tennessee 37996
| | - Xinlu Chen
- Department of Plant Sciences, University of Tennessee, Knoxville, Tennessee 37996
| | - Dali Zeng
- State Key Lab for Rice Biology, China National Rice Research Institute, Hangzhou 310006, China
| | - Qian Qian
- State Key Lab for Rice Biology, China National Rice Research Institute, Hangzhou 310006, China
| | - Feng Chen
- Department of Plant Sciences, University of Tennessee, Knoxville, Tennessee 37996
| |
Collapse
|
41
|
Cambridge SB. Hypothesis: protein and RNA attributes are continuously optimized over time. BMC Genomics 2019; 20:1012. [PMID: 31870287 PMCID: PMC6929361 DOI: 10.1186/s12864-019-6371-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 12/05/2019] [Indexed: 02/01/2023] Open
Abstract
Background Little is known why proteins and RNAs exhibit half-lives varying over several magnitudes. Despite many efforts, a conclusive link between half-lives and gene function could not be established suggesting that other determinants may influence these molecular attributes. Results Here, I find that with increasing gene age there is a gradual and significant increase of protein and RNA half-lives, protein structure, and other molecular attributes that tend to affect protein abundance. These observations are accommodated in a hypothesis which posits that new genes at ‘birth’ are not optimized and thus their products exhibit low half-lives and less structure but continuous mutagenesis eventually improves these attributes. Thus, the protein and RNA products of the oldest genes obtained their high degrees of stability and structure only after billions of years while the products of younger genes had less time to be optimized and are therefore less stable and structured. Because more stable proteins with lower turnover require less transcription to maintain the same level of abundance, reduced transcription-associated mutagenesis (TAM) would fixate the changes by increasing gene conservation. Conclusions Consequently, the currently observed diversity of molecular attributes is a snapshot of gene products being at different stages along their temporal path of optimization.
Collapse
Affiliation(s)
- Sidney B Cambridge
- Department of Functional Neuroanatomy, Heidelberg University, Heidelberg, Germany.
| |
Collapse
|
42
|
Yin H, Li M, Xia L, He C, Zhang Z. Computational determination of gene age and characterization of evolutionary dynamics in human. Brief Bioinform 2019; 20:2141-2149. [PMID: 30184145 DOI: 10.1093/bib/bby074] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 08/01/2018] [Accepted: 08/02/2018] [Indexed: 12/23/2022] Open
Abstract
Genes originate at different evolutionary time scales and possess different ages, accordingly presenting diverse functional characteristics and reflecting distinct adaptive evolutionary innovations. In the past decades, progresses have been made in gene age identification by a variety of methods that are principally based on comparative genomics. Here we summarize methods for computational determination of gene age and evaluate the effectiveness of different computational methods for age identification. Our results show that improved age determination can be achieved by combining homolog clustering with phylogeny inference, which enables more accurate age identification in human genes. Accordingly, we characterize evolutionary dynamics of human genes based on an extremely long evolutionary time scale spanning ~4,000 million years from archaea/bacteria to human, revealing that young genes are clustered on certain chromosomes and that Mendelian disease genes (including monogenic disease and polygenic disease genes) and cancer genes exhibit divergent evolutionary origins. Taken together, deciphering genes' ages as well as their evolutionary dynamics is of fundamental significance in unveiling the underlying mechanisms during evolution and better understanding how young or new genes become indispensable integrants coupled with novel phenotypes and biological diversity.
Collapse
Affiliation(s)
- Hongyan Yin
- Hainan Key Laboratory for Sustainable Utilization of Tropical Bioresources, Institute of Tropical Agriculture and Forestry, Hainan University, China
| | - Mengwei Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Lin Xia
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Chaozu He
- Hainan Key Laboratory for Sustainable Utilization of Tropical Bioresources, Institute of Tropical Agriculture and Forestry, Hainan University, China
| | - Zhang Zhang
- BIG Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
43
|
Mustafin ZS, Zamyatin VI, Konstantinov DK, Doroshkov AV, Lashin SA, Afonnikov DA. Phylostratigraphic Analysis Shows the Earliest Origination of the Abiotic Stress Associated Genes in A. thaliana. Genes (Basel) 2019; 10:genes10120963. [PMID: 31766757 PMCID: PMC6947294 DOI: 10.3390/genes10120963] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 11/16/2019] [Accepted: 11/18/2019] [Indexed: 12/27/2022] Open
Abstract
Plants constantly fight with stressful factors as high or low temperature, drought, soil salinity and flooding. Plants have evolved a set of stress response mechanisms, which involve physiological and biochemical changes that result in adaptive or morphological changes. At a molecular level, stress response in plants is performed by genetic networks, which also undergo changes in the process of evolution. The study of the network structure and evolution may highlight mechanisms of plants adaptation to adverse conditions, as well as their response to stresses and help in discovery and functional characterization of the stress-related genes. We performed an analysis of Arabidopsis thaliana genes associated with several types of abiotic stresses (heat, cold, water-related, light, osmotic, salt, and oxidative) at the network level using a phylostratigraphic approach. Our results show that a substantial fraction of genes associated with various types of abiotic stress is of ancient origin and evolves under strong purifying selection. The interaction networks of genes associated with stress response have a modular structure with a regulatory component being one of the largest for five of seven stress types. We demonstrated a positive relationship between the number of interactions of gene in the stress gene network and its age. Moreover, genes of the same age tend to be connected in stress gene networks. We also demonstrated that old stress-related genes usually participate in the response for various types of stress and are involved in numerous biological processes unrelated to stress. Our results demonstrate that the stress response genes represent the ancient and one of the fundamental molecular systems in plants.
Collapse
Affiliation(s)
- Zakhar S. Mustafin
- The Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences (IC & G SB RAS), 630090 Novosibirsk, Russia; (Z.S.M.); (V.I.Z.); (D.K.K.); (A.V.D.)
- Kurchatov Genomics Center, Institute of Cytology and Genetics, SB RAS, 630090 Novosibirsk, Russia
| | - Vladimir I. Zamyatin
- The Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences (IC & G SB RAS), 630090 Novosibirsk, Russia; (Z.S.M.); (V.I.Z.); (D.K.K.); (A.V.D.)
- Kurchatov Genomics Center, Institute of Cytology and Genetics, SB RAS, 630090 Novosibirsk, Russia
- Faculty of Natural Sciences, Novosibirsk State University (NSU), 630090 Novosibirsk, Russia
| | - Dmitrii K. Konstantinov
- The Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences (IC & G SB RAS), 630090 Novosibirsk, Russia; (Z.S.M.); (V.I.Z.); (D.K.K.); (A.V.D.)
- Faculty of Natural Sciences, Novosibirsk State University (NSU), 630090 Novosibirsk, Russia
| | - Aleksej V. Doroshkov
- The Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences (IC & G SB RAS), 630090 Novosibirsk, Russia; (Z.S.M.); (V.I.Z.); (D.K.K.); (A.V.D.)
- Faculty of Natural Sciences, Novosibirsk State University (NSU), 630090 Novosibirsk, Russia
| | - Sergey A. Lashin
- The Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences (IC & G SB RAS), 630090 Novosibirsk, Russia; (Z.S.M.); (V.I.Z.); (D.K.K.); (A.V.D.)
- Kurchatov Genomics Center, Institute of Cytology and Genetics, SB RAS, 630090 Novosibirsk, Russia
- Faculty of Natural Sciences, Novosibirsk State University (NSU), 630090 Novosibirsk, Russia
- Correspondence: (S.A.L.); (D.A.A.); Tel.: +7-383-363-49-63 (D.A.A.)
| | - Dmitry A. Afonnikov
- The Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences (IC & G SB RAS), 630090 Novosibirsk, Russia; (Z.S.M.); (V.I.Z.); (D.K.K.); (A.V.D.)
- Kurchatov Genomics Center, Institute of Cytology and Genetics, SB RAS, 630090 Novosibirsk, Russia
- Faculty of Natural Sciences, Novosibirsk State University (NSU), 630090 Novosibirsk, Russia
- Correspondence: (S.A.L.); (D.A.A.); Tel.: +7-383-363-49-63 (D.A.A.)
| |
Collapse
|
44
|
Song H, Guo Z, Hu X, Qian L, Miao F, Zhang X, Chen J. Evolutionary balance between LRR domain loss and young NBS-LRR genes production governs disease resistance in Arachis hypogaea cv. Tifrunner. BMC Genomics 2019; 20:844. [PMID: 31722670 PMCID: PMC6852974 DOI: 10.1186/s12864-019-6212-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Accepted: 10/22/2019] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Cultivated peanut (Arachis hypogaea L.) is an important oil and protein crop, but it has low disease resistance; therefore, it is important to reveal the number, sequence features, function, and evolution of genes that confer resistance. Nucleotide-binding site-leucine-rich repeats (NBS-LRRs) are resistance genes that are involved in response to various pathogens. RESULTS We identified 713 full-length NBS-LRRs in A. hypogaea cv. Tifrunner. Genetic exchange events occurred on NBS-LRRs in A. hypogaea cv. Tifrunner, which were detected in the same subgenomes and also found in different subgenomes. Relaxed selection acted on NBS-LRR proteins and LRR domains in A. hypogaea cv. Tifrunner. Using quantitative trait loci (QTL), we found that NBS-LRRs were involved in response to late leaf spot, tomato spotted wilt virus, and bacterial wilt in A. duranensis (2 NBS-LRRs), A. ipaensis (39 NBS-LRRs), and A. hypogaea cv. Tifrunner (113 NBS-LRRs). In A. hypogaea cv. Tifrunner, 113 NBS-LRRs were classified as 75 young and 38 old NBS-LRRs, indicating that young NBS-LRRs were involved in response to disease after tetraploidization. However, compared to A. duranensis and A. ipaensis, fewer LRR domains were found in A. hypogaea cv. Tifrunner NBS-LRR proteins, partly explaining the lower disease resistance of the cultivated peanut. CONCLUSIONS Although relaxed selection acted on NBS-LRR proteins and LRR domains, LRR domains were preferentially lost in A. hypogaea cv. Tifrunner compared to A. duranensis and A. ipaensis. The QTL results suggested that young NBS-LRRs were important for resistance against diseases in A. hypogaea cv. Tifrunner. Our results provid insight into the greater susceptibility of A. hypogaea cv. Tifrunner to disease compared to A. duranensis and A. ipaensis.
Collapse
Affiliation(s)
- Hui Song
- Grassland Agri-husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China.
| | - Zhonglong Guo
- State Key Laboratory of Protein and Plant Gene Research, Peking-Tsinghua Center for Life Sciences, School of Life Sciences and School of Advanced Agricultural Sciences, Peking University, Beijing, China
| | - Xiaohui Hu
- Shandong Peanut Research Institute, Qingdao, China
| | - Lang Qian
- Dalian Academy of Agricultural Sciences, Dalian, China
| | - Fuhong Miao
- Grassland Agri-husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Xiaojun Zhang
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| | - Jing Chen
- Shandong Peanut Research Institute, Qingdao, China.
| |
Collapse
|
45
|
Willemsen A, Félez-Sánchez M, Bravo IG. Genome Plasticity in Papillomaviruses and De Novo Emergence of E5 Oncogenes. Genome Biol Evol 2019; 11:1602-1617. [PMID: 31076746 PMCID: PMC6557308 DOI: 10.1093/gbe/evz095] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/29/2019] [Indexed: 02/06/2023] Open
Abstract
The clinical presentations of papillomavirus (PV) infections come in many different flavors. While most PVs are part of a healthy skin microbiota and are not associated to physical lesions, other PVs cause benign lesions, and only a handful of PVs are associated to malignant transformations linked to the specific activities of the E5, E6, and E7 oncogenes. The functions and origin of E5 remain to be elucidated. These E5 open reading frames (ORFs) are present in the genomes of a few polyphyletic PV lineages, located between the early and the late viral gene cassettes. We have computationally assessed whether these E5 ORFs have a common origin and whether they display the properties of a genuine gene. Our results suggest that during the evolution of Papillomaviridae, at least four events lead to the presence of a long noncoding DNA stretch between the E2 and the L2 genes. In three of these events, the novel regions evolved coding capacity, becoming the extant E5 ORFs. We then focused on the evolution of the E5 genes in AlphaPVs infecting primates. The sharp match between the type of E5 protein encoded in AlphaPVs and the infection phenotype (cutaneous warts, genital warts, or anogenital cancers) supports the role of E5 in the differential oncogenic potential of these PVs. In our analyses, the best-supported scenario is that the five types of extant E5 proteins within the AlphaPV genomes may not have a common ancestor. However, the chemical similarities between E5s regarding amino acid composition prevent us from confidently rejecting the model of a common origin. Our evolutionary interpretation is that an originally noncoding region entered the genome of the ancestral AlphaPVs. This genetic novelty allowed to explore novel transcription potential, triggering an adaptive radiation that yielded three main viral lineages encoding for different E5 proteins, displaying distinct infection phenotypes. Overall, our results provide an evolutionary scenario for the de novo emergence of viral genes and illustrate the impact of such genotypic novelty in the phenotypic diversity of the viral infections.
Collapse
Affiliation(s)
- Anouk Willemsen
- Laboratory MIVEGEC (UMR CNRS IRD Uni Montpellier), Centre National de la Recherche Scientique (CNRS), Montpellier, France
| | - Marta Félez-Sánchez
- Infections and Cancer Laboratory, Catalan Institute of Oncology (ICO), Barcelona, Spain
| | - Ignacio G Bravo
- Laboratory MIVEGEC (UMR CNRS IRD Uni Montpellier), Centre National de la Recherche Scientique (CNRS), Montpellier, France
| |
Collapse
|
46
|
Dyachkova MS, Chekalin EV, Danilenko VN. Positive Selection in Bifidobacterium Genes Drives Species-Specific Host-Bacteria Communication. Front Microbiol 2019; 10:2374. [PMID: 31681231 PMCID: PMC6803598 DOI: 10.3389/fmicb.2019.02374] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Accepted: 09/30/2019] [Indexed: 12/15/2022] Open
Abstract
Bifidobacteria are commensal microorganisms that inhabit a wide range of hosts, including insects, birds and mammals. The mechanisms responsible for the adaptation of bifidobacteria to various hosts during the evolutionary process remain poorly understood. Previously, we reported that the species-specific PFNA gene cluster is present in the genomes of various species of the Bifidobacterium genus. The cluster contains signal transduction and adhesion genes that are presumably involved in the communication between bifidobacteria and their hosts. The genes in the PFNA cluster show high sequence divergence between bifidobacterial species, which may be indicative of rapid evolution that drives species-specific adaptation to the host organism. We used the maximum likelihood approach to detect positive selection in the PFNA genes. We tested for both pervasive and episodic positive selection to identify codons that experienced adaptive evolution in all and individual branches of the Bifidobacterium phylogenetic tree, respectively. Our results provide evidence that episodic positive selection has played an important role in the divergence process and molecular evolution of sequences of the species-specific PFNA genes in most bifidobacterial species. Moreover, we found the signatures of pervasive positive selection in the molecular evolution of the tgm gene in all branches of the Bifidobacterium phylogenetic tree. These results are consistent with the suggested role of PFNA gene cluster in the process of specific adaptation of bifidobacterial species to various hosts.
Collapse
Affiliation(s)
- Marina S Dyachkova
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Evgeny V Chekalin
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Valery N Danilenko
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
47
|
Chen H, Zhang Z, Jiang S, Li R, Li W, Zhao C, Hong H, Huang X, Li H, Bo X. New insights on human essential genes based on integrated analysis and the construction of the HEGIAP web-based platform. Brief Bioinform 2019; 21:1397-1410. [PMID: 31504171 PMCID: PMC7373178 DOI: 10.1093/bib/bbz072] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 05/13/2019] [Accepted: 05/24/2019] [Indexed: 12/13/2022] Open
Abstract
Essential genes are those whose loss of function compromises organism viability or results in profound loss of fitness. Recent gene-editing technologies have provided new opportunities to characterize essential genes. Here, we present an integrated analysis that comprehensively and systematically elucidates the genetic and regulatory characteristics of human essential genes. First, we found that essential genes act as ‘hubs’ in protein–protein interaction networks, chromatin structure and epigenetic modification. Second, essential genes represent conserved biological processes across species, although gene essentiality changes differently among species. Third, essential genes are important for cell development due to their discriminate transcription activity in embryo development and oncogenesis. In addition, we developed an interactive web server, the Human Essential Genes Interactive Analysis Platform (http://sysomics.com/HEGIAP/), which integrates abundant analytical tools to enable global, multidimensional interpretation of gene essentiality. Our study provides new insights that improve the understanding of human essential genes.
Collapse
Affiliation(s)
- Hebing Chen
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Zhuo Zhang
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Shuai Jiang
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Ruijiang Li
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Wanying Li
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Chenghui Zhao
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Hao Hong
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xin Huang
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Hao Li
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| |
Collapse
|
48
|
Song H, Sun J, Yang G. The characteristic of Arachis duranensis-specific genes and their potential function. Gene 2019; 705:60-66. [PMID: 31009681 DOI: 10.1016/j.gene.2019.04.052] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Revised: 03/12/2019] [Accepted: 04/18/2019] [Indexed: 11/17/2022]
Abstract
Arachis species produce flowers aerially, and then grow into the ground, where they develop into fruits; a feature that is unique to Arachis species. We hypothesized that Arachis species evolved genes specifically involved in the control of aerial flowers and the formation of underground fruits. Arachis duranensis is more resistant to biotic and abiotic stressors. Here, we compared different legume species and identified Arachis duranensis-specific genes. We analyzed gene expression patterns, base substitution patterns and sequence features between genes that are conserved across legume plants and A. duranensis-specific genes. Furthermore, we tested the role of A. duranensis-specific genes during seed development, response to nematode Meloidogyne arenaria infection and drought stress. We found that A. duranensis-specific genes had characteristics of young genes. The gene expression level and breadth were lower in the A. duranensis-specific genes compared to conserved genes. The A. duranensis-specific genes had higher codon usage bias than conserved genes, and the polypeptide length and GC content at the three codon sites were lower compared to conserved genes. Of the A. duranensis-specific genes, single-copy and duplicated genes had different features. The RNA-seq result showed A. duranensis-specific genes were involved in seed development, as well as response to nematode infection and drought stress. In addition, we detected asymmetric functions in A. duranensis-specific duplicated genes in response to nematode infection and drought stress.
Collapse
Affiliation(s)
- Hui Song
- Grassland Agri-husbandry Research Center, Qingdao Agricultural University, Qingdao, China.
| | - Juan Sun
- Grassland Agri-husbandry Research Center, Qingdao Agricultural University, Qingdao, China
| | - Guofeng Yang
- Grassland Agri-husbandry Research Center, Qingdao Agricultural University, Qingdao, China.
| |
Collapse
|
49
|
Mello B, Schrago CG. The Estimated Pacemaker for Great Apes Supports the Hominoid Slowdown Hypothesis. Evol Bioinform Online 2019; 15:1176934319855988. [PMID: 31223232 PMCID: PMC6566470 DOI: 10.1177/1176934319855988] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 05/17/2019] [Indexed: 11/16/2022] Open
Abstract
The recent surge of genomic data has prompted the investigation of substitution rate variation across the genome, as well as among lineages. Evolutionary trees inferred from distinct genomic regions may display branch lengths that differ between loci by simple proportionality constants, indicating that rate variation follows a pacemaker model, which may be attributed to lineage effects. Analyses of genes from diverse biological clades produced contrasting results, supporting either this model or alternative scenarios where multiple pacemakers exist. So far, an evaluation of the pacemaker hypothesis for all great apes has never been carried out. In this work, we tested whether the evolutionary rates of hominids conform to pacemakers, which were inferred accounting for gene tree/species tree discordance. For higher precision, substitution rates in branches were estimated with a calibration-free approach, the relative rate framework. A predominant evolutionary trend in great apes was evidenced by the recovery of a large pacemaker, encompassing most hominid genomic regions. In addition, the majority of genes followed a pace of evolution that was closely related to the strict molecular clock. However, slight rate decreases were recovered in the internal branches leading to humans, corroborating the hominoid slowdown hypothesis. Our findings suggest that in great apes, life history traits were the major drivers of substitution rate variation across the genome.
Collapse
Affiliation(s)
- Beatriz Mello
- Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Carlos G Schrago
- Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
50
|
Durand É, Gagnon-Arsenault I, Hallin J, Hatin I, Dubé AK, Nielly-Thibault L, Namy O, Landry CR. Turnover of ribosome-associated transcripts from de novo ORFs produces gene-like characteristics available for de novo gene emergence in wild yeast populations. Genome Res 2019; 29:932-943. [PMID: 31152050 PMCID: PMC6581059 DOI: 10.1101/gr.239822.118] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Accepted: 05/13/2019] [Indexed: 12/17/2022]
Abstract
Little is known about the rate of emergence of de novo genes, what their initial properties are, and how they spread in populations. We examined wild yeast populations (Saccharomyces paradoxus) to characterize the diversity and turnover of intergenic ORFs over short evolutionary timescales. We find that hundreds of intergenic ORFs show translation signatures similar to canonical genes, and we experimentally confirmed the translation of many of these ORFs in laboratory conditions using a reporter assay. Compared with canonical genes, intergenic ORFs have lower translation efficiency, which could imply a lack of optimization for translation or a mechanism to reduce their production cost. Translated intergenic ORFs also tend to have sequence properties that are generally close to those of random intergenic sequences. However, some of the very recent translated intergenic ORFs, which appeared <110 kya, already show gene-like characteristics, suggesting that the raw material for functional innovations could appear over short evolutionary timescales.
Collapse
Affiliation(s)
- Éléonore Durand
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Isabelle Gagnon-Arsenault
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada.,Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Johan Hallin
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada.,Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Isabelle Hatin
- Institut de Biologie Intégrative de la Cellule (I2BC), CEA, CNRS, Université Paris-Sud, Université Paris-Saclay, 91190 Gif sur Yvette, France
| | - Alexandre K Dubé
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada.,Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Lou Nielly-Thibault
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Olivier Namy
- Institut de Biologie Intégrative de la Cellule (I2BC), CEA, CNRS, Université Paris-Sud, Université Paris-Saclay, 91190 Gif sur Yvette, France
| | - Christian R Landry
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada.,Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, G1V 0A6 Québec, Québec, Canada
| |
Collapse
|