1
|
Gunasekera RS, Raja KKB, Hewapathirana S, Tundrea E, Gunasekera V, Galbadage T, Nelson PA. ORFanID: A web-based search engine for the discovery and identification of orphan and taxonomically restricted genes. PLoS One 2023; 18:e0291260. [PMID: 37879070 PMCID: PMC10599687 DOI: 10.1371/journal.pone.0291260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 08/24/2023] [Indexed: 10/27/2023] Open
Abstract
With the numerous genomes sequenced today, it has been revealed that a noteworthy percentage of genes in a given taxon of organisms in the phylogenetic tree of life do not have orthologous sequences in other taxa. These sequences are commonly referred to as "orphans" or "ORFans" if found as single occurrences in a single species or as "taxonomically restricted genes" (TRGs) when found at higher taxonomic levels. Quantitative and collective studies of these genes are necessary for understanding their biological origins. However, the current software for identifying orphan genes is limited in its functionality, database search range, and very complex algorithmically. Thus, researchers studying orphan genes must harvest their data from many disparate sources. ORFanID is a graphical web-based search engine that facilitates the efficient identification of both orphan genes and TRGs at all taxonomic levels, from DNA or amino acid sequences in the NCBI database cluster and other large bioinformatics repositories. The software allows users to identify genes that are unique to any taxonomic rank, from species to domain, using NCBI systematic classifiers. It provides control over NCBI database search parameters, and the results are presented in a spreadsheet as well as a graphical display. The tables in the software are sortable, and results can be filtered using the fuzzy search functionality. The visual presentation can be expanded and collapsed by the taxonomic tree to its various branches. Example results from searches on five species and gene expression data from specific orphan genes are provided in the Supplementary Information.
Collapse
Affiliation(s)
- Richard S. Gunasekera
- Department of Chemistry, Physics and Engineering, School of Science, Technology & Health, Biola University, La Mirada, CA, United States of America
| | - Komal K. B. Raja
- Department of Pathology & Immunology, Baylor College of Medicine, Houston, TX, United States of America
| | - Suresh Hewapathirana
- European Bioinformatics Institute, Welcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom
| | - Emanuel Tundrea
- Griffiths School of Management and IT, Emanuel University of Oradea, Oradea, Romania
| | - Vinodh Gunasekera
- Bioinformatics, Chesalon USA, Inc., Houston, TX, United States of America
| | - Thushara Galbadage
- Department of Kinesiology and Public Health, School of Science, Technology & Health, Biola University, La Mirada, CA, United States of America
| | - Paul A. Nelson
- Biola University, La Mirada, CA, United States of America
| |
Collapse
|
2
|
Fakhar AZ, Liu J, Pajerowska-Mukhtar KM, Mukhtar MS. The Lost and Found: Unraveling the Functions of Orphan Genes. J Dev Biol 2023; 11:27. [PMID: 37367481 PMCID: PMC10299390 DOI: 10.3390/jdb11020027] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 05/19/2023] [Accepted: 05/26/2023] [Indexed: 06/28/2023] Open
Abstract
Orphan Genes (OGs) are a mysterious class of genes that have recently gained significant attention. Despite lacking a clear evolutionary history, they are found in nearly all living organisms, from bacteria to humans, and they play important roles in diverse biological processes. The discovery of OGs was first made through comparative genomics followed by the identification of unique genes across different species. OGs tend to be more prevalent in species with larger genomes, such as plants and animals, and their evolutionary origins remain unclear but potentially arise from gene duplication, horizontal gene transfer (HGT), or de novo origination. Although their precise function is not well understood, OGs have been implicated in crucial biological processes such as development, metabolism, and stress responses. To better understand their significance, researchers are using a variety of approaches, including transcriptomics, functional genomics, and molecular biology. This review offers a comprehensive overview of the current knowledge of OGs in all domains of life, highlighting the possible role of dark transcriptomics in their evolution. More research is needed to fully comprehend the role of OGs in biology and their impact on various biological processes.
Collapse
Affiliation(s)
| | | | | | - M. Shahid Mukhtar
- Department of Biology, University of Alabama at Birmingham, 1300 University Blvd., Birmingham, AL 35294, USA
| |
Collapse
|
3
|
Nevers Y, Glover NM, Dessimoz C, Lecompte O. Protein length distribution is remarkably uniform across the tree of life. Genome Biol 2023; 24:135. [PMID: 37291671 PMCID: PMC10251718 DOI: 10.1186/s13059-023-02973-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 05/16/2023] [Indexed: 06/10/2023] Open
Abstract
BACKGROUND In every living species, the function of a protein depends on its organization of structural domains, and the length of a protein is a direct reflection of this. Because every species evolved under different evolutionary pressures, the protein length distribution, much like other genomic features, is expected to vary across species but has so far been scarcely studied. RESULTS Here we evaluate this diversity by comparing protein length distribution across 2326 species (1688 bacteria, 153 archaea, and 485 eukaryotes). We find that proteins tend to be on average slightly longer in eukaryotes than in bacteria or archaea, but that the variation of length distribution across species is low, especially compared to the variation of other genomic features (genome size, number of proteins, gene length, GC content, isoelectric points of proteins). Moreover, most cases of atypical protein length distribution appear to be due to artifactual gene annotation, suggesting the actual variation of protein length distribution across species is even smaller. CONCLUSIONS These results open the way for developing a genome annotation quality metric based on protein length distribution to complement conventional quality measures. Overall, our findings show that protein length distribution between living species is more uniform than previously thought. Furthermore, we also provide evidence for a universal selection on protein length, yet its mechanism and fitness effect remain intriguing open questions.
Collapse
Affiliation(s)
- Yannis Nevers
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- Swiss Institute for Bioinformatics, University of Lausanne, Lausanne, Switzerland.
| | - Natasha M Glover
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute for Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute for Bioinformatics, University of Lausanne, Lausanne, Switzerland
- Department of Computer Science, University College London, London, UK
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Odile Lecompte
- Department of Computer Science, Centre de Recherche en Biomédecine de Strasbourg, ICube, UMR 7357, University of Strasbourg, CNRS, Strasbourg, France
| |
Collapse
|
4
|
Poretti M, Praz CR, Sotiropoulos AG, Wicker T. A survey of lineage-specific genes in Triticeae reveals de novo gene evolution from genomic raw material. PLANT DIRECT 2023; 7:e484. [PMID: 36937792 PMCID: PMC10020141 DOI: 10.1002/pld3.484] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 01/26/2023] [Accepted: 01/27/2023] [Indexed: 06/18/2023]
Abstract
Diploid plant genomes typically contain ~35,000 genes, almost all belonging to highly conserved gene families. Only a small fraction are lineage-specific, which are found in only one or few closely related species. Little is known about how genes arise de novo in plant genomes and how often this occurs; however, they are believed to be important for plants diversification and adaptation. We developed a pipeline to identify lineage-specific genes in Triticeae, using newly available genome assemblies of wheat, barley, and rye. Applying a set of stringent criteria, we identified 5942 candidate Triticeae-specific genes (TSGs), of which 2337 were validated as protein-coding genes in wheat. Differential gene expression analyses revealed that stress-induced wheat TSGs are strongly enriched in putative secreted proteins. Some were previously described to be involved in Triticeae non-host resistance and cold response. Additionally, we show that 1079 TSGs have sequence homology to transposable elements (TEs), ~68% of them deriving from regulatory non-coding regions of Gypsy retrotransposons. Most importantly, we demonstrate that these TSGs are enriched in transmembrane domains and are among the most highly expressed wheat genes overall. To summarize, we conclude that de novo gene formation is relatively rare and that Triticeae probably possess ~779 lineage-specific genes per haploid genome. TSGs, which respond to pathogen and environmental stresses, may be interesting candidates for future targeted resistance breeding in Triticeae. Finally, we propose that non-coding regions of TEs might provide important genetic raw material for the functional innovation of TM domains and the evolution of novel secreted proteins.
Collapse
Affiliation(s)
- Manuel Poretti
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
- Department of BiologyUniversity of FribourgFribourgSwitzerland
| | - Coraline R. Praz
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
- Centro de Biotecnología y Genómica de PlantasUniversidad Politécnica de Madrid (UPM)–Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA)MadridSpain
| | | | - Thomas Wicker
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| |
Collapse
|
5
|
Schossig P, Coskun E, Arsenic R, Horst D, Sehouli J, Bergmann E, Andresen N, Sigler C, Busse A, Keller U, Ochsenreither S. Target Selection for T-Cell Therapy in Epithelial Ovarian Cancer: Systematic Prioritization of Self-Antigens. Int J Mol Sci 2023; 24:ijms24032292. [PMID: 36768616 PMCID: PMC9916968 DOI: 10.3390/ijms24032292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 01/13/2023] [Accepted: 01/17/2023] [Indexed: 01/26/2023] Open
Abstract
Adoptive T cell-receptor therapy (ACT) could represent a promising approach in the targeted treatment of epithelial ovarian cancer (EOC). However, the identification of suitable tumor-associated antigens (TAAs) as targets is challenging. We identified and prioritized TAAs for ACT and other immunotherapeutic interventions in EOC. A comprehensive list of pre-described TAAs was created and candidates were prioritized, using predefined weighted criteria. Highly ranked TAAs were immunohistochemically stained in a tissue microarray of 58 EOC samples to identify associations of TAA expression with grade, stage, response to platinum, and prognosis. Preselection based on expression data resulted in 38 TAAs, which were prioritized. Along with already published Cyclin A1, the TAAs KIF20A, CT45, and LY6K emerged as most promising targets, with high expression in EOC samples and several identified peptides in ligandome analysis. Expression of these TAAs showed prognostic relevance independent of molecular subtypes. By using a systematic vetting algorithm, we identified KIF20A, CT45, and LY6K to be promising candidates for immunotherapy in EOC. Results are supported by IHC and HLA-ligandome data. The described method might be helpful for the prioritization of TAAs in other tumor entities.
Collapse
Affiliation(s)
- Paul Schossig
- Department of Hematology, Oncology and Cancer Immunology, Campus Benjamin Franklin, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
| | - Ebru Coskun
- Department of Hematology, Oncology and Cancer Immunology, Campus Benjamin Franklin, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Ruza Arsenic
- Department of Pathology, Universitätsklinikum Heidelberg, Heidelberg University, 69120 Heidelberg, Germany
| | - David Horst
- Insitute of Pathology, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
| | - Jalid Sehouli
- Department of Gynecology, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
- Tumorbank Ovarian Cancer Network, 13353 Berlin, Germany
| | - Eva Bergmann
- Department of Hematology, Oncology and Cancer Immunology, Campus Benjamin Franklin, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
| | - Nadine Andresen
- Department of Hematology, Oncology and Cancer Immunology, Campus Benjamin Franklin, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
| | - Christian Sigler
- Charité Comprehensive Cancer Center, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
| | - Antonia Busse
- Department of Hematology, Oncology and Cancer Immunology, Campus Benjamin Franklin, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Max-Delbrück-Center for Molecular Medicine, 13125 Berlin, Germany
| | - Ulrich Keller
- Department of Hematology, Oncology and Cancer Immunology, Campus Benjamin Franklin, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Max-Delbrück-Center for Molecular Medicine, 13125 Berlin, Germany
| | - Sebastian Ochsenreither
- Department of Hematology, Oncology and Cancer Immunology, Campus Benjamin Franklin, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Charité Comprehensive Cancer Center, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
- Correspondence:
| |
Collapse
|
6
|
Multiplex PCR Identification of Aspergillus cristatus and Aspergillus chevalieri in Liupao Tea Based on Orphan Genes. Foods 2022; 11:foods11152217. [PMID: 35892804 PMCID: PMC9332452 DOI: 10.3390/foods11152217] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/19/2022] [Accepted: 07/22/2022] [Indexed: 11/21/2022] Open
Abstract
“Golden flower” fungi in dark tea are beneficial to human health. The rapid identification method of “golden flower” fungi can verify the quality of dark tea products and ensure food safety. In this study, 6 strains were isolated from Liupao tea. They were respectively identified as A. cristatus, A. chevalieri, and A. pseudoglaucus. A. pseudoglaucus was reported as Liupao tea “golden flower” fungus for the first time. It was found that the ITS and BenA sequences of A. cristatus and A. chevalieri were highly conserved. It is difficult to clearly distinguish these closely related species by ITS sequencing. To rapidly identify species, multiplex PCR species-specific primers were designed based on orphan genes screened by comparative genomics analysis. Multiplex PCR results showed that orphan genes were specific and effective for the identification of A. cristatus and A. chevalieri isolated from Liupao tea and Fu brick tea. We confirmed that orphan genes can be used for identification of closely related Aspergillus species. Validation showed that the method is convenient, rapid, robust, sequencing-free, and economical. This promising method will be greatly beneficial to the dark tea processing industry and consumers.
Collapse
|
7
|
Jiang M, Li X, Dong X, Zu Y, Zhan Z, Piao Z, Lang H. Research Advances and Prospects of Orphan Genes in Plants. FRONTIERS IN PLANT SCIENCE 2022; 13:947129. [PMID: 35874010 PMCID: PMC9305701 DOI: 10.3389/fpls.2022.947129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 06/23/2022] [Indexed: 06/15/2023]
Abstract
Orphan genes (OGs) are defined as genes having no sequence similarity with genes present in other lineages. OGs have been regarded to play a key role in the development of lineage-specific adaptations and can also serve as a constant source of evolutionary novelty. These genes have often been found related to various stress responses, species-specific traits, special expression regulation, and also participate in primary substance metabolism. The advancement in sequencing tools and genome analysis methods has made the identification and characterization of OGs comparatively easier. In the study of OG functions in plants, significant progress has been made. We review recent advances in the fast evolving characteristics, expression modulation, and functional analysis of OGs with a focus on their role in plant biology. We also emphasize current challenges, adoptable strategies and discuss possible future directions of functional study of OGs.
Collapse
Affiliation(s)
- Mingliang Jiang
- School of Agriculture, Jilin Agricultural Science and Technology College, Jilin, China
| | - Xiaonan Li
- College of Horticulture, Shenyang Agricultural University, Shenyang, China
| | - Xiangshu Dong
- School of Agriculture, Yunnan University, Kunming, China
| | - Ye Zu
- College of Horticulture, Shenyang Agricultural University, Shenyang, China
| | - Zongxiang Zhan
- College of Horticulture, Shenyang Agricultural University, Shenyang, China
| | - Zhongyun Piao
- College of Horticulture, Shenyang Agricultural University, Shenyang, China
| | - Hong Lang
- School of Agriculture, Jilin Agricultural Science and Technology College, Jilin, China
| |
Collapse
|
8
|
Prabh N, Rödelsperger C. Multiple Pristionchus pacificus genomes reveal distinct evolutionary dynamics between de novo candidates and duplicated genes. Genome Res 2022; 32:1315-1327. [PMID: 35618417 PMCID: PMC9341508 DOI: 10.1101/gr.276431.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 05/20/2022] [Indexed: 01/03/2023]
Abstract
The birth of new genes is a major molecular innovation driving phenotypic diversity across all domains of life. Although repurposing of existing protein-coding material by duplication is considered the main process of new gene formation, recent studies have discovered thousands of transcriptionally active sequences as a rich source of new genes. However, differential loss rates have to be assumed to reconcile the high birth rates of these incipient de novo genes with the dominance of ancient gene families in individual genomes. Here, we test this rapid turnover hypothesis in the context of the nematode model organism Pristionchus pacificus We extended the existing species-level phylogenomic framework by sequencing the genomes of six divergent P. pacificus strains. We used these data to study the evolutionary dynamics of different age classes and categories of origin at a population level. Contrasting de novo candidates with new families that arose by duplication and divergence from known genes, we find that de novo candidates are typically shorter, show less expression, and are overrepresented on the sex chromosome. Although the contribution of de novo candidates increases toward young age classes, multiple comparisons within the same age class showed significantly higher attrition in de novo candidates than in known genes. Similarly, young genes remain under weak evolutionary constraints with de novo candidates representing the fastest evolving subcategory. Altogether, this study provides empirical evidence for the rapid turnover hypothesis and highlights the importance of the evolutionary timescale when quantifying the contribution of different mechanisms toward new gene formation.
Collapse
Affiliation(s)
- Neel Prabh
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, 72076 Tübingen, Germany
| | - Christian Rödelsperger
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, 72076 Tübingen, Germany
| |
Collapse
|
9
|
Cardoso-Silva CB, Aono AH, Mancini MC, Sforça DA, da Silva CC, Pinto LR, Adams KL, de Souza AP. Taxonomically Restricted Genes Are Associated With Responses to Biotic and Abiotic Stresses in Sugarcane ( Saccharum spp.). FRONTIERS IN PLANT SCIENCE 2022; 13:923069. [PMID: 35845637 PMCID: PMC9280035 DOI: 10.3389/fpls.2022.923069] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 06/13/2022] [Indexed: 06/15/2023]
Abstract
Orphan genes (OGs) are protein-coding genes that are restricted to particular clades or species and lack homology with genes from other organisms, making their biological functions difficult to predict. OGs can rapidly originate and become functional; consequently, they may support rapid adaptation to environmental changes. Extensive spread of mobile elements and whole-genome duplication occurred in the Saccharum group, which may have contributed to the origin and diversification of OGs in the sugarcane genome. Here, we identified and characterized OGs in sugarcane, examined their expression profiles across tissues and genotypes, and investigated their regulation under varying conditions. We identified 319 OGs in the Saccharum spontaneum genome without detected homology to protein-coding genes in green plants, except those belonging to Saccharinae. Transcriptomic analysis revealed 288 sugarcane OGs with detectable expression levels in at least one tissue or genotype. We observed similar expression patterns of OGs in sugarcane genotypes originating from the closest geographical locations. We also observed tissue-specific expression of some OGs, possibly indicating a complex regulatory process for maintaining diverse functional activity of these genes across sugarcane tissues and genotypes. Sixty-six OGs were differentially expressed under stress conditions, especially cold and osmotic stresses. Gene co-expression network and functional enrichment analyses suggested that sugarcane OGs are involved in several biological mechanisms, including stimulus response and defence mechanisms. These findings provide a valuable genomic resource for sugarcane researchers, especially those interested in selecting stress-responsive genes.
Collapse
Affiliation(s)
- Cláudio Benício Cardoso-Silva
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- Department of Botany, University of British Columbia, Vancouver, BC, Canada
| | - Alexandre Hild Aono
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Melina Cristina Mancini
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Danilo Augusto Sforça
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Carla Cristina da Silva
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- Agronomy Department, Federal University of Viçosa (UFV), Viçosa, Brazil
| | - Luciana Rossini Pinto
- Sugarcane Research Advanced Centre, Agronomic Institute of Campinas (IAC/APTA), Ribeirão Preto, Brazil
| | - Keith L. Adams
- Department of Botany, University of British Columbia, Vancouver, BC, Canada
| | - Anete Pereira de Souza
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- Institute of Biology, University of Campinas (UNICAMP), Campinas, Brazil
| |
Collapse
|
10
|
Dong XM, Pu XJ, Zhou SZ, Li P, Luo T, Chen ZX, Chen SL, Liu L. Orphan gene PpARDT positively involved in drought tolerance potentially by enhancing ABA response in Physcomitrium (Physcomitrella) patens. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2022; 319:111222. [PMID: 35487672 DOI: 10.1016/j.plantsci.2022.111222] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Revised: 02/11/2022] [Accepted: 02/14/2022] [Indexed: 05/19/2023]
Abstract
Almost all genomes have orphan genes, the majority of which are not functionally annotated. There is growing evidence showed that orphan genes may play important roles in the environmental stress response of Physcomitrium patens. We identified PpARDT (ABA-responsive drought tolerance) as a moss-specific and ABA-responsive orphan gene in P. patens. PpARDT is mainly expressed during the gametophytic stage of the life cycle, and the expression was induced by different abiotic stresses. A PpARDT knockout (Ppardt) mutant showed reduced dehydration-rehydration tolerance, and the phenotype could be rescued by exogenous ABA. Meanwhile, transgenic Arabidopsis lines exhibiting heterologous expression of PpARDT were more sensitive to exogenous ABA than wild-type (Col-0) plants and showed enhanced drought tolerance. These indicate that PpARDT confers drought tolerance among land plants potentially by enhancing ABA response. Further, we identified genes encoding abscisic acid receptor PYR/PYL family proteins, and ADP-ribosylation factors (Arf) as hub genes associated with the Ppardt phenotype. Given the lineage-specific characteristics of PpARDT, our results provide insights into the roles of orphan gene in shaping lineage-specific adaptation possibly by recruiting common pre-existed pathway components.
Collapse
Affiliation(s)
- Xiu-Mei Dong
- Key Laboratory Dependent on for Economic Plants and Biotechnology, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China.
| | - Xiao-Jun Pu
- Key Laboratory Dependent on for Economic Plants and Biotechnology, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China.
| | - Shi-Zhao Zhou
- Key Laboratory Dependent on for Economic Plants and Biotechnology, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China.
| | - Ping Li
- Key Laboratory for Forest Resources Conservation and Utilization in the Southwest Mountains of China Ministry of Education, Southwest Forestry University, Kunming, 650201, China.
| | - Ting Luo
- Key Laboratory Dependent on for Economic Plants and Biotechnology, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China.
| | - Ze-Xi Chen
- Key Laboratory Dependent on for Economic Plants and Biotechnology, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China.
| | - Si-Lin Chen
- Key Laboratory Dependent on for Economic Plants and Biotechnology, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Li Liu
- Key Laboratory Dependent on for Economic Plants and Biotechnology, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China; State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-Resources, Hubei Key Laboratory of Industrial Biotechnology, Hubei University, Wuhan, Hubei, China.
| |
Collapse
|
11
|
Claverie JM, Santini S. Validation of predicted anonymous proteins simply using Fisher's exact test. BIOINFORMATICS ADVANCES 2021; 1:vbab034. [PMID: 36700095 PMCID: PMC9710694 DOI: 10.1093/bioadv/vbab034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/03/2021] [Accepted: 11/10/2021] [Indexed: 01/28/2023]
Abstract
Motivation Genomes sequencing has become the primary (and often the sole) experimental method to characterize newly discovered organisms, in particular from the microbial world (bacteria, archaea, viruses). This generates an ever increasing number of predicted proteins the existence of which is unwarranted, in particular among those without homolog in model organisms. As a last resort, the computation of the selection pressure from pairwise alignments of the corresponding 'Open Reading Frames' (ORFs) can be used to validate their existences. However, this approach is error-prone, as not usually associated with a significance test. Results We introduce the use of the straightforward Fisher's exact test as a postprocessing of the results provided by the popular CODEML sequence comparison software. The respective rates of nucleotide changes at the nonsynonymous versus synonymous position (as determined by CODEML) are turned into entries into a 2 × 2 contingency table, the probability of which is computed under the Null hypothesis that they should not behave differently if the ORFs do not encode actual proteins. Using the genome sequences of two recently isolated giant viruses, we show that strong negative selection pressures do not always provide a solid argument in favor of the existence of proteins.
Collapse
Affiliation(s)
- Jean-Michel Claverie
- Aix-Marseille University, CNRS, IGS (UMR7256), IMM (FR3479), Luminy, Marseille F-13288, France,To whom correspondence should be addressed.
| | - Sébastien Santini
- Aix-Marseille University, CNRS, IGS (UMR7256), IMM (FR3479), Luminy, Marseille F-13288, France
| |
Collapse
|
12
|
Li J, Singh U, Arendsee Z, Wurtele ES. Landscape of the Dark Transcriptome Revealed Through Re-mining Massive RNA-Seq Data. Front Genet 2021; 12:722981. [PMID: 34484307 PMCID: PMC8415361 DOI: 10.3389/fgene.2021.722981] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 07/26/2021] [Indexed: 12/13/2022] Open
Abstract
The "dark transcriptome" can be considered the multitude of sequences that are transcribed but not annotated as genes. We evaluated expression of 6,692 annotated genes and 29,354 unannotated open reading frames (ORFs) in the Saccharomyces cerevisiae genome across diverse environmental, genetic and developmental conditions (3,457 RNA-Seq samples). Over 30% of the highly transcribed ORFs have translation evidence. Phylostratigraphic analysis infers most of these transcribed ORFs would encode species-specific proteins ("orphan-ORFs"); hundreds have mean expression comparable to annotated genes. These data reveal unannotated ORFs most likely to be protein-coding genes. We partitioned a co-expression matrix by Markov Chain Clustering; the resultant clusters contain 2,468 orphan-ORFs. We provide the aggregated RNA-Seq yeast data with extensive metadata as a project in MetaOmGraph (MOG), a tool designed for interactive analysis and visualization. This approach enables reuse of public RNA-Seq data for exploratory discovery, providing a rich context for experimentalists to make novel, experimentally testable hypotheses about candidate genes.
Collapse
Affiliation(s)
- Jing Li
- Genetics and Genomics Graduate Program, Iowa State University, Ames, IA, United States
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
| | - Urminder Singh
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| | - Zebulun Arendsee
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| | - Eve Syrkin Wurtele
- Genetics and Genomics Graduate Program, Iowa State University, Ames, IA, United States
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| |
Collapse
|
13
|
Majic P, Payne JL. Enhancers Facilitate the Birth of De Novo Genes and Gene Integration into Regulatory Networks. Mol Biol Evol 2021; 37:1165-1178. [PMID: 31845961 PMCID: PMC7086177 DOI: 10.1093/molbev/msz300] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Regulatory networks control the spatiotemporal gene expression patterns that give rise to and define the individual cell types of multicellular organisms. In eumetazoa, distal regulatory elements called enhancers play a key role in determining the structure of such networks, particularly the wiring diagram of “who regulates whom.” Mutations that affect enhancer activity can therefore rewire regulatory networks, potentially causing adaptive changes in gene expression. Here, we use whole-tissue and single-cell transcriptomic and chromatin accessibility data from mouse to show that enhancers play an additional role in the evolution of regulatory networks: They facilitate network growth by creating transcriptionally active regions of open chromatin that are conducive to de novo gene evolution. Specifically, our comparative transcriptomic analysis with three other mammalian species shows that young, mouse-specific intergenic open reading frames are preferentially located near enhancers, whereas older open reading frames are not. Mouse-specific intergenic open reading frames that are proximal to enhancers are more highly and stably transcribed than those that are not proximal to enhancers or promoters, and they are transcribed in a limited diversity of cellular contexts. Furthermore, we report several instances of mouse-specific intergenic open reading frames proximal to promoters showing evidence of being repurposed enhancers. We also show that open reading frames gradually acquire interactions with enhancers over macroevolutionary timescales, helping integrate genes—those that have arisen de novo or by other means—into existing regulatory networks. Taken together, our results highlight a dual role of enhancers in expanding and rewiring gene regulatory networks.
Collapse
Affiliation(s)
- Paco Majic
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Corresponding author: E-mail:
| |
Collapse
|
14
|
Rödelsperger C, Ebbing A, Sharma DR, Okumura M, Sommer RJ, Korswagen HC. Spatial Transcriptomics of Nematodes Identifies Sperm Cells as a Source of Genomic Novelty and Rapid Evolution. Mol Biol Evol 2021; 38:229-243. [PMID: 32785688 PMCID: PMC8480184 DOI: 10.1093/molbev/msaa207] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Divergence of gene function and expression during development can give rise to phenotypic differences at the level of cells, tissues, organs, and ultimately whole organisms. To gain insights into the evolution of gene expression and novel genes at spatial resolution, we compared the spatially resolved transcriptomes of two distantly related nematodes, Caenorhabditis elegans and Pristionchus pacificus, that diverged 60-90 Ma. The spatial transcriptomes of adult worms show little evidence for strong conservation at the level of single genes. Instead, regional expression is largely driven by recent duplication and emergence of novel genes. Estimation of gene ages across anatomical structures revealed an enrichment of novel genes in sperm-related regions. This provides first evidence in nematodes for the "out of testis" hypothesis that has been previously postulated based on studies in Drosophila and mammals. "Out of testis" genes represent a mix of products of pervasive transcription as well as fast evolving members of ancient gene families. Strikingly, numerous novel genes have known functions during meiosis in Caenorhabditis elegans indicating that even universal processes such as meiosis may be targets of rapid evolution. Our study highlights the importance of novel genes in generating phenotypic diversity and explicitly characterizes gene origination in sperm-related regions. Furthermore, it proposes new functions for previously uncharacterized genes and establishes the spatial transcriptome of Pristionchus pacificus as a catalog for future studies on the evolution of gene expression and function.
Collapse
Affiliation(s)
- Christian Rödelsperger
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Annabel Ebbing
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences and University Medical Center Utrecht, Utrecht,
The Netherlands
| | - Devansh Raj Sharma
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Misako Okumura
- Program of Biomedical Science, Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Hiroshima, Japan
| | - Ralf J Sommer
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Hendrik C Korswagen
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences and University Medical Center Utrecht, Utrecht,
The Netherlands
- Developmental Biology, Department of Biology, Institute of Biodynamics and Biocomplexity, Utrecht University, Utrecht,
The Netherlands
| |
Collapse
|
15
|
Dowling D, Schmitz JF, Bornberg-Bauer E. Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage. Genome Biol Evol 2020; 12:2183-2195. [PMID: 33210146 PMCID: PMC7674706 DOI: 10.1093/gbe/evaa194] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/12/2020] [Indexed: 12/12/2022] Open
Abstract
In addition to known genes, much of the human genome is transcribed into RNA. Chance formation of novel open reading frames (ORFs) can lead to the translation of myriad new proteins. Some of these ORFs may yield advantageous adaptive de novo proteins. However, widespread translation of noncoding DNA can also produce hazardous protein molecules, which can misfold and/or form toxic aggregates. The dynamics of how de novo proteins emerge from potentially toxic raw materials and what influences their long-term survival are unknown. Here, using transcriptomic data from human and five other primates, we generate a set of transcribed human ORFs at six conservation levels to investigate which properties influence the early emergence and long-term retention of these expressed ORFs. As these taxa diverged from each other relatively recently, we present a fine scale view of the evolution of novel sequences over recent evolutionary time. We find that novel human-restricted ORFs are preferentially located on GC-rich gene-dense chromosomes, suggesting their retention is linked to pre-existing genes. Sequence properties such as intrinsic structural disorder and aggregation propensity-which have been proposed to play a role in survival of de novo genes-remain unchanged over time. Even very young sequences code for proteins with low aggregation propensities, suggesting that genomic regions with many novel transcribed ORFs are concomitantly less likely to produce ORFs which code for harmful toxic proteins. Our data indicate that the survival of these novel ORFs is largely stochastic rather than shaped by selection.
Collapse
Affiliation(s)
- Daniel Dowling
- Institute for Evolution and Biodiversity, University of Münster, Germany
| | - Jonathan F Schmitz
- Institute for Evolution and Biodiversity, University of Münster, Germany
| | | |
Collapse
|
16
|
Athanasouli M, Witte H, Weiler C, Loschko T, Eberhardt G, Sommer RJ, Rödelsperger C. Comparative genomics and community curation further improve gene annotations in the nematode Pristionchus pacificus. BMC Genomics 2020; 21:708. [PMID: 33045985 PMCID: PMC7552371 DOI: 10.1186/s12864-020-07100-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 09/23/2020] [Indexed: 02/07/2023] Open
Abstract
Background Nematode model organisms such as Caenorhabditis elegans and Pristionchus pacificus are powerful systems for studying the evolution of gene function at a mechanistic level. However, the identification of P. pacificus orthologs of candidate genes known from C. elegans is complicated by the discrepancy in the quality of gene annotations, a common problem in nematode and invertebrate genomics. Results Here, we combine comparative genomic screens for suspicious gene models with community-based curation to further improve the quality of gene annotations in P. pacificus. We extend previous curations of one-to-one orthologs to larger gene families and also orphan genes. Cross-species comparisons of protein lengths, screens for atypical domain combinations and species-specific orphan genes resulted in 4311 candidate genes that were subject to community-based curation. Corrections for 2946 gene models were implemented in a new version of the P. pacificus gene annotations. The new set of gene annotations contains 28,896 genes and has a single copy ortholog completeness level of 97.6%. Conclusions Our work demonstrates the effectiveness of comparative genomic screens to identify suspicious gene models and the scalability of community-based approaches to improve the quality of thousands of gene models. Similar community-based approaches can help to improve the quality of gene annotations in other invertebrate species, including parasitic nematodes.
Collapse
Affiliation(s)
- Marina Athanasouli
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Hanh Witte
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Christian Weiler
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Tobias Loschko
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Gabi Eberhardt
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Ralf J Sommer
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Christian Rödelsperger
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany.
| |
Collapse
|
17
|
Chen K, Tian Z, Chen P, He H, Jiang F, Long CA. Genome-wide identification, characterization and expression analysis of lineage-specific genes within Hanseniaspora yeasts. FEMS Microbiol Lett 2020; 367:5837084. [PMID: 32407480 DOI: 10.1093/femsle/fnaa077] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2019] [Accepted: 05/12/2020] [Indexed: 12/13/2022] Open
Abstract
Lineage-specific genes (LSGs) are defined as genes with sequences that are not significantly similar to those in any other lineage. LSGs have been proposed, and sometimes shown, to have significant effects in the evolution of biological function. In this study, two sets of Hanseniaspora spp. LSGs were identified by comparing the sequences of the Kloeckera apiculata genome and of 80 other yeast genomes. This study identified 344 Hanseniaspora-specific genes (HSGs) and 109 genes ('orphan genes') specific to K. apiculata. Three thousand three hundred thirty-one K. apiculata genes that showed significant similarity to at least one sequence outside the Hanseniaspora were classified into evolutionarily conserved genes. We analyzed their sequence features, functional categories, gene origin, gene structure and gene expression. We also investigated the predicted cellular roles and Gene Ontology categories of the LSGs using functional inference. The patterns of the functions of LSGs do not deviate significantly from genome-wide average. The results showed that a few LSGs were formed by gene duplication, followed by rapid sequence divergence. Many of the HSGs and orphan genes exhibited altered expression in response to abiotic stress. Studying these LSGs might be helpful for understanding the molecular mechanism of yeast adaption.
Collapse
Affiliation(s)
- Kai Chen
- School of Biological Engineering and Food, Hubei University of Technology, Wuhan 430068, China
| | - Zhonghuan Tian
- Key Laboratory of Horticultural Plant Biology of the Ministry of Education, National Centre of Citrus Breeding, Huazhong Agricultural University, Wuhan 430070, China
| | - Ping Chen
- Department of Pediatric Hematology, Tongji Hospital Affiliated to Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China
| | - Hua He
- School of Landscape Architecture and Horticulture, Wuhan Institute of Bioengineering, Wuhan 430415, China
| | - Fatang Jiang
- School of Biological Engineering and Food, Hubei University of Technology, Wuhan 430068, China
| | - Chao-An Long
- Key Laboratory of Horticultural Plant Biology of the Ministry of Education, National Centre of Citrus Breeding, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
18
|
Rödelsperger C, Athanasouli M, Lenuzzi M, Theska T, Sun S, Dardiry M, Wighard S, Hu W, Sharma DR, Han Z. Crowdsourcing and the feasibility of manual gene annotation: A pilot study in the nematode Pristionchus pacificus. Sci Rep 2019; 9:18789. [PMID: 31827189 PMCID: PMC6906410 DOI: 10.1038/s41598-019-55359-5] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 11/20/2019] [Indexed: 01/15/2023] Open
Abstract
Nematodes such as Caenorhabditis elegans are powerful systems to study basically all aspects of biology. Their species richness together with tremendous genetic knowledge from C. elegans facilitate the evolutionary study of biological functions using reverse genetics. However, the ability to identify orthologs of candidate genes in other species can be hampered by erroneous gene annotations. To improve gene annotation in the nematode model organism Pristionchus pacificus, we performed a genome-wide screen for C. elegans genes with potentially incorrectly annotated P. pacificus orthologs. We initiated a community-based project to manually inspect more than two thousand candidate loci and to propose new gene models based on recently generated Iso-seq and RNA-seq data. In most cases, misannotation of C. elegans orthologs was due to artificially fused gene predictions and completely missing gene models. The community-based curation raised the gene count from 25,517 to 28,036 and increased the single copy ortholog completeness level from 86% to 97%. This pilot study demonstrates how even small-scale crowdsourcing can drastically improve gene annotations. In future, similar approaches can be used for other species, gene sets, and even larger communities thus making manual annotation of large parts of the genome feasible.
Collapse
Affiliation(s)
- Christian Rödelsperger
- Max Planck Institute for Developmental Biology, Department for Integrative Evolutionary Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany.
| | - Marina Athanasouli
- Max Planck Institute for Developmental Biology, Department for Integrative Evolutionary Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Maša Lenuzzi
- Max Planck Institute for Developmental Biology, Department for Integrative Evolutionary Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Tobias Theska
- Max Planck Institute for Developmental Biology, Department for Integrative Evolutionary Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Shuai Sun
- Max Planck Institute for Developmental Biology, Department for Integrative Evolutionary Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Mohannad Dardiry
- Max Planck Institute for Developmental Biology, Department for Integrative Evolutionary Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Sara Wighard
- Max Planck Institute for Developmental Biology, Department for Integrative Evolutionary Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Wen Hu
- Max Planck Institute for Developmental Biology, Department for Integrative Evolutionary Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Devansh Raj Sharma
- Max Planck Institute for Developmental Biology, Department for Integrative Evolutionary Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Ziduan Han
- Max Planck Institute for Developmental Biology, Department for Integrative Evolutionary Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| |
Collapse
|
19
|
Rödelsperger C, Prabh N, Sommer RJ. New Gene Origin and Deep Taxon Phylogenomics: Opportunities and Challenges. Trends Genet 2019; 35:914-922. [DOI: 10.1016/j.tig.2019.08.007] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 08/07/2019] [Accepted: 08/29/2019] [Indexed: 01/22/2023]
|
20
|
Perochon A, Kahla A, Vranić M, Jia J, Malla KB, Craze M, Wallington E, Doohan FM. A wheat NAC interacts with an orphan protein and enhances resistance to Fusarium head blight disease. PLANT BIOTECHNOLOGY JOURNAL 2019; 17:1892-1904. [PMID: 30821405 PMCID: PMC6737021 DOI: 10.1111/pbi.13105] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Revised: 02/19/2019] [Accepted: 02/21/2019] [Indexed: 05/05/2023]
Abstract
Taxonomically-restricted orphan genes play an important role in environmental adaptation, as recently demonstrated by the fact that the Pooideae-specific orphan TaFROG (Triticum aestivum Fusarium Resistance Orphan Gene) enhanced wheat resistance to the economically devastating Fusarium head blight (FHB) disease. Like most orphan genes, little is known about the cellular function of the encoded protein TaFROG, other than it interacts with the central stress regulator TaSnRK1α. Here, we functionally characterized a wheat (T. aestivum) NAC-like transcription factor TaNACL-D1 that interacts with TaFROG and investigated its' role in FHB using studies to assess motif analyses, yeast transactivation, protein-protein interaction, gene expression and the disease response of wheat lines overexpressing TaNACL-D1. TaNACL-D1 is a Poaceae-divergent NAC transcription factor that encodes a Triticeae-specific protein C-terminal region with transcriptional activity and a nuclear localisation signal. The TaNACL-D1/TaFROG interaction was detected in yeast and confirmed in planta, within the nucleus. Analysis of multi-protein interactions indicated that TaFROG could form simultaneously distinct protein complexes with TaNACL-D1 and TaSnRK1α in planta. TaNACL-D1 and TaFROG are co-expressed as an early response to both the causal fungal agent of FHB, Fusarium graminearum and its virulence factor deoxynivalenol (DON). Wheat lines overexpressing TaNACL-D1 were more resistant to FHB disease than wild type plants. Thus, we conclude that the orphan protein TaFROG interacts with TaNACL-D1, a NAC transcription factor that forms part of the disease response evolved within the Triticeae.
Collapse
Affiliation(s)
- Alexandre Perochon
- UCD School of Biology and Environmental Science and Earth InstituteCollege of ScienceUniversity College DublinBelfield, Dublin 4Ireland
| | - Amal Kahla
- UCD School of Biology and Environmental Science and Earth InstituteCollege of ScienceUniversity College DublinBelfield, Dublin 4Ireland
| | - Monika Vranić
- UCD School of Biology and Environmental Science and Earth InstituteCollege of ScienceUniversity College DublinBelfield, Dublin 4Ireland
| | - Jianguang Jia
- UCD School of Biology and Environmental Science and Earth InstituteCollege of ScienceUniversity College DublinBelfield, Dublin 4Ireland
| | - Keshav B. Malla
- UCD School of Biology and Environmental Science and Earth InstituteCollege of ScienceUniversity College DublinBelfield, Dublin 4Ireland
| | | | | | - Fiona M. Doohan
- UCD School of Biology and Environmental Science and Earth InstituteCollege of ScienceUniversity College DublinBelfield, Dublin 4Ireland
| |
Collapse
|
21
|
Prabh N, Rödelsperger C. De Novo, Divergence, and Mixed Origin Contribute to the Emergence of Orphan Genes in Pristionchus Nematodes. G3 (BETHESDA, MD.) 2019; 9:2277-2286. [PMID: 31088903 PMCID: PMC6643871 DOI: 10.1534/g3.119.400326] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Accepted: 05/11/2019] [Indexed: 12/30/2022]
Abstract
Homology is a fundamental concept in comparative biology. It is extensively used at the sequence level to make phylogenetic hypotheses and functional inferences. Nonetheless, the majority of eukaryotic genomes contain large numbers of orphan genes lacking homologs in other taxa. Generally, the fraction of orphan genes is higher in genomically undersampled clades, and in the absence of closely related genomes any hypothesis about their origin and evolution remains untestable. Previously, we sequenced ten genomes with an underlying ladder-like phylogeny to establish a phylogenomic framework for studying genome evolution in diplogastrid nematodes. Here, we use this deeply sampled data set to understand the processes that generate orphan genes in our focal species Pristionchus pacificus Based on phylostratigraphic analysis and additional bioinformatic filters, we obtained 29 high-confidence candidate genes for which mechanisms of orphan origin were proposed based on manual inspection. This revealed diverse mechanisms including annotation artifacts, chimeric origin, alternative reading frame usage, and gene splitting with subsequent gain of de novo exons. In addition, we present two cases of complete de novo origination from non-coding regions, which represents one of the first reports of de novo genes in nematodes. Thus, we conclude that de novo emergence, divergence, and mixed mechanisms contribute to novel gene formation in Pristionchus nematodes.
Collapse
Affiliation(s)
- Neel Prabh
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Biology, August Thienemann Str. 2, 24306 Plön, Germany
| | - Christian Rödelsperger
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| |
Collapse
|
22
|
Li G, Wu X, Hu Y, Muñoz-Amatriaín M, Luo J, Zhou W, Wang B, Wang Y, Wu X, Huang L, Lu Z, Xu P. Orphan genes are involved in drought adaptations and ecoclimatic-oriented selections in domesticated cowpea. JOURNAL OF EXPERIMENTAL BOTANY 2019; 70:3101-3110. [PMID: 30949664 DOI: 10.1093/jxb/erz145] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Accepted: 03/20/2019] [Indexed: 05/19/2023]
Abstract
Orphan genes (OGs) are genes that are restricted to a single species or a particular taxonomic group. To date, little is known about the functions of OGs in domesticated crops. Here, we report our findings on the relationships between OGs and environmental adaptation in cowpea (Vigna unguiculata). We identified 578 expressed OGs, of which 73.2% were predicted to be non-coding. Transcriptomic analyses revealed a high rate of OGs that were drought inducible in roots when compared with conserved genes. Co-expression analysis further revealed the possible involvement of OGs in stress response pathways. Overexpression of UP12_8740, a drought-inducible OG, conferred enhanced tolerance to osmotic stresses and soil drought. By combining Capture-Seq and fluorescence-based Kompetitive allele-specific PCR (KASP), we efficiently genotyped single nucleotide polymorphisms (SNPs) on OGs across a 223 accession cowpea germplasm collection. Population genomic parameters, including polymorphism information content (PIC), expected heterozygosity (He), nucleotide diversity (π), and Tajima's D statistics, that were calculated based on these SNPs, showed distinct signatures between the grain- and vegetable-type subpopulations of cowpea. This study reinforces the idea that OGs are a valuable resource for identifying new genes related to species-specific environmental adaptations and fosters new insights that artificial selection on OGs might have contributed to balancing the adaptive and agronomic traits in domesticated crops in various ecoclimatic conditions.
Collapse
Affiliation(s)
- Guojing Li
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
- State Key Lab Breeding Base for Sustainable Control of Plant Pest and Disease, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Xinyi Wu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Yaowen Hu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Maria Muñoz-Amatriaín
- Department of Botany and Plant Sciences, University of California Riverside, Riverside, CA, USA
| | - Jie Luo
- Central Laboratory of Zhejiang Academy of Agricultural Sciences, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Wen Zhou
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Baogen Wang
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Ying Wang
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Xiaohua Wu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Lijuan Huang
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
- College of Horticulture, Northwest Agriculture and Forestry University, Yangling, China
| | - Zhongfu Lu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Pei Xu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
- State Key Lab Breeding Base for Sustainable Control of Plant Pest and Disease, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| |
Collapse
|
23
|
McLean F, Berger D, Laetsch DR, Schwartz HT, Blaxter M. Improving the annotation of the Heterorhabditis bacteriophora genome. Gigascience 2018; 7:4958981. [PMID: 29617768 PMCID: PMC5906903 DOI: 10.1093/gigascience/giy034] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2017] [Accepted: 03/23/2018] [Indexed: 12/03/2022] Open
Abstract
Background Genome assembly and annotation remain exacting tasks. As the tools available for these tasks improve, it is useful to return to data produced with earlier techniques to assess their credibility and correctness. The entomopathogenic nematode Heterorhabditis bacteriophora is widely used to control insect pests in horticulture. The genome sequence for this species was reported to encode an unusually high proportion of unique proteins and a paucity of secreted proteins compared to other related nematodes. Findings We revisited the H. bacteriophora genome assembly and gene predictions to determine whether these unusual characteristics were biological or methodological in origin. We mapped an independent resequencing dataset to the genome and used the blobtools pipeline to identify potential contaminants. While present (0.2% of the genome span, 0.4% of predicted proteins), assembly contamination was not significant. Conclusions Re-prediction of the gene set using BRAKER1 and published transcriptome data generated a predicted proteome that was very different from the published one. The new gene set had a much reduced complement of unique proteins, better completeness values that were in line with other related species’ genomes, and an increased number of proteins predicted to be secreted. It is thus likely that methodological issues drove the apparent uniqueness of the initial H. bacteriophora genome annotation and that similar contamination and misannotation issues affect other published genome assemblies.
Collapse
Affiliation(s)
- Florence McLean
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UK
| | - Duncan Berger
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UK
| | - Dominik R Laetsch
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UK
| | - Hillel T Schwartz
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, USA
| | - Mark Blaxter
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UK
| |
Collapse
|
24
|
Rödelsperger C, Röseler W, Prabh N, Yoshida K, Weiler C, Herrmann M, Sommer RJ. Phylotranscriptomics of Pristionchus Nematodes Reveals Parallel Gene Loss in Six Hermaphroditic Lineages. Curr Biol 2018; 28:3123-3127.e5. [DOI: 10.1016/j.cub.2018.07.041] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 06/12/2018] [Accepted: 07/12/2018] [Indexed: 11/28/2022]
|
25
|
Werner MS, Sieriebriennikov B, Prabh N, Loschko T, Lanz C, Sommer RJ. Young genes have distinct gene structure, epigenetic profiles, and transcriptional regulation. Genome Res 2018; 28:1675-1687. [PMID: 30232198 PMCID: PMC6211652 DOI: 10.1101/gr.234872.118] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 09/05/2018] [Indexed: 12/22/2022]
Abstract
Species-specific, new, or "orphan" genes account for 10%-30% of eukaryotic genomes. Although initially considered to have limited function, an increasing number of orphan genes have been shown to provide important phenotypic innovation. How new genes acquire regulatory sequences for proper temporal and spatial expression is unknown. Orphan gene regulation may rely in part on origination in open chromatin adjacent to preexisting promoters, although this has not yet been assessed by genome-wide analysis of chromatin states. Here, we combine taxon-rich nematode phylogenies with Iso-Seq, RNA-seq, ChIP-seq, and ATAC-seq to identify the gene structure and epigenetic signature of orphan genes in the satellite model nematode Pristionchus pacificus Consistent with previous findings, we find young genes are shorter, contain fewer exons, and are on average less strongly expressed than older genes. However, the subset of orphan genes that are expressed exhibit distinct chromatin states from similarly expressed conserved genes. Orphan gene transcription is determined by a lack of repressive histone modifications, confirming long-held hypotheses that open chromatin is important for new gene formation. Yet orphan gene start sites more closely resemble enhancers defined by H3K4me1, H3K27ac, and ATAC-seq peaks, in contrast to conserved genes that exhibit traditional promoters defined by H3K4me3 and H3K27ac. Although the majority of orphan genes are located on chromosome arms that contain high recombination rates and repressive histone marks, strongly expressed orphan genes are more randomly distributed. Our results support a model of new gene origination by rare integration into open chromatin near enhancers.
Collapse
Affiliation(s)
- Michael S Werner
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Bogdan Sieriebriennikov
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Neel Prabh
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Tobias Loschko
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Christa Lanz
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Ralf J Sommer
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| |
Collapse
|
26
|
Prabh N, Roeseler W, Witte H, Eberhardt G, Sommer RJ, Rödelsperger C. Deep taxon sampling reveals the evolutionary dynamics of novel gene families in Pristionchus nematodes. Genome Res 2018; 28:1664-1674. [PMID: 30232197 PMCID: PMC6211646 DOI: 10.1101/gr.234971.118] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Accepted: 09/05/2018] [Indexed: 01/20/2023]
Abstract
The widespread identification of genes without detectable homology in related taxa is a hallmark of genome sequencing projects in animals, together with the abundance of gene duplications. Such genes have been called novel, young, taxon-restricted, or orphans, but little is known about the mechanisms accounting for their origin, age, and mode of evolution. Phylogenomic studies relying on deep and systematic taxon sampling and using the comparative method can provide insight into the evolutionary dynamics acting on novel genes. We used a phylogenomic approach for the nematode model organism Pristionchus pacificus and sequenced six additional Pristionchus and two outgroup species. This resulted in 10 genomes with a ladder-like phylogeny, sequenced in one laboratory using the same platform and analyzed by the same bioinformatic procedures. Our analysis revealed that 68%-81% of genes are assignable to orthologous gene families, the majority of which defined nine age classes with presence/absence patterns that can be explained by single evolutionary events. Contrasting different age classes, we find that older age classes are concentrated at chromosome centers, whereas novel gene families preferentially arise at the periphery, are weakly expressed, evolve rapidly, and have a high propensity of being lost. Over time, they increase in expression and become more constrained. Thus, the detailed phylogenetic resolution allowed a comprehensive characterization of the evolutionary dynamics of Pristionchus genomes indicating that distribution of age classes and their associated differences shape chromosomal divergence. This study establishes the Pristionchus system for future research on the mechanisms that drive the formation of novel genes.
Collapse
Affiliation(s)
- Neel Prabh
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| | - Waltraud Roeseler
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| | - Hanh Witte
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| | - Gabi Eberhardt
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| | - Ralf J Sommer
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| | - Christian Rödelsperger
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| |
Collapse
|
27
|
Çelen İ, Doh JH, Sabanayagam CR. Effects of liquid cultivation on gene expression and phenotype of C. elegans. BMC Genomics 2018; 19:562. [PMID: 30064382 PMCID: PMC6069985 DOI: 10.1186/s12864-018-4948-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 07/19/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Liquid cultures have been commonly used in space, toxicology, and pharmacology studies of Caenorhabditis elegans. However, the knowledge about transcriptomic alterations caused by liquid cultivation remains limited. Moreover, the impact of different genotypes in rapid adaptive responses to environmental changes (e.g., liquid cultivation) is often overlooked. Here, we report the transcriptomic and phenotypic responses of laboratory N2 and the wild-isolate AB1 strains after culturing P0 worms on agar plates, F1 in liquid cultures, and F2 back on agar plates. RESULTS Significant variations were found in the gene expressions between the N2 and AB1 strains in response to liquid cultivation. The results demonstrated that 8-34% of the environmental change-induced transcriptional responses are transmitted to the subsequent generation. By categorizing the gene expressions for genotype, environment, and genotype-environment interactions, we identified that the genotype has a substantial impact on the adaptive responses. Functional analysis of the transcriptome showed correlation with phenotypical changes. For example, the N2 strain exhibited alterations in both phenotype and gene expressions for germline and cuticle in axenic liquid cultivation. We found transcript evidence to approximately 21% of the computationally predicted genes in C. elegans by exposing the worms to environmental changes. CONCLUSIONS The presented study reveals substantial differences between N2 and AB1 strains for transcriptomic and phenotypical responses to rapid environmental changes. Our data can provide standard controls for future studies for the liquid cultivation of C. elegans and enable the discovery of condition-specific genes.
Collapse
Affiliation(s)
- İrem Çelen
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711 USA
- Delaware Biotechnology Institute, University of Delaware, 15 Innovation Way, Newark, DE 19711 USA
| | - Jung H. Doh
- Delaware Biotechnology Institute, University of Delaware, 15 Innovation Way, Newark, DE 19711 USA
| | - Chandran R. Sabanayagam
- Delaware Biotechnology Institute, University of Delaware, 15 Innovation Way, Newark, DE 19711 USA
| |
Collapse
|
28
|
Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza. Nat Genet 2018; 50:285-296. [DOI: 10.1038/s41588-018-0040-0] [Citation(s) in RCA: 289] [Impact Index Per Article: 48.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2017] [Accepted: 12/18/2017] [Indexed: 11/08/2022]
|
29
|
Abstract
Nematodes, such as Caenorhabditis elegans, form one of the most species-rich animal phyla. By now more than 30 nematode genomes have been published allowing for comparative genomic analyses at various different time-scales. The majority of a nematode's gene repertoire is represented by either duplicated or so-called orphan genes of unknown origin. This indicates the importance of mechanisms that generate new genes during the course of evolution. While it is certain that nematodes have acquired genes by horizontal gene transfer from various donors, this process only explains a small portion of the nematode gene content. As evolutionary genomic analyses strongly support that most orphan genes are indeed protein-coding, future studies will have to decide, whether they are result from extreme divergence or evolved de novo from previously noncoding sequences. In this contribution, I summarize several studies investigating gene loss and gain in nematodes and discuss the strengths and weaknesses of individual approaches and datasets. These approaches can be used to ask nematode-specific questions such as associated with the evolution of parasitism or with switches in mating systems, but also can complement studies in other animal phyla like vertebrates and insects to broaden our general view on genome evolution.
Collapse
Affiliation(s)
- Christian Rödelsperger
- Department for Evolutionary Biology, Max Planck Institute for Developmental Biology, Spemannstr. 35, 72076, Tübingen, Germany.
| |
Collapse
|
30
|
Carlson DE, Hedin M. Comparative transcriptomics of Entelegyne spiders (Araneae, Entelegynae), with emphasis on molecular evolution of orphan genes. PLoS One 2017; 12:e0174102. [PMID: 28379977 PMCID: PMC5381867 DOI: 10.1371/journal.pone.0174102] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2016] [Accepted: 03/04/2017] [Indexed: 11/18/2022] Open
Abstract
Next-generation sequencing technology is rapidly transforming the landscape of evolutionary biology, and has become a cost-effective and efficient means of collecting exome information for non-model organisms. Due to their taxonomic diversity, production of interesting venom and silk proteins, and the relative scarcity of existing genomic resources, spiders in particular are excellent targets for next-generation sequencing (NGS) methods. In this study, the transcriptomes of six entelegyne spider species from three genera (Cicurina travisae, C. vibora, Habronattus signatus, H. ustulatus, Nesticus bishopi, and N. cooperi) were sequenced and de novo assembled. Each assembly was assessed for quality and completeness and functionally annotated using gene ontology information. Approximately 100 transcripts with evidence of homology to venom proteins were discovered. After identifying more than 3,000 putatively orthologous genes across all six taxa, we used comparative analyses to identify 24 instances of positively selected genes. In addition, between ~ 550 and 1,100 unique orphan genes were found in each genus. These unique, uncharacterized genes exhibited elevated rates of amino acid substitution, potentially consistent with lineage-specific adaptive evolution. The data generated for this study represent a valuable resource for future phylogenetic and molecular evolutionary research, and our results provide new insight into the forces driving genome evolution in taxa that span the root of entelegyne spider phylogeny.
Collapse
Affiliation(s)
- David E. Carlson
- Department of Biology, San Diego State University, San Diego, California, United States of America
- Department of Ecology & Evolution, Stony Brook University, Stony Brook, New York, United States of America
| | - Marshal Hedin
- Department of Biology, San Diego State University, San Diego, California, United States of America
| |
Collapse
|
31
|
Armero A, Baudouin L, Bocs S, This D. Improving transcriptome de novo assembly by using a reference genome of a related species: Translational genomics from oil palm to coconut. PLoS One 2017; 12:e0173300. [PMID: 28334050 PMCID: PMC5363918 DOI: 10.1371/journal.pone.0173300] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 02/17/2017] [Indexed: 01/20/2023] Open
Abstract
The palms are a family of tropical origin and one of the main constituents of the ecosystems of these regions around the world. The two main species of palm represent different challenges: coconut (Cocos nucifera L.) is a source of multiple goods and services in tropical communities, while oil palm (Elaeis guineensis Jacq) is the main protagonist of the oil market. In this study, we present a workflow that exploits the comparative genomics between a target species (coconut) and a reference species (oil palm) to improve the transcriptomic data, providing a proteome useful to answer functional or evolutionary questions. This workflow reduces redundancy and fragmentation, two inherent problems of transcriptomic data, while preserving the functional representation of the target species. Our approach was validated in Arabidopsis thaliana using Arabidopsis lyrata and Capsella rubella as references species. This analysis showed the high sensitivity and specificity of our strategy, relatively independent of the reference proteome. The workflow increased the length of proteins products in A. thaliana by 13%, allowing, often, to recover 100% of the protein sequence length. In addition redundancy was reduced by a factor greater than 3. In coconut, the approach generated 29,366 proteins, 1,246 of these proteins deriving from new contigs obtained with the BRANCH software. The coconut proteome presented a functional profile similar to that observed in rice and an important number of metabolic pathways related to secondary metabolism. The new sequences found with BRANCH software were enriched in functions related to biotic stress. Our strategy can be used as a complementary step to de novo transcriptome assembly to get a representative proteome of a target species. The results of the current analysis are available on the website PalmComparomics (http://palm-comparomics.southgreen.fr/).
Collapse
Affiliation(s)
- Alix Armero
- Montpellier SupAgro, UMR AGAP, Montpellier, France
| | | | - Stéphanie Bocs
- CIRAD, UMR AGAP, Montpellier, France
- South Green Bioinformatics Platform, Montpellier, France
| | | |
Collapse
|
32
|
González C, Lazcano M, Valdés J, Holmes DS. Bioinformatic Analyses of Unique (Orphan) Core Genes of the Genus Acidithiobacillus: Functional Inferences and Use As Molecular Probes for Genomic and Metagenomic/Transcriptomic Interrogation. Front Microbiol 2016; 7:2035. [PMID: 28082953 PMCID: PMC5186765 DOI: 10.3389/fmicb.2016.02035] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Accepted: 12/02/2016] [Indexed: 01/06/2023] Open
Abstract
Using phylogenomic and gene compositional analyses, five highly conserved gene families have been detected in the core genome of the phylogenetically coherent genus Acidithiobacillus of the class Acidithiobacillia. These core gene families are absent in the closest extant genus Thermithiobacillus tepidarius that subtends the Acidithiobacillus genus and roots the deepest in this class. The predicted proteins encoded by these core gene families are not detected by a BLAST search in the NCBI non-redundant database of more than 90 million proteins using a relaxed cut-off of 1.0e−5. None of the five families has a clear functional prediction. However, bioinformatic scrutiny, using pI prediction, motif/domain searches, cellular location predictions, genomic context analyses, and chromosome topology studies together with previously published transcriptomic and proteomic data, suggests that some may have functions associated with membrane remodeling during cell division perhaps in response to pH stress. Despite the high level of amino acid sequence conservation within each family, there is sufficient nucleotide variation of the respective genes to permit the use of the DNA sequences to distinguish different species of Acidithiobacillus, making them useful additions to the armamentarium of tools for phylogenetic analysis. Since the protein families are unique to the Acidithiobacillus genus, they can also be leveraged as probes to detect the genus in environmental metagenomes and metatranscriptomes, including industrial biomining operations, and acid mine drainage (AMD).
Collapse
Affiliation(s)
- Carolina González
- Center for Bioinformatics and Genome Biology, Fundación Ciencia & VidaSantiago, Chile; Facultad de Ciencias Biologicas, Universidad Andres BelloSantiago, Chile
| | - Marcelo Lazcano
- Center for Bioinformatics and Genome Biology, Fundación Ciencia & VidaSantiago, Chile; Facultad de Ciencias Biologicas, Universidad Andres BelloSantiago, Chile
| | - Jorge Valdés
- Center for Genomics and Bioinformatics, Faculty of Sciences, Universidad Mayor Santiago, Chile
| | - David S Holmes
- Center for Bioinformatics and Genome Biology, Fundación Ciencia & VidaSantiago, Chile; Facultad de Ciencias Biologicas, Universidad Andres BelloSantiago, Chile
| |
Collapse
|
33
|
Rödelsperger C, Menden K, Serobyan V, Witte H, Baskaran P. First insights into the nature and evolution of antisense transcription in nematodes. BMC Evol Biol 2016; 16:165. [PMID: 27549405 PMCID: PMC4994411 DOI: 10.1186/s12862-016-0740-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2016] [Accepted: 08/11/2016] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND The development of multicellular organisms is coordinated by various gene regulatory mechanisms that ensure correct spatio-temporal patterns of gene expression. Recently, the role of antisense transcription in gene regulation has moved into focus of research. To characterize genome-wide patterns of antisense transcription and to study their evolutionary conservation, we sequenced a strand-specific RNA-seq library of the nematode Pristionchus pacificus. RESULTS We identified 1112 antisense configurations of which the largest group represents 465 antisense transcripts (ASTs) that are fully embedded in introns of their host genes. We find that most ASTs show homology to protein-coding genes and are overrepresented in proteomic data. Together with the finding, that expression levels of ASTs and host genes are uncorrelated, this indicates that most ASTs in P. pacificus do not represent non-coding RNAs and do not exhibit regulatory functions on their host genes. We studied the evolution of antisense gene pairs across 20 nematode genomes, showing that the majority of pairs is lineage-specific and even the highly conserved vps-4, ddx-27, and sel-2 loci show abundant structural changes including duplications, deletions, intron gains and loss of antisense transcription. In contrast, host genes in general, are remarkably conserved and encode exceptionally long introns leading to unusually large blocks of conserved synteny. CONCLUSIONS Our study has shown that in P. pacificus antisense transcription as such does not define non-coding RNAs but is rather a feature of highly conserved genes with long introns. We hypothesize that the presence of regulatory elements imposes evolutionary constraint on the intron length, but simultaneously, their large size makes them a likely target for translocation of genomic elements including protein-coding genes that eventually end up as ASTs.
Collapse
Affiliation(s)
- Christian Rödelsperger
- Department for Evolutionary Biology, Max Planck Institute for Developmental Biology, Spemannstr. 35, Tübingen, 72076, Germany.
| | - Kevin Menden
- Department for Evolutionary Biology, Max Planck Institute for Developmental Biology, Spemannstr. 35, Tübingen, 72076, Germany.,Eberhard Karls University, Tübingen, Germany
| | - Vahan Serobyan
- Department for Evolutionary Biology, Max Planck Institute for Developmental Biology, Spemannstr. 35, Tübingen, 72076, Germany
| | - Hanh Witte
- Department for Evolutionary Biology, Max Planck Institute for Developmental Biology, Spemannstr. 35, Tübingen, 72076, Germany
| | - Praveen Baskaran
- Department for Evolutionary Biology, Max Planck Institute for Developmental Biology, Spemannstr. 35, Tübingen, 72076, Germany
| |
Collapse
|