51
|
Dowling D, Schmitz JF, Bornberg-Bauer E. Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage. Genome Biol Evol 2020; 12:2183-2195. [PMID: 33210146 PMCID: PMC7674706 DOI: 10.1093/gbe/evaa194] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/12/2020] [Indexed: 12/12/2022] Open
Abstract
In addition to known genes, much of the human genome is transcribed into RNA. Chance formation of novel open reading frames (ORFs) can lead to the translation of myriad new proteins. Some of these ORFs may yield advantageous adaptive de novo proteins. However, widespread translation of noncoding DNA can also produce hazardous protein molecules, which can misfold and/or form toxic aggregates. The dynamics of how de novo proteins emerge from potentially toxic raw materials and what influences their long-term survival are unknown. Here, using transcriptomic data from human and five other primates, we generate a set of transcribed human ORFs at six conservation levels to investigate which properties influence the early emergence and long-term retention of these expressed ORFs. As these taxa diverged from each other relatively recently, we present a fine scale view of the evolution of novel sequences over recent evolutionary time. We find that novel human-restricted ORFs are preferentially located on GC-rich gene-dense chromosomes, suggesting their retention is linked to pre-existing genes. Sequence properties such as intrinsic structural disorder and aggregation propensity-which have been proposed to play a role in survival of de novo genes-remain unchanged over time. Even very young sequences code for proteins with low aggregation propensities, suggesting that genomic regions with many novel transcribed ORFs are concomitantly less likely to produce ORFs which code for harmful toxic proteins. Our data indicate that the survival of these novel ORFs is largely stochastic rather than shaped by selection.
Collapse
Affiliation(s)
- Daniel Dowling
- Institute for Evolution and Biodiversity, University of Münster, Germany
| | - Jonathan F Schmitz
- Institute for Evolution and Biodiversity, University of Münster, Germany
| | | |
Collapse
|
52
|
Unfried JP, Fortes P. SMIM30, a tiny protein with a big role in liver cancer. J Hepatol 2020; 73:1010-1012. [PMID: 32843211 DOI: 10.1016/j.jhep.2020.07.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Revised: 07/03/2020] [Accepted: 07/06/2020] [Indexed: 12/04/2022]
Affiliation(s)
- Juan Pablo Unfried
- University of Navarra (UNAV), Center for Applied Medical Research (CIMA), Program of Gene Therapy and Hepatology, Pamplona, Spain
| | - Puri Fortes
- University of Navarra (UNAV), Center for Applied Medical Research (CIMA), Program of Gene Therapy and Hepatology, Pamplona, Spain; Navarra Institute for Health Research (IdiSNA), Pamplona, Spain; Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Spain.
| |
Collapse
|
53
|
Khitun A, Slavoff SA. Proteomic Detection and Validation of Translated Small Open Reading Frames. ACTA ACUST UNITED AC 2020; 11:e77. [PMID: 31750990 DOI: 10.1002/cpch.77] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Small open reading frames (smORFs) encode previously unannotated polypeptides or short proteins that regulate translation in cis (eukaryotes) and/or are independently functional (prokaryotes and eukaryotes). Ongoing efforts for complete annotation and functional characterization of smORF-encoded proteins have yielded novel regulators and therapeutic targets. However, because they are excluded from protein databases, initiate at non-AUG start codons, and produce few unique tryptic peptides, unannotated small proteins cannot be detected with standard proteomic methods. Here,, we outline a procedure for mass spectrometry-based detection of translated smORFs in cultured human cells from protein extraction, digestion, and LC-MS/MS, to database preparation and data analysis. Following proteomic detection, translation from a unique smORF may be validated via siRNA-based silencing or overexpression and epitope tagging. This is necessary to unambiguously assign a peptide to a smORF within a specific transcript isoform or genomic locus. Provided that sufficient starting material is available, this workflow can be applied to any cell type/organism and adjusted to study specific (patho)physiological contexts including, but not limited to, development, stress, and disease. © 2019 by John Wiley & Sons, Inc. Basic Protocol 1: Protein extraction, size selection, and trypsin digestion Alternate Protocol 1: In-solution C8 column size selection Support Protocol 1: Chloroform/methanol precipitation Support Protocol 2: Reduction, alkylation, and in-solution protease digestion Support Protocol 3: Peptide de-salting Basic Protocol 2: Two-dimensional LC-MS/MS with ERLIC fractionation Basic Protocol 3: Transcriptomic database construction Alternate Protocol 2: Transcriptomics database generation with gffread Basic Protocol 4: Non-annotated peptide identification from LC-MS/MS data Basic Protocol 5: Validation using isotopically labeled synthetic peptide standards and siRNA Basic Protocol 6: Transcript validation using transient overexpression.
Collapse
Affiliation(s)
- Alexandra Khitun
- Department of Chemistry, Yale University, New Haven, Connecticut.,Chemical Biology Institute, Yale University, West Haven, Connecticut
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, Connecticut.,Chemical Biology Institute, Yale University, West Haven, Connecticut.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut
| |
Collapse
|
54
|
Orr MW, Mao Y, Storz G, Qian SB. Alternative ORFs and small ORFs: shedding light on the dark proteome. Nucleic Acids Res 2020; 48:1029-1042. [PMID: 31504789 DOI: 10.1093/nar/gkz734] [Citation(s) in RCA: 146] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 08/03/2019] [Accepted: 08/15/2019] [Indexed: 02/06/2023] Open
Abstract
Traditional annotation of protein-encoding genes relied on assumptions, such as one open reading frame (ORF) encodes one protein and minimal lengths for translated proteins. With the serendipitous discoveries of translated ORFs encoded upstream and downstream of annotated ORFs, from alternative start sites nested within annotated ORFs and from RNAs previously considered noncoding, it is becoming clear that these initial assumptions are incorrect. The findings have led to the realization that genetic information is more densely coded and that the proteome is more complex than previously anticipated. As such, interest in the identification and characterization of the previously ignored 'dark proteome' is increasing, though we note that research in eukaryotes and bacteria has largely progressed in isolation. To bridge this gap and illustrate exciting findings emerging from studies of the dark proteome, we highlight recent advances in both eukaryotic and bacterial cells. We discuss progress in the detection of alternative ORFs as well as in the understanding of functions and the regulation of their expression and posit questions for future work.
Collapse
Affiliation(s)
- Mona Wu Orr
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Yuanhui Mao
- Division of Nutritional Sciences, Cornell University, Ithaca, NY 14853, USA
| | - Gisela Storz
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Shu-Bing Qian
- Division of Nutritional Sciences, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
55
|
Reynolds JC, Bwiza CP, Lee C. Mitonuclear genomics and aging. Hum Genet 2020; 139:381-399. [PMID: 31997134 PMCID: PMC7147958 DOI: 10.1007/s00439-020-02119-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2019] [Accepted: 01/17/2020] [Indexed: 12/25/2022]
Abstract
Our cells operate based on two distinct genomes that are enclosed in the nucleus and mitochondria. The mitochondrial genome presumably originates from endosymbiotic bacteria. With time, a large portion of the original genes in the bacterial genome is considered to have been lost or transferred to the nuclear genome, leaving a reduced 16.5 Kb circular mitochondrial DNA (mtDNA). Traditionally only 37 genes, including 13 proteins, were thought to be encoded within mtDNA, its genetic repertoire is expanding with the identification of mitochondrial-derived peptides (MDPs). The biology of aging has been largely unveiled to be regulated by genes that are encoded in the nuclear genome, whereas the mitochondrial genome remained more cryptic. However, recent studies position mitochondria and mtDNA as an important counterpart to the nuclear genome, whereby the two organelles constantly regulate each other. Thus, the genomic network that regulates lifespan and/or healthspan is likely constituted by two unique, yet co-evolved, genomes. Here, we will discuss aspects of mitochondrial biology, especially mitochondrial communication that may add substantial momentum to aging research by accounting for both mitonuclear genomes to more comprehensively and inclusively map the genetic and molecular networks that govern aging and age-related diseases.
Collapse
Affiliation(s)
- Joseph C Reynolds
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA
| | - Conscience P Bwiza
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA
| | - Changhan Lee
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA.
- USC Norris Comprehensive Cancer Center, Los Angeles, CA, 90089, USA.
- Biomedical Sciences, Graduate School, Ajou University, Suwon, 16499, South Korea.
| |
Collapse
|
56
|
Camargo AP, Sourkov V, Pereira G, Carazzolle M. RNAsamba: neural network-based assessment of the protein-coding potential of RNA sequences. NAR Genom Bioinform 2020; 2:lqz024. [PMID: 33575571 PMCID: PMC7671399 DOI: 10.1093/nargab/lqz024] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Revised: 11/15/2019] [Accepted: 12/17/2019] [Indexed: 02/06/2023] Open
Abstract
The advent of high-throughput sequencing technologies made it possible to obtain large volumes of genetic information, quickly and inexpensively. Thus, many efforts are devoted to unveiling the biological roles of genomic elements, being the distinction between protein-coding and long non-coding RNAs one of the most important tasks. We describe RNAsamba, a tool to predict the coding potential of RNA molecules from sequence information using a neural network-based that models both the whole sequence and the ORF to identify patterns that distinguish coding from non-coding transcripts. We evaluated RNAsamba's classification performance using transcripts coming from humans and several other model organisms and show that it recurrently outperforms other state-of-the-art methods. Our results also show that RNAsamba can identify coding signals in partial-length ORFs and UTR sequences, evidencing that its algorithm is not dependent on complete transcript sequences. Furthermore, RNAsamba can also predict small ORFs, traditionally identified with ribosome profiling experiments. We believe that RNAsamba will enable faster and more accurate biological findings from genomic data of species that are being sequenced for the first time. A user-friendly web interface, the documentation containing instructions for local installation and usage, and the source code of RNAsamba can be found at https://rnasamba.lge.ibi.unicamp.br/.
Collapse
Affiliation(s)
- Antonio P Camargo
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, University of Campinas, Campinas, SP, 13083-862, Brazil
| | - Vsevolod Sourkov
- Department of Computer Science, ReDNA Labs, Pattaya, Chonburi, 20150, Thailand
| | - Gonçalo A G Pereira
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, University of Campinas, Campinas, SP, 13083-862, Brazil
| | - Marcelo F Carazzolle
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, University of Campinas, Campinas, SP, 13083-862, Brazil
| |
Collapse
|
57
|
Wang Y, Yang C, Liu X, Zheng J, Zhang F, Wang D, Xue Y, Li X, Shen S, Shao L, Yang Y, Liu L, Ma J, Liu Y. Transcription factor AP-4 (TFAP4)-upstream ORF coding 66 aa inhibits the malignant behaviors of glioma cells by suppressing the TFAP4/long noncoding RNA 00520/microRNA-520f-3p feedback loop. Cancer Sci 2020; 111:891-906. [PMID: 31943575 PMCID: PMC7060482 DOI: 10.1111/cas.14308] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 12/27/2019] [Accepted: 01/02/2020] [Indexed: 02/06/2023] Open
Abstract
Upstream ORF (uORF) is a translational initiation element located in the 5′UTR of eukaryotic mRNAs. Studies have found that uORFs play an important regulatory role in many diseases. Based on The Cancer Genome Atlas database, the results of our experiments and previous research evidence, we investigated transcription factor AP‐4 (TFAP4) and its uORF, LIM and SH3 protein 1 (LASP1), long noncoding RNA 00520 (LINC00520), and microRNA (miR)‐520f‐3p as candidates involved in glioma malignancy, which is a poorly understood process. Both TFAP4‐66aa‐uORF and miR‐520f‐3p were downregulated, and TFAP4, LASP1, and LINC00520 were highly expressed in glioma tissues and cells. TFAP4‐66aa‐uORF or miR‐520f‐3p overexpression or TFAP4, LASP1, or LINC00520 knockdown inhibited glioma cell proliferation, migration, and invasion, but promoted apoptosis. TFAP4‐66aa‐uORF inhibited the translation of TFAP4 by binding to the TFAP4 mRNA. MicroRNA‐520f‐3p inhibited TFAP4 expression by binding to its 3′UTR. However, LINC00520 could promote the expression of TFAP4 by competitively binding to miR‐520f‐3p. In addition, TFAP4 transcriptionally activated LASP1 and LINC00520 expression by binding to their promoter regions, forming a positive feedback loop of TFAP4/LINC00520/miR‐520f‐3p. Our findings together indicated that TFAP4‐66aa‐uORF inhibited the TFAP4/LINC00520/miR‐520f‐3p feedback loop by directly inhibiting TFAP4 expression, subsequently leading to inhibition of glioma malignancy. This provides a basis for developing new therapeutic approaches for glioma treatment.
Collapse
Affiliation(s)
- Yipeng Wang
- Department of Neurosurgery, Shengjing Hospital of China Medical University, Shenyang, China.,Liaoning Clinical Medical Research Center in Nervous System Disease, Shenyang, China.,Key Laboratory of Neuro-oncology in Liaoning Province, Shenyang, China
| | - Chunqing Yang
- Department of Neurosurgery, Shengjing Hospital of China Medical University, Shenyang, China.,Liaoning Clinical Medical Research Center in Nervous System Disease, Shenyang, China.,Key Laboratory of Neuro-oncology in Liaoning Province, Shenyang, China
| | - Xiaobai Liu
- Department of Neurosurgery, Shengjing Hospital of China Medical University, Shenyang, China.,Liaoning Clinical Medical Research Center in Nervous System Disease, Shenyang, China.,Key Laboratory of Neuro-oncology in Liaoning Province, Shenyang, China
| | - Jian Zheng
- Department of Neurosurgery, Shengjing Hospital of China Medical University, Shenyang, China.,Liaoning Clinical Medical Research Center in Nervous System Disease, Shenyang, China.,Key Laboratory of Neuro-oncology in Liaoning Province, Shenyang, China
| | - Fangfang Zhang
- Department of Neurobiology, School of Life Sciences, China Medical University, Shenyang, China.,Key Laboratory of Cell Biology, Ministry of Public Health of China, China Medical University, Shenyang, China.,Key Laboratory of Medical Cell Biology, Ministry of Education of China, China Medical University, Shenyang, China
| | - Di Wang
- Department of Neurosurgery, Shengjing Hospital of China Medical University, Shenyang, China.,Liaoning Clinical Medical Research Center in Nervous System Disease, Shenyang, China.,Key Laboratory of Neuro-oncology in Liaoning Province, Shenyang, China
| | - Yixue Xue
- Department of Neurobiology, School of Life Sciences, China Medical University, Shenyang, China.,Key Laboratory of Cell Biology, Ministry of Public Health of China, China Medical University, Shenyang, China.,Key Laboratory of Medical Cell Biology, Ministry of Education of China, China Medical University, Shenyang, China
| | - Xiaozhi Li
- Department of Neurosurgery, Shengjing Hospital of China Medical University, Shenyang, China.,Liaoning Clinical Medical Research Center in Nervous System Disease, Shenyang, China.,Key Laboratory of Neuro-oncology in Liaoning Province, Shenyang, China
| | - Shuyuan Shen
- Department of Neurobiology, School of Life Sciences, China Medical University, Shenyang, China.,Key Laboratory of Cell Biology, Ministry of Public Health of China, China Medical University, Shenyang, China.,Key Laboratory of Medical Cell Biology, Ministry of Education of China, China Medical University, Shenyang, China
| | - Lianqi Shao
- Department of Neurobiology, School of Life Sciences, China Medical University, Shenyang, China.,Key Laboratory of Cell Biology, Ministry of Public Health of China, China Medical University, Shenyang, China.,Key Laboratory of Medical Cell Biology, Ministry of Education of China, China Medical University, Shenyang, China
| | - Yang Yang
- Department of Neurosurgery, Shengjing Hospital of China Medical University, Shenyang, China.,Liaoning Clinical Medical Research Center in Nervous System Disease, Shenyang, China.,Key Laboratory of Neuro-oncology in Liaoning Province, Shenyang, China
| | - Libo Liu
- Department of Neurobiology, School of Life Sciences, China Medical University, Shenyang, China.,Key Laboratory of Cell Biology, Ministry of Public Health of China, China Medical University, Shenyang, China.,Key Laboratory of Medical Cell Biology, Ministry of Education of China, China Medical University, Shenyang, China
| | - Jun Ma
- Department of Neurobiology, School of Life Sciences, China Medical University, Shenyang, China.,Key Laboratory of Cell Biology, Ministry of Public Health of China, China Medical University, Shenyang, China.,Key Laboratory of Medical Cell Biology, Ministry of Education of China, China Medical University, Shenyang, China
| | - Yunhui Liu
- Department of Neurosurgery, Shengjing Hospital of China Medical University, Shenyang, China.,Liaoning Clinical Medical Research Center in Nervous System Disease, Shenyang, China.,Key Laboratory of Neuro-oncology in Liaoning Province, Shenyang, China
| |
Collapse
|
58
|
Rödelsperger C, Prabh N, Sommer RJ. New Gene Origin and Deep Taxon Phylogenomics: Opportunities and Challenges. Trends Genet 2019; 35:914-922. [DOI: 10.1016/j.tig.2019.08.007] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 08/07/2019] [Accepted: 08/29/2019] [Indexed: 01/22/2023]
|
59
|
Fesenko I, Kirov I, Kniazev A, Khazigaleeva R, Lazarev V, Kharlampieva D, Grafskaia E, Zgoda V, Butenko I, Arapidi G, Mamaeva A, Ivanov V, Govorun V. Distinct types of short open reading frames are translated in plant cells. Genome Res 2019; 29:1464-1477. [PMID: 31387879 PMCID: PMC6724668 DOI: 10.1101/gr.253302.119] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 08/01/2019] [Indexed: 02/07/2023]
Abstract
Genomes contain millions of short (<100 codons) open reading frames (sORFs), which are usually dismissed during gene annotation. Nevertheless, peptides encoded by such sORFs can play important biological roles, and their impact on cellular processes has long been underestimated. Here, we analyzed approximately 70,000 transcribed sORFs in the model plant Physcomitrella patens (moss). Several distinct classes of sORFs that differ in terms of their position on transcripts and the level of evolutionary conservation are present in the moss genome. Over 5000 sORFs were conserved in at least one of 10 plant species examined. Mass spectrometry analysis of proteomic and peptidomic data sets suggested that tens of sORFs located on distinct parts of mRNAs and long noncoding RNAs (lncRNAs) are translated, including conserved sORFs. Translational analysis of the sORFs and main ORFs at a single locus suggested the existence of genes that code for multiple proteins and peptides with tissue-specific expression. Functional analysis of four lncRNA-encoded peptides showed that sORFs-encoded peptides are involved in regulation of growth and differentiation in moss. Knocking out lncRNA-encoded peptides resulted in a decrease of moss growth. In contrast, the overexpression of these peptides resulted in a diverse range of phenotypic effects. Our results thus open new avenues for discovering novel, biologically active peptides in the plant kingdom.
Collapse
Affiliation(s)
- Igor Fesenko
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, 117997 Moscow, Russian Federation
| | - Ilya Kirov
- Laboratory of marker-assisted and genomic selection of plants, All-Russian Research Institute of Agricultural Biotechnology, 127550 Moscow, Russian Federation
| | - Andrey Kniazev
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, 117997 Moscow, Russian Federation
| | - Regina Khazigaleeva
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, 117997 Moscow, Russian Federation
| | - Vassili Lazarev
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russian Federation.,Moscow Institute of Physics and Technology (National Research University), 141701 Dolgoprudny, Moscow Region, Russian Federation
| | - Daria Kharlampieva
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russian Federation
| | - Ekaterina Grafskaia
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russian Federation.,Moscow Institute of Physics and Technology (National Research University), 141701 Dolgoprudny, Moscow Region, Russian Federation
| | - Viktor Zgoda
- Laboratory of System Biology, Institute of Biomedical Chemistry, 119121 Moscow, Russian Federation
| | - Ivan Butenko
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russian Federation
| | - Georgy Arapidi
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, 117997 Moscow, Russian Federation.,Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russian Federation
| | - Anna Mamaeva
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, 117997 Moscow, Russian Federation
| | - Vadim Ivanov
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, 117997 Moscow, Russian Federation
| | - Vadim Govorun
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russian Federation
| |
Collapse
|
60
|
De Novo, Divergence, and Mixed Origin Contribute to the Emergence of Orphan Genes in Pristionchus Nematodes. G3-GENES GENOMES GENETICS 2019; 9:2277-2286. [PMID: 31088903 PMCID: PMC6643871 DOI: 10.1534/g3.119.400326] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Homology is a fundamental concept in comparative biology. It is extensively used at the sequence level to make phylogenetic hypotheses and functional inferences. Nonetheless, the majority of eukaryotic genomes contain large numbers of orphan genes lacking homologs in other taxa. Generally, the fraction of orphan genes is higher in genomically undersampled clades, and in the absence of closely related genomes any hypothesis about their origin and evolution remains untestable. Previously, we sequenced ten genomes with an underlying ladder-like phylogeny to establish a phylogenomic framework for studying genome evolution in diplogastrid nematodes. Here, we use this deeply sampled data set to understand the processes that generate orphan genes in our focal species Pristionchus pacificus Based on phylostratigraphic analysis and additional bioinformatic filters, we obtained 29 high-confidence candidate genes for which mechanisms of orphan origin were proposed based on manual inspection. This revealed diverse mechanisms including annotation artifacts, chimeric origin, alternative reading frame usage, and gene splitting with subsequent gain of de novo exons. In addition, we present two cases of complete de novo origination from non-coding regions, which represents one of the first reports of de novo genes in nematodes. Thus, we conclude that de novo emergence, divergence, and mixed mechanisms contribute to novel gene formation in Pristionchus nematodes.
Collapse
|
61
|
Ruiz-Orera J, Albà MM. Conserved regions in long non-coding RNAs contain abundant translation and protein-RNA interaction signatures. NAR Genom Bioinform 2019; 1:e2. [PMID: 33575549 PMCID: PMC7671363 DOI: 10.1093/nargab/lqz002] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 06/14/2019] [Accepted: 07/04/2019] [Indexed: 02/06/2023] Open
Abstract
The mammalian transcriptome includes thousands of transcripts that do not correspond to annotated protein-coding genes and that are known as long non-coding RNAs (lncRNAs). A handful of lncRNAs have well-characterized regulatory functions but the biological significance of the majority of them is not well understood. LncRNAs that are conserved between mice and humans are likely to be enriched in functional sequences. Here, we investigate the presence of different types of ribosome profiling signatures in lncRNAs and how they relate to sequence conservation. We find that lncRNA-conserved regions contain three times more ORFs with translation evidence than non-conserved ones, and identify nine cases that display significant sequence constraints at the amino acid sequence level. The study also reveals that conserved regions in intergenic lncRNAs are significantly enriched in protein–RNA interaction signatures when compared to non-conserved ones; this includes sites in well-characterized lncRNAs, such as Cyrano, Malat1, Neat1 and Meg3, as well as in tens of lncRNAs of unknown function. This work illustrates how the analysis of ribosome profiling data coupled with evolutionary analysis provides new opportunities to explore the lncRNA functional landscape.
Collapse
Affiliation(s)
- Jorge Ruiz-Orera
- Evolutionary Genomics Group, Research Programme on Biomedical Informatics, Hospital del Mar Research Institute, Universitat Pompeu Fabra, Dr Aiguader 88, Barcelona 08003, Spain
| | - M Mar Albà
- Evolutionary Genomics Group, Research Programme on Biomedical Informatics, Hospital del Mar Research Institute, Universitat Pompeu Fabra, Dr Aiguader 88, Barcelona 08003, Spain.,Catalan Institution for Research and Advanced Studies, Passeig Lluís Companys 23, Barcelona 08010, Spain
| |
Collapse
|
62
|
Function and Evolution of Upstream ORFs in Eukaryotes. Trends Biochem Sci 2019; 44:782-794. [PMID: 31003826 DOI: 10.1016/j.tibs.2019.03.002] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2019] [Revised: 03/08/2019] [Accepted: 03/19/2019] [Indexed: 12/18/2022]
Abstract
There is growing interest in the role of translational regulation in cellular homeostasis during organismal development. Translation initiation is the rate-limiting step in mRNA translation and is central to translational regulation. Upstream open reading frames (uORFs) are regulatory elements that are prevalent in eukaryotic mRNAs. uORFs modulate the translation initiation rate of downstream coding sequences (CDSs) by sequestering ribosomes. Over the past several years, genome-wide studies have revealed the widespread regulatory functions of uORFs in different species in different biological contexts. Here, we review the current understanding of uORF-mediated translational regulation from the perspective of functional and evolutionary genomics and address remaining gaps that deserve further study.
Collapse
|