1
|
Stasevich EM, Simonova AV, Bogomolova EA, Murashko MM, Uvarova AN, Zheremyan EA, Korneev KV, Schwartz AM, Kuprash DV, Demin DE. Cut from the same cloth: RNAs transcribed from regulatory elements. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2024; 1867:195049. [PMID: 38964653 DOI: 10.1016/j.bbagrm.2024.195049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 06/25/2024] [Accepted: 06/28/2024] [Indexed: 07/06/2024]
Abstract
A certain degree of chromatin openness is necessary for the activity of transcription-regulating regions within the genome, facilitating accessibility to RNA polymerases and subsequent synthesis of regulatory element RNAs (regRNAs) from these regions. The rapidly increasing number of studies underscores the significance of regRNAs across diverse cellular processes and diseases, challenging the paradigm that these transcripts are non-functional transcriptional noise. This review explores the multifaceted roles of regRNAs in human cells, encompassing rather well-studied entities such as promoter RNAs and enhancer RNAs (eRNAs), while also providing insights into overshadowed silencer RNAs and insulator RNAs. Furthermore, we assess notable examples of shorter regRNAs, like miRNAs, snRNAs, and snoRNAs, playing important roles. Expanding our discourse, we deliberate on the potential usage of regRNAs as biomarkers and novel targets for cancer and other human diseases.
Collapse
Affiliation(s)
- E M Stasevich
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - A V Simonova
- Laboratory of Intracellular Signaling in Health and Disease, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - E A Bogomolova
- Laboratory of Intracellular Signaling in Health and Disease, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia; Moscow Center for Advanced Studies, Moscow, Russia
| | - M M Murashko
- Laboratory of Intracellular Signaling in Health and Disease, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia; Moscow Center for Advanced Studies, Moscow, Russia
| | - A N Uvarova
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - E A Zheremyan
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - K V Korneev
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - A M Schwartz
- Department of Human Biology, University of Haifa, Haifa, Israel
| | - D V Kuprash
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - D E Demin
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia.
| |
Collapse
|
2
|
Adjeroh DA, Zhou X, Paschoal AR, Dimitrova N, Derevyanchuk EG, Shkurat TP, Loeb JA, Martinez I, Lipovich L. Challenges in LncRNA Biology: Views and Opinions. Noncoding RNA 2024; 10:43. [PMID: 39195572 DOI: 10.3390/ncrna10040043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Revised: 06/26/2024] [Accepted: 07/04/2024] [Indexed: 08/29/2024] Open
Abstract
This is a mini-review capturing the views and opinions of selected participants at the 2021 IEEE BIBM 3rd Annual LncRNA Workshop, held in Dubai, UAE. The views and opinions are expressed on five broad themes related to problems in lncRNA, namely, challenges in the computational analysis of lncRNAs, lncRNAs and cancer, lncRNAs in sports, lncRNAs and COVID-19, and lncRNAs in human brain activity.
Collapse
Affiliation(s)
- Donald A Adjeroh
- Lane Department of Computer Science and Electrical Engineering, West Virginia University (WVU), Morgantown, WV 26506, USA
| | - Xiaobo Zhou
- Department of Bioinformatics and Systems Medicine, University of Texas Health Science Center, Houston, TX 77030, USA
| | - Alexandre Rossi Paschoal
- Department of Computer Science, Bioinformatics and Pattern Recognition Group, Federal University of Technology-Paraná-UTFPR, Curitiba 86300-000, Brazil
- Rosalind Franklin Institute, Harwell Science and Innovation Campus, Didcot OX11 0FA, UK
| | - Nadya Dimitrova
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06520, USA
| | | | - Tatiana P Shkurat
- Department of Genetics, Southern Federal University, Rostov-on-Don 344090, Russia
| | - Jeffrey A Loeb
- Department of Neurology and Rehabilitation, The Center for Clinical and Translational Science, The University of Illinois NeuroRepository, University of Illinois, Chicago, IL 60607, USA
| | - Ivan Martinez
- Department of Microbiology, Immunology & Cell Biology, WVU Cancer Institute, West Virginia University (WVU) School of Medicine, Morgantown, WV 26505, USA
| | - Leonard Lipovich
- Shenzhen Huayuan Biological Science Research Institute, Shenzhen Huayuan Biotechnology Co., Ltd., Shenzhen 518000, China
- Center for Molecular Medicine and Genetics, School of Medicine, Wayne State University, Detroit, MI 48201, USA
- College of Science, Mathematics and Technology, Wenzhou-Kean University, Wenzhou 325060, China
| |
Collapse
|
3
|
Rich A, Acar O, Carvunis AR. Massively integrated coexpression analysis reveals transcriptional regulation, evolution and cellular implications of the yeast noncanonical translatome. Genome Biol 2024; 25:183. [PMID: 38978079 PMCID: PMC11232214 DOI: 10.1186/s13059-024-03287-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 05/20/2024] [Indexed: 07/10/2024] Open
Abstract
BACKGROUND Recent studies uncovered pervasive transcription and translation of thousands of noncanonical open reading frames (nORFs) outside of annotated genes. The contribution of nORFs to cellular phenotypes is difficult to infer using conventional approaches because nORFs tend to be short, of recent de novo origins, and lowly expressed. Here we develop a dedicated coexpression analysis framework that accounts for low expression to investigate the transcriptional regulation, evolution, and potential cellular roles of nORFs in Saccharomyces cerevisiae. RESULTS Our results reveal that nORFs tend to be preferentially coexpressed with genes involved in cellular transport or homeostasis but rarely with genes involved in RNA processing. Mechanistically, we discover that young de novo nORFs located downstream of conserved genes tend to leverage their neighbors' promoters through transcription readthrough, resulting in high coexpression and high expression levels. Transcriptional piggybacking also influences the coexpression profiles of young de novo nORFs located upstream of genes, but to a lesser extent and without detectable impact on expression levels. Transcriptional piggybacking influences, but does not determine, the transcription profiles of de novo nORFs emerging nearby genes. About 40% of nORFs are not strongly coexpressed with any gene but are transcriptionally regulated nonetheless and tend to form entirely new transcription modules. We offer a web browser interface ( https://carvunislab.csb.pitt.edu/shiny/coexpression/ ) to efficiently query, visualize, and download our coexpression inferences. CONCLUSIONS Our results suggest that nORF transcription is highly regulated. Our coexpression dataset serves as an unprecedented resource for unraveling how nORFs integrate into cellular networks, contribute to cellular phenotypes, and evolve.
Collapse
Affiliation(s)
- April Rich
- Joint Carnegie Mellon University-University of Pittsburgh, University of Pittsburgh Computational Biology PhD Program, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), University of Pittsburgh, Pittsburgh, PA, USA
| | - Omer Acar
- Joint Carnegie Mellon University-University of Pittsburgh, University of Pittsburgh Computational Biology PhD Program, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), University of Pittsburgh, Pittsburgh, PA, USA
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA.
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
4
|
Sanejouand YH. Are Most Human-Specific Proteins Encoded by Long Noncoding RNAs? J Mol Evol 2024:10.1007/s00239-024-10174-z. [PMID: 38916610 DOI: 10.1007/s00239-024-10174-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 05/03/2024] [Indexed: 06/26/2024]
Abstract
By looking for a lack of homologs in a reference database of 27 well-annotated proteomes of primates and 52 well-annotated proteomes of other mammals, 170 putative human-specific proteins were identified. While most of them are deemed uncertain, 2 are known at the protein level and 23 at the transcript level, according to UniProt. Interestingly, 23 of these 25 proteins are found to be encoded or to have close homologs in an open reading frame of a long noncoding human RNA. However, half of them are predicted to be at least 80% globular, with a single structural domain, according to IUPred, and with at least 80% of ordered residues, according to flDPnn. Strikingly, there is a near-complete lack of structural knowledge about these proteins, with no tertiary structure presently available in the Protein Data Bank and a fair prediction for one of them in the AlphaFold Protein Structure Database. Moreover, knowledge about the function of these possibly key proteins remains scarce.
Collapse
Affiliation(s)
- Yves-Henri Sanejouand
- US2B, UMR 6286 of CNRS, Nantes University, 2 rue de la Houssinière, Nantes, 44322, Pays de la Loire, France.
| |
Collapse
|
5
|
Wen K, Chen X, Gu J, Chen Z, Wang Z. Beyond traditional translation: ncRNA derived peptides as modulators of tumor behaviors. J Biomed Sci 2024; 31:63. [PMID: 38877495 PMCID: PMC11177406 DOI: 10.1186/s12929-024-01047-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 05/24/2024] [Indexed: 06/16/2024] Open
Abstract
Within the intricate tapestry of molecular research, noncoding RNAs (ncRNAs) were historically overshadowed by a pervasive presumption of their inability to encode proteins or peptides. However, groundbreaking revelations have challenged this notion, unveiling select ncRNAs that surprisingly encode peptides specifically those nearing a succinct 100 amino acids. At the forefront of this epiphany stand lncRNAs and circRNAs, distinctively characterized by their embedded small open reading frames (sORFs). Increasing evidence has revealed different functions and mechanisms of peptides/proteins encoded by ncRNAs in cancer, including promotion or inhibition of cancer cell proliferation, cellular metabolism (glucose metabolism and lipid metabolism), and promotion or concerted metastasis of cancer cells. The discoveries not only accentuate the depth of ncRNA functionality but also open novel avenues for oncological research and therapeutic innovations. The main difficulties in the study of these ncRNA-derived peptides hinge crucially on precise peptide detection and sORFs identification. Here, we illuminate cutting-edge methodologies, essential instrumentation, and dedicated databases tailored for unearthing sORFs and peptides. In addition, we also conclude the potential of clinical applications in cancer therapy.
Collapse
Affiliation(s)
- Kang Wen
- Cancer Medical Center, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu, 210011, P.R. China
| | - Xin Chen
- Cancer Medical Center, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu, 210011, P.R. China
| | - Jingyao Gu
- Cancer Medical Center, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu, 210011, P.R. China
| | - Zhenyao Chen
- Department of Respiratory Endoscopy, Shanghai Chest Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200030, P.R. China.
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, China.
| | - Zhaoxia Wang
- Cancer Medical Center, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu, 210011, P.R. China.
| |
Collapse
|
6
|
Linnenbrink M, Breton G, Misra P, Pfeifle C, Dutheil JY, Tautz D. Experimental Evaluation of a Direct Fitness Effect of the De Novo Evolved Mouse Gene Pldi. Genome Biol Evol 2024; 16:evae084. [PMID: 38742287 PMCID: PMC11091481 DOI: 10.1093/gbe/evae084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2024] [Indexed: 05/16/2024] Open
Abstract
De novo evolved genes emerge from random parts of noncoding sequences and have, therefore, no homologs from which a function could be inferred. While expression analysis and knockout experiments can provide insights into the function, they do not directly test whether the gene is beneficial for its carrier. Here, we have used a seminatural environment experiment to test the fitness of the previously identified de novo evolved mouse gene Pldi, which has been implicated to have a role in sperm differentiation. We used a knockout mouse strain for this gene and competed it against its parental wildtype strain for several generations of free reproduction. We found that the knockout (ko) allele frequency decreased consistently across three replicates of the experiment. Using an approximate Bayesian computation framework that simulated the data under a demographic scenario mimicking the experiment's demography, we could estimate a selection coefficient ranging between 0.21 and 0.61 for the wildtype allele compared to the ko allele in males, under various models. This implies a relatively strong selective advantage, which would fix the new gene in less than hundred generations after its emergence.
Collapse
Affiliation(s)
- Miriam Linnenbrink
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
- Present address: Max Planck Institute for Biological Intelligence, 82152 Martinsried, Germany
| | - Gwenna Breton
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
- Present address: Clinical Genomics Gothenburg, Science for Life Laboratory, Sahlgrenska Academy, University of Gothenburg, and Center for Medical Genomics, Department of Clinical Genetic and Genomics, Sahlgrenska University Hospital, Sweden
| | - Pallavi Misra
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
- Present address: Laboratory Corporation of America (LabCorp), Westborough, MA 01581, USA
| | - Christine Pfeifle
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - Julien Y Dutheil
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - Diethard Tautz
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| |
Collapse
|
7
|
Westemeier-Rice ES, Winters MT, Rawson TW, Martinez I. More than the SRY: The Non-Coding Landscape of the Y Chromosome and Its Importance in Human Disease. Noncoding RNA 2024; 10:21. [PMID: 38668379 PMCID: PMC11054740 DOI: 10.3390/ncrna10020021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 03/31/2024] [Accepted: 04/08/2024] [Indexed: 04/29/2024] Open
Abstract
Historically, the Y chromosome has presented challenges to classical methodology and philosophy of understanding the differences between males and females. A genetic unsolved puzzle, the Y chromosome was the last chromosome to be fully sequenced. With the advent of the Human Genome Project came a realization that the human genome is more than just genes encoding proteins, and an entire universe of RNA was discovered. This dark matter of biology and the black box surrounding the Y chromosome have collided over the last few years, as increasing numbers of non-coding RNAs have been identified across the length of the Y chromosome, many of which have played significant roles in disease. In this review, we will uncover what is known about the connections between the Y chromosome and the non-coding RNA universe that originates from it, particularly as it relates to long non-coding RNAs, microRNAs and circular RNAs.
Collapse
Affiliation(s)
- Emily S. Westemeier-Rice
- West Virginia University Cancer Institute, West Virginia University School of Medicine, Morgantown, WV 26506, USA;
| | - Michael T. Winters
- Department of Microbiology, Immunology and Cell Biology, West Virginia University School of Medicine, Morgantown, WV 26506, USA; (M.T.W.); (T.W.R.)
| | - Travis W. Rawson
- Department of Microbiology, Immunology and Cell Biology, West Virginia University School of Medicine, Morgantown, WV 26506, USA; (M.T.W.); (T.W.R.)
| | - Ivan Martinez
- West Virginia University Cancer Institute, West Virginia University School of Medicine, Morgantown, WV 26506, USA;
- Department of Microbiology, Immunology and Cell Biology, West Virginia University School of Medicine, Morgantown, WV 26506, USA; (M.T.W.); (T.W.R.)
| |
Collapse
|
8
|
Martinez-Castillo M, M. Elsayed A, López-Berestein G, Amero P, Rodríguez-Aguayo C. An Overview of the Immune Modulatory Properties of Long Non-Coding RNAs and Their Potential Use as Therapeutic Targets in Cancer. Noncoding RNA 2023; 9:70. [PMID: 37987366 PMCID: PMC10660772 DOI: 10.3390/ncrna9060070] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 10/25/2023] [Accepted: 11/08/2023] [Indexed: 11/22/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) play pivotal roles in regulating immune responses, immune cell differentiation, activation, and inflammatory processes. In cancer, they are gaining prominence as potential therapeutic targets due to their ability to regulate immune checkpoint molecules and immune-related factors, suggesting avenues for bolstering anti-tumor immune responses. Here, we explore the mechanistic insights into lncRNA-mediated immune modulation, highlighting their impact on immunity. Additionally, we discuss their potential to enhance cancer immunotherapy, augmenting the effectiveness of immune checkpoint inhibitors and adoptive T cell therapies. LncRNAs as therapeutic targets hold the promise of revolutionizing cancer treatments, inspiring further research in this field with substantial clinical implications.
Collapse
Affiliation(s)
- Moises Martinez-Castillo
- Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX 77054, USA; (M.M.-C.); (G.L.-B.); (P.A.)
- Liver, Pancreas and Motility Laboratory, Unit of Research in Experimental Medicine, School of Medicine, Universidad Nacional Autónoma de México (UNAM), Mexico City 06726, Mexico
| | - Abdelrahman M. Elsayed
- Department of Pharmacology & Toxicology, Faculty of Pharmacy, Al-Azhar University, Cairo 11754, Egypt;
- Havener Eye Institute, Department of Ophthalmology and Visual Science, The Ohio State University Wexner Medical Center, Columbus, OH 43210, USA
| | - Gabriel López-Berestein
- Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX 77054, USA; (M.M.-C.); (G.L.-B.); (P.A.)
- Center for RNA Interference and Non-Coding RNA, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, TX 77030, USA
| | - Paola Amero
- Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX 77054, USA; (M.M.-C.); (G.L.-B.); (P.A.)
| | - Cristian Rodríguez-Aguayo
- Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX 77054, USA; (M.M.-C.); (G.L.-B.); (P.A.)
- Center for RNA Interference and Non-Coding RNA, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, TX 77030, USA
| |
Collapse
|
9
|
Zhang M, Zhao J, Wu J, Wang Y, Zhuang M, Zou L, Mao R, Jiang B, Liu J, Song X. In-depth characterization and identification of translatable lncRNAs. Comput Biol Med 2023; 164:107243. [PMID: 37453378 DOI: 10.1016/j.compbiomed.2023.107243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 06/16/2023] [Accepted: 07/07/2023] [Indexed: 07/18/2023]
Abstract
Long non-coding RNAs (LncRNAs) are non-protein coding transcripts more than 200 nucleotides in length. Deep sequencing technologies have unveiled lncRNAs can harbor translatable short open reading frames (sORFs). Yet the regulatory mechanisms governing lncRNA translation events remain poorly understood. Here, we exhaustively detected the sequence, functional element, and structure features relevant to lncRNA translation in human. Extensive identification and analysis reveal that translatable lncRNAs contain richer protein-coding related sequence features, cap-dependent and cap-independent translation initiation mechanisms, and more stable secondary structures, as compared to untranslatable lncRNAs. These findings strongly support lncRNAs serve as a repository for the production of new small peptides. Based on the feature fusion affecting translation and the extreme gradient boosting (XGBoost) algorithm, we developed the first computational tool that dedicated for predicting translatable lncRNAs, named TransLncPred. Benchmark experimental results show that our method outperforms several state-of-the-art RNA coding potential prediction tools on the same training and testing datasets. The 100-time 10-fold cross-validation tests also demonstrate that regulatory element-derived features, especially N7-methylguanosine (m7G) and internal ribosome entry site (IRES), contribute to the improvement in predictive performance.
Collapse
Affiliation(s)
- Meng Zhang
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
| | - Jian Zhao
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China.
| | - Jing Wu
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, 211166, China
| | - Yulan Wang
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
| | - Minhui Zhuang
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
| | - Lingxiao Zou
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
| | - Renlong Mao
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
| | - Bin Jiang
- College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
| | - Jingjing Liu
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
| | - Xiaofeng Song
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China.
| |
Collapse
|
10
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Moritz RL, Deutsch EW, van Heesch S. What Can Ribo-Seq, Immunopeptidomics, and Proteomics Tell Us About the Noncanonical Proteome? Mol Cell Proteomics 2023; 22:100631. [PMID: 37572790 PMCID: PMC10506109 DOI: 10.1016/j.mcpro.2023.100631] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 07/21/2023] [Accepted: 08/08/2023] [Indexed: 08/14/2023] Open
Abstract
Ribosome profiling (Ribo-Seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of noncanonical sites of ribosome translation outside the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7000 noncanonical ORFs are translated, which, at first glance, has the potential to expand the number of human protein CDSs by 30%, from ∼19,500 annotated CDSs to over 26,000 annotated CDSs. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of noncanonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome but searching for guidance on how to proceed. Here, we discuss the current state of noncanonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein coding."
Collapse
Affiliation(s)
- John R Prensner
- Division of Pediatric Hematology/Oncology, Department of Pediatrics, University of Michigan Medical School, Ann Arbor, Michigan, USA; Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, Michigan, USA.
| | | | - Leron W Kok
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Karl R Clauser
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, Agora Center Bugnon 25A, University of Lausanne, Lausanne, Switzerland; Department of Oncology, Centre Hospitalier Universitaire Vaudois (CHUV), Lausanne, Switzerland; Agora Cancer Research Centre, Lausanne, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | - Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | | |
Collapse
|
11
|
Abstract
Within the next decade, the genomes of 1.8 million eukaryotic species will be sequenced. Identifying genes in these sequences is essential to understand the biology of the species. This is challenging due to the transcriptional complexity of eukaryotic genomes, which encode hundreds of thousands of transcripts of multiple types. Among these, a small set of protein-coding mRNAs play a disproportionately large role in defining phenotypes. Due to their sequence conservation, orthology can be established, making it possible to define the universal catalog of eukaryotic protein-coding genes. This catalog should substantially contribute to uncovering the genomic events underlying the emergence of eukaryotic phenotypes. This piece briefly reviews the basics of protein-coding gene prediction, discusses challenges in finalizing annotation of the human genome, and proposes strategies for producing annotations across the eukaryotic Tree of Life. This lays the groundwork for obtaining the catalog of all genes-the Earth's code of life.
Collapse
Affiliation(s)
- Roderic Guigó
- Bioinformatics and Genomics, Center for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Dr. Aiguader 88, 08003 Barcelona, Catalonia
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia
| |
Collapse
|
12
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Deutsch EW, van Heesch S. What can Ribo-seq and proteomics tell us about the non-canonical proteome? BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.541049. [PMID: 37292611 PMCID: PMC10245706 DOI: 10.1101/2023.05.16.541049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Ribosome profiling (Ribo-seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of non-canonical sites of ribosome translation outside of the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7,000 non-canonical open reading frames (ORFs) are translated, which, at first glance, has the potential to expand the number of human protein-coding sequences by 30%, from ∼19,500 annotated CDSs to over 26,000. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of non-canonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome, but searching for guidance on how to proceed. Here, we discuss the current state of non-canonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein-coding". In brief The human genome encodes thousands of non-canonical open reading frames (ORFs) in addition to protein-coding genes. As a nascent field, many questions remain regarding non-canonical ORFs. How many exist? Do they encode proteins? What level of evidence is needed for their verification? Central to these debates has been the advent of ribosome profiling (Ribo-seq) as a method to discern genome-wide ribosome occupancy, and immunopeptidomics as a method to detect peptides that are processed and presented by MHC molecules and not observed in traditional proteomics experiments. This article provides a synthesis of the current state of non-canonical ORF research and proposes standards for their future investigation and reporting. Highlights Combined use of Ribo-seq and proteomics-based methods enables optimal confidence in detecting non-canonical ORFs and their protein products.Ribo-seq can provide more sensitive detection of non-canonical ORFs, but data quality and analytical pipelines will impact results.Non-canonical ORF catalogs are diverse and span both high-stringency and low-stringency ORF nominations.A framework for standardized non-canonical ORF evidence will advance the research field.
Collapse
Affiliation(s)
- John R. Prensner
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | | | - Leron W. Kok
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| | - Karl R. Clauser
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Jonathan M. Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Agora Center Bugnon 25A, 1005 Lausanne, Switzerland
- Department of Oncology, Centre hospitalier universitaire vaudois (CHUV), Rue du Bugnon 46, 1005 Lausanne, Switzerland
- Agora Cancer Research Centre, 1011 Lausanne, Switzerland
| | - Eric W. Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington 98109, USA
| | - Sebastiaan van Heesch
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| |
Collapse
|
13
|
Evolution and implications of de novo genes in humans. Nat Ecol Evol 2023:10.1038/s41559-023-02014-y. [PMID: 36928843 DOI: 10.1038/s41559-023-02014-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 02/06/2023] [Indexed: 03/18/2023]
Abstract
Genes and translated open reading frames (ORFs) that emerged de novo from previously non-coding sequences provide species with opportunities for adaptation. When aberrantly activated, some human-specific de novo genes and ORFs have disease-promoting properties-for instance, driving tumour growth. Thousands of putative de novo coding sequences have been described in humans, but we still do not know what fraction of those ORFs has readily acquired a function. Here, we discuss the challenges and controversies surrounding the detection, mechanisms of origin, annotation, validation and characterization of de novo genes and ORFs. Through manual curation of literature and databases, we provide a thorough table with most de novo genes reported for humans to date. We re-evaluate each locus by tracing the enabling mutations and list proposed disease associations, protein characteristics and supporting evidence for translation and protein detection. This work will support future explorations of de novo genes and ORFs in humans.
Collapse
|
14
|
Álvarez-Urdiola R, Borràs E, Valverde F, Matus JT, Sabidó E, Riechmann JL. Peptidomics Methods Applied to the Study of Flower Development. Methods Mol Biol 2023; 2686:509-536. [PMID: 37540375 DOI: 10.1007/978-1-0716-3299-4_24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/05/2023]
Abstract
Understanding the global and dynamic nature of plant developmental processes requires not only the study of the transcriptome, but also of the proteome, including its largely uncharacterized peptidome fraction. Recent advances in proteomics and high-throughput analyses of translating RNAs (ribosome profiling) have begun to address this issue, evidencing the existence of novel, uncharacterized, and possibly functional peptides. To validate the accumulation in tissues of sORF-encoded polypeptides (SEPs), the basic setup of proteomic analyses (i.e., LC-MS/MS) can be followed. However, the detection of peptides that are small (up to ~100 aa, 6-7 kDa) and novel (i.e., not annotated in reference databases) presents specific challenges that need to be addressed both experimentally and with computational biology resources. Several methods have been developed in recent years to isolate and identify peptides from plant tissues. In this chapter, we outline two different peptide extraction protocols and the subsequent peptide identification by mass spectrometry using the database search or the de novo identification methods.
Collapse
Affiliation(s)
- Raquel Álvarez-Urdiola
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Edifici CRAG, Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| | - Eva Borràs
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Federico Valverde
- Institute for Plant Biochemistry and Photosynthesis CSIC - University of Seville, Seville, Spain
| | - José Tomás Matus
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Edifici CRAG, Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
- Institute for Integrative Systems Biology (I2SysBio), Universitat de València-CSIC, Paterna, Valencia, Spain
| | - Eduard Sabidó
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - José Luis Riechmann
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Edifici CRAG, Campus UAB, Cerdanyola del Vallès, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
15
|
de Lorenzo V. Innovation versus novelty in microbial systems. Environ Microbiol 2023; 25:167-170. [PMID: 36335556 PMCID: PMC10098617 DOI: 10.1111/1462-2920.16278] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 11/02/2022] [Indexed: 11/08/2022]
Affiliation(s)
- Víctor de Lorenzo
- Systems Biology Department, Centro Nacional de Biotecnología (CNB-CSIC), Madrid, Spain
| |
Collapse
|
16
|
The Modular Architecture of Metallothioneins Facilitates Domain Rearrangements and Contributes to Their Evolvability in Metal-Accumulating Mollusks. Int J Mol Sci 2022; 23:ijms232415824. [PMID: 36555472 PMCID: PMC9781358 DOI: 10.3390/ijms232415824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 12/05/2022] [Accepted: 12/10/2022] [Indexed: 12/15/2022] Open
Abstract
Protein domains are independent structural and functional modules that can rearrange to create new proteins. While the evolution of multidomain proteins through the shuffling of different preexisting domains has been well documented, the evolution of domain repeat proteins and the origin of new domains are less understood. Metallothioneins (MTs) provide a good case study considering that they consist of metal-binding domain repeats, some of them with a likely de novo origin. In mollusks, for instance, most MTs are bidomain proteins that arose by lineage-specific rearrangements between six putative domains: α, β1, β2, β3, γ and δ. Some domains have been characterized in bivalves and gastropods, but nothing is known about the MTs and their domains of other Mollusca classes. To fill this gap, we investigated the metal-binding features of NpoMT1 of Nautilus pompilius (Cephalopoda class) and FcaMT1 of Falcidens caudatus (Caudofoveata class). Interestingly, whereas NpoMT1 consists of α and β1 domains and has a prototypical Cd2+ preference, FcaMT1 has a singular preference for Zn2+ ions and a distinct domain composition, including a new Caudofoveata-specific δ domain. Overall, our results suggest that the modular architecture of MTs has contributed to MT evolution during mollusk diversification, and exemplify how modularity increases MT evolvability.
Collapse
|
17
|
Brunet MA, Leblanc S, Roucou X. OpenVar: functional annotation of variants in non-canonical open reading frames. Cell Biosci 2022; 12:130. [PMID: 35965322 PMCID: PMC9375913 DOI: 10.1186/s13578-022-00871-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 08/03/2022] [Indexed: 11/12/2022] Open
Abstract
Background Recent technological advances have revealed thousands of functional open reading frames (ORF) that have eluded reference genome annotations. These overlooked ORFs are found throughout the genome, in any reading frame of transcripts, mature or non-coding, and can overlap annotated ORFs in a different reading frame. The exploration of these novel ORFs in genomic datasets and of their role in genetic traits is hindered by a lack of software. Results Here, we present OpenVar, a genomic variant annotator that mends that gap and fosters meaningful discoveries. To illustrate the potential of OpenVar, we analysed all variants within SynMicDB, a database of cancer-associated synonymous mutations. By including non-canonical ORFs in the analysis, OpenVar yields a 33.6-fold, 13.8-fold and 8.3-fold increase in high impact variants over Annovar, SnpEff and VEP respectively. We highlighted an overlapping non-canonical ORF in the HEY2 gene where variants significantly clustered. Conclusions OpenVar integrates non-canonical ORFs in the analysis of genomic variants, unveiling new research avenues to better understand the genotype–phenotype relationships.
Collapse
|
18
|
Pan J, Wang R, Shang F, Ma R, Rong Y, Zhang Y. Functional Micropeptides Encoded by Long Non-Coding RNAs: A Comprehensive Review. Front Mol Biosci 2022; 9:817517. [PMID: 35769907 PMCID: PMC9234465 DOI: 10.3389/fmolb.2022.817517] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 05/24/2022] [Indexed: 12/03/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) were originally defined as non-coding RNAs (ncRNAs) which lack protein-coding ability. However, with the emergence of technologies such as ribosome profiling sequencing and ribosome-nascent chain complex sequencing, it has been demonstrated that most lncRNAs have short open reading frames hence the potential to encode functional micropeptides. Such micropeptides have been described to be widely involved in life-sustaining activities in several organisms, such as homeostasis regulation, disease, and tumor occurrence, and development, and morphological development of animals, and plants. In this review, we focus on the latest developments in the field of lncRNA-encoded micropeptides, and describe the relevant computational tools and techniques for micropeptide prediction and identification. This review aims to serve as a reference for future research studies on lncRNA-encoded micropeptides.
Collapse
Affiliation(s)
- Jianfeng Pan
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Ruijun Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
- Key Laboratory of Mutton Sheep Genetics and Breeding, Ministry of Agriculture, Hohhot, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Hohhot, China
- Engineering Research Center for Goat Genetics and Breeding, Hohhot, China
| | - Fangzheng Shang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Rong Ma
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Youjun Rong
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Yanjun Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
- Key Laboratory of Mutton Sheep Genetics and Breeding, Ministry of Agriculture, Hohhot, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Hohhot, China
- Engineering Research Center for Goat Genetics and Breeding, Hohhot, China
- *Correspondence: Yanjun Zhang,
| |
Collapse
|
19
|
Identification and characterisation of sPEPs in Cryptococcus neoformans. Fungal Genet Biol 2022; 160:103688. [PMID: 35339703 DOI: 10.1016/j.fgb.2022.103688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 03/02/2022] [Accepted: 03/21/2022] [Indexed: 11/24/2022]
Abstract
Short open reading frame (sORF)-encoded peptides (sPEPs) have been found across a wide range of genomic locations in a variety of species. To date, their identification, validation, and characterisation in the human fungal pathogen Cryptococcus neoformans has been limited due to a lack of standardised protocols. We have developed an enrichment process that enables sPEP detection within a protein sample from this polysaccharide-encapsulated yeast, and implemented proteogenomics to provide insights into the validity of predicted and hypothetical sORFs annotated in the C. neoformans genome. Novel sORFs were discovered within the 5' and 3' UTRs of known transcripts as well as in "non-coding" RNAs. One novel candidate, dubbed NPB1, that resided in an RNA annotated as "non-coding", was chosen for characterisation. Through the creation of both specific point mutations and a full deletion allele, the function of the new sPEP, Npb1, was shown to resemble that of the bacterial trans-translation protein SmpB.
Collapse
|
20
|
Leong AZX, Lee PY, Mohtar MA, Syafruddin SE, Pung YF, Low TY. Short open reading frames (sORFs) and microproteins: an update on their identification and validation measures. J Biomed Sci 2022; 29:19. [PMID: 35300685 PMCID: PMC8928697 DOI: 10.1186/s12929-022-00802-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 03/09/2022] [Indexed: 12/17/2022] Open
Abstract
A short open reading frame (sORFs) constitutes ≤ 300 bases, encoding a microprotein or sORF-encoded protein (SEP) which comprises ≤ 100 amino acids. Traditionally dismissed by genome annotation pipelines as meaningless noise, sORFs were found to possess coding potential with ribosome profiling (RIBO-Seq), which unveiled sORF-based transcripts at various genome locations. Nonetheless, the existence of corresponding microproteins that are stable and functional was little substantiated by experimental evidence initially. With recent advancements in multi-omics, the identification, validation, and functional characterisation of sORFs and microproteins have become feasible. In this review, we discuss the history and development of an emerging research field of sORFs and microproteins. In particular, we focus on an array of bioinformatics and OMICS approaches used for predicting, sequencing, validating, and characterizing these recently discovered entities. These strategies include RIBO-Seq which detects sORF transcripts via ribosome footprints, and mass spectrometry (MS)-based proteomics for sequencing the resultant microproteins. Subsequently, our discussion extends to the functional characterisation of microproteins by incorporating CRISPR/Cas9 screen and protein–protein interaction (PPI) studies. Our review discusses not only detection methodologies, but we also highlight on the challenges and potential solutions in identifying and validating sORFs and their microproteins. The novelty of this review lies within its validation for the functional role of microproteins, which could contribute towards the future landscape of microproteomics.
Collapse
Affiliation(s)
- Alyssa Zi-Xin Leong
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia
| | - Pey Yee Lee
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia
| | - M Aiman Mohtar
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia
| | - Saiful Effendi Syafruddin
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia
| | - Yuh-Fen Pung
- Division of Biomedical Science, School of Pharmacy, University of Nottingham Malaysia, Semenyih, 43500, Selangor, Malaysia
| | - Teck Yew Low
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia.
| |
Collapse
|
21
|
The microprotein Nrs1 rewires the G1/S transcriptional machinery during nitrogen limitation in budding yeast. PLoS Biol 2022; 20:e3001548. [PMID: 35239649 PMCID: PMC8893695 DOI: 10.1371/journal.pbio.3001548] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 01/19/2022] [Indexed: 12/01/2022] Open
Abstract
Commitment to cell division at the end of G1 phase, termed Start in the budding yeast Saccharomyces cerevisiae, is strongly influenced by nutrient availability. To identify new dominant activators of Start that might operate under different nutrient conditions, we screened a genome-wide ORF overexpression library for genes that bypass a Start arrest caused by absence of the G1 cyclin Cln3 and the transcriptional activator Bck2. We recovered a hypothetical gene YLR053c, renamed NRS1 for Nitrogen-Responsive Start regulator 1, which encodes a poorly characterized 108 amino acid microprotein. Endogenous Nrs1 was nuclear-localized, restricted to poor nitrogen conditions, induced upon TORC1 inhibition, and cell cycle-regulated with a peak at Start. NRS1 interacted genetically with SWI4 and SWI6, which encode subunits of the main G1/S transcription factor complex SBF. Correspondingly, Nrs1 physically interacted with Swi4 and Swi6 and was localized to G1/S promoter DNA. Nrs1 exhibited inherent transactivation activity, and fusion of Nrs1 to the SBF inhibitor Whi5 was sufficient to suppress other Start defects. Nrs1 appears to be a recently evolved microprotein that rewires the G1/S transcriptional machinery under poor nitrogen conditions. Commitment to cell division at the end of G1 phase in the budding yeast Saccharomyces cerevisiae is strongly influenced by nutrient availability. This study identifies a micro-protein that promotes G1/S transcription activation and cell cycle entry in yeast under nitrogen-limited conditions.
Collapse
|
22
|
Melo ESD, Wallau GL. Mosquito long non-coding RNAs are enriched with Transposable Elements. Genet Mol Biol 2022; 45:e20210215. [PMID: 35088819 PMCID: PMC8796034 DOI: 10.1590/1678-4685-gmb-2021-0215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 11/29/2021] [Indexed: 11/22/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) lack coding capacity and mounting evidence suggests that they have a regulatory role in diverse organisms. Most knowledge about lncRNAs comes from studies on vertebrates, including a structural association between lncRNAs and transposable elements (TEs). TE sequences are genomic parasites found in all branches of life and are particularly active and abundant in insect genomes. Here we investigate the contribution of TEs to lncRNA biogenesis in Aedes albopictus and Culex quinquefasciatus. We found that a large fraction of lncRNA loci co-occurs with TE loci in both species. Around 40% of A. albopictus and 52% of C. quinquefasciatus lncRNAs show some association with TEs. Most of the lncRNA/TE associations are represented by TE-derived sequences that are expressed as one or all exons of lncRNAs, including five lncRNAs that seem to influence immune-related genes involved in antiviral response. The contribution of TEs to lncRNAs also varies among the different types of TEs. The Gypsi superfamily is particularly enriched in lncRNAs sequences. In sum, this study demonstrates that transposable elements substantially contribute to lncRNAs biogenesis in A. albopictus and C. quinquefasciatus and may have an impact on regulatory modulation in these species.
Collapse
Affiliation(s)
- Elverson Soares de Melo
- Fundação Oswaldo Cruz, Instituto Aggeu Magalhães, Departamento de Entomologia e Núcleo de Bioinformática, Recife, PE, Brazil
| | - Gabriel Luz Wallau
- Fundação Oswaldo Cruz, Instituto Aggeu Magalhães, Departamento de Entomologia e Núcleo de Bioinformática, Recife, PE, Brazil
| |
Collapse
|
23
|
Abstract
Modern genome-scale methods that identify new genes, such as proteogenomics and ribosome profiling, have revealed, to the surprise of many, that overlap in genes, open reading frames and even coding sequences is widespread and functionally integrated into prokaryotic, eukaryotic and viral genomes. In parallel, the constraints that overlapping regions place on genome sequences and their evolution can be harnessed in bioengineering to build more robust synthetic strains and constructs. With a focus on overlapping protein-coding and RNA-coding genes, this Review examines their discovery, topology and biogenesis in the context of their genome biology. We highlight exciting new uses for sequence overlap to control translation, compress synthetic genetic constructs, and protect against mutation.
Collapse
|
24
|
Bhave D, Tautz D. Effects of the Expression of Random Sequence Clones on Growth and Transcriptome Regulation in Escherichia coli. Genes (Basel) 2021; 13:genes13010053. [PMID: 35052392 PMCID: PMC8775113 DOI: 10.3390/genes13010053] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 12/21/2021] [Accepted: 12/21/2021] [Indexed: 02/04/2023] Open
Abstract
Comparative genomic analyses have provided evidence that new genetic functions can emerge out of random nucleotide sequences. Here, we apply a direct experimental approach to study the effects of plasmids harboring random sequence inserts under the control of an inducible promoter. Based on data from previously described experiments dealing with the growth of clones within whole libraries, we extracted specific clones that had shown either negative, neutral or positive effects on relative cell growth. We analyzed these individually with respect to growth characteristics and the impact on the transcriptome. We find that candidate clones for negative peptides lead to growth arrest by eliciting a general stress response. Overexpression of positive clones, on the other hand, does not change the exponential growth rates of hosts, and they show a growth advantage over a neutral clone when tested in direct competition experiments. Transcriptomic changes in positive clones are relatively moderate and specific to each clone. We conclude from our experiments that random sequence peptides are indeed a suitable source for the de novo evolution of genetic functions.
Collapse
|
25
|
Bonilauri B, Holetz FB, Dallagiovanna B. Long Non-Coding RNAs Associated with Ribosomes in Human Adipose-Derived Stem Cells: From RNAs to Microproteins. Biomolecules 2021; 11:1673. [PMID: 34827671 PMCID: PMC8615451 DOI: 10.3390/biom11111673] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Revised: 10/15/2021] [Accepted: 10/25/2021] [Indexed: 12/12/2022] Open
Abstract
Ribosome profiling reveals the translational dynamics of mRNAs by capturing a ribosomal footprint snapshot. Growing evidence shows that several long non-coding RNAs (lncRNAs) contain small open reading frames (smORFs) that are translated into functional peptides. The difficulty in identifying bona-fide translated smORFs is a constant challenge in experimental and bioinformatics fields due to their unconventional characteristics. This motivated us to isolate human adipose-derived stem cells (hASC) from adipose tissue and perform a ribosome profiling followed by bioinformatics analysis of transcriptome, translatome, and ribosome-protected fragments of lncRNAs. Here, we demonstrated that 222 lncRNAs were associated with the translational machinery in hASC, including the already demonstrated lncRNAs coding microproteins. The ribosomal occupancy of some transcripts was consistent with the translation of smORFs. In conclusion, we were able to identify a subset of 15 lncRNAs containing 35 smORFs that likely encode functional microproteins, including four previously demonstrated smORF-derived microproteins, suggesting a possible dual role of these lncRNAs in hASC self-renewal.
Collapse
Affiliation(s)
- Bernardo Bonilauri
- Laboratory of Basic Biology of Stem Cells (LABCET), Carlos Chagas Institute-Fiocruz-Paraná, Curitiba 81350-010, Brazil;
| | - Fabiola Barbieri Holetz
- Laboratory of Gene Expression Regulation (LABREG), Carlos Chagas Institute-Fiocruz-Paraná, Curitiba 81350-010, Brazil;
| | - Bruno Dallagiovanna
- Laboratory of Basic Biology of Stem Cells (LABCET), Carlos Chagas Institute-Fiocruz-Paraná, Curitiba 81350-010, Brazil;
| |
Collapse
|
26
|
Lei CS, Kung HJ, Shih JW. Long Non-Coding RNAs as Functional Codes for Oral Cancer: Translational Potential, Progress and Promises. Int J Mol Sci 2021; 22:4903. [PMID: 34063159 PMCID: PMC8124393 DOI: 10.3390/ijms22094903] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2021] [Revised: 04/30/2021] [Accepted: 05/03/2021] [Indexed: 12/24/2022] Open
Abstract
Oral cancer is one of the leading malignant tumors worldwide. Despite the advent of multidisciplinary approaches, the overall prognosis of patients with oral cancer is poor, mainly due to late diagnosis. There is an urgent need to develop valid biomarkers for early detection and effective therapies. Long non-coding RNAs (lncRNAs) are recognized as key elements of gene regulation, with pivotal roles in various physiological and pathological processes, including cancer. Over the past few years, an exponentially growing number of lncRNAs have been identified and linked to tumorigenesis and prognosis outcomes in oral cancer, illustrating their emerging roles in oral cancer progression and the associated signaling pathways. Herein, we aim to summarize the most recent advances made concerning oral cancer-associated lncRNA, and their expression, involvement, and potential clinical impact, reported to date, with a specific focus on the lncRNA-mediated molecular regulation in oncogenic signaling cascades and oral malignant progression, while exploring their potential, and challenges, for clinical applications as biomarkers or therapeutic targets for oral cancer.
Collapse
Affiliation(s)
- Cing-Syuan Lei
- Ph.D. Program for Cancer Molecular Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University and Academia Sinica, Taipei 11031, Taiwan; (C.-S.L.); (H.-J.K.)
| | - Hsing-Jien Kung
- Ph.D. Program for Cancer Molecular Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University and Academia Sinica, Taipei 11031, Taiwan; (C.-S.L.); (H.-J.K.)
- Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
- Institute of Molecular and Genomic Medicine, National Health Research Institutes, Zhunan, Miaoli County 35053, Taiwan
- Comprehensive Cancer Center, Department of Biochemistry and Molecular Medicine, University of California at Davis, Sacramento, CA 95817, USA
- TMU Research Center of Cancer Translational Medicine, Taipei Medical University, Taipei 11031, Taiwan
| | - Jing-Wen Shih
- Ph.D. Program for Cancer Molecular Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University and Academia Sinica, Taipei 11031, Taiwan; (C.-S.L.); (H.-J.K.)
- Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
- TMU Research Center of Cancer Translational Medicine, Taipei Medical University, Taipei 11031, Taiwan
- Ph.D. Program for Translational Medicine, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
| |
Collapse
|
27
|
Steinberg R, Koch HG. The largely unexplored biology of small proteins in pro- and eukaryotes. FEBS J 2021; 288:7002-7024. [PMID: 33780127 DOI: 10.1111/febs.15845] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/11/2021] [Accepted: 03/26/2021] [Indexed: 12/29/2022]
Abstract
The large abundance of small open reading frames (smORFs) in prokaryotic and eukaryotic genomes and the plethora of smORF-encoded small proteins became only apparent with the constant advancements in bioinformatic, genomic, proteomic, and biochemical tools. Small proteins are typically defined as proteins of < 50 amino acids in prokaryotes and of less than 100 amino acids in eukaryotes, and their importance for cell physiology and cellular adaptation is only beginning to emerge. In contrast to antimicrobial peptides, which are secreted by prokaryotic and eukaryotic cells for combatting pathogens and competitors, small proteins act within the producing cell mainly by stabilizing protein assemblies and by modifying the activity of larger proteins. Production of small proteins is frequently linked to stress conditions or environmental changes, and therefore, cells seem to use small proteins as intracellular modifiers for adjusting cell metabolism to different intra- and extracellular cues. However, the size of small proteins imposes a major challenge for the cellular machinery required for protein folding and intracellular trafficking and recent data indicate that small proteins can engage distinct trafficking pathways. In the current review, we describe the diversity of small proteins in prokaryotes and eukaryotes, highlight distinct and common features, and illustrate how they are handled by the protein trafficking machineries in prokaryotic and eukaryotic cells. Finally, we also discuss future topics of research on this fascinating but largely unexplored group of proteins.
Collapse
Affiliation(s)
- Ruth Steinberg
- Institute for Biochemistry and Molecular Biology, Zentrum für Biochemie und Molekulare Medizin (ZMBZ), Faculty of Medicine, Albert-Ludwigs-Universität Freiburg, Germany
| | - Hans-Georg Koch
- Institute for Biochemistry and Molecular Biology, Zentrum für Biochemie und Molekulare Medizin (ZMBZ), Faculty of Medicine, Albert-Ludwigs-Universität Freiburg, Germany
| |
Collapse
|
28
|
Rutley N, Poidevin L, Doniger T, Tillett RL, Rath A, Forment J, Luria G, Schlauch KA, Ferrando A, Harper JF, Miller G. Characterization of novel pollen-expressed transcripts reveals their potential roles in pollen heat stress response in Arabidopsis thaliana. PLANT REPRODUCTION 2021; 34:61-78. [PMID: 33459869 PMCID: PMC7902599 DOI: 10.1007/s00497-020-00400-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 11/17/2020] [Indexed: 05/27/2023]
Abstract
Arabidopsis pollen transcriptome analysis revealed new intergenic transcripts of unknown function, many of which are long non-coding RNAs, that may function in pollen-specific processes, including the heat stress response. The male gametophyte is the most heat sensitive of all plant tissues. In recent years, long noncoding RNAs (lncRNAs) have emerged as important components of cellular regulatory networks involved in most biological processes, including response to stress. While examining RNAseq datasets of developing and germinating Arabidopsis thaliana pollen exposed to heat stress (HS), we identified 66 novel and 246 recently annotated intergenic expressed loci (XLOCs) of unknown function, with the majority encoding lncRNAs. Comparison with HS in cauline leaves and other RNAseq experiments indicated that 74% of the 312 XLOCs are pollen-specific, and at least 42% are HS-responsive. Phylogenetic analysis revealed that 96% of the genes evolved recently in Brassicaceae. We found that 50 genes are putative targets of microRNAs and that 30% of the XLOCs contain small open reading frames (ORFs) with homology to protein sequences. Finally, RNAseq of ribosome-protected RNA fragments together with predictions of periodic footprint of the ribosome P-sites indicated that 23 of these ORFs are likely to be translated. Our findings indicate that many of the 312 unknown genes might be functional and play a significant role in pollen biology, including the HS response.
Collapse
Affiliation(s)
- Nicholas Rutley
- The Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, 5290002, Ramat-Gan, Israel
| | - Laetitia Poidevin
- Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Cient́́if́icas-Universitat Politècnica de València, Valencia, Spain
| | - Tirza Doniger
- The Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, 5290002, Ramat-Gan, Israel
| | - Richard L Tillett
- Department of Biochemistry and Molecular Biology, University of Nevada at Reno, Reno, NV, 89557, USA
- Nevada INBRE Bioinformatics Core, University of Nevada at Reno, Reno, NV, 89557, USA
| | - Abhishek Rath
- The Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, 5290002, Ramat-Gan, Israel
| | - Javier Forment
- Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Cient́́if́icas-Universitat Politècnica de València, Valencia, Spain
| | - Gilad Luria
- The Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, 5290002, Ramat-Gan, Israel
| | - Karen A Schlauch
- Institute of Health Innovation, Desert Research Institute, Department of Pharmacology, University of Nevada at Reno, Reno, NV, 89557, USA
| | - Alejandro Ferrando
- Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Cient́́if́icas-Universitat Politècnica de València, Valencia, Spain
| | - Jeffery F Harper
- Department of Biochemistry and Molecular Biology, University of Nevada at Reno, Reno, NV, 89557, USA
| | - Gad Miller
- The Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, 5290002, Ramat-Gan, Israel.
| |
Collapse
|
29
|
The hidden world of non-canonical ORFs. Exp Cell Res 2020; 396:112267. [PMID: 32926940 DOI: 10.1016/j.yexcr.2020.112267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
30
|
Palazzo AF, Koonin EV. Functional Long Non-coding RNAs Evolve from Junk Transcripts. Cell 2020; 183:1151-1161. [PMID: 33068526 DOI: 10.1016/j.cell.2020.09.047] [Citation(s) in RCA: 128] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 08/20/2020] [Accepted: 09/17/2020] [Indexed: 12/30/2022]
Abstract
Transcriptome studies reveal pervasive transcription of complex genomes, such as those of mammals. Despite popular arguments for functionality of most, if not all, of these transcripts, genome-wide analysis of selective constraints indicates that most of the produced RNA are junk. However, junk is not garbage. On the contrary, junk transcripts provide the raw material for the evolution of diverse long non-coding (lnc) RNAs by non-adaptive mechanisms, such as constructive neutral evolution. The generation of many novel functional entities, such as lncRNAs, that fuels organismal complexity does not seem to be driven by strong positive selection. Rather, the weak selection regime that dominates the evolution of most multicellular eukaryotes provides ample material for functional innovation with relatively little adaptation involved.
Collapse
Affiliation(s)
- Alexander F Palazzo
- Department of Biochemistry, University of Toronto, Toronto, ON M5G 1M1, Canada.
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| |
Collapse
|