1
|
Affiliation(s)
- Thomas R Cech
- Department of Biochemistry, BioFrontiers Institute, and Howard Hughes Medical Institute, University of Colorado Boulder, Boulder, Colorado 80309, USA
| |
Collapse
|
2
|
Isakova A, Neff N, Quake SR. Single-cell quantification of a broad RNA spectrum reveals unique noncoding patterns associated with cell types and states. Proc Natl Acad Sci U S A 2021; 118:e2113568118. [PMID: 34911763 PMCID: PMC8713755 DOI: 10.1073/pnas.2113568118] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/10/2021] [Indexed: 12/22/2022] Open
Abstract
The ability to interrogate total RNA content of single cells would enable better mapping of the transcriptional logic behind emerging cell types and states. However, current single-cell RNA-sequencing (RNA-seq) methods are unable to simultaneously monitor all forms of RNA transcripts at the single-cell level, and thus deliver only a partial snapshot of the cellular RNAome. Here we describe Smart-seq-total, a method capable of assaying a broad spectrum of coding and noncoding RNA from a single cell. Smart-seq-total does not require splitting the RNA content of a cell and allows the incorporation of unique molecular identifiers into short and long RNA molecules for absolute quantification. It outperforms current poly(A)-independent total RNA-seq protocols by capturing transcripts of a broad size range, thus enabling simultaneous analysis of protein-coding, long-noncoding, microRNA, and other noncoding RNA transcripts from single cells. We used Smart-seq-total to analyze the total RNAome of human primary fibroblasts, HEK293T, and MCF7 cells, as well as that of induced murine embryonic stem cells differentiated into embryoid bodies. By analyzing the coexpression patterns of both noncoding RNA and mRNA from the same cell, we were able to discover new roles of noncoding RNA throughout essential processes, such as cell cycle and lineage commitment during embryonic development. Moreover, we show that independent classes of short-noncoding RNA can be used to determine cell-type identity.
Collapse
Affiliation(s)
- Alina Isakova
- Department of Bioengineering, Stanford University, Stanford, CA 94305
| | - Norma Neff
- Chan Zuckerberg Biohub, San Francisco, CA 94158
| | - Stephen R Quake
- Department of Bioengineering, Stanford University, Stanford, CA 94305;
- Chan Zuckerberg Biohub, San Francisco, CA 94158
- Department of Applied Physics, Stanford University, Stanford, CA 94305
| |
Collapse
|
3
|
Abstract
Non-coding RNAs are important regulators of differentiation during embryogenesis as well as key players in the fine-tuning of transcription and furthermore, they control the post-transcriptional regulation of mRNAs under physiological conditions. Deregulated expression of non-coding RNAs is often identified as one major contribution in a number of pathological conditions. Non-coding RNAs are a heterogenous group of RNAs and they represent the majority of nuclear transcripts in eukaryotes. An evolutionary highly conserved sub-group of non-coding RNAs is represented by vault RNAs, named since firstly discovered as component of the largest known ribonucleoprotein complexes called "vault". Although they have been initially described 30 years ago, vault RNAs are largely unknown and their molecular role is still under investigation. In this review we will summarize the known functions of vault RNAs and their involvement in cellular mechanisms.
Collapse
Affiliation(s)
- Jens Claus Hahne
- Division of Molecular Pathology, The Institute of Cancer Research, London, UK.
- Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK.
| | - Andrea Lampis
- Division of Molecular Pathology, The Institute of Cancer Research, London, UK
- Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK
| | - Nicola Valeri
- Division of Molecular Pathology, The Institute of Cancer Research, London, UK
- Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK
- Department of Medicine, The Royal Marsden NHS Foundation Trust, London, UK
| |
Collapse
|
4
|
Fang L, Li Y, Ma L, Xu Q, Tan F, Chen G. GRNdb: decoding the gene regulatory networks in diverse human and mouse conditions. Nucleic Acids Res 2021; 49:D97-D103. [PMID: 33151298 PMCID: PMC7779055 DOI: 10.1093/nar/gkaa995] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2020] [Revised: 10/10/2020] [Accepted: 10/13/2020] [Indexed: 12/11/2022] Open
Abstract
Gene regulatory networks (GRNs) formed by transcription factors (TFs) and their downstream target genes play essential roles in gene expression regulation. Moreover, GRNs can be dynamic changing across different conditions, which are crucial for understanding the underlying mechanisms of disease pathogenesis. However, no existing database provides comprehensive GRN information for various human and mouse normal tissues and diseases at the single-cell level. Based on the known TF-target relationships and the large-scale single-cell RNA-seq data collected from public databases as well as the bulk data of The Cancer Genome Atlas and the Genotype-Tissue Expression project, we systematically predicted the GRNs of 184 different physiological and pathological conditions of human and mouse involving >633 000 cells and >27 700 bulk samples. We further developed GRNdb, a freely accessible and user-friendly database (http://www.grndb.com/) for searching, comparing, browsing, visualizing, and downloading the predicted information of 77 746 GRNs, 19 687 841 TF-target pairs, and related binding motifs at single-cell/bulk resolution. GRNdb also allows users to explore the gene expression profile, correlations, and the associations between expression levels and the patient survival of diverse cancers. Overall, GRNdb provides a valuable and timely resource to the scientific community to elucidate the functions and mechanisms of gene expression regulation in various conditions.
Collapse
Affiliation(s)
- Li Fang
- Center for Bioinformatics and Computational Biology, and Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Yunjin Li
- Center for Bioinformatics and Computational Biology, and Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Lu Ma
- Center for Bioinformatics and Computational Biology, and Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Qiyue Xu
- Center for Bioinformatics and Computational Biology, and Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Fei Tan
- Shanghai Skin Disease Hospital, School of Medicine, Tongji University, Shanghai 200443, China
| | - Geng Chen
- Center for Bioinformatics and Computational Biology, and Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
- Shanghai Applied Protein Technology Co., Ltd. (APTBIO), Shanghai 200233, China
| |
Collapse
|
5
|
Yao J, Wu DC, Nottingham RM, Lambowitz AM. Identification of protein-protected mRNA fragments and structured excised intron RNAs in human plasma by TGIRT-seq peak calling. eLife 2020; 9:e60743. [PMID: 32876046 PMCID: PMC7518892 DOI: 10.7554/elife.60743] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 09/01/2020] [Indexed: 12/18/2022] Open
Abstract
Human plasma contains > 40,000 different coding and non-coding RNAs that are potential biomarkers for human diseases. Here, we used thermostable group II intron reverse transcriptase sequencing (TGIRT-seq) combined with peak calling to simultaneously profile all RNA biotypes in apheresis-prepared human plasma pooled from healthy individuals. Extending previous TGIRT-seq analysis, we found that human plasma contains largely fragmented mRNAs from > 19,000 protein-coding genes, abundant full-length, mature tRNAs and other structured small non-coding RNAs, and less abundant tRNA fragments and mature and pre-miRNAs. Many of the mRNA fragments identified by peak calling correspond to annotated protein-binding sites and/or have stable predicted secondary structures that could afford protection from plasma nucleases. Peak calling also identified novel repeat RNAs, miRNA-sized RNAs, and putatively structured intron RNAs of potential biological, evolutionary, and biomarker significance, including a family of full-length excised intron RNAs, subsets of which correspond to mirtron pre-miRNAs or agotrons.
Collapse
Affiliation(s)
- Jun Yao
- Institute for Cellular and Molecular Biology and Departments of Molecular Biosciences and Oncology, University of TexasAustinUnited States
| | - Douglas C Wu
- Institute for Cellular and Molecular Biology and Departments of Molecular Biosciences and Oncology, University of TexasAustinUnited States
| | - Ryan M Nottingham
- Institute for Cellular and Molecular Biology and Departments of Molecular Biosciences and Oncology, University of TexasAustinUnited States
| | - Alan M Lambowitz
- Institute for Cellular and Molecular Biology and Departments of Molecular Biosciences and Oncology, University of TexasAustinUnited States
| |
Collapse
|
6
|
Černý J, Božíková P, Svoboda J, Schneider B. A unified dinucleotide alphabet describing both RNA and DNA structures. Nucleic Acids Res 2020; 48:6367-6381. [PMID: 32406923 PMCID: PMC7293047 DOI: 10.1093/nar/gkaa383] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Revised: 04/11/2020] [Accepted: 04/30/2020] [Indexed: 12/13/2022] Open
Abstract
By analyzing almost 120 000 dinucleotides in over 2000 nonredundant nucleic acid crystal structures, we define 96+1 diNucleotide Conformers, NtCs, which describe the geometry of RNA and DNA dinucleotides. NtC classes are grouped into 15 codes of the structural alphabet CANA (Conformational Alphabet of Nucleic Acids) to simplify symbolic annotation of the prominent structural features of NAs and their intuitive graphical display. The search for nontrivial patterns of NtCs resulted in the identification of several types of RNA loops, some of them observed for the first time. Over 30% of the nearly six million dinucleotides in the PDB cannot be assigned to any NtC class but we demonstrate that up to a half of them can be re-refined with the help of proper refinement targets. A statistical analysis of the preferences of NtCs and CANA codes for the 16 dinucleotide sequences showed that neither the NtC class AA00, which forms the scaffold of RNA structures, nor BB00, the DNA most populated class, are sequence neutral but their distributions are significantly biased. The reported automated assignment of the NtC classes and CANA codes available at dnatco.org provides a powerful tool for unbiased analysis of nucleic acid structures by structural and molecular biologists.
Collapse
Affiliation(s)
- Jiří Černý
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, CZ-252 50 Vestec, Prague-West, Czech Republic
| | - Paulína Božíková
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, CZ-252 50 Vestec, Prague-West, Czech Republic
| | - Jakub Svoboda
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, CZ-252 50 Vestec, Prague-West, Czech Republic
| | - Bohdan Schneider
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, CZ-252 50 Vestec, Prague-West, Czech Republic
| |
Collapse
|
7
|
Neueder A. RNA-Mediated Disease Mechanisms in Neurodegenerative Disorders. J Mol Biol 2018; 431:1780-1791. [PMID: 30597161 DOI: 10.1016/j.jmb.2018.12.012] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Revised: 12/14/2018] [Accepted: 12/16/2018] [Indexed: 12/16/2022]
Abstract
RNA is accurately entangled in virtually all pathways that maintain cellular homeostasis. To name but a few, RNA is the "messenger" between DNA encoded information and the resulting proteins. Furthermore, RNAs regulate diverse processes by forming DNA::RNA or RNA::RNA interactions. Finally, RNA itself can be the scaffold for ribonucleoprotein complexes, for example, ribosomes or cellular bodies. Consequently, disruption of any of these processes can lead to disease. This review describes known and emerging RNA-based disease mechanisms like interference with regular splicing, the anomalous appearance of RNA-protein complexes and uncommon RNA species, as well as non-canonical translation. Due to the complexity and entanglement of the above-mentioned pathways, only few drugs are available that target RNA-based disease mechanisms. However, advances in our understanding how RNA is involved in and modulates cellular homeostasis might pave the way to novel treatments.
Collapse
Affiliation(s)
- Andreas Neueder
- Experimental Neurology, Department of Neurology, Ulm University, 89081 Ulm, Germany.
| |
Collapse
|
8
|
Greco S, Cardinali B, Falcone G, Martelli F. Circular RNAs in Muscle Function and Disease. Int J Mol Sci 2018; 19:ijms19113454. [PMID: 30400273 PMCID: PMC6274904 DOI: 10.3390/ijms19113454] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Revised: 10/30/2018] [Accepted: 10/31/2018] [Indexed: 12/11/2022] Open
Abstract
Circular RNAs (circRNAs) are a class of RNA produced during pre-mRNA splicing that are emerging as new members of the gene regulatory network. In addition to being spliced in a linear fashion, exons of pre-mRNAs can be circularized by use of the 3' acceptor splice site of upstream exons, leading to the formation of circular RNA species. In this way, genetic information can be re-organized, increasing gene expression potential. Expression of circRNAs is developmentally regulated, tissue and cell-type specific, and shared across eukaryotes. The importance of circRNAs in gene regulation is now beginning to be recognized and some putative functions have been assigned to them, such as the sequestration of microRNAs or proteins, the modulation of transcription, the interference with splicing, and translation of small proteins. In accordance with an important role in normal cell biology, circRNA deregulation has been reported to be associated with diseases. Recent evidence demonstrated that circRNAs are highly expressed in striated muscle tissue, both skeletal and cardiac, that is also one of the body tissue showing the highest levels of alternative splicing. Moreover, initial studies revealed altered circRNA expression in diseases involving striated muscle, suggesting important functions of these molecules in the pathogenetic mechanisms of both heart and skeletal muscle diseases. The recent findings in this field will be described and discussed.
Collapse
Affiliation(s)
- Simona Greco
- Molecular Cardiology Laboratory, IRCCS-Policlinico San Donato, San Donato Milanese, 20097 Milan, Italy.
| | - Beatrice Cardinali
- Institute of Cell Biology and Neurobiology, National Research Council, Monterotondo, 00015 Rome, Italy.
| | - Germana Falcone
- Institute of Cell Biology and Neurobiology, National Research Council, Monterotondo, 00015 Rome, Italy.
| | - Fabio Martelli
- Molecular Cardiology Laboratory, IRCCS-Policlinico San Donato, San Donato Milanese, 20097 Milan, Italy.
| |
Collapse
|
9
|
Zhao R, Li FQ, Tian LL, Shang DS, Guo Y, Zhang JR, Liu M. Comprehensive analysis of the whole coding and non-coding RNA transcriptome expression profiles and construction of the circRNA-lncRNA co-regulated ceRNA network in laryngeal squamous cell carcinoma. Funct Integr Genomics 2018; 19:109-121. [PMID: 30128795 DOI: 10.1007/s10142-018-0631-y] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2018] [Revised: 07/21/2018] [Accepted: 08/03/2018] [Indexed: 02/07/2023]
Abstract
Recently, accumulating evidence has demonstrated that non-coding RNAs (ncRNAs) play a vital role in oncogenicity. Nevertheless, the regulatory mechanisms and functions remain poorly understood, especially for lncRNAs and circRNAs. In this study, we simultaneously detected, for the first time, the expression profiles of the whole transcriptome, including miRNA, circRNA and lncRNA + mRNA, in five pairs of laryngeal squamous cell carcinoma (LSCC) and matched non-carcinoma tissues by microarrays. Five miRNAs, four circRNAs, three lncRNAs and five mRNAs that were dysregulated were selected to confirm the verification of the microarray data by quantitative real-time PCR (qRT-PCR) in 20 pairs of LSCC samples. We constructed LSCC-related competing endogenous RNA (ceRNA) networks of lncRNAs and circRNAs (circRNA or lncRNA-miRNA-mRNA) respectively. Functional annotation revealed the lncRNA-mediated ceRNA network were enriched for genes involved in the tumor-associated pathways. Hsa_circ_0033988 with the highest degree in the circRNA-mediated ceRNA network was associated with fatty acid degradation, which was responsible for the depletion of fat in tumor-associated cachexia. Finally, to clarify the ncRNA co-regulation mechanism, we constructed a circRNA-lncRNA co-regulated network by integrating the above two networks and identified 9 modules for further study. A subnetwork of module 2 with the most dysregulated microRNAs was extracted to establish the ncRNA-involved TGF-β-associated pathway. In conclusion, our findings provide a high-throughput microarray data of the coding and non-coding RNAs and establish the foundation for further functional research on the ceRNA regulatory mechanism of non-coding RNAs in LSCC.
Collapse
MESH Headings
- Carcinoma, Squamous Cell/genetics
- Carcinoma, Squamous Cell/metabolism
- Carcinoma, Squamous Cell/pathology
- Computational Biology
- Gene Expression Profiling
- Gene Expression Regulation, Neoplastic
- Gene Ontology
- Gene Regulatory Networks
- Humans
- Laryngeal Neoplasms/genetics
- Laryngeal Neoplasms/metabolism
- Laryngeal Neoplasms/pathology
- MicroRNAs/classification
- MicroRNAs/genetics
- MicroRNAs/metabolism
- Microarray Analysis
- Molecular Sequence Annotation
- RNA/classification
- RNA/genetics
- RNA/metabolism
- RNA, Circular
- RNA, Long Noncoding/classification
- RNA, Long Noncoding/genetics
- RNA, Long Noncoding/metabolism
- RNA, Messenger/classification
- RNA, Messenger/genetics
- RNA, Messenger/metabolism
- Transcriptome
Collapse
Affiliation(s)
- Rui Zhao
- Department of Otolaryngology-Head and Neck Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Feng-Qing Li
- Department of Gerontology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Lin-Li Tian
- Department of Otolaryngology-Head and Neck Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - De-Si Shang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yan Guo
- Department of Otolaryngology-Head and Neck Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Jia-Rui Zhang
- Department of Otolaryngology-Head and Neck Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Ming Liu
- Department of Otolaryngology-Head and Neck Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, 150081, Heilongjiang, China.
| |
Collapse
|
10
|
Abstract
Circular RNA (circRNA) is a group of endogenous noncoding RNA characterized by a covalently closed cyclic structure lacking poly-adenylated tails. Recent studies have suggested that circRNAs play a crucial role in regulating gene expression by acting as a microRNA sponge, RNA binding protein sponge and translational regulator. CircRNAs have become a research hotspot because of their close association with the development of diseases. Some circRNAs are reportedly expressed in a tissue- and development stage-specific manner. Furthermore, due to other features of circRNAs including stability, conservation and high abundance in body fluids, circRNAs are believed to be potential biomarkers for various diseases. In the present review, we provide the current understanding of biogenesis and gene regulatory mechanisms of circRNAs, summarize the recent studies on circRNAs as potential diagnostic and prognostic biomarkers, and highlight the major advantages and limitations of circRNAs as novel biomarkers based on existing knowledge.
Collapse
Affiliation(s)
- Zhongrong Zhang
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, School of Life Science, Shanghai University, Shanghai 200444, China
| | - Tingting Yang
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, School of Life Science, Shanghai University, Shanghai 200444, China
| | - Junjie Xiao
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, School of Life Science, Shanghai University, Shanghai 200444, China.
| |
Collapse
|
11
|
Abstract
Exosomes are small extracellular vesicles of around 100 nm of diameter produced by most cell types. These vesicles carry nucleic acids, proteins, lipids, and other biomolecules and function as carriers of biological information in processes of extracellular communication. The content of exosomes is regulated by the external and internal microenvironment of the parent cell, but the intrinsic mechanisms of loading of molecules into exosomes are still not completely elucidated. In this study, by the use of next-generation sequencing we have characterized in depth the RNA composition of healthy endothelial cells and exosomes and provided an accurate profile of the different coding and noncoding RNA species found per compartment. We have also discovered a set of unique genes preferentially included (or excluded) into vesicles. Moreover, after studying the enrichment of RNA motifs in the genes unequally distributed between cells and exosomes, we have detected a set of enriched sequences for several classes of RNA. In conclusion, our results provide the basis for studying the involvement of RNA-binding proteins capable of recognizing RNA sequences and their role in the export of RNAs into exosomes.
Collapse
Affiliation(s)
- Jennifer Pérez-Boza
- Laboratory of Molecular Angiogenesis, GIGA-R, University of Liège, 4000 Liège, Belgium
| | - Michelle Lion
- Laboratory of Molecular Angiogenesis, GIGA-R, University of Liège, 4000 Liège, Belgium
| | - Ingrid Struman
- Laboratory of Molecular Angiogenesis, GIGA-R, University of Liège, 4000 Liège, Belgium
| |
Collapse
|
12
|
Abstract
As more biological activities of ribonucleic acids continue to emerge, the development of efficient analytical tools for RNA identification and characterization is necessary to acquire an in-depth understanding of their functions and chemical properties. Herein, we demonstrate the capacity of label-free direct surface-enhanced Raman scattering (SERS) analysis to access highly specific structural information on RNAs at the ultrasensitive level. This includes the recognition of distinctive vibrational features of RNAs organized into a variety of conformations (micro-, fully complementary duplex-, small interfering- and short hairpin-RNAs) or characterized by subtle chemical differences (single-base variances, nucleobase modifications and backbone composition). This method represents a key advance in the ribonucleic acid analysis and will have a direct impact in a wide range of different fields, including medical diagnosis, drug design, and biotechnology, by enabling the rapid, high-throughput, simple, and low-cost identification and classification of structurally similar RNAs.
Collapse
Affiliation(s)
- Judit Morla-Folch
- Medcom Advance , Viladecans Business Park, Edificio Brasil, Bertran i Musitu 83-85, 08840 Viladecans, Barcelona, Spain
- Universitat Rovira i Virgili and Centro Tecnológico de la Química de Catalunya , Carrer de Marcel·lí Domingo s/n, 43007 Tarragona, Spain
| | - Hai-nan Xie
- Medcom Advance , Viladecans Business Park, Edificio Brasil, Bertran i Musitu 83-85, 08840 Viladecans, Barcelona, Spain
| | - Ramon A Alvarez-Puebla
- Medcom Advance , Viladecans Business Park, Edificio Brasil, Bertran i Musitu 83-85, 08840 Viladecans, Barcelona, Spain
- Universitat Rovira i Virgili and Centro Tecnológico de la Química de Catalunya , Carrer de Marcel·lí Domingo s/n, 43007 Tarragona, Spain
- ICREA , Passeig Lluís Companys 23, 08010 Barcelona, Spain
| | - Luca Guerrini
- Medcom Advance , Viladecans Business Park, Edificio Brasil, Bertran i Musitu 83-85, 08840 Viladecans, Barcelona, Spain
| |
Collapse
|
13
|
Housman G, Ulitsky I. Methods for distinguishing between protein-coding and long noncoding RNAs and the elusive biological purpose of translation of long noncoding RNAs. Biochim Biophys Acta 2015; 1859:31-40. [PMID: 26265145 DOI: 10.1016/j.bbagrm.2015.07.017] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 06/18/2015] [Accepted: 07/19/2015] [Indexed: 12/12/2022]
Abstract
Long noncoding RNAs (lncRNAs) are a diverse class of RNAs with increasingly appreciated functions in vertebrates, yet much of their biology remains poorly understood. In particular, it is unclear to what extent the current catalog of over 10,000 annotated lncRNAs is indeed devoid of genes coding for proteins. Here we review the available computational and experimental schemes for distinguishing between coding and noncoding transcripts and assess the conclusions from their recent genome-wide applications. We conclude that the model most consistent with the available data is that a large number of mammalian lncRNAs undergo translation, but only a very small minority of such translation events results in stable and functional peptides. The outcomes of the majority of the translation events and their potential biological purposes remain an intriguing topic for future investigation. This article is part of a Special Issue entitled: Clues to long noncoding RNA taxonomy1, edited by Dr. Tetsuro Hirose and Dr. Shinichi Nakagawa.
Collapse
Affiliation(s)
- Gali Housman
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Igor Ulitsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot 76100, Israel.
| |
Collapse
|
14
|
Abstract
RNAs that do not encode proteins, increasing evidence shows, are the rule rather than the exception. How do we call these RNAs? The term non-coding RNA should be rejected, we argue, since it constitutes a contradiction in terms: most if not all RNAs carry a code, even though that code may not specify an amino acid sequence. In naming these RNAs, we suggest to follow a natural distinction between two broad classes of RNAs. Class I RNAs are those that are transcribed but not translated, i.e., do not contain a translatable Open Reading Frame (ORF). Class II RNAs are transcribed and subsequently translated into amino acid sequences by the ribosomal translational apparatus. Class II RNAs comprise the familiar mRNAs, including peptide-coding RNAs. Class I RNAs, we suggest, are most fittingly called utRNAs (untranslated RNAs). The term npcRNAs (non-peptide/protein coding) can be used synonymously.
Collapse
Affiliation(s)
- Jürgen Brosius
- Institute of Experimental Pathology, ZMBE, University of Münster, Münster, Germany.
| | | |
Collapse
|
15
|
Abstract
Transcriptomics is one of the most developed fields in the post-genomic era. Transcriptome is the complete set of RNA transcripts in a specific cell type or tissue at a certain developmental stage and/or under a specific physiological condition, including messenger RNA, transfer RNA, ribosomal RNA, and other non-coding RNAs. Transcriptomics focuses on the gene expression at the RNA level and offers the genome-wide information of gene structure and gene function in order to reveal the molecular mechanisms involved in specific biological processes. With the development of next-generation high-throughput sequencing technology, transcriptome analysis has been progressively improving our understanding of RNA-based gene regulatory network. Here, we discuss the concept, history, and especially the recent advances in this inspiring field of study.
Collapse
Affiliation(s)
- Zhicheng Dong
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China,
| | | |
Collapse
|
16
|
Abstract
The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson-Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access.
Collapse
Affiliation(s)
- Anton I. Petrov
- Department of Chemistry, Bowling Green State University, Bowling Green, Ohio 43403, USA
| | - Craig L. Zirbel
- Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, Ohio 43403, USA
| | - Neocles B. Leontis
- Department of Chemistry, Bowling Green State University, Bowling Green, Ohio 43403, USA
- Corresponding authorE-mail
| |
Collapse
|
17
|
Abstract
During the development, progression and dissemination of neoplastic lesions, cancer cells hijack normal pathways and mechanisms, especially those involved in repair and embryologic development. These pathways include those involved in intercellular communication, control of transcription, post-transcriptional regulation of protein production including translation of mRNAs, post-translational protein modifications, e.g., acetylation of proteins, and protein degradation. Small, non-translatable RNAs, especially microRNAs (miRs), are Important components of post-transcriptional control. MiRs are produced from areas of the genome that are not translated into proteins, but may be co-regulated with their associated genes. MiRs bind to the 3' untranslated regions of mRNAs and regulate the expression of genes in most cases by either promoting the degradation of mRNA and/or inhibiting the translation of mRNAs into proteins; thus, miRs usually cause a decrease in protein levels that would be expected if the mRNAs were translated normally. It is early in our understanding of how miRs affect neoplastic processes, but miRs are expressed differentially in most cancers and have been associated with tumor progression, chemoresistance and metastasis. MiRs are present in nanovesicles, such as exosomes, and thus are likely involved in intercellular communication, especially in neoplasia. MiRs are attractive targets for novel therapies of cancer as well as potential biomarkers that might be useful for early detection and diagnosis, and for prediction of therapeutic efficacy. MiRs also could aid and in determining prognosis, evaluating novel therapies, and developing preventive strategies by their use as surrogate end points.
Collapse
Affiliation(s)
- Lr McNally
- James Graham Brown Cancer Center , University of Louisville, Louisville , Kentucky
| | | | | |
Collapse
|
18
|
Laborde J, Robinson D, Srivastava A, Klassen E, Zhang J. RNA global alignment in the joint sequence-structure space using elastic shape analysis. Nucleic Acids Res 2013; 41:e114. [PMID: 23585278 PMCID: PMC3675459 DOI: 10.1093/nar/gkt187] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2012] [Revised: 02/26/2013] [Accepted: 02/27/2013] [Indexed: 01/22/2023] Open
Abstract
The functions of RNAs, like proteins, are determined by their structures, which, in turn, are determined by their sequences. Comparison/alignment of RNA molecules provides an effective means to predict their functions and understand their evolutionary relationships. For RNA sequence alignment, most methods developed for protein and DNA sequence alignment can be directly applied. RNA 3-dimensional structure alignment, on the other hand, tends to be more difficult than protein structure alignment due to the lack of regular secondary structures as observed in proteins. Most of the existing RNA 3D structure alignment methods use only the backbone geometry and ignore the sequence information. Using both the sequence and backbone geometry information in RNA alignment may not only produce more accurate classification, but also deepen our understanding of the sequence-structure-function relationship of RNA molecules. In this study, we developed a new RNA alignment method based on elastic shape analysis (ESA). ESA treats RNA structures as three dimensional curves with sequence information encoded on additional dimensions so that the alignment can be performed in the joint sequence-structure space. The similarity between two RNA molecules is quantified by a formal distance, geodesic distance. Based on ESA, a rigorous mathematical framework can be built for RNA structure comparison. Means and covariances of full structures can be defined and computed, and probability distributions on spaces of such structures can be constructed for a group of RNAs. Our method was further applied to predict functions of RNA molecules and showed superior performance compared with previous methods when tested on benchmark datasets. The programs are available at http://stat.fsu.edu/ ∼jinfeng/ESA.html.
Collapse
Affiliation(s)
- Jose Laborde
- Department of Statistics, Florida State University, FL, USA and Department of Mathematics, Florida State University, FL, USA
| | - Daniel Robinson
- Department of Statistics, Florida State University, FL, USA and Department of Mathematics, Florida State University, FL, USA
| | - Anuj Srivastava
- Department of Statistics, Florida State University, FL, USA and Department of Mathematics, Florida State University, FL, USA
| | - Eric Klassen
- Department of Statistics, Florida State University, FL, USA and Department of Mathematics, Florida State University, FL, USA
| | - Jinfeng Zhang
- Department of Statistics, Florida State University, FL, USA and Department of Mathematics, Florida State University, FL, USA
| |
Collapse
|
19
|
Li Y, Podlevsky JD, Marz M, Qi X, Hoffmann S, Stadler PF, Chen JJL. Identification of purple sea urchin telomerase RNA using a next-generation sequencing based approach. RNA 2013; 19:852-860. [PMID: 23584428 PMCID: PMC3683918 DOI: 10.1261/rna.039131.113] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2013] [Accepted: 03/19/2013] [Indexed: 06/02/2023]
Abstract
Telomerase is a ribonucleoprotein (RNP) enzyme essential for telomere maintenance and chromosome stability. While the catalytic telomerase reverse transcriptase (TERT) protein is well conserved across eukaryotes, telomerase RNA (TR) is extensively divergent in size, sequence, and structure. This diversity prohibits TR identification from many important organisms. Here we report a novel approach for TR discovery that combines in vitro TR enrichment from total RNA, next-generation sequencing, and a computational screening pipeline. With this approach, we have successfully identified TR from Strongylocentrotus purpuratus (purple sea urchin) from the phylum Echinodermata. Reconstitution of activity in vitro confirmed that this RNA is an integral component of sea urchin telomerase. Comparative phylogenetic analysis against vertebrate TR sequences revealed that the purple sea urchin TR contains vertebrate-like template-pseudoknot and H/ACA domains. While lacking a vertebrate-like CR4/5 domain, sea urchin TR has a unique central domain critical for telomerase activity. This is the first TR identified from the previously unexplored invertebrate clade and provides the first glimpse of TR evolution in the deuterostome lineage. Moreover, our TR discovery approach is a significant step toward the comprehensive understanding of telomerase RNP evolution.
Collapse
Affiliation(s)
- Yang Li
- Department of Chemistry & Biochemistry
| | - Joshua D. Podlevsky
- School of Life Sciences, Arizona State University, Tempe, Arizona 85287, USA
| | - Manja Marz
- Department of Bioinformatics, Friedrich Schiller University of Jena, D-07743 Jena, Germany
| | | | - Steve Hoffmann
- LIFE Center
- Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany
| | - Peter F. Stadler
- Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany
| | | |
Collapse
|
20
|
Abstract
The RNA infrastructure connects RNA-based functions. With transcription-to-translation processing forming the core of the network, we can visualise how RNA-based regulation, cleavage and modification are the backbone of cellular function. The key to interpreting the RNA-infrastructure is in understanding how core RNAs (tRNA, mRNA and rRNA) and other ncRNAs operate in a spatial-temporal manner, moving around the nucleus, cytoplasm and organelles during processing, or in response to environmental cues. This chapter summarises the concept of the RNA-infrastructure, and highlights examples of RNA-based networking within prokaryotes and eukaryotes. It describes how transcription-to-translation processes are tightly connected, and explores some similarities and differences between prokaryotic and eukaryotic RNA networking.
Collapse
Affiliation(s)
- Lesley J Collins
- Institute of Fundamental Sciences, Massey University, Palmerston North, New Zealand.
| |
Collapse
|
21
|
Zhao Y, Wang Z. [RNA secondary structure prediction based on support vector machine classification]. Sheng Wu Gong Cheng Xue Bao 2008; 24:1140-1148. [PMID: 18837386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The comparative sequence analysis is the most reliable method for RNA secondary structure prediction, and many algorithms based on it have been developed in last several decades. This paper considers RNA structure prediction as a 2-classes classification problem: given a sequence alignment, to decide whether or not two columns of alignment form a base pair. We employed Support Vector Machine (SVM) to predict potential paired sites, and selected co-variation information, thermodynamic information and the fraction of complementary bases as feature vectors. Considering the effect of sequence similarity upon co-variation score, we introduced a similarity weight factor, which could adjust the contribution of co-variation and thermodynamic information toward prediction according to sequence similarity. The test on 49 Rfam-seed alignments showed the effectiveness of our method, and the accuracy was better than many similar algorithms. Furthermore, this method could predict simple pseudoknot.
Collapse
Affiliation(s)
- Yingjie Zhao
- College of Mechatronics Engineering and Automation, National University of Defense Technology, Changsha 410073, China.
| | | |
Collapse
|
22
|
Abstract
Small RNAs ranging in size between 20 and 32 nt regulate gene expression through chromatin modification, mRNA degradation, and translational repression. Three major classes of small RNAs have been characterized: microRNAs (miRNAs), short interfering RNAs (siRNAs), and Piwi-interacting RNAs (piRNAs). miRNAs are expressed in a developmentally regulated and tissue-specific manner and are involved in development and cell differentiation. siRNAs are mainly involved in defense against transposons and viruses. piRNAs are expressed in germ cells and stem cells and are thought to repress transposition of retrotransposons. In this chapter, we describe the methods of small RNA cloning, annotation and classification, and their expression analysis during development.
Collapse
Affiliation(s)
- Toshiaki Watanabe
- Division of Human Genetics, Department of Integrated Genetics, National Institute of Genetics, Research Organization of Information and Systems, Mishima, Japan
| | | | | | | | | |
Collapse
|
23
|
Kawaji H, Nakamura M, Takahashi Y, Sandelin A, Katayama S, Fukuda S, Daub CO, Kai C, Kawai J, Yasuda J, Carninci P, Hayashizaki Y. Hidden layers of human small RNAs. BMC Genomics 2008. [PMID: 18402656 DOI: 10.1186/1471-12164-9-157] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2023] Open
Abstract
BACKGROUND Small RNA attracts increasing interest based on the discovery of RNA silencing and the rapid progress of our understanding of these phenomena. Although recent studies suggest the possible existence of yet undiscovered types of small RNAs in higher organisms, many studies to profile small RNA have focused on miRNA and/or siRNA rather than on the exploration of additional classes of RNAs. RESULTS Here, we explored human small RNAs by unbiased sequencing of RNAs with sizes of 19-40 nt. We provide substantial evidences for the existence of independent classes of small RNAs. Our data shows that well-characterized non-coding RNA, such as tRNA, snoRNA, and snRNA are cleaved at sites specific to the class of ncRNA. In particular, tRNA cleavage is regulated depending on tRNA type and tissue expression. We also found small RNAs mapped to genomic regions that are transcribed in both directions by bidirectional promoters, indicating that the small RNAs are a product of dsRNA formation and their subsequent cleavage. Their partial similarity with ribosomal RNAs (rRNAs) suggests unrevealed functions of ribosomal DNA or interstitial rRNA. Further examination revealed six novel miRNAs. CONCLUSION Our results underscore the complexity of the small RNA world and the biogenesis of small RNAs.
Collapse
Affiliation(s)
- Hideya Kawaji
- Genome Science Laboratory, Discovery and Research Institute, RIKEN Wako Main Campus, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Abstract
Microarray-based screening technologies have revealed a larger than expected diversity of gene expression profiles for many cells, tissues, and organisms. The complexity of RNA species, defined by their molecular structure, represents a major new development in biology. RNA not only carries genetic information in the form of templates and components of the translational machinery for protein synthesis but also directly regulates gene expression as exemplified by micro-RNAs (miRNAs). Recent evidence has demonstrated that 5' capped and 3' polyadenylated ends are not restricted to mRNAs, but that they are also present in precursors of both miRNAs and some antisense RNA transcripts. In addition, as many as 40% of transcribed RNAs may lack 3' poly(A) ends. In concert with the presence of a 5' cap (m7 GpppN), the length of the 3' poly(A) end plays a critical role in determining the translational efficiency, stability, and the cellular distribution of a specific mRNA. RNAs with short or lacking 3' poly(A) ends, that escape isolation and amplification with oligo(dT)-based methods, provide a challenge in RNA biology and gene expression studies. To circumvent the limitations of 3' poly(A)-dependent RNA isolation methods, we developed an efficient RNA purification system that binds the 5' cap of RNA with a high-affinity variant of the cap-binding protein eIF4E. This system can be used in differential selection approaches to isolate subsets of RNAs, including those with short 3' poly(A) ends that are likely targets of post-transcriptional regulation of gene expression. The length of the 3' poly(A) ends can be defined using a rapid polymerase chain reaction (PCR)- based approach.
Collapse
Affiliation(s)
- Edyta Z Bajak
- Department of Medicine and Pharmacology, University of Kansas Medical Center, Kansas City, KS, USA
| | | |
Collapse
|
25
|
Abstract
Telomerase maintains the integrity of telomeres, the ends of linear chromosomes, by adding G-rich repeats to their 3′-ends. Telomerase RNA is an integral component of telomerase. It contains a template for the synthesis of the telomeric repeats by the telomerase reverse transcriptase. Although telomerase RNAs of different organisms are very diverse in their sequences, a functional non-template element, a pseudoknot, was predicted in all of them. Pseudoknot elements in human and the budding yeast Kluyveromyces lactis telomerase RNAs contain unusual triple-helical segments with AUU base triples, which are critical for telomerase function. Such base triples in ciliates have not been previously reported. We analyzed the pseudoknot sequences in 28 ciliate species and classified them in six different groups based on the lengths of the stems and loops composing the pseudoknot. Using miniCarlo, a helical parameter-based modeling program, we calculated 3D models for a representative of each morphological group. In all cases, the predicted structure contains at least one AUU base triple in stem 2, except for that of Colpidium colpoda, which contains unconventional GCG and AUA triples. These results suggest that base triples in a pseudoknot element are a conserved feature of all telomerases.
Collapse
Affiliation(s)
- Nikolai B. Ulyanov
- Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, CA 94158-2517, USA and Department of Genetics, The Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Givat Ram, 91904 Jerusalem, Israel
- *To whom correspondence should be addressed. +1 415 476 0707+1 415 502 8298 Correspondence may also be addressed to Yehuda Tzfati. +972 2 6584902+972 2 6586975
| | - Kinneret Shefer
- Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, CA 94158-2517, USA and Department of Genetics, The Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Givat Ram, 91904 Jerusalem, Israel
| | - Thomas L. James
- Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, CA 94158-2517, USA and Department of Genetics, The Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Givat Ram, 91904 Jerusalem, Israel
| | - Yehuda Tzfati
- Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, CA 94158-2517, USA and Department of Genetics, The Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Givat Ram, 91904 Jerusalem, Israel
| |
Collapse
|
26
|
Kong L, Zhang Y, Ye ZQ, Liu XQ, Zhao SQ, Wei L, Gao G. CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res 2007; 35:W345-9. [PMID: 17631615 PMCID: PMC1933232 DOI: 10.1093/nar/gkm391] [Citation(s) in RCA: 1880] [Impact Index Per Article: 110.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Recent transcriptome studies have revealed that a large number of transcripts in mammals and other organisms do not encode proteins but function as noncoding RNAs (ncRNAs) instead. As millions of transcripts are generated by large-scale cDNA and EST sequencing projects every year, there is a need for automatic methods to distinguish protein-coding RNAs from noncoding RNAs accurately and quickly. We developed a support vector machine-based classifier, named Coding Potential Calculator (CPC), to assess the protein-coding potential of a transcript based on six biologically meaningful sequence features. Tenfold cross-validation on the training dataset and further testing on several large datasets showed that CPC can discriminate coding from noncoding transcripts with high accuracy. Furthermore, CPC also runs an order-of-magnitude faster than a previous state-of-the-art tool and has higher accuracy. We developed a user-friendly web-based interface of CPC at http://cpc.cbi.pku.edu.cn. In addition to predicting the coding potential of the input transcripts, the CPC web server also graphically displays detailed sequence features and additional annotations of the transcript that may facilitate users’ further investigation.
Collapse
Affiliation(s)
| | | | | | | | | | - Liping Wei
- *To whom correspondence should be addressed. +86-10-6275-5206+86-10-62759001 Correspondence may also be addressed to Ge Gao. +86-10-6275-1861+86-10-6275-1861
| | - Ge Gao
- *To whom correspondence should be addressed. +86-10-6275-5206+86-10-62759001 Correspondence may also be addressed to Ge Gao. +86-10-6275-1861+86-10-6275-1861
| |
Collapse
|
27
|
Abstract
The Y genes encode small noncoding RNAs whose functions remain elusive, whose numbers vary between species, and whose major property is to be bound by the Ro60 protein (or its ortholog in other species). To better understand the evolution of the Y gene family, we performed a homology search in 27 different genomes along with a structural search using Y RNA specific motifs. These searches confirmed that Y RNAs are well conserved in the animal kingdom and resulted in the detection of several new Y RNA genes, including the first Y RNAs in insects and a second Y RNA detected in Caenorhabditis elegans. Unexpectedly, Y5 genes were retrieved almost as frequently as Y1 and Y3 genes, and, consequently are not the result of a relatively recent apparition as is generally believed. Investigation of the organization of the Y genes demonstrated that the synteny was conserved among species. Interestingly, it revealed the presence of six putative "fossil" Y genes, all of which were Y4 and Y5 related. Sequence analysis led to inference of the ancestral sequences for all Y RNAs. In addition, the evolution of existing Y RNAs was deduced for many families, orders and classes. Moreover, a consensus sequence and secondary structure for each Y species was determined. Further evolutionary insight was obtained from the analysis of several thousand Y retropseudogenes among various species. Taken together, these results confirm the rich and diversified evolution history of Y RNAs.
Collapse
Affiliation(s)
- Jonathan Perreault
- Département de Biochimie, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | | | | |
Collapse
|
28
|
Backofen R, Bernhart SH, Flamm C, Fried C, Fritzsch G, Hackermüller J, Hertel J, Hofacker IL, Missal K, Mosig A, Prohaska SJ, Rose D, Stadler PF, Tanzer A, Washietl S, Will S. RNAs everywhere: genome-wide annotation of structured RNAs. J Exp Zool B Mol Dev Evol 2007; 308:1-25. [PMID: 17171697 DOI: 10.1002/jez.b.21130] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Starting with the discovery of microRNAs and the advent of genome-wide transcriptomics, non-protein-coding transcripts have moved from a fringe topic to a central field research in molecular biology. In this contribution we review the state of the art of "computational RNomics", i.e., the bioinformatics approaches to genome-wide RNA annotation. Instead of rehashing results from recently published surveys in detail, we focus here on the open problem in the field, namely (functional) annotation of the plethora of putative RNAs. A series of exploratory studies are used to provide non-trivial examples for the discussion of some of the difficulties.
Collapse
|
29
|
Pyysalo S, Ginter F, Heimonen J, Björne J, Boberg J, Järvinen J, Salakoski T. BioInfer: a corpus for information extraction in the biomedical domain. BMC Bioinformatics 2007; 8:50. [PMID: 17291334 PMCID: PMC1808065 DOI: 10.1186/1471-2105-8-50] [Citation(s) in RCA: 160] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2006] [Accepted: 02/09/2007] [Indexed: 12/22/2022] Open
Abstract
Background Lately, there has been a great interest in the application of information extraction methods to the biomedical domain, in particular, to the extraction of relationships of genes, proteins, and RNA from scientific publications. The development and evaluation of such methods requires annotated domain corpora. Results We present BioInfer (Bio Information Extraction Resource), a new public resource providing an annotated corpus of biomedical English. We describe an annotation scheme capturing named entities and their relationships along with a dependency analysis of sentence syntax. We further present ontologies defining the types of entities and relationships annotated in the corpus. Currently, the corpus contains 1100 sentences from abstracts of biomedical research articles annotated for relationships, named entities, as well as syntactic dependencies. Supporting software is provided with the corpus. The corpus is unique in the domain in combining these annotation types for a single set of sentences, and in the level of detail of the relationship annotation. Conclusion We introduce a corpus targeted at protein, gene, and RNA relationships which serves as a resource for the development of information extraction systems and their components such as parsers and domain analyzers. The corpus will be maintained and further developed with a current version being available at .
Collapse
Affiliation(s)
- Sampo Pyysalo
- Turku Centre for Computer Science (TUCS), and the Department of IT, University of Turku, Lemminkäisenkatu 14a, 20520 Turku, Finland
| | - Filip Ginter
- Turku Centre for Computer Science (TUCS), and the Department of IT, University of Turku, Lemminkäisenkatu 14a, 20520 Turku, Finland
| | - Juho Heimonen
- Turku Centre for Computer Science (TUCS), and the Department of IT, University of Turku, Lemminkäisenkatu 14a, 20520 Turku, Finland
| | - Jari Björne
- Turku Centre for Computer Science (TUCS), and the Department of IT, University of Turku, Lemminkäisenkatu 14a, 20520 Turku, Finland
| | - Jorma Boberg
- Turku Centre for Computer Science (TUCS), and the Department of IT, University of Turku, Lemminkäisenkatu 14a, 20520 Turku, Finland
| | - Jouni Järvinen
- Turku Centre for Computer Science (TUCS), and the Department of IT, University of Turku, Lemminkäisenkatu 14a, 20520 Turku, Finland
| | - Tapio Salakoski
- Turku Centre for Computer Science (TUCS), and the Department of IT, University of Turku, Lemminkäisenkatu 14a, 20520 Turku, Finland
| |
Collapse
|
30
|
Abstract
Although group I and group II introns were discovered more than 25 years ago, they are still difficult to identify. Modeling their RNA structure also remains particularly challenging for organelle sequences, owing to their great diversity. In fact, accelerated evolution in organelles often results in a reduced RNA structure and a loss of autocatalytic splicing and intron mobility. We set out to identify all mitochondrial group I and II introns in published sequences, and, to this end, we developed and applied a new search approach: RNAweasel. On the basis of the results, we focus here on building a comprehensive picture of mitochondrial group I introns, including a modified (reduced) consensus RNA secondary structure and a concise phylogeny-based subclassification.
Collapse
Affiliation(s)
- B Franz Lang
- Robert Cedergren Centre, Program in Evolutionary Biology, Canadian Institute for Advanced Research, Département de Biochimie, Université de Montréal, Montréal, Québec, Canada.
| | | | | |
Collapse
|
31
|
Grivna ST, Pyhtila B, Lin H. MIWI associates with translational machinery and PIWI-interacting RNAs (piRNAs) in regulating spermatogenesis. Proc Natl Acad Sci U S A 2006; 103:13415-20. [PMID: 16938833 PMCID: PMC1569178 DOI: 10.1073/pnas.0605506103] [Citation(s) in RCA: 294] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Noncoding small RNAs have emerged as important regulators of gene expression at both transcriptional and posttranscriptional levels. Particularly, microRNA (miRNA)-mediated translational repression involving PIWI/Argonaute family proteins has been widely recognized as a novel mechanism of gene regulation. We previously reported that MIWI, a murine PIWI family member, is required for initiating spermiogenesis, a process that transforms round spermatids into mature sperm. MIWI is a cytoplasmic protein present in spermatocytes and round spermatids, and it is required for the expression of its target mRNAs involved in spermiogenesis. Most recently, we discovered a class of noncoding small RNAs called PIWI-interacting RNAs (piRNAs) that are abundantly expressed during spermiogenesis in a MIWI-dependent fashion. Here, we show that MIWI associates with both piRNAs and mRNAs in cytosolic ribonucleoprotein and polysomal fractions. As polysomes increase in early spermiogenesis, MIWI increases in polysome fractions. Moreover, MIWI associates with the mRNA cap-binding complex. Interestingly, MIWI is required for the expression of not only piRNAs but also a subset of miRNAs, despite the presence of Dicer. These results suggest that MIWI has a complicated role in the biogenesis and/or maintenance of two distinct types of small RNAs. Together, our results indicate that MIWI, a PIWI subfamily protein, uses piRNA as the major, but not exclusive, binding partner, and it is associated with translational machinery.
Collapse
Affiliation(s)
- Shane T. Grivna
- Departments of *Cell Biology and
- Pharmacology and Molecular Cancer Biology, Duke University Medical School, Durham, NC 27710
| | | | - Haifan Lin
- Departments of *Cell Biology and
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
32
|
Affiliation(s)
- Rachel Green
- Howard Hughes Medical Institute, Johns Hopkins School of Medicine, Baltimore, Maryland 21205, USA.
| | | |
Collapse
|
33
|
Abstract
UNLABELLED RNA atomic resolution structures have revealed the existance of different families of basepair interactions, each of which with its own isosteric sub-families. Ribostral (Ribonucleic Structural Aligner) is a user-friendly framework for analyzing, evaluating and viewing RNA sequence alignments with at least one available atomic resolution structure. It is the first of its kind that makes direct and easy- to-understand superposition of the isostericity matrices of basepairs observed in the structure onto sequence alignments, easily indicating allowed and unallowed substitutions at each BP position. Potential mistakes in the alignments can then be corrected using other sequence editing software. Ribostral has been developed and tested under Windows XP, and is capable of running on any PC or MAC platform with MATLAB 7.1 (SP3) or higher installed version. A stand-alone version is also available for the PC platform. AVAILABILITY http://rna.bgsu.edu/ribostral.
Collapse
Affiliation(s)
- Ali Mokdad
- Department of Chemistry and Center for Biomolecular Sciences, Bowling Green State University, OH 43403, USA.
| | | |
Collapse
|
34
|
Watanabe T, Takeda A, Tsukiyama T, Mise K, Okuno T, Sasaki H, Minami N, Imai H. Identification and characterization of two novel classes of small RNAs in the mouse germline: retrotransposon-derived siRNAs in oocytes and germline small RNAs in testes. Genes Dev 2006; 20:1732-43. [PMID: 16766679 PMCID: PMC1522070 DOI: 10.1101/gad.1425706] [Citation(s) in RCA: 418] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2006] [Accepted: 05/11/2006] [Indexed: 01/24/2023]
Abstract
Small RNAs ranging in size between 18 and 30 nucleotides (nt) are found in many organisms including yeasts, plants, and animals. Small RNAs are involved in the regulation of gene expression through translational repression, mRNA degradation, and chromatin modification. In mammals, microRNAs (miRNAs) are the only small RNAs that have been well characterized. Here, we have identified two novel classes of small RNAs in the mouse germline. One class consists of approximately 20- to 24-nt small interfering RNAs (siRNAs) from mouse oocytes, which are derived from retroelements including LINE, SINE, and LTR retrotransposons. Addition of retrotransposon-derived sequences to the 3' untranslated region (UTR) of a reporter mRNA destabilizes the mRNA significantly when injected into full-grown oocytes. These results suggest that retrotransposons are suppressed through the RNAi pathway in mouse oocytes. The other novel class of small RNAs is 26- to 30-nt germline small RNAs (gsRNAs) from testes. gsRNAs are expressed during spermatogenesis in a developmentally regulated manner, are mapped to the genome in clusters, and have strong strand bias. These features are reminiscent of Tetrahymena approximately 23- to 24-nt small RNAs and Caenorhabditis elegans X-cluster small RNAs. A conserved novel small RNA pathway may be present in diverse animals.
Collapse
Affiliation(s)
- Toshiaki Watanabe
- Laboratory of Reproductive Biology, Graduate School of Agriculture, Kyoto University, Kyoto 606-8502, Japan.
| | | | | | | | | | | | | | | |
Collapse
|
35
|
Abstract
Small noncoding RNAs, including small interfering RNAs (siRNAs) and micro RNAs (miRNAs) of approximately 21 nucleotides (nt) in length, have emerged as potent regulators of gene expression at both transcriptional and post-transcriptional levels in diverse organisms. Here we report the identification of a novel class of small RNAs in the mouse male germline termed piwi-interacting RNAs (piRNAs). piRNAs are approximately 30 nt in length. They are expressed during spermatogenesis, mostly in spermatids. piRNAs are associated with MIWI, a spermatogenesis-specific PIWI subfamily member of the Argonaute protein family, and depend on MIWI for their biogenesis and/or stability. Furthermore, a subpopulation of piRNAs are associated with polysomes, suggesting their potential role in translational regulation.
Collapse
Affiliation(s)
- Shane T Grivna
- Department of Cell Biology, Duke University, Durham, North Carolina 27710, USA
| | | | | | | |
Collapse
|
36
|
Aravin A, Gaidatzis D, Pfeffer S, Lagos-Quintana M, Landgraf P, Iovino N, Morris P, Brownstein MJ, Kuramochi-Miyagawa S, Nakano T, Chien M, Russo JJ, Ju J, Sheridan R, Sander C, Zavolan M, Tuschl T. A novel class of small RNAs bind to MILI protein in mouse testes. Nature 2006; 442:203-7. [PMID: 16751777 DOI: 10.1038/nature04916] [Citation(s) in RCA: 1056] [Impact Index Per Article: 58.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2006] [Accepted: 05/17/2006] [Indexed: 11/09/2022]
Abstract
Small RNAs bound to Argonaute proteins recognize partially or fully complementary nucleic acid targets in diverse gene-silencing processes. A subgroup of the Argonaute proteins--known as the 'Piwi family'--is required for germ- and stem-cell development in invertebrates, and two Piwi members--MILI and MIWI--are essential for spermatogenesis in mouse. Here we describe a new class of small RNAs that bind to MILI in mouse male germ cells, where they accumulate at the onset of meiosis. The sequences of the over 1,000 identified unique molecules share a strong preference for a 5' uridine, but otherwise cannot be readily classified into sequence families. Genomic mapping of these small RNAs reveals a limited number of clusters, suggesting that these RNAs are processed from long primary transcripts. The small RNAs are 26-31 nucleotides (nt) in length--clearly distinct from the 21-23 nt of microRNAs (miRNAs) or short interfering RNAs (siRNAs)--and we refer to them as 'Piwi-interacting RNAs' or piRNAs. Orthologous human chromosomal regions also give rise to small RNAs with the characteristics of piRNAs, but the cloned sequences are distinct. The identification of this new class of small RNAs provides an important starting point to determine the molecular function of Piwi proteins in mammalian spermatogenesis.
Collapse
Affiliation(s)
- Alexei Aravin
- Howard Hughes Medical Institute, Laboratory of RNA Molecular Biology, The Rockefeller University, 1230 York Avenue, Box 186, New York, New York 10021, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Abstract
RIKEN's FANTOM project has revealed many previously unknown coding sequences, as well as an unexpected degree of variation in transcripts resulting from alternative promoter usage and splicing. Ever more transcripts that do not code for proteins have been identified by transcriptome studies, in general. Increasing evidence points to the important cellular roles of such non-coding RNAs (ncRNAs). The distinction of protein-coding RNA transcripts from ncRNA transcripts is therefore an important problem in understanding the transcriptome and carrying out its annotation. Very few in silico methods have specifically addressed this problem. Here, we introduce CONC (for “coding or non-coding”), a novel method based on support vector machines that classifies transcripts according to features they would have if they were coding for proteins. These features include peptide length, amino acid composition, predicted secondary structure content, predicted percentage of exposed residues, compositional entropy, number of homologs from database searches, and alignment entropy. Nucleotide frequencies are also incorporated into the method. Confirmed coding cDNAs for eukaryotic proteins from the Swiss-Prot database constituted the set of true positives, ncRNAs from RNAdb and NONCODE the true negatives. Ten-fold cross-validation suggested that CONC distinguished coding RNAs from ncRNAs at about 97% specificity and 98% sensitivity. Applied to 102,801 mouse cDNAs from the FANTOM3 dataset, our method reliably identified over 14,000 ncRNAs and estimated the total number of ncRNAs to be about 28,000. There are two types of RNA: messenger RNAs (mRNAs), which are translated into proteins, and non-coding RNAs (ncRNAs), which function as RNA molecules. Besides textbook examples such as tRNAs and rRNAs, non-coding RNAs have been found to carry out very diverse functions, from mRNA splicing and RNA modification to translational regulation. It has been estimated that non-coding RNAs make up the vast majority of transcription output of higher eukaryotes. Discriminating mRNA from ncRNA has become an important biological and computational problem. The authors describe a computational method based on a machine learning algorithm known as a support vector machine (SVM) that classifies transcripts according to features they would have if they were coding for proteins. These features include peptide length, amino acid composition, secondary structure content, and protein alignment information. The method is applied to the dataset from the FANTOM3 large-scale mouse cDNA sequencing project; it identifies over 14,000 ncRNAs in mouse and estimates the total number of ncRNAs in the FANTOM3 data to be about 28,000.
Collapse
Affiliation(s)
- Jinfeng Liu
- Columbia University Bioinformatics Center, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America.
| | | | | |
Collapse
|
38
|
Abstract
New structural analysis methods, and a tree formalism re-define and expand the RNA motif concept, unifying what previously appeared to be disparate groups of structures. We find RNA tetraloops at high frequencies, in new contexts, with unexpected lengths, and in novel topologies. The results, with broad implications for RNA structure in general, show that even at this most elementary level of organization, RNA tolerates astounding variation in conformation, length, sequence and context. However the variation is not random; it is well-described by four distinct modes, which are 3-2 switches (backbone topology variations), insertions, deletions and strand clips.
Collapse
Affiliation(s)
| | | | - Eli Hershkovitz
- Departments of Electrical and Computer Engineering, Georgia Institute of TechnologyAtlanta, GA 30332-0400, USA
- Department of Biomedical Engineering, Georgia Institute of TechnologyAtlanta, GA 30332-0400, USA
| | - Allen Tannenbaum
- Departments of Electrical and Computer Engineering, Georgia Institute of TechnologyAtlanta, GA 30332-0400, USA
- Department of Biomedical Engineering, Georgia Institute of TechnologyAtlanta, GA 30332-0400, USA
| | - Loren Dean Williams
- To whom correspondence should be addressed. Tel: +1 404 894 9752; Fax: +1 404 894 7452;
| |
Collapse
|
39
|
Hofacker IL. RNAs everywhere: genome-wide annotation of structured RNAs. Genome Inform 2006; 17:281-2. [PMID: 17514830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Affiliation(s)
- Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Austria.
| |
Collapse
|
40
|
Kaplan JC. Ni gènes, ni junk, mais des TAR/TUF ! Med Sci (Paris) 2005; 21:1005. [PMID: 16274656 DOI: 10.1051/medsci/200521111005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
41
|
Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest ARR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, Chiu KP, Choudhary V, Christoffels A, Clutterbuck DR, Crowe ML, Dalla E, Dalrymple BP, de Bono B, Della Gatta G, di Bernardo D, Down T, Engstrom P, Fagiolini M, Faulkner G, Fletcher CF, Fukushima T, Furuno M, Futaki S, Gariboldi M, Georgii-Hemming P, Gingeras TR, Gojobori T, Green RE, Gustincich S, Harbers M, Hayashi Y, Hensch TK, Hirokawa N, Hill D, Huminiecki L, Iacono M, Ikeo K, Iwama A, Ishikawa T, Jakt M, Kanapin A, Katoh M, Kawasawa Y, Kelso J, Kitamura H, Kitano H, Kollias G, Krishnan SPT, Kruger A, Kummerfeld SK, Kurochkin IV, Lareau LF, Lazarevic D, Lipovich L, Liu J, Liuni S, McWilliam S, Madan Babu M, Madera M, Marchionni L, Matsuda H, Matsuzawa S, Miki H, Mignone F, Miyake S, Morris K, Mottagui-Tabar S, Mulder N, Nakano N, Nakauchi H, Ng P, Nilsson R, Nishiguchi S, Nishikawa S, Nori F, Ohara O, Okazaki Y, Orlando V, Pang KC, Pavan WJ, Pavesi G, Pesole G, Petrovsky N, Piazza S, Reed J, Reid JF, Ring BZ, Ringwald M, Rost B, Ruan Y, Salzberg SL, Sandelin A, Schneider C, Schönbach C, Sekiguchi K, Semple CAM, Seno S, Sessa L, Sheng Y, Shibata Y, Shimada H, Shimada K, Silva D, Sinclair B, Sperling S, Stupka E, Sugiura K, Sultana R, Takenaka Y, Taki K, Tammoja K, Tan SL, Tang S, Taylor MS, Tegner J, Teichmann SA, Ueda HR, van Nimwegen E, Verardo R, Wei CL, Yagi K, Yamanishi H, Zabarovsky E, Zhu S, Zimmer A, Hide W, Bult C, Grimmond SM, Teasdale RD, Liu ET, Brusic V, Quackenbush J, Wahlestedt C, Mattick JS, Hume DA, Kai C, Sasaki D, Tomaru Y, Fukuda S, Kanamori-Katayama M, Suzuki M, Aoki J, Arakawa T, Iida J, Imamura K, Itoh M, Kato T, Kawaji H, Kawagashira N, Kawashima T, Kojima M, Kondo S, Konno H, Nakano K, Ninomiya N, Nishio T, Okada M, Plessy C, Shibata K, Shiraki T, Suzuki S, Tagami M, Waki K, Watahiki A, Okamura-Oho Y, Suzuki H, Kawai J, Hayashizaki Y. The transcriptional landscape of the mammalian genome. Science 2005; 309:1559-63. [PMID: 16141072 DOI: 10.1126/science.1112014] [Citation(s) in RCA: 2607] [Impact Index Per Article: 137.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
Collapse
|
42
|
Abstract
This week's issues of Science, Science's STKE, and SAGE KE (www.sciencemag.org/sciext/rna/) focus on the increasing complexity that RNA brings to cellular biology. STKE resources include Reviews, Perspectives, and Teaching Resources. STKE specifically looks at the roles for small RNAs in regulation of gene expression and how mRNAs can be selectively activated from cytoplasmic macromolecular structures, RNA granules, to contribute to such processes as synaptic plasticity.
Collapse
|
43
|
Abstract
RNA is the only molecule known to recapitulate all biochemical functions of life: definition, control and transmission of genetic information, creation of defined three-dimensional structures, enzymatic activities and storage of energy. Because of its versatility and thanks to several recent scientific breakthroughs, RNA became the focus of intense research in molecular medicine at the beginning of the millennium. In particular, mRNA can be seen as a safe and efficient alternative to protein-, recombinant virus- or DNA-based therapies in the field of vaccination. This review summarises the most remarkable advances in this area and presents the advantages and limits of the five different mRNA-based vaccination methods. The paper will present the official, industrial and financial aspects of mRNA-based vaccination that are paving the way for therapeutic and prophylactic drugs with mRNA as the active component.
Collapse
Affiliation(s)
- Steve Pascolo
- CureVac GmbH, Paul Ehrlich Strasse 15, 72076 Tübingen, Germany.
| |
Collapse
|
44
|
Yamauchi T, Miyoshi D, Kubodera T, Nishimura A, Nakai S, Sugimoto N. Roles of Mg2+ in TPP-dependent riboswitch. FEBS Lett 2005; 579:2583-8. [PMID: 15862294 DOI: 10.1016/j.febslet.2005.03.074] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2004] [Revised: 03/10/2005] [Accepted: 03/21/2005] [Indexed: 10/25/2022]
Abstract
We quantified the effect of Mg(2+) on thiamine pyrophosphate (TPP) binding to TPP-dependent thiA riboswitch RNA. The association constant of TPP binding to the riboswitch at 20 degrees C increased from 1.2 x 10(6) to 50 x 10(6) M(-1) as the Mg(2+) concentration increased from 0 to 1 mM. Furthermore, circular dichroic spectra under various conditions showed that 1 mM Mg(2+) induced a local structural change of the riboswitch, which might be pivotal for TPP binding. These results indicate that a physiological concentration of Mg(2+) can regulate TPP binding to the thiA riboswitch.
Collapse
Affiliation(s)
- Takahiro Yamauchi
- Frontier Institute for Biomolecular Engineering, Research (FIBER), Konan University, Higashinada-ku, Kobe, Japan
| | | | | | | | | | | |
Collapse
|
45
|
Abstract
A report of the Keystone Symposium 'Diverse roles for RNA
in gene regulation', Breckenridge, USA, 8-15 January 2005. A report of the Keystone Symposium 'Diverse roles for RNA in gene regulation', Breckenridge, USA, 8-15 January 2005.
Collapse
Affiliation(s)
- Nelson C Lau
- Massachusetts General Hospital, Harvard Medical School, 55 Fruit St, Boston, MA 02114, USA
| | - Eric C Lai
- 545 Life Sciences Addition, University of California, Berkeley, CA 94720-3200, USA
| |
Collapse
|
46
|
Abstract
MOTIVATION Since the whole genome sequences of many species have been determined, computational prediction of RNA secondary structures and computational identification of those non-coding RNA regions by comparative genomics become important. Therefore, more advanced alignment methods are required. Recently, an approach of structural alignment for RNA sequences has been introduced to solve these problems. Pair hidden Markov models on tree structures (PHMMTSs) proposed by Sakakibara are efficient automata-theoretic models for structural alignment of RNA secondary structures, although PHMMTSs are incapable of handling pseudoknots. On the other hand, tree adjoining grammars (TAGs), a subclass of context-sensitive grammars, are suitable for modeling pseudoknots. Our goal is to extend PHMMTSs by incorporating TAGs to be able to handle pseudoknots. RESULTS We propose pair stochastic TAGs (PSTAGs) for aligning and predicting RNA secondary structures including a simple type of pseudoknot which can represent most known pseudoknot structures. First, we extend PHMMTSs defined on alignment of 'trees' to PSTAGs defined on alignment of 'TAG trees' which represent derivation processes of TAGs and are functionally equivalent to derived trees of TAGs. Then, we develop an efficient dynamic programming algorithm of PSTAGs for obtaining an optimal structural alignment including pseudoknots. We implement the PSTAG algorithm and demonstrate the properties of the algorithm by using it to align and predict several small pseudoknot structures. We believe that our implemented program based on PSTAGs is the first grammar-based and practically executable software for comparative analyses of RNA pseudoknot structures, and, further, non-coding RNAs.
Collapse
Affiliation(s)
- Hiroshi Matsui
- Keio University, Department of Biosciences and Informatics, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan
| | | | | |
Collapse
|
47
|
Gorodkin J. Comparing two K-category assignments by a K-category correlation coefficient. Comput Biol Chem 2005; 28:367-74. [PMID: 15556477 DOI: 10.1016/j.compbiolchem.2004.09.006] [Citation(s) in RCA: 210] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2004] [Revised: 09/16/2004] [Accepted: 09/16/2004] [Indexed: 10/26/2022]
Abstract
Predicted assignments of biological sequences are often evaluated by Matthews correlation coefficient. However, Matthews correlation coefficient applies only to cases where the assignments belong to two categories, and cases with more than two categories are often artificially forced into two categories by considering what belongs and what does not belong to one of the categories, leading to the loss of information. Here, an extended correlation coefficient that applies to K-categories is proposed, and this measure is shown to be highly applicable for evaluating prediction of RNA secondary structure in cases where some predicted pairs go into the category "unknown" due to lack of reliability in predicted pairs or unpaired residues. Hence, predicting base pairs of RNA secondary structure can be a three-category problem. The measure is further shown to be well in agreement with existing performance measures used for ranking protein secondary structure predictions. Server and software is available at http://rk.kvl.dk/.
Collapse
Affiliation(s)
- J Gorodkin
- Center for Bioinformatics and Division of Genetics, IBHV, The Royal Veterinary and Agricultural University, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| |
Collapse
|
48
|
Helfenbein KG, Fourcade HM, Vanjani RG, Boore JL. The mitochondrial genome of Paraspadella gotoi is highly reduced and reveals that chaetognaths are a sister group to protostomes. Proc Natl Acad Sci U S A 2004; 101:10639-43. [PMID: 15249679 PMCID: PMC489987 DOI: 10.1073/pnas.0400941101] [Citation(s) in RCA: 108] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2004] [Indexed: 11/18/2022] Open
Abstract
We report the complete mtDNA sequence from a member of the phylum Chaetognatha (arrow worms). The Paraspadella gotoi mtDNA is highly unusual, missing 23 of the genes commonly found in animal mtDNAs, including atp6, which has otherwise been found universally to be present. Its 14 genes are unusually arranged into two groups, one on each strand. One group is punctuated by numerous noncoding intergenic nucleotides although the other group is tightly packed, having no noncoding nucleotides, leading to speculation that there are two transcription units with differing modes of expression. The phylogenetic position of the Chaetognatha within the Metazoa has long been uncertain, with conflicting or equivocal results from various morphological analyses and rRNA sequence comparisons. Comparisons here of amino acid sequences from mitochondrially encoded proteins give a single most parsimonious tree that supports a position of Chaetognatha as sister to the protostomes studied here. From this analysis, one can more clearly interpret the patterns of evolution of various developmental features, especially regarding the embryological fate of the blastopore.
Collapse
Affiliation(s)
- Kevin G Helfenbein
- Department of Biology, University of Michigan, 830 North University Avenue, Ann Arbor, MI 48109, USA
| | | | | | | |
Collapse
|
49
|
Abstract
MOTIVATION Pseudoknots have generally been excluded from the prediction of RNA secondary structures due to its difficulty in modeling. Although, several dynamic programming algorithms exist for the prediction of pseudoknots using thermodynamic approaches, they are neither reliable nor efficient. On the other hand, comparative methods are more reliable, but are often done in an ad hoc manner and require expert intervention. Maximum weighted matching, an algorithm for pseudoknot prediction with comparative analysis, suffers from low-prediction accuracy in many cases. RESULTS Here we present an algorithm, iterated loop matching, for reliably and efficiently predicting RNA secondary structures including pseudoknots. The method can utilize either thermodynamic or comparative information or both, thus is able to predict pseudoknots for both aligned and individual sequences. We have tested the algorithm on a number of RNA families. Using 8-12 homologous sequences, the algorithm correctly identifies more than 90% of base-pairs for short sequences and 80% overall. It correctly predicts nearly all pseudoknots and produces very few spurious base-pairs for sequences without pseudoknots. Comparisons show that our algorithm is both more sensitive and more specific than the maximum weighted matching method. In addition, our algorithm has high-prediction accuracy on individual sequences, comparable with the PKNOTS algorithm, while using much less computational resources. AVAILABILITY The program has been implemented in ANSI C and is freely available for academic use at http://www.cse.wustl.edu/~zhang/projects/rna/ilm/ SUPPLEMENTARY INFORMATION http://www.cse.wustl.edu/~zhang/projects/rna/ilm/
Collapse
Affiliation(s)
- Jianhua Ruan
- Department of Computer Science, Washington University in St. Louis, St. Louis, MO 63130, USA.
| | | | | |
Collapse
|
50
|
Klosterman PS, Hendrix DK, Tamura M, Holbrook SR, Brenner SE. Three-dimensional motifs from the SCOR, structural classification of RNA database: extruded strands, base triples, tetraloops and U-turns. Nucleic Acids Res 2004; 32:2342-52. [PMID: 15121895 PMCID: PMC419439 DOI: 10.1093/nar/gkh537] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Release 2.0.1 of the Structural Classification of RNA (SCOR) database, http://scor.lbl.gov, contains a classification of the internal and hairpin loops in a comprehensive collection of 497 NMR and X-ray RNA structures. This report discusses findings of the classification that have not been reported previously. The SCOR database contains multiple examples of a newly described RNA motif, the extruded helical single strand. Internal loop base triples are classified in SCOR according to their three-dimensional context. These internal loop triples contain several examples of a frequently found motif, the minor groove AGC triple. SCOR also presents the predominant and alternate conformations of hairpin loops, as shown in the most well represented tetraloops, with consensus sequences GNRA, UNCG and ANYA. The ubiquity of the GNRA hairpin turn motif is illustrated by its presence in complex internal loops.
Collapse
Affiliation(s)
- Peter S Klosterman
- Department of Plant and Microbial Biology, University of California at Berkeley, 111 Koshland Hall, Berkeley, CA 94720-3102, USA
| | | | | | | | | |
Collapse
|