1
|
Backofen R, Gorodkin J, Hofacker IL, Stadler PF. Comparative RNA Genomics. Methods Mol Biol 2024; 2802:347-393. [PMID: 38819565 DOI: 10.1007/978-1-0716-3838-5_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Over the last quarter of a century it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large-scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible non-coding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of non-coding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Jan Gorodkin
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria
- Bioinformatics and Computational Biology research group, University of Vienna, Vienna, Austria
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig, Germany.
- Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.
- Universidad National de Colombia, Bogotá, Colombia.
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria.
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.
- Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
2
|
Tieng FYF, Abdullah-Zawawi MR, Md Shahri NAA, Mohamed-Hussein ZA, Lee LH, Mutalib NSA. A Hitchhiker's guide to RNA-RNA structure and interaction prediction tools. Brief Bioinform 2023; 25:bbad421. [PMID: 38040490 PMCID: PMC10753535 DOI: 10.1093/bib/bbad421] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/16/2023] [Accepted: 10/26/2023] [Indexed: 12/03/2023] Open
Abstract
RNA biology has risen to prominence after a remarkable discovery of diverse functions of noncoding RNA (ncRNA). Most untranslated transcripts often exert their regulatory functions into RNA-RNA complexes via base pairing with complementary sequences in other RNAs. An interplay between RNAs is essential, as it possesses various functional roles in human cells, including genetic translation, RNA splicing, editing, ribosomal RNA maturation, RNA degradation and the regulation of metabolic pathways/riboswitches. Moreover, the pervasive transcription of the human genome allows for the discovery of novel genomic functions via RNA interactome investigation. The advancement of experimental procedures has resulted in an explosion of documented data, necessitating the development of efficient and precise computational tools and algorithms. This review provides an extensive update on RNA-RNA interaction (RRI) analysis via thermodynamic- and comparative-based RNA secondary structure prediction (RSP) and RNA-RNA interaction prediction (RIP) tools and their general functions. We also highlighted the current knowledge of RRIs and the limitations of RNA interactome mapping via experimental data. Then, the gap between RSP and RIP, the importance of RNA homologues, the relationship between pseudoknots, and RNA folding thermodynamics are discussed. It is hoped that these emerging prediction tools will deepen the understanding of RNA-associated interactions in human diseases and hasten treatment processes.
Collapse
Affiliation(s)
- Francis Yew Fu Tieng
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia (UKM), Kuala Lumpur 56000, Malaysia
| | | | - Nur Alyaa Afifah Md Shahri
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia (UKM), Kuala Lumpur 56000, Malaysia
| | - Zeti-Azura Mohamed-Hussein
- Institute of Systems Biology (INBIOSIS), UKM, Selangor 43600, Malaysia
- Department of Applied Physics, Faculty of Science and Technology, UKM, Selangor 43600, Malaysia
| | - Learn-Han Lee
- Sunway Microbiomics Centre, School of Medical and Life Sciences, Sunway University, Sunway City 47500, Malaysia
- Novel Bacteria and Drug Discovery Research Group, Microbiome and Bioresource Research Strength, Jeffrey Cheah School of Medicine and Health Sciences, Monash University of Malaysia, Selangor 47500, Malaysia
| | - Nurul-Syakima Ab Mutalib
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia (UKM), Kuala Lumpur 56000, Malaysia
- Novel Bacteria and Drug Discovery Research Group, Microbiome and Bioresource Research Strength, Jeffrey Cheah School of Medicine and Health Sciences, Monash University of Malaysia, Selangor 47500, Malaysia
- Faculty of Health Sciences, UKM, Kuala Lumpur 50300, Malaysia
| |
Collapse
|
3
|
Ai L, Jiang X, Zhang K, Cui C, Liu B, Tan W. Tools and techniques for the discovery of therapeutic aptamers: recent advances. Expert Opin Drug Discov 2023; 18:1393-1411. [PMID: 37840268 DOI: 10.1080/17460441.2023.2264187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 09/25/2023] [Indexed: 10/17/2023]
Abstract
INTRODUCTION The pursuit of novel therapeutic agents for serious diseases such as cancer has been a global endeavor. Aptamers characteristic of high affinity, programmability, low immunogenicity, and rapid permeability hold great promise for the treatment of diseases. Yet obtaining the approval for therapeutic aptamers remains challenging. Consequently, researchers are increasingly devoted to exploring innovative strategies and technologies to advance the development of these therapeutic aptamers. AREAS COVERED The authors provide a comprehensive summary of the recent progress of the SELEX (Systematic Evolution of Ligands by EXponential enrichment) technique, and how the integration of modern tools has facilitated the identification of therapeutic aptamers. Additionally, the engineering of aptamers to enhance their functional attributes, such as inhibiting and targeting, is discussed, demonstrating the potential to broaden their scope of utility. EXPERT OPINION The grand potential of aptamers and the insufficient development of relevant drugs have spurred countless efforts for stimulating their discovery and application in the therapeutic field. While SELEX techniques have undergone significant developments with the aid of advanced analysis instruments and ingeniously updated aptameric engineering strategies, several challenges still impede their clinical translation. A key challenge lies in the insufficient understanding of binding conformation and susceptibility to degradation under physiological conditions. Despite the hurdles, our opinion is optimistic. With continued progress in overcoming these obstacles, the widespread utilization of aptamers for clinical therapy is envisioned to become a reality soon.
Collapse
Affiliation(s)
- Lili Ai
- Molecular Science and Biomedicine Laboratory (MBL), State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, College of Biology, Aptamer Engineering Center of Hunan Province, Hunan University, Changsha, Hunan, The People's Republic of China
| | - Xinyi Jiang
- Molecular Science and Biomedicine Laboratory (MBL), State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, College of Biology, Aptamer Engineering Center of Hunan Province, Hunan University, Changsha, Hunan, The People's Republic of China
| | - Kejing Zhang
- Department of Geriatrics and Department of General Surgery, Xiangya Hospital, Central South University, Changsha, Hunan, The People's Republic of China
- The Key Laboratory of Zhejiang Province for Aptamers and Theranostics, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, Zhejiang, The People's Republic of China
| | - Cheng Cui
- Molecular Science and Biomedicine Laboratory (MBL), State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, College of Biology, Aptamer Engineering Center of Hunan Province, Hunan University, Changsha, Hunan, The People's Republic of China
| | - Bo Liu
- Department of Geriatrics and Department of General Surgery, Xiangya Hospital, Central South University, Changsha, Hunan, The People's Republic of China
| | - Weihong Tan
- Molecular Science and Biomedicine Laboratory (MBL), State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, College of Biology, Aptamer Engineering Center of Hunan Province, Hunan University, Changsha, Hunan, The People's Republic of China
- The Key Laboratory of Zhejiang Province for Aptamers and Theranostics, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, Zhejiang, The People's Republic of China
- Institute of Molecular Medicine (IMM), Renji Hospital, School of Medicine and College of Chemistry and Chemical Engineering, Shanghai Jiao Tong University, Shanghai, The People's Republic of China
| |
Collapse
|
4
|
Justyna M, Antczak M, Szachniuk M. Machine learning for RNA 2D structure prediction benchmarked on experimental data. Brief Bioinform 2023; 24:7140288. [PMID: 37096592 DOI: 10.1093/bib/bbad153] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 03/15/2023] [Accepted: 03/29/2023] [Indexed: 04/26/2023] Open
Abstract
Since the 1980s, dozens of computational methods have addressed the problem of predicting RNA secondary structure. Among them are those that follow standard optimization approaches and, more recently, machine learning (ML) algorithms. The former were repeatedly benchmarked on various datasets. The latter, on the other hand, have not yet undergone extensive analysis that could suggest to the user which algorithm best fits the problem to be solved. In this review, we compare 15 methods that predict the secondary structure of RNA, of which 6 are based on deep learning (DL), 3 on shallow learning (SL) and 6 control methods on non-ML approaches. We discuss the ML strategies implemented and perform three experiments in which we evaluate the prediction of (I) representatives of the RNA equivalence classes, (II) selected Rfam sequences and (III) RNAs from new Rfam families. We show that DL-based algorithms (such as SPOT-RNA and UFold) can outperform SL and traditional methods if the data distribution is similar in the training and testing set. However, when predicting 2D structures for new RNA families, the advantage of DL is no longer clear, and its performance is inferior or equal to that of SL and non-ML methods.
Collapse
Affiliation(s)
- Marek Justyna
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Maciej Antczak
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| | - Marta Szachniuk
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| |
Collapse
|
5
|
Genome-Wide RNA Secondary Structure Prediction. Methods Mol Biol 2023; 2586:35-48. [PMID: 36705897 DOI: 10.1007/978-1-0716-2768-6_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
The information of RNA secondary structure has been widely applied to the inference of RNA function. However, a classical prediction method is not feasible to long RNAs such as mRNA due to the problems of computational time and numerical errors. To overcome those problems, sliding window methods have been applied while their results are not directly comparable to global RNA structure prediction. In this chapter, we introduce ParasoR, a method designed for parallel computation of genome-wide RNA secondary structures. To enable genome-wide prediction, ParasoR distributes dynamic programming (DP) matrices required for structure prediction to multiple computational nodes. Using the database of not the original DP variable but the ratio of variables, ParasoR can locally compute the structure scores such as stem probability or accessibility on demand. A comprehensive analysis of local secondary structures by ParasoR is expected to be a promising way to detect the statistical constraints on long RNAs.
Collapse
|
6
|
Ono Y, Asai K. Rtools: A Web Server for Various Secondary Structural Analyses on Single RNA Sequences. Methods Mol Biol 2023; 2586:1-14. [PMID: 36705895 DOI: 10.1007/978-1-0716-2768-6_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Predicting the secondary structures of RNA molecules is an essential step to characterize their functions, but the thermodynamic probability of any prediction is generally small. On the other hand, there are a few tools for calculating and visualizing various secondary structural information from RNA sequences. We implemented a web server that calculates in parallel various features of secondary structures: different types of secondary structure predictions, the marginal probabilities for local structural contexts, accessibilities of the subsequences, the energy changes by arbitrary base mutations, and the measures for validations of the predicted secondary structures. The web server is available at http://rtools.cbrc.jp , which integrates software tools, CentroidFold, CentroidHomfold, IPknot, CapR, Raccess, Rchange, RintD, and RintW.
Collapse
Affiliation(s)
- Yukiteru Ono
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Tokyo, Japan
| | - Kiyoshi Asai
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Tokyo, Japan.
| |
Collapse
|
7
|
Xia K, Liu X, Wee J. Persistent Homology for RNA Data Analysis. Methods Mol Biol 2023; 2627:211-229. [PMID: 36959450 DOI: 10.1007/978-1-0716-2974-1_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
Molecular representations are of great importance for machine learning models in RNA data analysis. Essentially, efficient molecular descriptors or fingerprints that characterize the intrinsic structural and interactional information of RNAs can significantly boost the performance of all learning modeling. In this paper, we introduce two persistent models, including persistent homology and persistent spectral, for RNA structure and interaction representations and their applications in RNA data analysis. Different from traditional geometric and graph representations, persistent homology is built on simplicial complex, which is a generalization of graph models to higher-dimensional situations. Hypergraph is a further generalization of simplicial complexes and hypergraph-based embedded persistent homology has been proposed recently. Moreover, persistent spectral models, which combine filtration process with spectral models, including spectral graph, spectral simplicial complex, and spectral hypergraph, are proposed for molecular representation. The persistent attributes for RNAs can be obtained from these two persistent models and further combined with machine learning models for RNA structure, flexibility, dynamics, and function analysis.
Collapse
Affiliation(s)
- Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore, Singapore.
| | - Xiang Liu
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore, Singapore
- Chern Institute of Mathematics and LPMC, Nankai University, Tianjin, China
| | - JunJie Wee
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
8
|
Fukunaga T, Hamada M. LinAliFold and CentroidLinAliFold: fast RNA consensus secondary structure prediction for aligned sequences using beam search methods. BIOINFORMATICS ADVANCES 2022; 2:vbac078. [PMID: 36699418 PMCID: PMC9710674 DOI: 10.1093/bioadv/vbac078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 10/13/2022] [Accepted: 10/21/2022] [Indexed: 11/05/2022]
Abstract
Motivation RNA consensus secondary structure prediction from aligned sequences is a powerful approach for improving the secondary structure prediction accuracy. However, because the computational complexities of conventional prediction tools scale with the cube of the alignment lengths, their application to long RNA sequences, such as viral RNAs or long non-coding RNAs, requires significant computational time. Results In this study, we developed LinAliFold and CentroidLinAliFold, fast RNA consensus secondary structure prediction tools based on minimum free energy and maximum expected accuracy principles, respectively. We achieved software acceleration using beam search methods that were successfully used for fast secondary structure prediction from a single RNA sequence. Benchmark analyses showed that LinAliFold and CentroidLinAliFold were much faster than the existing methods while preserving the prediction accuracy. As an empirical application, we predicted the consensus secondary structure of coronaviruses with approximately 30 000 nt in 5 and 79 min by LinAliFold and CentroidLinAliFold, respectively. We confirmed that the predicted consensus secondary structure of coronaviruses was consistent with the experimental results. Availability and implementation The source codes of LinAliFold and CentroidLinAliFold are freely available at https://github.com/fukunagatsu/LinAliFold-CentroidLinAliFold. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | - Michiaki Hamada
- Department of Electrical Engineering and Bioscience, Graduate School of Advanced Science and Engineering, Waseda University, Tokyo 1698555, Japan,Computational Bio Big-Data Open Innovation Laboratory, AIST-Waseda University, Tokyo 1698555, Japan
| |
Collapse
|
9
|
Zhang J, Fei Y, Sun L, Zhang QC. Advances and opportunities in RNA structure experimental determination and computational modeling. Nat Methods 2022; 19:1193-1207. [PMID: 36203019 DOI: 10.1038/s41592-022-01623-y] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 08/23/2022] [Indexed: 11/09/2022]
Abstract
Beyond transferring genetic information, RNAs are molecules with diverse functions that include catalyzing biochemical reactions and regulating gene expression. Most of these activities depend on RNAs' specific structures. Therefore, accurately determining RNA structure is integral to advancing our understanding of RNA functions. Here, we summarize the state-of-the-art experimental and computational technologies developed to evaluate RNA secondary and tertiary structures. We also highlight how the rapid increase of experimental data facilitates the integrative modeling approaches for better resolving RNA structures. Finally, we provide our thoughts on the latest advances and challenges in RNA structure determination methods, as well as on future directions for both experimental approaches and artificial intelligence-based computational tools to model RNA structure. Ultimately, we hope the technological advances will deepen our understanding of RNA biology and facilitate RNA structure-based biomedical research such as designing specific RNA structures for therapeutics and deploying RNA-targeting small-molecule drugs.
Collapse
Affiliation(s)
- Jinsong Zhang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China.,Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China.,Tsinghua-Peking Center for Life Sciences, Beijing, China
| | - Yuhan Fei
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China.,Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China.,Tsinghua-Peking Center for Life Sciences, Beijing, China
| | - Lei Sun
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China. .,Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China. .,Tsinghua-Peking Center for Life Sciences, Beijing, China.
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China. .,Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China. .,Tsinghua-Peking Center for Life Sciences, Beijing, China.
| |
Collapse
|
10
|
RNA secondary structure packages evaluated and improved by high-throughput experiments. Nat Methods 2022; 19:1234-1242. [PMID: 36192461 PMCID: PMC9839360 DOI: 10.1038/s41592-022-01605-0] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 08/10/2022] [Indexed: 01/17/2023]
Abstract
Despite the popularity of computer-aided study and design of RNA molecules, little is known about the accuracy of commonly used structure modeling packages in tasks sensitive to ensemble properties of RNA. Here, we demonstrate that the EternaBench dataset, a set of more than 20,000 synthetic RNA constructs designed on the RNA design platform Eterna, provides incisive discriminative power in evaluating current packages in ensemble-oriented structure prediction tasks. We find that CONTRAfold and RNAsoft, packages with parameters derived through statistical learning, achieve consistently higher accuracy than more widely used packages in their standard settings, which derive parameters primarily from thermodynamic experiments. We hypothesized that training a multitask model with the varied data types in EternaBench might improve inference on ensemble-based prediction tasks. Indeed, the resulting model, named EternaFold, demonstrated improved performance that generalizes to diverse external datasets including complete messenger RNAs, viral genomes probed in human cells and synthetic designs modeling mRNA vaccines.
Collapse
|
11
|
Matarrese MAG, Loppini A, Nicoletti M, Filippi S, Chiodo L. Assessment of tools for RNA secondary structure prediction and extraction: a final-user perspective. J Biomol Struct Dyn 2022:1-20. [DOI: 10.1080/07391102.2022.2116110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
- Margherita A. G. Matarrese
- Engineering Department, Campus Bio-Medico University of Rome, Rome, Italy
- Jane and John Justin Neurosciences Center, Cook Children’s Health Care System, TX, USA
- Department of Bioengineering, The University of Texas at Arlington, Arlington, TX, USA
| | - Alessandro Loppini
- Engineering Department, Campus Bio-Medico University of Rome, Rome, Italy
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome, Italy
| | - Martina Nicoletti
- Engineering Department, Campus Bio-Medico University of Rome, Rome, Italy
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome, Italy
| | - Simonetta Filippi
- Engineering Department, Campus Bio-Medico University of Rome, Rome, Italy
| | - Letizia Chiodo
- Engineering Department, Campus Bio-Medico University of Rome, Rome, Italy
| |
Collapse
|
12
|
Shen C, Chen Y, Xiao F, Yang T, Wang X, Chen S, Tang J, Liao Z. BAT-Net: An enhanced RNA Secondary Structure prediction via bidirectional GRU-based network with attention mechanism. Comput Biol Chem 2022; 101:107765. [DOI: 10.1016/j.compbiolchem.2022.107765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 08/24/2022] [Indexed: 11/03/2022]
|
13
|
Gray M, Chester S, Jabbari H. KnotAli: informed energy minimization through the use of evolutionary information. BMC Bioinformatics 2022; 23:159. [PMID: 35505276 PMCID: PMC9063079 DOI: 10.1186/s12859-022-04673-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 04/05/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Improving the prediction of structures, especially those containing pseudoknots (structures with crossing base pairs) is an ongoing challenge. Homology-based methods utilize structural similarities within a family to predict the structure. However, their prediction is limited to the consensus structure, and by the quality of the alignment. Minimum free energy (MFE) based methods, on the other hand, do not rely on familial information and can predict structures of novel RNA molecules. Their prediction normally suffers from inaccuracies due to their underlying energy parameters. RESULTS We present a new method for prediction of RNA pseudoknotted secondary structures that combines the strengths of MFE prediction and alignment-based methods. KnotAli takes a multiple RNA sequence alignment as input and uses covariation and thermodynamic energy minimization to predict possibly pseudoknotted secondary structures for each individual sequence in the alignment. We compared KnotAli's performance to that of three other alignment-based programs, two that can handle pseudoknotted structures and one control, on a large data set of 3034 RNA sequences with varying lengths and levels of sequence conservation from 10 families with pseudoknotted and pseudoknot-free reference structures. We produced sequence alignments for each family using two well-known sequence aligners (MUSCLE and MAFFT). CONCLUSIONS We found KnotAli's performance to be superior in 6 of the 10 families for MUSCLE and 7 of the 10 for MAFFT. While both KnotAli and Cacofold use background noise correction strategies, we found KnotAli's predictions to be less dependent on the alignment quality. KnotAli can be found online at the Zenodo image: https://doi.org/10.5281/zenodo.5794719.
Collapse
Affiliation(s)
- Mateo Gray
- Department of Computer Science, University of Victoria, Victoria, Canada
| | - Sean Chester
- Department of Computer Science, University of Victoria, Victoria, Canada
| | - Hosna Jabbari
- Department of Computer Science, University of Victoria, Victoria, Canada. .,Institute on Aging and Lifelong Health, University of Victoria, Victoria, Canada.
| |
Collapse
|
14
|
Zambrano RAI, Hernandez-Perez C, Takahashi MK. RNA Structure Prediction, Analysis, and Design: An Introduction to Web-Based Tools. Methods Mol Biol 2022; 2518:253-269. [PMID: 35666450 DOI: 10.1007/978-1-0716-2421-0_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Understanding RNA structure has become critical in the study of RNA in their roles as mediators of biological processes. To aid in these studies, computational algorithms that utilize thermodynamics have been developed to predict RNA secondary structure. Due to the importance of intermolecular interactions, the algorithms have been expanded to determine and predict RNA-RNA hybridization. This chapter discusses popular webservers with the tools for RNA secondary structure prediction, RNA-RNA hybridization, and design. We address key features that distinguish common-functioning programs and their purposes for the interests of the user. Ultimately, we hope this review elucidates web-based tools researchers may take advantage of in their investigations of RNA structure and function.
Collapse
Affiliation(s)
| | | | - Melissa K Takahashi
- Department of Biology, California State University Northridge, Northridge, CA, USA.
| |
Collapse
|
15
|
Fairman CW, Lever AML, Kenyon JC. Evaluating RNA Structural Flexibility: Viruses Lead the Way. Viruses 2021; 13:v13112130. [PMID: 34834937 PMCID: PMC8624864 DOI: 10.3390/v13112130] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 10/12/2021] [Accepted: 10/18/2021] [Indexed: 12/11/2022] Open
Abstract
Our understanding of RNA structure has lagged behind that of proteins and most other biological polymers, largely because of its ability to adopt multiple, and often very different, functional conformations within a single molecule. Flexibility and multifunctionality appear to be its hallmarks. Conventional biochemical and biophysical techniques all have limitations in solving RNA structure and to address this in recent years we have seen the emergence of a wide diversity of techniques applied to RNA structural analysis and an accompanying appreciation of its ubiquity and versatility. Viral RNA is a particularly productive area to study in that this economy of function within a single molecule admirably suits the minimalist lifestyle of viruses. Here, we review the major techniques that are being used to elucidate RNA conformational flexibility and exemplify how the structure and function are, as in all biology, tightly linked.
Collapse
Affiliation(s)
| | - Andrew M. L. Lever
- Department of Medicine, Cambridge University, Level 5, Addenbrookes’ Hospital (Box 157), Cambridge CB2 0QQ, UK
- Correspondence: (A.M.L.L.); (J.C.K.); Tel.: +44-(0)-1223-747308 (A.M.L.L. & J.C.K.)
| | - Julia C. Kenyon
- Homerton College, University of Cambridge, Cambridge CB2 8PH, UK;
- Department of Medicine, Cambridge University, Level 5, Addenbrookes’ Hospital (Box 157), Cambridge CB2 0QQ, UK
- Correspondence: (A.M.L.L.); (J.C.K.); Tel.: +44-(0)-1223-747308 (A.M.L.L. & J.C.K.)
| |
Collapse
|
16
|
Learning the Fastest RNA Folding Path Based on Reinforcement Learning and Monte Carlo Tree Search. Molecules 2021; 26:molecules26154420. [PMID: 34361572 PMCID: PMC8347524 DOI: 10.3390/molecules26154420] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 07/17/2021] [Accepted: 07/20/2021] [Indexed: 11/17/2022] Open
Abstract
RNA molecules participate in many important biological processes, and they need to fold into well-defined secondary and tertiary structures to realize their functions. Like the well-known protein folding problem, there is also an RNA folding problem. The folding problem includes two aspects: structure prediction and folding mechanism. Although the former has been widely studied, the latter is still not well understood. Here we present a deep reinforcement learning algorithms 2dRNA-Fold to study the fastest folding paths of RNA secondary structure. 2dRNA-Fold uses a neural network combined with Monte Carlo tree search to select residue pairing step by step according to a given RNA sequence until the final secondary structure is formed. We apply 2dRNA-Fold to several short RNA molecules and one longer RNA 1Y26 and find that their fastest folding paths show some interesting features. 2dRNA-Fold is further trained using a set of RNA molecules from the dataset bpRNA and is used to predict RNA secondary structure. Since in 2dRNA-Fold the scoring to determine next step is based on possible base pairings, the learned or predicted fastest folding path may not agree with the actual folding paths determined by free energy according to physical laws.
Collapse
|
17
|
Singh J, Paliwal K, Singh J, Zhou Y. RNA Backbone Torsion and Pseudotorsion Angle Prediction Using Dilated Convolutional Neural Networks. J Chem Inf Model 2021; 61:2610-2622. [PMID: 34037398 DOI: 10.1021/acs.jcim.1c00153] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
RNA three-dimensional structure prediction has been relied on using a predicted or experimentally determined secondary structure as a restraint to reduce the conformational sampling space. However, the secondary-structure restraints are limited to paired bases, and the conformational space of the ribose-phosphate backbone is still too large to be sampled efficiently. Here, we employed the dilated convolutional neural network to predict backbone torsion and pseudotorsion angles using a single RNA sequence as input. The method called SPOT-RNA-1D was trained on a high-resolution training data set and tested on three independent, nonredundant, and high-resolution test sets. The proposed method yields substantially smaller mean absolute errors than the baseline predictors based on random predictions and based on helix conformations according to actual angle distributions. The mean absolute errors for three test sets range from 14°-44° for different angles, compared to 17°-62° by random prediction and 14°-58° by helix prediction. The method also accurately recovers the overall patterns of single or pairwise angle distributions. In general, torsion angles further away from the bases and associated with unpaired bases and paired bases involved in tertiary interactions are more difficult to predict. Compared to the best models in RNA-puzzles experiments, SPOT-RNA-1D yielded more accurate dihedral angles and, thus, are potentially useful as model quality indicators and restraints for RNA structure prediction as in protein structure prediction.
Collapse
Affiliation(s)
- Jaswinder Singh
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland 4122, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland 4122, Australia
| | - Jaspreet Singh
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland 4122, Australia
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Queensland 4222, Australia.,Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China.,Peking University Shenzhen Graduate School, Shenzhen 518055, P.R. China
| |
Collapse
|
18
|
Schmidt CM, Smolke CD. A convolutional neural network for the prediction and forward design of ribozyme-based gene-control elements. eLife 2021; 10:59697. [PMID: 33860764 PMCID: PMC8128436 DOI: 10.7554/elife.59697] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Accepted: 04/15/2021] [Indexed: 12/12/2022] Open
Abstract
Ribozyme switches are a class of RNA-encoded genetic switch that support conditional regulation of gene expression across diverse organisms. An improved elucidation of the relationships between sequence, structure, and activity can improve our capacity for de novo rational design of ribozyme switches. Here, we generated data on the activity of hundreds of thousands of ribozyme sequences. Using automated structural analysis and machine learning, we leveraged these large data sets to develop predictive models that estimate the in vivo gene-regulatory activity of a ribozyme sequence. These models supported the de novo design of ribozyme libraries with low mean basal gene-regulatory activities and new ribozyme switches that exhibit changes in gene-regulatory activity in the presence of a target ligand, producing functional switches for four out of five aptamers. Our work examines how biases in the model and the data set that affect prediction accuracy can arise and demonstrates that machine learning can be applied to RNA sequences to predict gene-regulatory activity, providing the basis for design tools for functional RNAs.
Collapse
Affiliation(s)
- Calvin M Schmidt
- Department of Bioengineering, Stanford University, Stanford, United States
| | - Christina D Smolke
- Department of Bioengineering, Stanford University, Stanford, United States.,Chan Zuckerberg Biohub, San Francisco, United States
| |
Collapse
|
19
|
Abstract
The molecules of the ribonucleic acid (RNA) perform a variety of vital roles in all living cells. Their biological function depends on their structure and dynamics, both of which are difficult to experimentally determine but can be theoretically inferred based on the RNA sequence. SimRNA is one of the computational methods for molecular simulations of RNA 3D structure formation. The method is based on a simplified (coarse-grained) representation of nucleotide chains, a statistically derived model of interactions (statistical potential), and the Monte Carlo method as a conformational sampling scheme.The current version of SimRNA (3.22) is able to predict basic topologies of RNA molecules with sizes up to about 50-70 nucleotides, based on their sequences only, and larger molecules if supplied with appropriate distance restraints. The user can specify various types of restraints, including secondary structure, pairwise atom-atom distances, and positions of atoms. SimRNA can be also used for studying systems composed of several chains of RNA. SimRNA is a folding simulations method, thus it allows for examining folding pathways, getting an approximate view of the energy landscapes.
Collapse
|
20
|
Kalvari I, Nawrocki EP, Ontiveros-Palacios N, Argasinska J, Lamkiewicz K, Marz M, Griffiths-Jones S, Toffano-Nioche C, Gautheret D, Weinberg Z, Rivas E, Eddy SR, Finn RD, Bateman A, Petrov AI. Rfam 14: expanded coverage of metagenomic, viral and microRNA families. Nucleic Acids Res 2021; 49:D192-D200. [PMID: 33211869 PMCID: PMC7779021 DOI: 10.1093/nar/gkaa1047] [Citation(s) in RCA: 392] [Impact Index Per Article: 130.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/14/2020] [Accepted: 10/21/2020] [Indexed: 12/15/2022] Open
Abstract
Rfam is a database of RNA families where each of the 3444 families is represented by a multiple sequence alignment of known RNA sequences and a covariance model that can be used to search for additional members of the family. Recent developments have involved expert collaborations to improve the quality and coverage of Rfam data, focusing on microRNAs, viral and bacterial RNAs. We have completed the first phase of synchronising microRNA families in Rfam and miRBase, creating 356 new Rfam families and updating 40. We established a procedure for comprehensive annotation of viral RNA families starting with Flavivirus and Coronaviridae RNAs. We have also increased the coverage of bacterial and metagenome-based RNA families from the ZWD database. These developments have enabled a significant growth of the database, with the addition of 759 new families in Rfam 14. To facilitate further community contribution to Rfam, expert users are now able to build and submit new families using the newly developed Rfam Cloud family curation system. New Rfam website features include a new sequence similarity search powered by RNAcentral, as well as search and visualisation of families with pseudoknots. Rfam is freely available at https://rfam.org.
Collapse
Affiliation(s)
- Ioanna Kalvari
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eric P Nawrocki
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Nancy Ontiveros-Palacios
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Joanna Argasinska
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kevin Lamkiewicz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,European Virus Bioinformatics Center, Leutragraben 1, 07743 Jena, Germany
| | - Manja Marz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,European Virus Bioinformatics Center, Leutragraben 1, 07743 Jena, Germany
| | - Sam Griffiths-Jones
- Faculty of Biology, Medicine and Health, University of Manchester, Oxford Road, Manchester, M13 9PT, UK
| | - Claire Toffano-Nioche
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Daniel Gautheret
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Zasha Weinberg
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Centre for Bioinformatics, Leipzig University, 04107 Leipzig, Germany
| | - Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA
| | - Sean R Eddy
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA.,Howard Hughes Medical Institute, Harvard University, Cambridge, MA 02138, USA.,John A. Paulson School of Engineering and Applied Science, Harvard University, Cambridge, MA 02138, USA
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anton I Petrov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
21
|
Bossanyi MA, Carpentier V, Glouzon JPS, Ouangraoua A, Anselmetti Y. aliFreeFoldMulti: alignment-free method to predict secondary structures of multiple RNA homologs. NAR Genom Bioinform 2020; 2:lqaa086. [PMID: 33575631 PMCID: PMC7671329 DOI: 10.1093/nargab/lqaa086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 10/19/2020] [Indexed: 11/18/2022] Open
Abstract
Predicting RNA structure is crucial for understanding RNA’s mechanism of action. Comparative approaches for the prediction of RNA structures can be classified into four main strategies. The three first—align-and-fold, align-then-fold and fold-then-align—exploit multiple sequence alignments to improve the accuracy of conserved RNA-structure prediction. Align-and-fold methods perform generally better, but are also typically slower than the other alignment-based methods. The fourth strategy—alignment-free—consists in predicting the conserved RNA structure without relying on sequence alignment. This strategy has the advantage of being the faster, while predicting accurate structures through the use of latent representations of the candidate structures for each sequence. This paper presents aliFreeFoldMulti, an extension of the aliFreeFold algorithm. This algorithm predicts a representative secondary structure of multiple RNA homologs by using a vector representation of their suboptimal structures. aliFreeFoldMulti improves on aliFreeFold by additionally computing the conserved structure for each sequence. aliFreeFoldMulti is assessed by comparing its prediction performance and time efficiency with a set of leading RNA-structure prediction methods. aliFreeFoldMulti has the lowest computing times and the highest maximum accuracy scores. It achieves comparable average structure prediction accuracy as other methods, except TurboFoldII which is the best in terms of average accuracy but with the highest computing times. We present aliFreeFoldMulti as an illustration of the potential of alignment-free approaches to provide fast and accurate RNA-structure prediction methods.
Collapse
Affiliation(s)
- Marc-André Bossanyi
- CoBIUS lab, Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l’Université, Sherbrooke, QC J1K 2R1, Canada
| | - Valentin Carpentier
- CoBIUS lab, Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l’Université, Sherbrooke, QC J1K 2R1, Canada
| | - Jean-Pierre S Glouzon
- CoBIUS lab, Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l’Université, Sherbrooke, QC J1K 2R1, Canada
| | - Aïda Ouangraoua
- CoBIUS lab, Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l’Université, Sherbrooke, QC J1K 2R1, Canada
| | - Yoann Anselmetti
- CoBIUS lab, Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l’Université, Sherbrooke, QC J1K 2R1, Canada
| |
Collapse
|
22
|
Mao K, Wang J, Xiao Y. Prediction of RNA secondary structure with pseudoknots using coupled deep neural networks. BIOPHYSICS REPORTS 2020. [DOI: 10.1007/s41048-020-00114-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
|
23
|
Schwarz M, Vohradský J, Modrák M, Pánek J. rboAnalyzer: A Software to Improve Characterization of Non-coding RNAs From Sequence Database Search Output. Front Genet 2020; 11:675. [PMID: 32849767 PMCID: PMC7401326 DOI: 10.3389/fgene.2020.00675] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Accepted: 06/02/2020] [Indexed: 12/12/2022] Open
Abstract
Searching for similar sequences in a database via BLAST or a similar tool is one of the most common bioinformatics tasks applied in general, and to non-coding RNAs in particular. However, the results of the search might be difficult to interpret due to the presence of partial matches to the database subject sequences. Here, we present rboAnalyzer – a tool that helps with interpreting sequence search result by (1) extending partial matches into plausible full-length subject sequences, (2) predicting homology of RNAs represented by full-length subject sequences to the query RNA, (3) pooling information across homologous RNAs found in the search results and public databases such as Rfam to predict more reliable secondary structures for all matches, and (4) contextualizing the matches by providing the prediction results and other relevant information in a rich graphical output. Using predicted full-length matches improves secondary structure prediction and makes rboAnalyzer robust with regards to identification of homology. The output of the tool should help the user to reliably characterize non-coding RNAs in BLAST output. The usefulness of the rboAnalyzer and its ability to correctly extend partial matches to full-length is demonstrated on known homologous RNAs. To allow the user to use custom databases and search options, rboAnalyzer accepts any search results as a text file in the BLAST format. The main output is an interactive HTML page displaying the computed characteristics and other context of the matches. The output can also be exported in an appropriate sequence and/or secondary structure formats.
Collapse
Affiliation(s)
- Marek Schwarz
- Laboratory of Bioinformatics, Institute of Microbiology, Czech Academy of Sciences, Prague, Czechia
| | - Jiří Vohradský
- Laboratory of Bioinformatics, Institute of Microbiology, Czech Academy of Sciences, Prague, Czechia
| | - Martin Modrák
- Laboratory of Bioinformatics, Institute of Microbiology, Czech Academy of Sciences, Prague, Czechia
| | - Josef Pánek
- Laboratory of Bioinformatics, Institute of Microbiology, Czech Academy of Sciences, Prague, Czechia
| |
Collapse
|
24
|
Müller T, Miladi M, Hutter F, Hofacker I, Will S, Backofen R. The locality dilemma of Sankoff-like RNA alignments. Bioinformatics 2020; 36:i242-i250. [PMID: 32657398 PMCID: PMC7355259 DOI: 10.1093/bioinformatics/btaa431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Motivation Elucidating the functions of non-coding RNAs by homology has been strongly limited due to fundamental computational and modeling issues. While existing simultaneous alignment and folding (SA&F) algorithms successfully align homologous RNAs with precisely known boundaries (global SA&F), the more pressing problem of identifying new classes of homologous RNAs in the genome (local SA&F) is intrinsically more difficult and much less understood. Typically, the length of local alignments is strongly overestimated and alignment boundaries are dramatically mispredicted. We hypothesize that local SA&F approaches are compromised this way due to a score bias, which is caused by the contribution of RNA structure similarity to their overall alignment score. Results In the light of this hypothesis, we study pairwise local SA&F for the first time systematically—based on a novel local RNA alignment benchmark set and quality measure. First, we vary the relative influence of structure similarity compared to sequence similarity. Putting more emphasis on the structure component leads to overestimating the length of local alignments. This clearly shows the bias of current scores and strongly hints at the structure component as its origin. Second, we study the interplay of several important scoring parameters by learning parameters for local and global SA&F. The divergence of these optimized parameter sets underlines the fundamental obstacles for local SA&F. Third, by introducing a position-wise correction term in local SA&F, we constructively solve its principal issues. Availability and implementation The benchmark data, detailed results and scripts are available at https://github.com/BackofenLab/local_alignment. The RNA alignment tool LocARNA, including the modifications proposed in this work, is available at https://github.com/s-will/LocARNA/releases/tag/v2.0.0RC6. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Teresa Müller
- Bioinformatics Group, University of Freiburg, Freiburg 79110, Germany
| | - Milad Miladi
- Bioinformatics Group, University of Freiburg, Freiburg 79110, Germany
| | - Frank Hutter
- Machine Learning Lab, Department of Computer Science, University of Freiburg, Freiburg 79110, Germany
| | - Ivo Hofacker
- Theoretical Biochemistry Group (TBI), Institute for Theoretical Chemistry, University of Vienna, Vienna, Wien 1090, Austria
| | - Sebastian Will
- Theoretical Biochemistry Group (TBI), Institute for Theoretical Chemistry, University of Vienna, Vienna, Wien 1090, Austria.,Bioinformatics Group AMIBio, LIX-Laboratoire d'Informatique d'École Polytechnique, IPP, Palaiseau 91120, France
| | - Rolf Backofen
- Bioinformatics Group, University of Freiburg, Freiburg 79110, Germany.,Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Freiburg 79104, Germany
| |
Collapse
|
25
|
Yu B, Lu Y, Zhang QC, Hou L. Prediction and differential analysis of RNA secondary structure. QUANTITATIVE BIOLOGY 2020. [DOI: 10.1007/s40484-020-0205-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
26
|
Wright ES. RNAconTest: comparing tools for noncoding RNA multiple sequence alignment based on structural consistency. RNA (NEW YORK, N.Y.) 2020; 26:531-540. [PMID: 32005745 PMCID: PMC7161358 DOI: 10.1261/rna.073015.119] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Accepted: 01/28/2020] [Indexed: 05/05/2023]
Abstract
The importance of noncoding RNA sequences has become increasingly clear over the past decade. New RNA families are often detected and analyzed using comparative methods based on multiple sequence alignments. Accordingly, a number of programs have been developed for aligning and deriving secondary structures from sets of RNA sequences. Yet, the best tools for these tasks remain unclear because existing benchmarks contain too few sequences belonging to only a small number of RNA families. RNAconTest (RNA consistency test) is a new benchmarking approach relying on the observation that secondary structure is often conserved across highly divergent RNA sequences from the same family. RNAconTest scores multiple sequence alignments based on the level of consistency among known secondary structures belonging to reference sequences in their output alignment. Similarly, consensus secondary structure predictions are scored according to their agreement with one or more known structures in a family. Comparing the performance of 10 popular alignment programs using RNAconTest revealed that DAFS, DECIPHER, LocARNA, and MAFFT created the most structurally consistent alignments. The best consensus secondary structure predictions were generated by DAFS and LocARNA (via RNAalifold). Many of the methods specific to noncoding RNAs exhibited poor scalability as the number or length of input sequences increased, and several programs displayed substantial declines in score as more sequences were aligned. Overall, RNAconTest provides a means of testing and improving tools for comparative RNA analysis, as well as highlighting the best available approaches. RNAconTest is available from the DECIPHER website (http://DECIPHER.codes/Downloads.html).
Collapse
Affiliation(s)
- Erik S Wright
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania 15219, USA
| |
Collapse
|
27
|
Emami N, Pakchin PS, Ferdousi R. Computational predictive approaches for interaction and structure of aptamers. J Theor Biol 2020; 497:110268. [PMID: 32311376 DOI: 10.1016/j.jtbi.2020.110268] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 03/27/2020] [Accepted: 04/02/2020] [Indexed: 02/07/2023]
Abstract
Aptamers are short single-strand sequences that can bind to their specific targets with high affinity and specificity. Usually, aptamers are selected experimentally via systematic evolution of ligands by exponential enrichment (SELEX), an evolutionary process that consists of multiple cycles of selection and amplification. The SELEX process is expensive, time-consuming, and its success rates are relatively low. To overcome these difficulties, in recent years, several computational techniques have been developed in aptamer sciences that bring together different disciplines and branches of technologies. In this paper, a complementary review on computational predictive approaches of the aptamer has been organized. Generally, the computational prediction approaches of aptamer have been proposed to carry out in two main categories: interaction-based prediction and structure-based predictions. Furthermore, the available software packages and toolkits in this scope were reviewed. The aim of describing computational methods and tools in aptamer science is that aptamer scientists might take advantage of these computational techniques to develop more accurate and more sensitive aptamers.
Collapse
Affiliation(s)
- Neda Emami
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Parvin Samadi Pakchin
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran; Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
28
|
Zhou G, Loper J, Geman S. Base-pair ambiguity and the kinetics of RNA folding. BMC Bioinformatics 2019; 20:666. [PMID: 31830902 PMCID: PMC6909616 DOI: 10.1186/s12859-019-3303-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Accepted: 12/02/2019] [Indexed: 01/28/2023] Open
Abstract
Background A pairings of nucleotide sequences. Given this forbidding free-energy landscape, mechanisms have evolved that contribute to a directed and efficient folding process, including catalytic proteins and error-detecting chaperones. Among structural RNA molecules we make a distinction between “bound” molecules, which are active as part of ribonucleoprotein (RNP) complexes, and “unbound,” with physiological functions performed without necessarily being bound in RNP complexes. We hypothesized that unbound molecules, lacking the partnering structure of a protein, would be more vulnerable than bound molecules to kinetic traps that compete with native stem structures. We defined an “ambiguity index”—a normalized function of the primary and secondary structure of an individual molecule that measures the number of kinetic traps available to nucleotide sequences that are paired in the native structure, presuming that unbound molecules would have lower indexes. The ambiguity index depends on the purported secondary structure, and was computed under both the comparative (“gold standard”) and an equilibrium-based prediction which approximates the minimum free energy (MFE) structure. Arguing that kinetically accessible metastable structures might be more biologically relevant than thermodynamic equilibrium structures, we also hypothesized that MFE-derived ambiguities would be less effective in separating bound and unbound molecules. Results We have introduced an intuitive and easily computed function of primary and secondary structures that measures the availability of complementary sequences that could disrupt the formation of native stems on a given molecule—an ambiguity index. Using comparative secondary structures, the ambiguity index is systematically smaller among unbound than bound molecules, as expected. Furthermore, the effect is lost when the presumably more accurate comparative structure is replaced instead by the MFE structure. Conclusions A statistical analysis of the relationship between the primary and secondary structures of non-coding RNA molecules suggests that stem-disrupting kinetic traps are substantially less prevalent in molecules not participating in RNP complexes. In that this distinction is apparent under the comparative but not the MFE secondary structure, the results highlight a possible deficiency in structure predictions when based upon assumptions of thermodynamic equilibrium.
Collapse
Affiliation(s)
| | - Jackson Loper
- Data Science Institute, Columbia University, New York, NY, USA
| | - Stuart Geman
- Division of Applied Mathematics, Brown University, Providence, RI, USA
| |
Collapse
|
29
|
RNAdemocracy: an ensemble method for RNA secondary structure prediction using consensus scoring. Comput Biol Chem 2019; 83:107151. [DOI: 10.1016/j.compbiolchem.2019.107151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Revised: 06/05/2019] [Accepted: 10/15/2019] [Indexed: 11/18/2022]
|
30
|
Magnus M, Kappel K, Das R, Bujnicki JM. RNA 3D structure prediction guided by independent folding of homologous sequences. BMC Bioinformatics 2019; 20:512. [PMID: 31640563 PMCID: PMC6806525 DOI: 10.1186/s12859-019-3120-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Accepted: 10/01/2019] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND The understanding of the importance of RNA has dramatically changed over recent years. As in the case of proteins, the function of an RNA molecule is encoded in its tertiary structure, which in turn is determined by the molecule's sequence. The prediction of tertiary structures of complex RNAs is still a challenging task. RESULTS Using the observation that RNA sequences from the same RNA family fold into conserved structure, we test herein whether parallel modeling of RNA homologs can improve ab initio RNA structure prediction. EvoClustRNA is a multi-step modeling process, in which homologous sequences for the target sequence are selected using the Rfam database. Subsequently, independent folding simulations using Rosetta FARFAR and SimRNA are carried out. The model of the target sequence is selected based on the most common structural arrangement of the common helical fragments. As a test, on two blind RNA-Puzzles challenges, EvoClustRNA predictions ranked as the first of all submissions for the L-glutamine riboswitch and as the second for the ZMP riboswitch. Moreover, through a benchmark of known structures, we discovered several cases in which particular homologs were unusually amenable to structure recovery in folding simulations compared to the single original target sequence. CONCLUSION This work, for the first time to our knowledge, demonstrates the importance of the selection of the target sequence from an alignment of an RNA family for the success of RNA 3D structure prediction. These observations prompt investigations into a new direction of research for checking 3D structure "foldability" or "predictability" of related RNA sequences to obtain accurate predictions. To support new research in this area, we provide all relevant scripts in a documented and ready-to-use form. By exploring new ideas and identifying limitations of the current RNA 3D structure prediction methods, this work is bringing us closer to the near-native computational RNA 3D models.
Collapse
Affiliation(s)
- Marcin Magnus
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Warsaw, Poland
| | - Kalli Kappel
- Biophysics Program, Stanford University, Stanford, CA USA
| | - Rhiju Das
- Biophysics Program, Stanford University, Stanford, CA USA
- Department of Biochemistry, Stanford University, Stanford, CA USA
- Department of Physics, Stanford University, Stanford, CA USA
| | - Janusz M. Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Warsaw, Poland
- Laboratory of Bioinformatics, Institute of Molecular Biology and Biotechnology, Adam Mickiewicz University, Poznan, Poland
| |
Collapse
|
31
|
Bahrami AA, Payandeh Z, Khalili S, Zakeri A, Bandehpour M. Immunoinformatics: In Silico Approaches and Computational Design of a Multi-epitope, Immunogenic Protein. Int Rev Immunol 2019; 38:307-322. [PMID: 31478759 DOI: 10.1080/08830185.2019.1657426] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Immunoinformatics is a new critical field with several tools and databases that conduct the eyesight of experimental selection and facilitate analysis of the great amount of immunologic data obtained from experimental researches and helps to design and introducing new hypothesis. Given these visages, immunoinformatics seems to be the way that develop and progress the immunological research. Bioinformatics methods and applications are successfully employed in vaccine informatics to assist different sites of the preclinical, clinical, and post-licensure vaccine enterprises. On the other hand, the progression of molecular biology and immunology caused epitope vaccines have become the focus of research on molecular vaccines. Moreover, reverse vaccinology could improve vaccine production and vaccination protocols by in silico prediction of protein-vaccine candidates from genome sequences. B- and T-cell immune epitopes could be predicted by immunoinformatics algorithms and computational methods to improve the vaccine design, protective immunity analysis, assessment of vaccine safety and efficacy, and immunization modeling. This review aims to discuss the power of computational approaches in vaccine design and their relevance to the development of effective vaccines. Furthermore, the various divisions of this field and available tools in each item are introduced and reviewed.
Collapse
Affiliation(s)
- Armina Alagheband Bahrami
- Department of Biotechnology, School of Advanced Technologies in Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Zahra Payandeh
- Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Saeed Khalili
- Department of Biology Sciences, Shahid Rajaee Teacher Training University, Tehran, Iran
| | - Alireza Zakeri
- Department of Biology Sciences, Shahid Rajaee Teacher Training University, Tehran, Iran
| | - Mojgan Bandehpour
- Department of Biotechnology, School of Advanced Technologies in Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
32
|
Glouzon JPS, Ouangraoua A. aliFreeFold: an alignment-free approach to predict secondary structure from homologous RNA sequences. Bioinformatics 2019; 34:i70-i78. [PMID: 29949960 PMCID: PMC6022685 DOI: 10.1093/bioinformatics/bty234] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Motivation Predicting the conserved secondary structure of homologous ribonucleic acid (RNA) sequences is crucial for understanding RNA functions. However, fast and accurate RNA structure prediction is challenging, especially when the number and the divergence of homologous RNA increases. To address this challenge, we propose aliFreeFold, based on a novel alignment-free approach which computes a representative structure from a set of homologous RNA sequences using sub-optimal secondary structures generated for each sequence. It is based on a vector representation of sub-optimal structures capturing structure conservation signals by weighting structural motifs according to their conservation across the sub-optimal structures. Results We demonstrate that aliFreeFold provides a good balance between speed and accuracy regarding predictions of representative structures for sets of homologous RNA compared to traditional methods based on sequence and structure alignment. We show that aliFreeFold is capable of uncovering conserved structural features fastly and effectively thanks to its weighting scheme that gives more (resp. less) importance to common (resp. uncommon) structural motifs. The weighting scheme is also shown to be capable of capturing conservation signal as the number of homologous RNA increases. These results demonstrate the ability of aliFreefold to efficiently and accurately provide interesting structural representatives of RNA families. Availability and implementation aliFreeFold was implemented in C++. Source code and Linux binary are freely available at https://github.com/UdeS-CoBIUS/aliFreeFold. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Aïda Ouangraoua
- Department of Computer Science, University of Sherbrooke, Sherbrooke, QC, Canada
| |
Collapse
|
33
|
Kimchi O, Cragnolini T, Brenner MP, Colwell LJ. A Polymer Physics Framework for the Entropy of Arbitrary Pseudoknots. Biophys J 2019; 117:520-532. [PMID: 31353036 PMCID: PMC6697467 DOI: 10.1016/j.bpj.2019.06.037] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 06/21/2019] [Accepted: 06/27/2019] [Indexed: 11/18/2022] Open
Abstract
The accurate prediction of RNA secondary structure from primary sequence has had enormous impact on research from the past 40 years. Although many algorithms are available to make these predictions, the inclusion of non-nested loops, termed pseudoknots, still poses challenges arising from two main factors: 1) no physical model exists to estimate the loop entropies of complex intramolecular pseudoknots, and 2) their NP-complete enumeration has impeded their study. Here, we address both challenges. First, we develop a polymer physics model that can address arbitrarily complex pseudoknots using only two parameters corresponding to concrete physical quantities-over an order of magnitude fewer than the sparsest state-of-the-art phenomenological methods. Second, by coupling this model to exhaustive enumeration of the set of possible structures, we compute the entire free energy landscape of secondary structures resulting from a primary RNA sequence. We demonstrate that for RNA structures of ∼80 nucleotides, with minimal heuristics, the complete enumeration of possible secondary structures can be accomplished quickly despite the NP-complete nature of the problem. We further show that despite our loop entropy model's parametric sparsity, it performs better than or on par with previously published methods in predicting both pseudoknotted and non-pseudoknotted structures on a benchmark data set of RNA structures of ≤80 nucleotides. We suggest ways in which the accuracy of the model can be further improved.
Collapse
Affiliation(s)
- Ofer Kimchi
- Harvard Graduate Program in Biophysics, Harvard University, Cambridge, Massachusetts.
| | - Tristan Cragnolini
- Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Michael P Brenner
- School of Engineering and Applied Sciences, Cambridge, Massachusetts; Kavli Institute for Bionano Science and Technology, Harvard University, Cambridge, Massachusetts
| | - Lucy J Colwell
- Department of Chemistry, University of Cambridge, Cambridge, United Kingdom.
| |
Collapse
|
34
|
Danaee P, Rouches M, Wiley M, Deng D, Huang L, Hendrix D. bpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Res 2019; 46:5381-5394. [PMID: 29746666 PMCID: PMC6009582 DOI: 10.1093/nar/gky285] [Citation(s) in RCA: 85] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2018] [Accepted: 04/11/2018] [Indexed: 01/04/2023] Open
Abstract
While RNA secondary structure prediction from sequence data has made remarkable progress, there is a need for improved strategies for annotating the features of RNA secondary structures. Here, we present bpRNA, a novel annotation tool capable of parsing RNA structures, including complex pseudoknot-containing RNAs, to yield an objective, precise, compact, unambiguous, easily-interpretable description of all loops, stems, and pseudoknots, along with the positions, sequence, and flanking base pairs of each such structural feature. We also introduce several new informative representations of RNA structure types to improve structure visualization and interpretation. We have further used bpRNA to generate a web-accessible meta-database, ‘bpRNA-1m’, of over 100 000 single-molecule, known secondary structures; this is both more fully and accurately annotated and over 20-times larger than existing databases. We use a subset of the database with highly similar (≥90% identical) sequences filtered out to report on statistical trends in sequence, flanking base pairs, and length. Both the bpRNA method and the bpRNA-1m database will be valuable resources both for specific analysis of individual RNA molecules and large-scale analyses such as are useful for updating RNA energy parameters for computational thermodynamic predictions, improving machine learning models for structure prediction, and for benchmarking structure-prediction algorithms.
Collapse
Affiliation(s)
| | | | | | - Dezhong Deng
- School of Electrical Engineering and Computer Science
| | - Liang Huang
- School of Electrical Engineering and Computer Science
| | - David Hendrix
- School of Electrical Engineering and Computer Science.,Department of Biochemistry and Biophysics
| |
Collapse
|
35
|
Zaucker A, Nagorska A, Kumari P, Hecker N, Wang Y, Huang S, Cooper L, Sivashanmugam L, VijayKumar S, Brosens J, Gorodkin J, Sampath K. Translational co-regulation of a ligand and inhibitor by a conserved RNA element. Nucleic Acids Res 2019; 46:104-119. [PMID: 29059375 PMCID: PMC5758872 DOI: 10.1093/nar/gkx938] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2017] [Accepted: 10/03/2017] [Indexed: 12/20/2022] Open
Abstract
In many organisms, transcriptional and post-transcriptional regulation of components of pathways or processes has been reported. However, to date, there are few reports of translational co-regulation of multiple components of a developmental signaling pathway. Here, we show that an RNA element which we previously identified as a dorsal localization element (DLE) in the 3'UTR of zebrafish nodal-related1/squint (ndr1/sqt) ligand mRNA, is shared by the related ligand nodal-related2/cyclops (ndr2/cyc) and the nodal inhibitors, lefty1 (lft1) and lefty2 mRNAs. We investigated the activity of the DLEs through functional assays in live zebrafish embryos. The lft1 DLE localizes fluorescently labeled RNA similarly to the ndr1/sqt DLE. Similar to the ndr1/sqt 3'UTR, the lft1 and lft2 3'UTRs are bound by the RNA-binding protein (RBP) and translational repressor, Y-box binding protein 1 (Ybx1), whereas deletions in the DLE abolish binding to Ybx1. Analysis of zebrafish ybx1 mutants shows that Ybx1 represses lefty1 translation in embryos. CRISPR/Cas9-mediated inactivation of human YBX1 also results in human NODAL translational de-repression, suggesting broader conservation of the DLE RNA element/Ybx1 RBP module in regulation of Nodal signaling. Our findings demonstrate translational co-regulation of components of a signaling pathway by an RNA element conserved in both sequence and structure and an RBP, revealing a 'translational regulon'.
Collapse
Affiliation(s)
- Andreas Zaucker
- Cell & Developmental Biology Unit, Division of Biomedical Sciences, Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| | - Agnieszka Nagorska
- Cell & Developmental Biology Unit, Division of Biomedical Sciences, Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| | - Pooja Kumari
- Cell & Developmental Biology Unit, Division of Biomedical Sciences, Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| | - Nikolai Hecker
- Center for non-coding RNAs in Technology and Health, Department of Veterinary and Animal Sciences, Faculty for Health and Medical Sciences, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark
| | - Yin Wang
- Cell & Developmental Biology Unit, Division of Biomedical Sciences, Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| | - Sizhou Huang
- Cell & Developmental Biology Unit, Division of Biomedical Sciences, Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| | - Ledean Cooper
- Cell & Developmental Biology Unit, Division of Biomedical Sciences, Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| | - Lavanya Sivashanmugam
- Cell & Developmental Biology Unit, Division of Biomedical Sciences, Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| | - Shruthi VijayKumar
- Cell & Developmental Biology Unit, Division of Biomedical Sciences, Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| | - Jan Brosens
- Cell & Developmental Biology Unit, Division of Biomedical Sciences, Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| | - Jan Gorodkin
- Center for non-coding RNAs in Technology and Health, Department of Veterinary and Animal Sciences, Faculty for Health and Medical Sciences, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark
| | - Karuna Sampath
- Cell & Developmental Biology Unit, Division of Biomedical Sciences, Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| |
Collapse
|
36
|
Su C, Weir JD, Zhang F, Yan H, Wu T. ENTRNA: a framework to predict RNA foldability. BMC Bioinformatics 2019; 20:373. [PMID: 31269893 PMCID: PMC6610807 DOI: 10.1186/s12859-019-2948-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Accepted: 06/12/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA molecules play many crucial roles in living systems. The spatial complexity that exists in RNA structures determines their cellular functions. Therefore, understanding RNA folding conformations, in particular, RNA secondary structures, is critical for elucidating biological functions. Existing literature has focused on RNA design as either an RNA structure prediction problem or an RNA inverse folding problem where free energy has played a key role. RESULTS In this research, we propose a Positive-Unlabeled data- driven framework termed ENTRNA. Other than free energy and commonly studied sequence and structural features, we propose a new feature, Sequence Segment Entropy (SSE), to measure the diversity of RNA sequences. ENTRNA is trained and cross-validated using 1024 pseudoknot-free RNAs and 1060 pseudoknotted RNAs from the RNASTRAND database respectively. To test the robustness of the ENTRNA, the models are further blind tested on 206 pseudoknot-free and 93 pseudoknotted RNAs from the PDB database. For pseudoknot-free RNAs, ENTRNA has 86.5% sensitivity on the training dataset and 80.6% sensitivity on the testing dataset. For pseudoknotted RNAs, ENTRNA shows 81.5% sensitivity on the training dataset and 71.0% on the testing dataset. To test the applicability of ENTRNA to long structural-complex RNA, we collect 5 laboratory synthetic RNAs ranging from 1618 to 1790 nucleotides. ENTRNA is able to predict the foldability of 4 RNAs. CONCLUSION In this article, we reformulate the RNA design problem as a foldability prediction problem which is to predict the likelihood of the co-existence of a sequence-structure pair. This new construct has the potential for both RNA structure prediction and the inverse folding problem. In addition, this new construct enables us to explore data-driven approaches in RNA research.
Collapse
Affiliation(s)
- Congzhe Su
- School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, AZ 85281 USA
| | - Jeffery D. Weir
- Department of Operational Sciences, Graduate School of Engineering and Management, Air Force Institute of Technology, Wright-Patterson AFB, Dayton, OH 45433 USA
| | - Fei Zhang
- Biodesign Center for Molecular Design and Biomimetics, The Biodesign Institute & School of Molecular Sciences, Arizona State University, Tempe, AZ 85281 USA
| | - Hao Yan
- Biodesign Center for Molecular Design and Biomimetics, The Biodesign Institute & School of Molecular Sciences, Arizona State University, Tempe, AZ 85281 USA
| | - Teresa Wu
- School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, AZ 85281 USA
| |
Collapse
|
37
|
Wang J, Williams B, Chirasani VR, Krokhotin A, Das R, Dokholyan NV. Limits in accuracy and a strategy of RNA structure prediction using experimental information. Nucleic Acids Res 2019; 47:5563-5572. [PMID: 31106330 PMCID: PMC6582333 DOI: 10.1093/nar/gkz427] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 05/03/2019] [Accepted: 05/08/2019] [Indexed: 01/22/2023] Open
Abstract
RNA structural complexity and flexibility present a challenge for computational modeling efforts. Experimental information and bioinformatics data can be used as restraints to improve the accuracy of RNA tertiary structure prediction. Regarding utilization of restraints, the fundamental questions are: (i) What is the limit in prediction accuracy that one can achieve with arbitrary number of restraints? (ii) Is there a strategy for selection of the minimal number of restraints that would result in the best structural model? We address the first question by testing the limits in prediction accuracy using native contacts as restraints. To address the second question, we develop an algorithm based on the distance variation allowed by secondary structure (DVASS), which ranks restraints according to their importance to RNA tertiary structure prediction. We find that due to kinetic traps, the greatest improvement in the structure prediction accuracy is achieved when we utilize only 40-60% of the total number of native contacts as restraints. When the restraints are sorted by DVASS algorithm, using only the first 20% ranked restraints can greatly improve the prediction accuracy. Our findings suggest that only a limited number of strategically selected distance restraints can significantly assist in RNA structure modeling.
Collapse
Affiliation(s)
- Jian Wang
- Department of Pharmacology, Penn State University College of Medicine, Hershey, PA 17033, USA
| | - Benfeard Williams
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
| | - Venkata R Chirasani
- Department of Pharmacology, Penn State University College of Medicine, Hershey, PA 17033, USA
| | - Andrey Krokhotin
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
| | - Rajeshree Das
- Weinberg College of Arts and Sciences, Northwestern University, Evanston, IL 60208, USA
| | - Nikolay V Dokholyan
- Department of Pharmacology, Penn State University College of Medicine, Hershey, PA 17033, USA
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
- Department of Biochemistry and Molecular Biology, Penn State University College of Medicine, Hershey, PA 17033, USA
- Department of Chemistry, Penn State University, University Park, PA 16802, USA
- Department of Biomedical Engineering, Penn State University, University Park, PA 16802, USA
| |
Collapse
|
38
|
RNApolis: Computational Platform for RNA Structure Analysis. FOUNDATIONS OF COMPUTING AND DECISION SCIENCES 2019. [DOI: 10.2478/fcds-2019-0012] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Abstract
In the 1970s, computer scientists began to engage in research in the field of structural biology. The first structural databases, as well as models and methods supporting the analysis of biomolecule structures, started to be created. RNA was put at the centre of scientific interest quite late. However, more and more methods dedicated to this molecule are currently being developed. This paper presents RNApolis - a new computing platform, which offers access to seven bioinformatic tools developed to support the RNA structure study. The set of tools include a structural database and systems for predicting, modelling, annotating and evaluating the RNA structure. RNApolis supports research at different structural levels and allows the discovery, establishment, and validation of relationships between the primary, secondary and tertiary structure of RNAs. The platform is freely available at http://rnapolis.pl
Collapse
|
39
|
Mangul S, Martin LS, Hill BL, Lam AKM, Distler MG, Zelikovsky A, Eskin E, Flint J. Systematic benchmarking of omics computational tools. Nat Commun 2019; 10:1393. [PMID: 30918265 PMCID: PMC6437167 DOI: 10.1038/s41467-019-09406-4] [Citation(s) in RCA: 84] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 03/06/2019] [Indexed: 01/11/2023] Open
Abstract
Computational omics methods packaged as software have become essential to modern biological research. The increasing dependence of scientists on these powerful software tools creates a need for systematic assessment of these methods, known as benchmarking. Adopting a standardized benchmarking practice could help researchers who use omics data to better leverage recent technological innovations. Our review summarizes benchmarking practices from 25 recent studies and discusses the challenges, advantages, and limitations of benchmarking across various domains of biology. We also propose principles that can make computational biology benchmarking studies more sustainable and reproducible, ultimately increasing the transparency of biomedical data and results. Benchmarking studies are important for comprehensively understanding and evaluating different computational omics methods. Here, the authors review practices from 25 recent studies and propose principles to improve the quality of benchmarking studies.
Collapse
Affiliation(s)
- Serghei Mangul
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA, 90095, USA. .,Institute for Quantitative and Computational Biosciences, University of California Los Angeles, 611 Charles E Young Drive East, Los Angeles, CA, 90095, USA.
| | - Lana S Martin
- Institute for Quantitative and Computational Biosciences, University of California Los Angeles, 611 Charles E Young Drive East, Los Angeles, CA, 90095, USA
| | - Brian L Hill
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA, 90095, USA
| | - Angela Ka-Mei Lam
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA, 90095, USA
| | - Margaret G Distler
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, Atlanta, GA, 30303, USA.,The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
| | - Eleazar Eskin
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA, 90095, USA.,Department of Human Genetics, University of California Los Angeles, 695 Charles E. Young, Los Angeles, CA, USA
| | - Jonathan Flint
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA
| |
Collapse
|
40
|
Abstract
In addition to coding for protein sequences, RNA molecules encode a diverse set of gene-regulatory elements. RNA switches are one class of gene-regulatory elements that control protein expression in a manner that is dependent on the concentration of specific ligand molecules. These allosteric gene-regulatory elements have been shown as useful tools in engineering diverse cell types to display novel function. In particular, RNA switches have been used as genetically encoded biosensors and conditional controllers to direct cellular decisions based on the system's changing environment. A significant focus in the field has been the generation of novel RNA switches that are tailored for different biological systems. We review approaches that have been used to generate RNA switches, which leverage the unique physical properties of RNA and the myriad ways in which RNA can modulate gene expression.
Collapse
Affiliation(s)
- Calvin M Schmidt
- Department of Bioengineering, Stanford University, Stanford, California 94305
| | - Christina D Smolke
- Department of Bioengineering, Stanford University, Stanford, California 94305.,Chan Zuckerberg Biohub, San Francisco, California 94158
| |
Collapse
|
41
|
Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures. Genes (Basel) 2018; 9:genes9120604. [PMID: 30518121 PMCID: PMC6315940 DOI: 10.3390/genes9120604] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 11/28/2018] [Accepted: 11/29/2018] [Indexed: 02/03/2023] Open
Abstract
Self-contained structured domains of RNA sequences have often distinct molecular functions. Determining the boundaries of structured domains of a non-coding RNA (ncRNA) is needed for many ncRNA gene finder programs that predict RNA secondary structures in aligned genomes because these methods do not necessarily provide precise information about the boundaries or the location of the RNA structure inside the predicted ncRNA. Even without having a structure prediction, it is of interest to search for structured domains, such as for finding common RNA motifs in RNA-protein binding assays. The precise definition of the boundaries are essential for downstream analyses such as RNA structure modelling, e.g., through covariance models, and RNA structure clustering for the search of common motifs. Such efforts have so far been focused on single sequences, thus here we present a comparison for boundary definition between single sequence and multiple sequence alignments. We also present a novel approach, named RNAbound, for finding the boundaries that are based on probabilities of evolutionarily conserved base pairings. We tested the performance of two different methods on a limited number of Rfam families using the annotated structured RNA regions in the human genome and their multiple sequence alignments created from 14 species. The results show that multiple sequence alignments improve the boundary prediction for branched structures compared to single sequences independent of the chosen method. The actual performance of the two methods differs on single hairpin structures and branched structures. For the RNA families with branched structures, including transfer RNA (tRNA) and small nucleolar RNAs (snoRNAs), RNAbound improves the boundary predictions using multiple sequence alignments to median differences of −6 and −11.5 nucleotides (nts) for left and right boundary, respectively (window size of 200 nts).
Collapse
|
42
|
Discovering Structural Motifs in miRNA Precursors from the Viridiplantae Kingdom. Molecules 2018; 23:molecules23061367. [PMID: 29882777 PMCID: PMC6100135 DOI: 10.3390/molecules23061367] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2018] [Revised: 06/01/2018] [Accepted: 06/04/2018] [Indexed: 11/17/2022] Open
Abstract
A small non-coding molecule of microRNA (19–24 nt) controls almost every biological process, including cellular and physiological, of various organisms’ lives. The amount of microRNA (miRNA) produced within an organism is highly correlated to the organism’s key processes, and determines whether the system works properly or not. A crucial factor in plant biogenesis of miRNA is the Dicer Like 1 (DCL1) enzyme. Its responsibility is to perform the cleavages in the miRNA maturation process. Despite everything we already know about the last phase of plant miRNA creation, recognition of miRNA by DCL1 in pre-miRNA structures of plants remains an enigma. Herein, we present a bioinformatic procedure we have followed to discover structure patterns that could guide DCL1 to perform a cleavage in front of or behind an miRNA:miRNA* duplex. The patterns in the closest vicinity of microRNA are searched, within pre-miRNA sequences, as well as secondary and tertiary structures. The dataset consists of structures of plant pre-miRNA from the Viridiplantae kingdom. The results confirm our previous observations based on Arabidopsis thaliana precursor analysis. Hereby, our hypothesis was tested on pre-miRNAs, collected from the miRBase database to show secondary structure patterns of small symmetric internal loops 1-1 and 2-2 at a 1–10 nt distance from the miRNA:miRNA* duplex.
Collapse
|
43
|
Jabbari H, Wark I, Montemagno C. RNA secondary structure prediction with pseudoknots: Contribution of algorithm versus energy model. PLoS One 2018; 13:e0194583. [PMID: 29621250 PMCID: PMC5886407 DOI: 10.1371/journal.pone.0194583] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 03/06/2018] [Indexed: 11/25/2022] Open
Abstract
Motivation RNA is a biopolymer with various applications inside the cell and in biotechnology. Structure of an RNA molecule mainly determines its function and is essential to guide nanostructure design. Since experimental structure determination is time-consuming and expensive, accurate computational prediction of RNA structure is of great importance. Prediction of RNA secondary structure is relatively simpler than its tertiary structure and provides information about its tertiary structure, therefore, RNA secondary structure prediction has received attention in the past decades. Numerous methods with different folding approaches have been developed for RNA secondary structure prediction. While methods for prediction of RNA pseudoknot-free structure (structures with no crossing base pairs) have greatly improved in terms of their accuracy, methods for prediction of RNA pseudoknotted secondary structure (structures with crossing base pairs) still have room for improvement. A long-standing question for improving the prediction accuracy of RNA pseudoknotted secondary structure is whether to focus on the prediction algorithm or the underlying energy model, as there is a trade-off on computational cost of the prediction algorithm versus the generality of the method. Results The aim of this work is to argue when comparing different methods for RNA pseudoknotted structure prediction, the combination of algorithm and energy model should be considered and a method should not be considered superior or inferior to others if they do not use the same scoring model. We demonstrate that while the folding approach is important in structure prediction, it is not the only important factor in prediction accuracy of a given method as the underlying energy model is also as of great value. Therefore we encourage researchers to pay particular attention in comparing methods with different energy models.
Collapse
Affiliation(s)
- Hosna Jabbari
- Department of Computer Science, University of Vermont, Burlington, Vermont, United States of America
- * E-mail:
| | - Ian Wark
- Ingenuity Lab, Department of Chemical and Materials Engineering, University of Alberta, Edmonton, Alberta, Canada
| | - Carlo Montemagno
- Southern Illinois University Carbondale, Carbondale, Illinois, United States of America
| |
Collapse
|
44
|
Churkin A, Retwitzer MD, Reinharz V, Ponty Y, Waldispühl J, Barash D. Design of RNAs: comparing programs for inverse RNA folding. Brief Bioinform 2018; 19:350-358. [PMID: 28049135 PMCID: PMC6018860 DOI: 10.1093/bib/bbw120] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Computational programs for predicting RNA sequences with desired folding properties have been extensively developed and expanded in the past several years. Given a secondary structure, these programs aim to predict sequences that fold into a target minimum free energy secondary structure, while considering various constraints. This procedure is called inverse RNA folding. Inverse RNA folding has been traditionally used to design optimized RNAs with favorable properties, an application that is expected to grow considerably in the future in light of advances in the expanding new fields of synthetic biology and RNA nanostructures. Moreover, it was recently demonstrated that inverse RNA folding can successfully be used as a valuable preprocessing step in computational detection of novel noncoding RNAs. This review describes the most popular freeware programs that have been developed for such purposes, starting from RNAinverse that was devised when formulating the inverse RNA folding problem. The most recently published ones that consider RNA secondary structure as input are antaRNA, RNAiFold and incaRNAfbinv, each having different features that could be beneficial to specific biological problems in practice. The various programs also use distinct approaches, ranging from ant colony optimization to constraint programming, in addition to adaptive walk, simulated annealing and Boltzmann sampling. This review compares between the various programs and provides a simple description of the various possibilities that would benefit practitioners in selecting the most suitable program. It is geared for specific tasks requiring RNA design based on input secondary structure, with an outlook toward the future of RNA design programs.
Collapse
Affiliation(s)
- Alexander Churkin
- Shamoon College of Engineering and Physics Department at Ben-Gurion University, Beer-Sheva, Israel
| | | | - Vladimir Reinharz
- Department of Computer Science, Ben-Gurion University, Beer-Sheva, Israel
- School of Computer Science, McGill University, Montréal QC, Canada
| | - Yann Ponty
- Laboratoire d’informatique, École Polytechnique, Palaiseau, France
| | | | - Danny Barash
- Department of Computer Science, Ben-Gurion University, Beer-Sheva, Israel
| |
Collapse
|
45
|
Lim CS, Brown CM. Know Your Enemy: Successful Bioinformatic Approaches to Predict Functional RNA Structures in Viral RNAs. Front Microbiol 2018; 8:2582. [PMID: 29354101 PMCID: PMC5758548 DOI: 10.3389/fmicb.2017.02582] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Accepted: 12/11/2017] [Indexed: 12/14/2022] Open
Abstract
Structured RNA elements may control virus replication, transcription and translation, and their distinct features are being exploited by novel antiviral strategies. Viral RNA elements continue to be discovered using combinations of experimental and computational analyses. However, the wealth of sequence data, notably from deep viral RNA sequencing, viromes, and metagenomes, necessitates computational approaches being used as an essential discovery tool. In this review, we describe practical approaches being used to discover functional RNA elements in viral genomes. In addition to success stories in new and emerging viruses, these approaches have revealed some surprising new features of well-studied viruses e.g., human immunodeficiency virus, hepatitis C virus, influenza, and dengue viruses. Some notable discoveries were facilitated by new comparative analyses of diverse viral genome alignments. Importantly, comparative approaches for finding RNA elements embedded in coding and non-coding regions differ. With the exponential growth of computer power we have progressed from stem-loop prediction on single sequences to cutting edge 3D prediction, and from command line to user friendly web interfaces. Despite these advances, many powerful, user friendly prediction tools and resources are underutilized by the virology community.
Collapse
Affiliation(s)
- Chun Shen Lim
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Chris M Brown
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| |
Collapse
|
46
|
Abstract
Over the last two decades it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible noncoding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of noncoding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, D-79110 Freiburg, Germany.,Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| | - Ivo L Hofacker
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark.,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria.,Bioinformatics and Computational Biology Research Group, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria
| | - Peter F Stadler
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark. .,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria. .,Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, Germany. .,Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany. .,Fraunhofer Institute for Cell Therapy and Immunology, Perlickstraße 1, D-04103 Leipzig, Germany. .,Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501, USA.
| |
Collapse
|
47
|
Ruscito A, McConnell EM, Koudrina A, Velu R, Mattice C, Hunt V, McKeague M, DeRosa MC. In Vitro Selection and Characterization of DNA Aptamers to a Small Molecule Target. ACTA ACUST UNITED AC 2017; 9:233-268. [PMID: 29241295 DOI: 10.1002/cpch.28] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Aptamers, synthetic oligonucleotide-based molecular recognition probes, have found use in a wide array of biosensing technologies based on their tight and highly selective binding to a variety of molecular targets. However, the inherent challenges associated with the selection and characterization of aptamers for small molecule targets have resulted in their underrepresentation, despite the need for small molecule detection in fields such as medicine, the environment, and agriculture. This protocol describes the steps in the selection, sequencing, affinity characterization, and truncation of DNA aptamers that are specific for small molecule targets. © 2017 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
| | - Erin M McConnell
- Chemistry Department, Carleton University, Ottawa, Ontario, Canada
| | - Anna Koudrina
- Chemistry Department, Carleton University, Ottawa, Ontario, Canada
| | - Ranganathan Velu
- Chemistry Department, Carleton University, Ottawa, Ontario, Canada
| | | | - Vernon Hunt
- Chemistry Department, Carleton University, Ottawa, Ontario, Canada
| | - Maureen McKeague
- Department of Health Sciences and Technology, ETH Zürich, Zurich, Switzerland
| | - Maria C DeRosa
- Chemistry Department, Carleton University, Ottawa, Ontario, Canada
| |
Collapse
|
48
|
Gong S, Wang Y, Wang Z, Zhang W. Computational Methods for Modeling Aptamers and Designing Riboswitches. Int J Mol Sci 2017; 18:E2442. [PMID: 29149090 PMCID: PMC5713409 DOI: 10.3390/ijms18112442] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Revised: 11/12/2017] [Accepted: 11/14/2017] [Indexed: 02/04/2023] Open
Abstract
Riboswitches, which are located within certain noncoding RNA region perform functions as genetic "switches", regulating when and where genes are expressed in response to certain ligands. Understanding the numerous functions of riboswitches requires computation models to predict structures and structural changes of the aptamer domains. Although aptamers often form a complex structure, computational approaches, such as RNAComposer and Rosetta, have already been applied to model the tertiary (three-dimensional (3D)) structure for several aptamers. As structural changes in aptamers must be achieved within the certain time window for effective regulation, kinetics is another key point for understanding aptamer function in riboswitch-mediated gene regulation. The coarse-grained self-organized polymer (SOP) model using Langevin dynamics simulation has been successfully developed to investigate folding kinetics of aptamers, while their co-transcriptional folding kinetics can be modeled by the helix-based computational method and BarMap approach. Based on the known aptamers, the web server Riboswitch Calculator and other theoretical methods provide a new tool to design synthetic riboswitches. This review will represent an overview of these computational methods for modeling structure and kinetics of riboswitch aptamers and for designing riboswitches.
Collapse
Affiliation(s)
- Sha Gong
- Hubei Key Laboratory of Economic Forest Germplasm Improvement and Resources Comprehensive Utilization, Hubei Collaborative Innovation Center for the Characteristic Resources Exploitation of Dabie Mountains, Huanggang Normal University, Huanggang 438000, China.
| | - Yanli Wang
- Department of Physics, Wuhan University, Wuhan 430072, China.
| | - Zhen Wang
- Department of Physics, Wuhan University, Wuhan 430072, China.
| | - Wenbing Zhang
- Department of Physics, Wuhan University, Wuhan 430072, China.
| |
Collapse
|
49
|
Impact of the structural integrity of the three-way junction of adenovirus VAI RNA on PKR inhibition. PLoS One 2017; 12:e0186849. [PMID: 29053745 PMCID: PMC5650172 DOI: 10.1371/journal.pone.0186849] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 10/09/2017] [Indexed: 02/06/2023] Open
Abstract
Highly structured RNA derived from viral genomes is a key cellular indicator of viral infection. In response, cells produce the interferon inducible RNA-dependent protein kinase (PKR) that, when bound to viral dsRNA, phosphorylates eukaryotic initiation factor 2α and attenuates viral protein translation. Adenovirus can evade this line of defence through transcription of a non-coding RNA, VAI, an inhibitor of PKR. VAI consists of three base-paired regions that meet at a three-way junction; an apical stem responsible for the interaction with PKR, a central stem required for inhibition, and a terminal stem. Recent studies have highlighted the potential importance of the tertiary structure of the three-way junction to PKR inhibition by enabling interaction between regions of the central and terminal stems. To further investigate the role of the three-way junction, we characterized the binding affinity and inhibitory potential of central stem mutants designed to introduce subtle alterations. These results were then correlated with small-angle X-ray scattering solution studies and computational tertiary structural models. Our results demonstrate that while mutations to the central stem have no observable effect on binding affinity to PKR, mutations that appear to disrupt the structure of the three-way junction prevent inhibition of PKR. Therefore, we propose that instead of simply sequestering PKR, a specific structural conformation of the PKR-VAI complex may be required for inhibition.
Collapse
|
50
|
Piao M, Sun L, Zhang QC. RNA Regulations and Functions Decoded by Transcriptome-wide RNA Structure Probing. GENOMICS PROTEOMICS & BIOINFORMATICS 2017; 15:267-278. [PMID: 29031843 PMCID: PMC5673676 DOI: 10.1016/j.gpb.2017.05.002] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Revised: 05/09/2017] [Accepted: 05/27/2017] [Indexed: 01/07/2023]
Abstract
RNA folds into intricate structures that are crucial for its functions and regulations. To date, a multitude of approaches for probing structures of the whole transcriptome, i.e., RNA structuromes, have been developed. Applications of these approaches to different cell lines and tissues have generated a rich resource for the study of RNA structure–function relationships at a systems biology level. In this review, we first introduce the designs of these methods and their applications to study different RNA structuromes. We emphasize their technological differences especially their unique advantages and caveats. We then summarize the structural insights in RNA functions and regulations obtained from the studies of RNA structuromes. And finally, we propose potential directions for future improvements and studies.
Collapse
Affiliation(s)
- Meiling Piao
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology, Center for Synthetic and Systems Biology, Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Lei Sun
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology, Center for Synthetic and Systems Biology, Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology, Center for Synthetic and Systems Biology, Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China.
| |
Collapse
|