1
|
Antczak M, Szachniuk M. Toward Increasing the Credibility of RNA Design. Methods Mol Biol 2025; 2847:137-151. [PMID: 39312141 DOI: 10.1007/978-1-0716-4079-1_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
In the problem of RNA design, also known as inverse folding, RNA sequences are predicted that achieve the desired secondary structure at the lowest possible free energy and under certain constraints. The designed sequences have applications in synthetic biology and RNA-based nanotechnologies. There are also known cases of the successful use of inverse folding to discover previously unknown noncoding RNAs. Several computational methods have been dedicated to the problem of RNA design. They differ by algorithm and additional parameters, e.g., those determining the goal function in the sequence optimization process. Users can obtain many promising RNA sequences quite easily. The more difficult issue is to critically evaluate them and select the most favorable and reliable sequence that form1s the expected RNA structure. The latter problem is addressed in this paper. We propose an RNA design protocol extended to include sequence evaluation, for which a 3D structure is used. Experiments show that the accuracy of RNA design can be improved by adding a 3D structure prediction and analysis step.
Collapse
Affiliation(s)
- Maciej Antczak
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland.
| |
Collapse
|
2
|
Joshi CK, Liò P. gRNAde: A Geometric Deep Learning Pipeline for 3D RNA Inverse Design. Methods Mol Biol 2025; 2847:121-135. [PMID: 39312140 DOI: 10.1007/978-1-0716-4079-1_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
Fundamental to the diverse biological functions of RNA are its 3D structure and conformational flexibility, which enable single sequences to adopt a variety of distinct 3D states. Currently, computational RNA design tasks are often posed as inverse problems, where sequences are designed based on adopting a single desired secondary structure without considering 3D geometry and conformational diversity. In this tutorial, we present gRNAde, a geometric RNA design pipeline operating on sets of 3D RNA backbone structures to design sequences that explicitly account for RNA 3D structure and dynamics. gRNAde is a graph neural network that uses an SE (3) equivariant encoder-decoder framework for generating RNA sequences conditioned on backbone structures where the identities of the bases are unknown. We demonstrate the utility of gRNAde for fixed-backbone re-design of existing RNA structures of interest from the PDB, including riboswitches, aptamers, and ribozymes. gRNAde is more accurate in terms of native sequence recovery while being significantly faster compared to existing physics-based tools for 3D RNA inverse design, such as Rosetta.
Collapse
Affiliation(s)
- Chaitanya K Joshi
- Department of Computer Science and Technology, University of Cambridge, Cambridge, UK.
| | - Pietro Liò
- Department of Computer Science and Technology, University of Cambridge, Cambridge, UK
| |
Collapse
|
3
|
Joshi CK, Jamasb AR, Viñas R, Harris C, Mathis S, Morehead A, Anand R, Liò P. gRNAde: Geometric Deep Learning for 3D RNA inverse design. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.31.587283. [PMID: 38826198 PMCID: PMC11142113 DOI: 10.1101/2024.03.31.587283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Computational RNA design tasks are often posed as inverse problems, where sequences are designed based on adopting a single desired secondary structure without considering 3D geometry and conformational diversity. We introduce gRNAde, a geometric RNA design pipeline operating on 3D RNA backbones to design sequences that explicitly account for structure and dynamics. gRNAde uses a multi-state Graph Neural Network and autoregressive decoding to generates candidate RNA sequences conditioned on one or more 3D backbone structures where the identities of the bases are unknown. On a single-state fixed backbone re-design benchmark of 14 RNA structures from the PDB identified by Das et al. (2010), gRNAde obtains higher native sequence recovery rates (56% on average) compared to Rosetta (45% on average), taking under a second to produce designs compared to the reported hours for Rosetta. We further demonstrate the utility of gRNAde on a new benchmark of multi-state design for structurally flexible RNAs, as well as zero-shot ranking of mutational fitness landscapes in a retrospective analysis of a recent ribozyme. Open source code: github.com/chaitjo/geometric-rna-design.
Collapse
Affiliation(s)
| | - Arian R Jamasb
- University of Cambridge, UK
- Prescient Design, Genentech, Roche
| | | | | | | | | | | | | |
Collapse
|
4
|
Gao X, Ma J, Li F, Zhou Q, Gao D. Optimization of the extraction process of total steroids from Phillinus gilvus (Schwein.) Pat. by artificial neural network (ANN)-response surface methodology and identification of extract constituents. Prep Biochem Biotechnol 2024:1-14. [PMID: 39178290 DOI: 10.1080/10826068.2024.2394449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/25/2024]
Abstract
Phillinus gilvus (Schwein.) Pat has pharmacological effects such as tonifying the spleen, dispelling dampness, and strengthening the stomach, in which sterol is one of the main compounds of P. gilvus, but there has not been thought you to its extraction and detailed identification of its composition, in the present study, we used artificial neural network (ANN) and response surface methodology (RSM) to optimize the conditions of ultrasonic-assisted extraction, and the parameters of the independent and interaction effects were evaluated. Ultra performance liquid chromatography-quadrupole-time of flight mass spectrometry (UPLC-Q-TOF-MS/MS) was used to identify the major components in the purified extract. The results showed that the optimal extraction process conditions were: ultrasonic time 96 min, ultrasonic power 140 W, liquid to material ratio 1:25 g/ml, and ultrasonic temperature 30.7 °C. The compliance rates of the predicted and experimental values for the artificial neural network model and the response surface model were 98.3% and 96.12%, respectively, indicating that both models have the potential to be used for optimizing the extraction process of P. gilvus in industry. A total of 120 compounds and 30 major steroids were identified by comparison with the reference compounds. Among the major steroidal components are these findings will contribute to the isolation and utilization of active ingredients in P. gilvus.
Collapse
Affiliation(s)
- Xusheng Gao
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
| | - Junxia Ma
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
- College of Traditional Chinese Medicine and Key Laboratory of Edible Fungi Resources and Utilization, Ministry of Agriculture and Rural Affairs, Jilin Agricultural University, Changchun, China
| | - Fengfu Li
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
| | - Qian Zhou
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
- College of Traditional Chinese Medicine and Key Laboratory of Edible Fungi Resources and Utilization, Ministry of Agriculture and Rural Affairs, Jilin Agricultural University, Changchun, China
| | - Dan Gao
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
| |
Collapse
|
5
|
Wang X, Terashi G, Kihara D. CryoREAD: de novo structure modeling for nucleic acids in cryo-EM maps using deep learning. Nat Methods 2023; 20:1739-1747. [PMID: 37783885 PMCID: PMC10841814 DOI: 10.1038/s41592-023-02032-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 08/24/2023] [Indexed: 10/04/2023]
Abstract
DNA and RNA play fundamental roles in various cellular processes, where their three-dimensional structures provide information critical to understanding the molecular mechanisms of their functions. Although an increasing number of nucleic acid structures and their complexes with proteins are determined by cryogenic electron microscopy (cryo-EM), structure modeling for DNA and RNA remains challenging particularly when the map is determined at a resolution coarser than atomic level. Moreover, computational methods for nucleic acid structure modeling are relatively scarce. Here, we present CryoREAD, a fully automated de novo DNA/RNA atomic structure modeling method using deep learning. CryoREAD identifies phosphate, sugar and base positions in a cryo-EM map using deep learning, which are traced and modeled into a three-dimensional structure. When tested on cryo-EM maps determined at 2.0 to 5.0 Å resolution, CryoREAD built substantially more accurate models than existing methods. We also applied the method to cryo-EM maps of biomolecular complexes in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).
Collapse
Affiliation(s)
- Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
6
|
Najeh S, Zandi K, Kharma N, Perreault J. Computational design and experimental verification of pseudoknotted ribozymes. RNA (NEW YORK, N.Y.) 2023; 29:764-776. [PMID: 36868786 PMCID: PMC10187678 DOI: 10.1261/rna.079148.122] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 05/27/2022] [Indexed: 05/18/2023]
Abstract
The design of new RNA sequences that retain the function of a model RNA structure is a challenge in bioinformatics because of the structural complexity of these molecules. RNA can fold into its secondary and tertiary structures by forming stem-loops and pseudoknots. A pseudoknot is a set of base pairs between a region within a stem-loop and nucleotides outside of this stem-loop; this motif is very important for numerous functional structures. It is important for any computational design algorithm to take into account these interactions to give a reliable result for any structures that include pseudoknots. In our study, we experimentally validated synthetic ribozymes designed by Enzymer, which implements algorithms allowing for the design of pseudoknots. Enzymer is a program that uses an inverse folding approach to design pseudoknotted RNAs; we used it in this study to design two types of ribozymes. The ribozymes tested were the hammerhead and the glmS, which have a self-cleaving activity that allows them to liberate the new RNA genome copy during rolling-circle replication or to control the expression of the downstream genes, respectively. We demonstrated the efficiency of Enzymer by showing that the pseudoknotted hammerhead and glmS ribozymes sequences it designed were extensively modified compared to wild-type sequences and were still active.
Collapse
Affiliation(s)
- Sabrine Najeh
- INRS - Institut Armand-Frappier, Laval, QC H7V 1B7, Canada
| | - Kasra Zandi
- Software Engineering and Computer Science Department, Concordia University, Montreal, Quebec H3G 1M8, Canada
| | - Nawwaf Kharma
- Electrical and Computer Engineering Department, Concordia University, Montreal, Quebec H3G 1M8, Canada
| | | |
Collapse
|
7
|
Moussa S, Kilgour M, Jans C, Hernandez-Garcia A, Cuperlovic-Culf M, Bengio Y, Simine L. Diversifying Design of Nucleic Acid Aptamers Using Unsupervised Machine Learning. J Phys Chem B 2023; 127:62-68. [PMID: 36574492 DOI: 10.1021/acs.jpcb.2c05660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Inverse design of short single-stranded RNA and DNA sequences (aptamers) is the task of finding sequences that satisfy a set of desired criteria. Relevant criteria may be, for example, the presence of specific folding motifs, binding to molecular ligands, sensing properties, and so on. Most practical approaches to aptamer design identify a small set of promising candidate sequences using high-throughput experiments (e.g., SELEX) and then optimize performance by introducing only minor modifications to the empirically found candidates. Sequences that possess the desired properties but differ drastically in chemical composition will add diversity to the search space and facilitate the discovery of useful nucleic acid aptamers. Systematic diversification protocols are needed. Here we propose to use an unsupervised machine learning model known as the Potts model to discover new, useful sequences with controllable sequence diversity. We start by training a Potts model using the maximum entropy principle on a small set of empirically identified sequences unified by a common feature. To generate new candidate sequences with a controllable degree of diversity, we take advantage of the model's spectral feature: an "energy" bandgap separating sequences that are similar to the training set from those that are distinct. By controlling the Potts energy range that is sampled, we generate sequences that are distinct from the training set yet still likely to have the encoded features. To demonstrate performance, we apply our approach to design diverse pools of sequences with specified secondary structure motifs in 30-mer RNA and DNA aptamers.
Collapse
Affiliation(s)
- Siba Moussa
- Department of Chemistry, McGill University, 801 Sherbrooke Street West, Montreal, QuebecH3A 0B8, Canada
| | - Michael Kilgour
- Department of Chemistry, McGill University, 801 Sherbrooke Street West, Montreal, QuebecH3A 0B8, Canada
| | - Clara Jans
- Department of Chemistry, McGill University, 801 Sherbrooke Street West, Montreal, QuebecH3A 0B8, Canada
| | - Alex Hernandez-Garcia
- Montreal Institute for Learning Algorithms, 6666 St. Urbain, #200, Montreal, QuebecH2S 3H1, Canada
| | - Miroslava Cuperlovic-Culf
- Digital Technologies Research Centre, National Research Council of Canada, 1200 Montreal Road, Ottawa, OntarioK1A 0R6, Canada
| | - Yoshua Bengio
- Montreal Institute for Learning Algorithms, 6666 St. Urbain, #200, Montreal, QuebecH2S 3H1, Canada
| | - Lena Simine
- Department of Chemistry, McGill University, 801 Sherbrooke Street West, Montreal, QuebecH3A 0B8, Canada
| |
Collapse
|
8
|
Zambrano RAI, Hernandez-Perez C, Takahashi MK. RNA Structure Prediction, Analysis, and Design: An Introduction to Web-Based Tools. Methods Mol Biol 2022; 2518:253-269. [PMID: 35666450 DOI: 10.1007/978-1-0716-2421-0_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Understanding RNA structure has become critical in the study of RNA in their roles as mediators of biological processes. To aid in these studies, computational algorithms that utilize thermodynamics have been developed to predict RNA secondary structure. Due to the importance of intermolecular interactions, the algorithms have been expanded to determine and predict RNA-RNA hybridization. This chapter discusses popular webservers with the tools for RNA secondary structure prediction, RNA-RNA hybridization, and design. We address key features that distinguish common-functioning programs and their purposes for the interests of the user. Ultimately, we hope this review elucidates web-based tools researchers may take advantage of in their investigations of RNA structure and function.
Collapse
Affiliation(s)
| | | | - Melissa K Takahashi
- Department of Biology, California State University Northridge, Northridge, CA, USA.
| |
Collapse
|
9
|
Minuesa G, Alsina C, Garcia-Martin JA, Oliveros J, Dotu I. MoiRNAiFold: a novel tool for complex in silico RNA design. Nucleic Acids Res 2021; 49:4934-4943. [PMID: 33956139 PMCID: PMC8136780 DOI: 10.1093/nar/gkab331] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 04/09/2021] [Accepted: 04/21/2021] [Indexed: 12/23/2022] Open
Abstract
Novel tools for in silico design of RNA constructs such as riboregulators are required in order to reduce time and cost to production for the development of diagnostic and therapeutic advances. Here, we present MoiRNAiFold, a versatile and user-friendly tool for de novo synthetic RNA design. MoiRNAiFold is based on Constraint Programming and it includes novel variable types, heuristics and restart strategies for Large Neighborhood Search. Moreover, this software can handle dozens of design constraints and quality measures and improves features for RNA regulation control of gene expression, such as Translation Efficiency calculation. We demonstrate that MoiRNAiFold outperforms any previous software in benchmarking structural RNA puzzles from EteRNA. Importantly, with regard to biologically relevant RNA designs, we focus on RNA riboregulators, demonstrating that the designed RNA sequences are functional both in vitro and in vivo. Overall, we have generated a powerful tool for de novo complex RNA design that we make freely available as a web server (https://moiraibiodesign.com/design/).
Collapse
Affiliation(s)
- Gerard Minuesa
- Moirai Biodesign, c/ Baldiri Reixach s/n, Parc Científic de Barcelona (PCB), 08028 Barcelona, Spain
| | - Cristina Alsina
- Moirai Biodesign, c/ Baldiri Reixach s/n, Parc Científic de Barcelona (PCB), 08028 Barcelona, Spain
| | - Juan Antonio Garcia-Martin
- Bioinformatics for Genomics and Proteomics. National Centre for Biotechnology (CNB-CSIC). c/ Darwin 3, 28049 Madrid, Spain
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Universidad Carlos III de Madrid, 28911 Madrid, Spain
| | - Juan Carlos Oliveros
- Bioinformatics for Genomics and Proteomics. National Centre for Biotechnology (CNB-CSIC). c/ Darwin 3, 28049 Madrid, Spain
| | - Ivan Dotu
- Moirai Biodesign, c/ Baldiri Reixach s/n, Parc Científic de Barcelona (PCB), 08028 Barcelona, Spain
| |
Collapse
|
10
|
RNA origami design tools enable cotranscriptional folding of kilobase-sized nanoscaffolds. Nat Chem 2021; 13:549-558. [PMID: 33972754 PMCID: PMC7610888 DOI: 10.1038/s41557-021-00679-1] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2019] [Accepted: 03/08/2021] [Indexed: 12/18/2022]
Abstract
RNA origami is a framework for the modular design of nanoscaffolds that can be folded from a single strand of RNA, and used to organize molecular components with nanoscale precision. Design of genetically expressible RNA origami, which must cotranscriptionally fold, requires modeling and design tools that simultaneously consider thermodynamics, folding pathway, sequence constraints, and pseudoknot optimization. Here, we describe RNA Origami Automated Design software (ROAD), which builds origami models from a library of structural modules, identifies potential folding barriers, and designs optimized sequences. Using ROAD, we extend the scale and functional diversity of RNA scaffolds, creating 32 designs of up to 2360 nucleotides, five that scaffold two proteins, and seven that scaffold two small molecules at precise distances. Micrographic and chromatographic comparison of optimized and nonoptimized structures validates that our principles for strand routing and sequence design substantially improve yield. By providing efficient design of RNA origami, ROAD may simplify construction of custom RNA scaffolds for nanomedicine and synthetic biology.
Collapse
|
11
|
Inverse RNA Folding Workflow to Design and Test Ribozymes that Include Pseudoknots. Methods Mol Biol 2021; 2167:113-143. [PMID: 32712918 DOI: 10.1007/978-1-0716-0716-9_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Ribozymes are RNAs that catalyze reactions. They occur in nature, and can also be evolved in vitro to catalyze novel reactions. This chapter provides detailed protocols for using inverse folding software to design a ribozyme sequence that will fold to a known ribozyme secondary structure and for testing the catalytic activity of the sequence experimentally. This protocol is able to design sequences that include pseudoknots, which is important as all naturally occurring full-length ribozymes have pseudoknots. The starting point is the known pseudoknot-containing secondary structure of the ribozyme and knowledge of any nucleotides whose identity is required for function. The output of the protocol is a set of sequences that have been tested for function. Using this protocol, we were previously successful at designing highly active double-pseudoknotted HDV ribozymes.
Collapse
|
12
|
Taneda A, Sato K. A Web Server for Designing Molecular Switches Composed of Two Interacting RNAs. Int J Mol Sci 2021; 22:ijms22052720. [PMID: 33800268 PMCID: PMC7962656 DOI: 10.3390/ijms22052720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/04/2021] [Accepted: 03/04/2021] [Indexed: 11/16/2022] Open
Abstract
The programmability of RNA–RNA interactions through intermolecular base-pairing has been successfully exploited to design a variety of RNA devices that artificially regulate gene expression. An in silico design for interacting structured RNA sequences that satisfies multiple design criteria becomes a complex multi-objective problem. Although multi-objective optimization is a powerful technique that explores a vast solution space without empirical weights between design objectives, to date, no web service for multi-objective design of RNA switches that utilizes RNA–RNA interaction has been proposed. We developed a web server, which is based on a multi-objective design algorithm called MODENA, to design two interacting RNAs that form a complex in silico. By predicting the secondary structures with RactIP during the design process, we can design RNAs that form a joint secondary structure with an external pseudoknot. The energy barrier upon the complex formation is modeled by an interaction seed that is optimized in the design algorithm. We benchmarked the RNA switch design approaches (MODENA+RactIP and MODENA+RNAcofold) for the target structures based on natural RNA-RNA interactions. As a result, MODENA+RactIP showed high design performance for the benchmark datasets.
Collapse
Affiliation(s)
- Akito Taneda
- Graduate School of Science and Technology, Hirosaki University, Hirosaki, Aomori 036-8561, Japan
- Correspondence:
| | - Kengo Sato
- Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan;
| |
Collapse
|
13
|
Gupta A, Bansal M. RNA-mediated translation regulation in viral genomes: computational advances in the recognition of sequences and structures. Brief Bioinform 2020; 21:1151-1163. [PMID: 31204430 PMCID: PMC7109810 DOI: 10.1093/bib/bbz054] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Revised: 03/24/2019] [Accepted: 04/15/2019] [Indexed: 12/30/2022] Open
Abstract
RNA structures are widely distributed across all life forms. The global conformation of these structures is defined by a variety of constituent structural units such as helices, hairpin loops, kissing-loop motifs and pseudoknots, which often behave in a modular way. Their ubiquitous distribution is associated with a variety of functions in biological processes. The location of these structures in the genomes of RNA viruses is often coordinated with specific processes in the viral life cycle, where the presence of the structure acts as a checkpoint for deciding the eventual fate of the process. These structures have been found to adopt complex conformations and exert their effects by interacting with ribosomes, multiple host translation factors and small RNA molecules like miRNA. A number of such RNA structures have also been shown to regulate translation in viruses at the level of initiation, elongation or termination. The role of various computational studies in the preliminary identification of such sequences and/or structures and subsequent functional analysis has not been fully appreciated. This review aims to summarize the processes in which viral RNA structures have been found to play an active role in translational regulation, their global conformational features and the bioinformatics/computational tools available for the identification and prediction of these structures.
Collapse
Affiliation(s)
- Asmita Gupta
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Manju Bansal
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| |
Collapse
|
14
|
Inverse folding with RNA-As-Graphs produces a large pool of candidate sequences with target topologies. J Struct Biol 2019; 209:107438. [PMID: 31874236 DOI: 10.1016/j.jsb.2019.107438] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 12/18/2019] [Accepted: 12/19/2019] [Indexed: 02/07/2023]
Abstract
We present an RNA-As-Graphs (RAG) based inverse folding algorithm, RAG-IF, to design novel RNA sequences that fold onto target tree graph topologies. The algorithm can be used to enhance our recently reported computational design pipeline (Jain et al., NAR 2018). The RAG approach represents RNA secondary structures as tree and dual graphs, where RNA loops and helices are coarse-grained as vertices and edges, opening the usage of graph theory methods to study, predict, and design RNA structures. Our recently developed computational pipeline for design utilizes graph partitioning (RAG-3D) and atomic fragment assembly (F-RAG) to design sequences to fold onto RNA-like tree graph topologies; the atomic fragments are taken from existing RNA structures that correspond to tree subgraphs. Because F-RAG may not produce the target folds for all designs, automated mutations by RAG-IF algorithm enhance the candidate pool markedly. The crucial residues for mutation are identified by differences between the predicted and the target topology. A genetic algorithm then mutates the selected residues, and the successful sequences are optimized to retain only the minimal or essential mutations. Here we evaluate RAG-IF for 6 RNA-like topologies and generate a large pool of successful candidate sequences with a variety of minimal mutations. We find that RAG-IF adds robustness and efficiency to our RNA design pipeline, making inverse folding motivated by graph topology rather than secondary structure more productive.
Collapse
|
15
|
Yamagami R, Kayedkhordeh M, Mathews DH, Bevilacqua PC. Design of highly active double-pseudoknotted ribozymes: a combined computational and experimental study. Nucleic Acids Res 2019; 47:29-42. [PMID: 30462314 PMCID: PMC6326823 DOI: 10.1093/nar/gky1118] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2018] [Accepted: 10/24/2018] [Indexed: 01/02/2023] Open
Abstract
Design of RNA sequences that adopt functional folds establishes principles of RNA folding and applications in biotechnology. Inverse folding for RNAs, which allows computational design of sequences that adopt specific structures, can be utilized for unveiling RNA functions and developing genetic tools in synthetic biology. Although many algorithms for inverse RNA folding have been developed, the pseudoknot, which plays a key role in folding of ribozymes and riboswitches, is not addressed in most algorithms. For the few algorithms that attempt to predict pseudoknot-containing ribozymes, self-cleavage activity has not been tested. Herein, we design double-pseudoknot HDV ribozymes using an inverse RNA folding algorithm and test their kinetic mechanisms experimentally. More than 90% of the positively designed ribozymes possess self-cleaving activity, whereas more than 70% of negative control ribozymes, which are predicted to fold to the necessary structure but with low fidelity, do not possess it. Kinetic and mutation analyses reveal that these RNAs cleave site-specifically and with the same mechanism as the WT ribozyme. Most ribozymes react just 50- to 80-fold slower than the WT ribozyme, and this rate can be improved to near WT by modification of a junction. Thus, fast-cleaving functional ribozymes with multiple pseudoknots can be designed computationally.
Collapse
Affiliation(s)
- Ryota Yamagami
- Department of Chemistry, Pennsylvania State University, University Park, PA 16802, USA.,Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Mohammad Kayedkhordeh
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York, NY 14642, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York, NY 14642, USA.,Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, New York, NY 14642, USA
| | - Philip C Bevilacqua
- Department of Chemistry, Pennsylvania State University, University Park, PA 16802, USA.,Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA.,Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
16
|
Raden M, Ali SM, Alkhnbashi OS, Busch A, Costa F, Davis JA, Eggenhofer F, Gelhausen R, Georg J, Heyne S, Hiller M, Kundu K, Kleinkauf R, Lott SC, Mohamed MM, Mattheis A, Miladi M, Richter AS, Will S, Wolff J, Wright PR, Backofen R. Freiburg RNA tools: a central online resource for RNA-focused research and teaching. Nucleic Acids Res 2018; 46:W25-W29. [PMID: 29788132 PMCID: PMC6030932 DOI: 10.1093/nar/gky329] [Citation(s) in RCA: 94] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Revised: 04/03/2018] [Accepted: 05/18/2018] [Indexed: 12/20/2022] Open
Abstract
The Freiburg RNA tools webserver is a well established online resource for RNA-focused research. It provides a unified user interface and comprehensive result visualization for efficient command line tools. The webserver includes RNA-RNA interaction prediction (IntaRNA, CopraRNA, metaMIR), sRNA homology search (GLASSgo), sequence-structure alignments (LocARNA, MARNA, CARNA, ExpaRNA), CRISPR repeat classification (CRISPRmap), sequence design (antaRNA, INFO-RNA, SECISDesign), structure aberration evaluation of point mutations (RaSE), and RNA/protein-family models visualization (CMV), and other methods. Open education resources offer interactive visualizations of RNA structure and RNA-RNA interaction prediction as well as basic and advanced sequence alignment algorithms. The services are freely available at http://rna.informatik.uni-freiburg.de.
Collapse
Affiliation(s)
- Martin Raden
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Syed M Ali
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Omer S Alkhnbashi
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Anke Busch
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128 Mainz, Germany
| | - Fabrizio Costa
- Department of Computer Science, University of Exeter, Exeter EX4 4QF, UK
| | - Jason A Davis
- Coreva Scientific, Kaiser-Joseph-Str 198-200, 79098 Freiburg, Germany
| | - Florian Eggenhofer
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Rick Gelhausen
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Jens Georg
- Genetics and Experimental Bioinformatics, University of Freiburg, Schänzlestraße 1, 79104 Freiburg, Germany
| | - Steffen Heyne
- Max Planck Institute of Immunobiology and Epigenetics, Stübeweg 51, 79108 Freiburg, Germany
| | - Michael Hiller
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, Dresden, Germany
| | - Kousik Kundu
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK
- Department of Human Genetics, The Wellcome Trust Sanger Institute, Hinxton Cambridge CB10 1HH, UK
| | - Robert Kleinkauf
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Steffen C Lott
- Genetics and Experimental Bioinformatics, University of Freiburg, Schänzlestraße 1, 79104 Freiburg, Germany
| | - Mostafa M Mohamed
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Alexander Mattheis
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Milad Miladi
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | | | - Sebastian Will
- Theoretical Biochemistry Group, University of Vienna, Währingerstraße 17, 1090 Vienna, Austria
| | - Joachim Wolff
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Patrick R Wright
- Department of Clinical Research, Clinical Trial Unit, University of Basel Hospital, Schanzenstrasse 55, 4031 Basel, Switzerland
| | - Rolf Backofen
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
- Centre for Biological Signalling Studies (BIOSS), University of Freiburg, Schaenzlestr. 18, 79104 Freiburg, Germany
| |
Collapse
|
17
|
Lotfi M, Zare-Mirakabad F, Montaseri S. RNA design using simulated SHAPE data. Genes Genet Syst 2018; 92:257-265. [PMID: 28757510 DOI: 10.1266/ggs.16-00067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
It has long been established that in addition to being involved in protein translation, RNA plays essential roles in numerous other cellular processes, including gene regulation and DNA replication. Such roles are known to be dictated by higher-order structures of RNA molecules. It is therefore of prime importance to find an RNA sequence that can fold to acquire a particular function that is desirable for use in pharmaceuticals and basic research. The challenge of finding an RNA sequence for a given structure is known as the RNA design problem. Although there are several algorithms to solve this problem, they mainly consider hard constraints, such as minimum free energy, to evaluate the predicted sequences. Recently, SHAPE data has emerged as a new soft constraint for RNA secondary structure prediction. To take advantage of this new experimental constraint, we report here a new method for accurate design of RNA sequences based on their secondary structures using SHAPE data as pseudo-free energy. We then compare our algorithm with four others: INFO-RNA, ERD, MODENA and RNAifold 2.0. Our algorithm precisely predicts 26 out of 29 new sequences for the structures extracted from the Rfam dataset, while the other four algorithms predict no more than 22 out of 29. The proposed algorithm is comparable to the above algorithms on RNA-SSD datasets, where they can predict up to 33 appropriate sequences for RNA secondary structures out of 34.
Collapse
Affiliation(s)
- Mohadeseh Lotfi
- Faculty of Mathematics and Computer Science, Amirkabir University of Technology
| | | | - Soheila Montaseri
- School of Mathematics, Statistics and Computer Science, College of Science, Enghelab Avenue, University of Tehran
| |
Collapse
|
18
|
Findeiß S, Hammer S, Wolfinger MT, Kühnl F, Flamm C, Hofacker IL. In silico design of ligand triggered RNA switches. Methods 2018; 143:90-101. [PMID: 29660485 DOI: 10.1016/j.ymeth.2018.04.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 03/06/2018] [Accepted: 04/06/2018] [Indexed: 02/06/2023] Open
Abstract
This contribution sketches a work flow to design an RNA switch that is able to adapt two structural conformations in a ligand-dependent way. A well characterized RNA aptamer, i.e., knowing its Kd and adaptive structural features, is an essential ingredient of the described design process. We exemplify the principles using the well-known theophylline aptamer throughout this work. The aptamer in its ligand-binding competent structure represents one structural conformation of the switch while an alternative fold that disrupts the binding-competent structure forms the other conformation. To keep it simple we do not incorporate any regulatory mechanism to control transcription or translation. We elucidate a commonly used design process by explicitly dissecting and explaining the necessary steps in detail. We developed a novel objective function which specifies the mechanistics of this simple, ligand-triggered riboswitch and describe an extensive in silico analysis pipeline to evaluate important kinetic properties of the designed sequences. This protocol and the developed software can be easily extended or adapted to fit novel design scenarios and thus can serve as a template for future needs.
Collapse
Affiliation(s)
- Sven Findeiß
- Bioinformatics, Institute of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstraße 16-18, 04107 Leipzig, Germany; University of Vienna, Faculty of Computer Science, Research Group Bioinformatics and Computational Biology, Währingerstraße 29, 1090 Vienna, Austria; University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstraße 17, 1090 Vienna, Austria.
| | - Stefan Hammer
- Bioinformatics, Institute of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstraße 16-18, 04107 Leipzig, Germany; University of Vienna, Faculty of Computer Science, Research Group Bioinformatics and Computational Biology, Währingerstraße 29, 1090 Vienna, Austria; University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstraße 17, 1090 Vienna, Austria
| | - Michael T Wolfinger
- University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstraße 17, 1090 Vienna, Austria; Medical University of Vienna, Center for Anatomy and Cell Biology, Währingerstraße 13, 1090 Vienna, Austria
| | - Felix Kühnl
- Bioinformatics, Institute of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstraße 16-18, 04107 Leipzig, Germany
| | - Christoph Flamm
- University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstraße 17, 1090 Vienna, Austria
| | - Ivo L Hofacker
- University of Vienna, Faculty of Computer Science, Research Group Bioinformatics and Computational Biology, Währingerstraße 29, 1090 Vienna, Austria; University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstraße 17, 1090 Vienna, Austria
| |
Collapse
|
19
|
Ledda M, Aviran S. PATTERNA: transcriptome-wide search for functional RNA elements via structural data signatures. Genome Biol 2018; 19:28. [PMID: 29495968 PMCID: PMC5833111 DOI: 10.1186/s13059-018-1399-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 01/30/2018] [Indexed: 02/08/2023] Open
Abstract
Establishing a link between RNA structure and function remains a great challenge in RNA biology. The emergence of high-throughput structure profiling experiments is revolutionizing our ability to decipher structure, yet principled approaches for extracting information on structural elements directly from these data sets are lacking. We present PATTERNA, an unsupervised pattern recognition algorithm that rapidly mines RNA structure motifs from profiling data. We demonstrate that PATTERNA detects motifs with an accuracy comparable to commonly used thermodynamic models and highlight its utility in automating data-directed structure modeling from large data sets. PATTERNA is versatile and compatible with diverse profiling techniques and experimental conditions.
Collapse
Affiliation(s)
- Mirko Ledda
- Department of Biomedical Engineering and Genome Center, UC Davis, 1 Shields Ave, Davis, 95616 USA
- Integrative Genetics and Genomics Graduate Group, UC Davis, 1 Shields Ave, Davis, 95616 USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, UC Davis, 1 Shields Ave, Davis, 95616 USA
| |
Collapse
|
20
|
Statistical modeling of RNA structure profiling experiments enables parsimonious reconstruction of structure landscapes. Nat Commun 2018; 9:606. [PMID: 29426922 PMCID: PMC5807309 DOI: 10.1038/s41467-018-02923-8] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Accepted: 01/09/2018] [Indexed: 11/23/2022] Open
Abstract
RNA plays key regulatory roles in diverse cellular processes, where its functionality often derives from folding into and converting between structures. Many RNAs further rely on co-existence of alternative structures, which govern their response to cellular signals. However, characterizing heterogeneous landscapes is difficult, both experimentally and computationally. Recently, structure profiling experiments have emerged as powerful and affordable structure characterization methods, which improve computational structure prediction. To date, efforts have centered on predicting one optimal structure, with much less progress made on multiple-structure prediction. Here, we report a probabilistic modeling approach that predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data. We demonstrate robust landscape reconstruction and quantitative insights into structural dynamics by analyzing numerous data sets. This work establishes a framework for data-directed characterization of structure landscapes to aid experimentalists in performing structure-function studies. Different experimental and computational approaches can be used to study RNA structures. Here, the authors present a computational method for data-directed reconstruction of complex RNA structure landscapes, which predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data.
Collapse
|
21
|
Martinez-Salas E, Francisco-Velilla R, Fernandez-Chamorro J, Embarek AM. Insights into Structural and Mechanistic Features of Viral IRES Elements. Front Microbiol 2018; 8:2629. [PMID: 29354113 PMCID: PMC5759354 DOI: 10.3389/fmicb.2017.02629] [Citation(s) in RCA: 98] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 12/15/2017] [Indexed: 01/19/2023] Open
Abstract
Internal ribosome entry site (IRES) elements are cis-acting RNA regions that promote internal initiation of protein synthesis using cap-independent mechanisms. However, distinct types of IRES elements present in the genome of various RNA viruses perform the same function despite lacking conservation of sequence and secondary RNA structure. Likewise, IRES elements differ in host factor requirement to recruit the ribosomal subunits. In spite of this diversity, evolutionarily conserved motifs in each family of RNA viruses preserve sequences impacting on RNA structure and RNA–protein interactions important for IRES activity. Indeed, IRES elements adopting remarkable different structural organizations contain RNA structural motifs that play an essential role in recruiting ribosomes, initiation factors and/or RNA-binding proteins using different mechanisms. Therefore, given that a universal IRES motif remains elusive, it is critical to understand how diverse structural motifs deliver functions relevant for IRES activity. This will be useful for understanding the molecular mechanisms beyond cap-independent translation, as well as the evolutionary history of these regulatory elements. Moreover, it could improve the accuracy to predict IRES-like motifs hidden in genome sequences. This review summarizes recent advances on the diversity and biological relevance of RNA structural motifs for viral IRES elements.
Collapse
Affiliation(s)
- Encarnacion Martinez-Salas
- Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas - Universidad Autónoma de Madrid, Madrid, Spain
| | - Rosario Francisco-Velilla
- Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas - Universidad Autónoma de Madrid, Madrid, Spain
| | - Javier Fernandez-Chamorro
- Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas - Universidad Autónoma de Madrid, Madrid, Spain
| | - Azman M Embarek
- Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas - Universidad Autónoma de Madrid, Madrid, Spain
| |
Collapse
|