1
|
Runge F, Franke J, Fertmann D, Backofen R, Hutter F. Partial RNA design. Bioinformatics 2024; 40:i437-i445. [PMID: 38940170 PMCID: PMC11256918 DOI: 10.1093/bioinformatics/btae222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION RNA design is a key technique to achieve new functionality in fields like synthetic biology or biotechnology. Computational tools could help to find such RNA sequences but they are often limited in their formulation of the search space. RESULTS In this work, we propose partial RNA design, a novel RNA design paradigm that addresses the limitations of current RNA design formulations. Partial RNA design describes the problem of designing RNAs from arbitrary RNA sequences and structure motifs with multiple design goals. By separating the design space from the objectives, our formulation enables the design of RNAs with variable lengths and desired properties, while still allowing precise control over sequence and structure constraints at individual positions. Based on this formulation, we introduce a new algorithm, libLEARNA, capable of efficiently solving different constraint RNA design tasks. A comprehensive analysis of various problems, including a realistic riboswitch design task, reveals the outstanding performance of libLEARNA and its robustness. AVAILABILITY AND IMPLEMENTATION libLEARNA is open-source and publicly available at: https://github.com/automl/learna_tools.
Collapse
Affiliation(s)
- Frederic Runge
- Department of Computer Science, University of Freiburg, Freiburg 79110, Germany
| | - Jörg Franke
- Department of Computer Science, University of Freiburg, Freiburg 79110, Germany
| | - Daniel Fertmann
- Department of Computer Science, University of Freiburg, Freiburg 79110, Germany
| | - Rolf Backofen
- Department of Computer Science, University of Freiburg, Freiburg 79110, Germany
| | - Frank Hutter
- Department of Computer Science, University of Freiburg, Freiburg 79110, Germany
| |
Collapse
|
2
|
Huang H, Lin Z, He D, Hong L, Li Y. RiboDiffusion: tertiary structure-based RNA inverse folding with generative diffusion models. Bioinformatics 2024; 40:i347-i356. [PMID: 38940178 PMCID: PMC11211841 DOI: 10.1093/bioinformatics/btae259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION RNA design shows growing applications in synthetic biology and therapeutics, driven by the crucial role of RNA in various biological processes. A fundamental challenge is to find functional RNA sequences that satisfy given structural constraints, known as the inverse folding problem. Computational approaches have emerged to address this problem based on secondary structures. However, designing RNA sequences directly from 3D structures is still challenging, due to the scarcity of data, the nonunique structure-sequence mapping, and the flexibility of RNA conformation. RESULTS In this study, we propose RiboDiffusion, a generative diffusion model for RNA inverse folding that can learn the conditional distribution of RNA sequences given 3D backbone structures. Our model consists of a graph neural network-based structure module and a Transformer-based sequence module, which iteratively transforms random sequences into desired sequences. By tuning the sampling weight, our model allows for a trade-off between sequence recovery and diversity to explore more candidates. We split test sets based on RNA clustering with different cut-offs for sequence or structure similarity. Our model outperforms baselines in sequence recovery, with an average relative improvement of 11% for sequence similarity splits and 16% for structure similarity splits. Moreover, RiboDiffusion performs consistently well across various RNA length categories and RNA types. We also apply in silico folding to validate whether the generated sequences can fold into the given 3D RNA backbones. Our method could be a powerful tool for RNA design that explores the vast sequence space and finds novel solutions to 3D structural constraints. AVAILABILITY AND IMPLEMENTATION The source code is available at https://github.com/ml4bio/RiboDiffusion.
Collapse
Affiliation(s)
- Han Huang
- Department of Computer Science and Engineering, CUHK, Hong Kong SAR, 999077, China
- School of Computer Science and Engineering, Beihang University, Beijing, 100191, China
| | - Ziqian Lin
- Department of Computer Science and Engineering, CUHK, Hong Kong SAR, 999077, China
- School of Artificial Intelligence, Nanjing University, Nanjing, 210023, China
| | - Dongchen He
- Department of Computer Science and Engineering, CUHK, Hong Kong SAR, 999077, China
| | - Liang Hong
- Department of Computer Science and Engineering, CUHK, Hong Kong SAR, 999077, China
| | - Yu Li
- Department of Computer Science and Engineering, CUHK, Hong Kong SAR, 999077, China
| |
Collapse
|
3
|
Zuber J, Mathews DH. Estimating RNA Secondary Structure Folding Free Energy Changes with efn2. Methods Mol Biol 2024; 2726:1-13. [PMID: 38780725 DOI: 10.1007/978-1-0716-3519-3_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
A number of analyses require estimates of the folding free energy changes of specific RNA secondary structures. These predictions are often based on a set of nearest neighbor parameters that models the folding stability of a RNA secondary structure as the sum of folding stabilities of the structural elements that comprise the secondary structure. In the software suite RNAstructure, the free energy change calculation is implemented in the program efn2. The efn2 program estimates the folding free energy change and the experimental uncertainty in the folding free energy change. It can be run through the graphical user interface for RNAstructure, from the command line, or a web server. This chapter provides detailed protocols for using efn2.
Collapse
Affiliation(s)
- Jeffrey Zuber
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, USA.
- Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY, USA.
| |
Collapse
|
4
|
Zhou T, Dai N, Li S, Ward M, Mathews DH, Huang L. RNA design via structure-aware multifrontier ensemble optimization. Bioinformatics 2023; 39:i563-i571. [PMID: 37387188 DOI: 10.1093/bioinformatics/btad252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION RNA design is the search for a sequence or set of sequences that will fold to desired structure, also known as the inverse problem of RNA folding. However, the sequences designed by existing algorithms often suffer from low ensemble stability, which worsens for long sequence design. Additionally, for many methods only a small number of sequences satisfying the MFE criterion can be found by each run of design. These drawbacks limit their use cases. RESULTS We propose an innovative optimization paradigm, SAMFEO, which optimizes ensemble objectives (equilibrium probability or ensemble defect) by iterative search and yields a very large number of successfully designed RNA sequences as byproducts. We develop a search method which leverages structure level and ensemble level information at different stages of the optimization: initialization, sampling, mutation, and updating. Our work, while being less complicated than others, is the first algorithm that is able to design thousands of RNA sequences for the puzzles from the Eterna100 benchmark. In addition, our algorithm solves the most Eterna100 puzzles among all the general optimization based methods in our study. The only baseline solving more puzzles than our work is dependent on handcrafted heuristics designed for a specific folding model. Surprisingly, our approach shows superiority on designing long sequences for structures adapted from the database of 16S Ribosomal RNAs. AVAILABILITY AND IMPLEMENTATION Our source code and data used in this article is available at https://github.com/shanry/SAMFEO.
Collapse
Affiliation(s)
- Tianshuo Zhou
- School of Electrical Engineering and Computer Science, Oregon State University, Corvalli OR 97330, United States
| | - Ning Dai
- School of Electrical Engineering and Computer Science, Oregon State University, Corvalli OR 97330, United States
| | - Sizhen Li
- School of Electrical Engineering and Computer Science, Oregon State University, Corvalli OR 97330, United States
| | - Max Ward
- Department of Computer Science and Software Engineering, The University of Western Australia, Perth, Australia
| | - David H Mathews
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY 14642, United States
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, United States
- Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, United States
| | - Liang Huang
- School of Electrical Engineering and Computer Science, Oregon State University, Corvalli OR 97330, United States
| |
Collapse
|
5
|
Fong JHC, Chu HY, Zhou P, Wong ASL. Parallel engineering and activity profiling of a base editor system. Cell Syst 2023; 14:392-403.e4. [PMID: 37164010 DOI: 10.1016/j.cels.2023.03.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 02/14/2023] [Accepted: 03/29/2023] [Indexed: 05/12/2023]
Abstract
Selecting the most suitable existing base editors and engineering new variants for installing specific base conversions with maximal efficiency and minimal undesired edits are pivotal for precise genome editing applications. Here, we present a platform for creating and analyzing a library of engineered base editor variants to enable head-to-head evaluation of their editing performance at scale. Our comprehensive comparison provides quantitative measures on each variant's editing efficiency, purity, motif preference, and bias in generating single and multiple base conversions, while uncovering undesired higher indel generation rate and noncanonical base conversion for some of the existing base editors. In addition to engineering the base editor protein, we further applied this platform to investigate a hitherto underexplored engineering route and created guide RNA scaffold variants that augment the editor's base-editing activity. With the unknown performance and compatibility of the growing number of engineered parts including deaminase, CRISPR-Cas enzyme, and guide RNA scaffold variants for assembling the expanding collection of base editor systems, our platform addresses the unmet need for an unbiased, scalable method to benchmark their editing outcomes and accelerate the engineering of next-generation precise genome editors.
Collapse
Affiliation(s)
- John H C Fong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Hong Kong SAR, China
| | - Hoi Yee Chu
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Peng Zhou
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Alan S L Wong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China; Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
6
|
Merleau NSC, Smerlak M. aRNAque: an evolutionary algorithm for inverse pseudoknotted RNA folding inspired by Lévy flights. BMC Bioinformatics 2022; 23:335. [PMID: 35964008 PMCID: PMC9375295 DOI: 10.1186/s12859-022-04866-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 07/29/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND We study in this work the inverse folding problem for RNA, which is the discovery of sequences that fold into given target secondary structures. RESULTS We implement a Lévy mutation scheme in an updated version of aRNAque an evolutionary inverse folding algorithm and apply it to the design of RNAs with and without pseudoknots. We find that the Lévy mutation scheme increases the diversity of designed RNA sequences and reduces the average number of evaluations of the evolutionary algorithm. Compared to antaRNA, aRNAque CPU time is higher but more successful in finding designed sequences that fold correctly into the target structures. CONCLUSION We propose that a Lévy flight offers a better standard mutation scheme for optimizing RNA design. Our new version of aRNAque is available on GitHub as a python script and the benchmark results show improved performance on both Pseudobase++ and the Eterna100 datasets, compared to existing inverse folding tools.
Collapse
Affiliation(s)
- Nono S. C. Merleau
- Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, 04103 Leipzig, Germany
| | - Matteo Smerlak
- Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, 04103 Leipzig, Germany
| |
Collapse
|
7
|
Pan Y, Li M, Huang J, Pan W, Shi T, Guo Q, Yang G, Nie X. Genome-Wide Identification and Characterization of RNA/DNA Differences Associated with Drought Response in Wheat. Int J Mol Sci 2022; 23:1405. [PMID: 35163325 PMCID: PMC8836135 DOI: 10.3390/ijms23031405] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 01/23/2022] [Accepted: 01/24/2022] [Indexed: 12/19/2022] Open
Abstract
RNA/DNA difference (RDD) is a post-transcriptional RNA modification to enrich genetic information, widely involved in regulating diverse biological processes in eukaryotes. RDDs in the wheat nuclear genome, especially those associated with drought response or tolerance, were not well studied up to now. In this study, we investigated the RDDs related to drought response based on the RNA-seq data of drought-stressed and control samples in wheat. In total, 21,782 unique RDDs were identified, of which 265 were found to be drought-induced, representing the first drought-responsive RDD landscape in the wheat nuclear genome. The drought-responsive RDDs were located in 69 genes, of which 35 were differentially expressed under drought stress. Furthermore, the effects of RNA/DNA differences were investigated, showing that they could result in changes of RNA secondary structure, miRNA-target binding as well as protein conserved domains in the RDD-containing genes. In particular, the A to C mutation in TraesCS2A02G053100 (orthology to OsRLCK) led to the loss of tae-miR9657b-5p targeting, indicating that RNA/DNA difference might mediate miRNA to regulate the drought-response process. This study reported the first drought-responsive RDDs in the wheat nuclear genome. It sheds light on the roles of RDD in drought tolerance, and may also contribute to wheat genetic improvement based on epi-transcriptome methods.
Collapse
Affiliation(s)
- Yan Pan
- State Key Laboratory of Crop Stress Biology in Arid Areas, College of Agronomy, Northwest A&F University, Yangling 712100, China; (Y.P.); (M.L.); (J.H.); (W.P.); (T.S.); (Q.G.); (G.Y.)
| | - Mengqi Li
- State Key Laboratory of Crop Stress Biology in Arid Areas, College of Agronomy, Northwest A&F University, Yangling 712100, China; (Y.P.); (M.L.); (J.H.); (W.P.); (T.S.); (Q.G.); (G.Y.)
| | - Jiaqian Huang
- State Key Laboratory of Crop Stress Biology in Arid Areas, College of Agronomy, Northwest A&F University, Yangling 712100, China; (Y.P.); (M.L.); (J.H.); (W.P.); (T.S.); (Q.G.); (G.Y.)
| | - Wenqiu Pan
- State Key Laboratory of Crop Stress Biology in Arid Areas, College of Agronomy, Northwest A&F University, Yangling 712100, China; (Y.P.); (M.L.); (J.H.); (W.P.); (T.S.); (Q.G.); (G.Y.)
| | - Tingrui Shi
- State Key Laboratory of Crop Stress Biology in Arid Areas, College of Agronomy, Northwest A&F University, Yangling 712100, China; (Y.P.); (M.L.); (J.H.); (W.P.); (T.S.); (Q.G.); (G.Y.)
| | - Qifan Guo
- State Key Laboratory of Crop Stress Biology in Arid Areas, College of Agronomy, Northwest A&F University, Yangling 712100, China; (Y.P.); (M.L.); (J.H.); (W.P.); (T.S.); (Q.G.); (G.Y.)
| | - Guang Yang
- State Key Laboratory of Crop Stress Biology in Arid Areas, College of Agronomy, Northwest A&F University, Yangling 712100, China; (Y.P.); (M.L.); (J.H.); (W.P.); (T.S.); (Q.G.); (G.Y.)
| | - Xiaojun Nie
- State Key Laboratory of Crop Stress Biology in Arid Areas, College of Agronomy, Northwest A&F University, Yangling 712100, China; (Y.P.); (M.L.); (J.H.); (W.P.); (T.S.); (Q.G.); (G.Y.)
- ICARDA-NWSUAF Joint Research Center, Yangling 712100, China
| |
Collapse
|
8
|
Minuesa G, Alsina C, Garcia-Martin JA, Oliveros J, Dotu I. MoiRNAiFold: a novel tool for complex in silico RNA design. Nucleic Acids Res 2021; 49:4934-4943. [PMID: 33956139 PMCID: PMC8136780 DOI: 10.1093/nar/gkab331] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 04/09/2021] [Accepted: 04/21/2021] [Indexed: 12/23/2022] Open
Abstract
Novel tools for in silico design of RNA constructs such as riboregulators are required in order to reduce time and cost to production for the development of diagnostic and therapeutic advances. Here, we present MoiRNAiFold, a versatile and user-friendly tool for de novo synthetic RNA design. MoiRNAiFold is based on Constraint Programming and it includes novel variable types, heuristics and restart strategies for Large Neighborhood Search. Moreover, this software can handle dozens of design constraints and quality measures and improves features for RNA regulation control of gene expression, such as Translation Efficiency calculation. We demonstrate that MoiRNAiFold outperforms any previous software in benchmarking structural RNA puzzles from EteRNA. Importantly, with regard to biologically relevant RNA designs, we focus on RNA riboregulators, demonstrating that the designed RNA sequences are functional both in vitro and in vivo. Overall, we have generated a powerful tool for de novo complex RNA design that we make freely available as a web server (https://moiraibiodesign.com/design/).
Collapse
Affiliation(s)
- Gerard Minuesa
- Moirai Biodesign, c/ Baldiri Reixach s/n, Parc Científic de Barcelona (PCB), 08028 Barcelona, Spain
| | - Cristina Alsina
- Moirai Biodesign, c/ Baldiri Reixach s/n, Parc Científic de Barcelona (PCB), 08028 Barcelona, Spain
| | - Juan Antonio Garcia-Martin
- Bioinformatics for Genomics and Proteomics. National Centre for Biotechnology (CNB-CSIC). c/ Darwin 3, 28049 Madrid, Spain
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Universidad Carlos III de Madrid, 28911 Madrid, Spain
| | - Juan Carlos Oliveros
- Bioinformatics for Genomics and Proteomics. National Centre for Biotechnology (CNB-CSIC). c/ Darwin 3, 28049 Madrid, Spain
| | - Ivan Dotu
- Moirai Biodesign, c/ Baldiri Reixach s/n, Parc Científic de Barcelona (PCB), 08028 Barcelona, Spain
| |
Collapse
|
9
|
Inverse RNA Folding Workflow to Design and Test Ribozymes that Include Pseudoknots. Methods Mol Biol 2021; 2167:113-143. [PMID: 32712918 DOI: 10.1007/978-1-0716-0716-9_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Ribozymes are RNAs that catalyze reactions. They occur in nature, and can also be evolved in vitro to catalyze novel reactions. This chapter provides detailed protocols for using inverse folding software to design a ribozyme sequence that will fold to a known ribozyme secondary structure and for testing the catalytic activity of the sequence experimentally. This protocol is able to design sequences that include pseudoknots, which is important as all naturally occurring full-length ribozymes have pseudoknots. The starting point is the known pseudoknot-containing secondary structure of the ribozyme and knowledge of any nucleotides whose identity is required for function. The output of the protocol is a set of sequences that have been tested for function. Using this protocol, we were previously successful at designing highly active double-pseudoknotted HDV ribozymes.
Collapse
|
10
|
Badu S, Melnik R, Singh S. Mathematical and computational models of RNA nanoclusters and their applications in data-driven environments. MOLECULAR SIMULATION 2020. [DOI: 10.1080/08927022.2020.1804564] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Shyam Badu
- MS2Discovery Interdisciplinary Research Institute, Wilfrid Laurier University, Waterloo, Ontario, Canada
| | - Roderick Melnik
- MS2Discovery Interdisciplinary Research Institute, Wilfrid Laurier University, Waterloo, Ontario, Canada
- BCAM-Basque Center for Applied Mathematics, Bilbao, Spain
| | - Sundeep Singh
- MS2Discovery Interdisciplinary Research Institute, Wilfrid Laurier University, Waterloo, Ontario, Canada
| |
Collapse
|
11
|
Abstract
A ribonucleic acid (RNA) sequence is a word over an alphabet on four elements [Formula: see text] called bases. RNA sequences fold into secondary structures where some bases pair with one another, while others remain unpaired. The two fundamental problems in RNA algorithmic are to predict how sequences fold within some models of energy and to design sequences of bases that will fold into targeted secondary structures. Predicting how a given RNA sequence folds into a pseudoknot-free secondary structure is known to be solvable in cubic time since the eighties and in truly subcubic time by a recent result of Bringmann et al. (FOCS, 2016), whereas Lyngsø has shown it is computationally hard if pseudoknots are allowed (ICALP, 2004). As a stark contrast, it is unknown whether or not designing a given RNA secondary structure is a tractable task; this has been raised as a challenging open question by Condon (ICALP, 2003). Because of its crucial importance in a number of fields such as pharmaceutical research and biochemistry, there are dozens of heuristics and software libraries dedicated to the RNA secondary structure design. It is therefore rather surprising that the computational complexity of this central problem in bioinformatics has been unsettled for decades. In this article, we show that in the simplest model of energy, which is the Watson-Crick model, the design of secondary structures is computationally hard if one adds natural constraints of the form: index i of the sequence has to be labeled by base b. This negative result suggests that the same lower bound holds for more realistic models of energy. It is noteworthy that the additional constraints are by no means artificial: they are provided by all the RNA design pieces of software and they do correspond to the actual practice (e.g., the instances of the EteRNA project).
Collapse
Affiliation(s)
- Édouard Bonnet
- Univ Lyon, CNRS, ENS de Lyon, Université Claude Bernard Lyon 1, LIP UMR5668, Lyon, France
| | - Paweł Rzążewski
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland.,Faculty of Mathematics, Informatics and Mechanics, Institute of Informatics, University of Warsaw, Warsaw, Poland
| | - Florian Sikora
- Université Paris-Dauphine, PSL University, CNRS, LAMSADE, Paris, France
| |
Collapse
|
12
|
Inverse folding with RNA-As-Graphs produces a large pool of candidate sequences with target topologies. J Struct Biol 2019; 209:107438. [PMID: 31874236 DOI: 10.1016/j.jsb.2019.107438] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 12/18/2019] [Accepted: 12/19/2019] [Indexed: 02/07/2023]
Abstract
We present an RNA-As-Graphs (RAG) based inverse folding algorithm, RAG-IF, to design novel RNA sequences that fold onto target tree graph topologies. The algorithm can be used to enhance our recently reported computational design pipeline (Jain et al., NAR 2018). The RAG approach represents RNA secondary structures as tree and dual graphs, where RNA loops and helices are coarse-grained as vertices and edges, opening the usage of graph theory methods to study, predict, and design RNA structures. Our recently developed computational pipeline for design utilizes graph partitioning (RAG-3D) and atomic fragment assembly (F-RAG) to design sequences to fold onto RNA-like tree graph topologies; the atomic fragments are taken from existing RNA structures that correspond to tree subgraphs. Because F-RAG may not produce the target folds for all designs, automated mutations by RAG-IF algorithm enhance the candidate pool markedly. The crucial residues for mutation are identified by differences between the predicted and the target topology. A genetic algorithm then mutates the selected residues, and the successful sequences are optimized to retain only the minimal or essential mutations. Here we evaluate RAG-IF for 6 RNA-like topologies and generate a large pool of successful candidate sequences with a variety of minimal mutations. We find that RAG-IF adds robustness and efficiency to our RNA design pipeline, making inverse folding motivated by graph topology rather than secondary structure more productive.
Collapse
|
13
|
Yamagami R, Kayedkhordeh M, Mathews DH, Bevilacqua PC. Design of highly active double-pseudoknotted ribozymes: a combined computational and experimental study. Nucleic Acids Res 2019; 47:29-42. [PMID: 30462314 PMCID: PMC6326823 DOI: 10.1093/nar/gky1118] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2018] [Accepted: 10/24/2018] [Indexed: 01/02/2023] Open
Abstract
Design of RNA sequences that adopt functional folds establishes principles of RNA folding and applications in biotechnology. Inverse folding for RNAs, which allows computational design of sequences that adopt specific structures, can be utilized for unveiling RNA functions and developing genetic tools in synthetic biology. Although many algorithms for inverse RNA folding have been developed, the pseudoknot, which plays a key role in folding of ribozymes and riboswitches, is not addressed in most algorithms. For the few algorithms that attempt to predict pseudoknot-containing ribozymes, self-cleavage activity has not been tested. Herein, we design double-pseudoknot HDV ribozymes using an inverse RNA folding algorithm and test their kinetic mechanisms experimentally. More than 90% of the positively designed ribozymes possess self-cleaving activity, whereas more than 70% of negative control ribozymes, which are predicted to fold to the necessary structure but with low fidelity, do not possess it. Kinetic and mutation analyses reveal that these RNAs cleave site-specifically and with the same mechanism as the WT ribozyme. Most ribozymes react just 50- to 80-fold slower than the WT ribozyme, and this rate can be improved to near WT by modification of a junction. Thus, fast-cleaving functional ribozymes with multiple pseudoknots can be designed computationally.
Collapse
Affiliation(s)
- Ryota Yamagami
- Department of Chemistry, Pennsylvania State University, University Park, PA 16802, USA.,Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Mohammad Kayedkhordeh
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York, NY 14642, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York, NY 14642, USA.,Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, New York, NY 14642, USA
| | - Philip C Bevilacqua
- Department of Chemistry, Pennsylvania State University, University Park, PA 16802, USA.,Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA.,Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
14
|
Wu MJ, Andreasson JOL, Kladwang W, Greenleaf W, Das R. Automated Design of Diverse Stand-Alone Riboswitches. ACS Synth Biol 2019; 8:1838-1846. [PMID: 31298841 PMCID: PMC6703183 DOI: 10.1021/acssynbio.9b00142] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
![]()
Riboswitches that couple binding
of ligands to conformational changes
offer sensors and control elements for RNA synthetic biology and medical
biotechnology. However, design of these riboswitches has required
expert intuition or software specialized to transcription or translation
outputs; design has been particularly challenging for applications
in which the riboswitch output cannot be amplified by other molecular
machinery. We present a fully automated design method called RiboLogic
for such “stand-alone” riboswitches and test it via high-throughput experiments on 2875 molecules using
RNA-MaP (RNA on a massively parallel array) technology. These molecules
consistently modulate their affinity to the MS2 bacteriophage coat
protein upon binding of flavin mononucleotide, tryptophan, theophylline,
and microRNA miR-208a, achieving activation ratios of up to 20 and
significantly better performance than control designs. By encompassing
a wide diversity of stand-alone switches and highly quantitative data,
the resulting ribologic-solves experimental data
set provides a rich resource for further improvement of riboswitch
models and design methods.
Collapse
|
15
|
Su C, Weir JD, Zhang F, Yan H, Wu T. ENTRNA: a framework to predict RNA foldability. BMC Bioinformatics 2019; 20:373. [PMID: 31269893 PMCID: PMC6610807 DOI: 10.1186/s12859-019-2948-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Accepted: 06/12/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA molecules play many crucial roles in living systems. The spatial complexity that exists in RNA structures determines their cellular functions. Therefore, understanding RNA folding conformations, in particular, RNA secondary structures, is critical for elucidating biological functions. Existing literature has focused on RNA design as either an RNA structure prediction problem or an RNA inverse folding problem where free energy has played a key role. RESULTS In this research, we propose a Positive-Unlabeled data- driven framework termed ENTRNA. Other than free energy and commonly studied sequence and structural features, we propose a new feature, Sequence Segment Entropy (SSE), to measure the diversity of RNA sequences. ENTRNA is trained and cross-validated using 1024 pseudoknot-free RNAs and 1060 pseudoknotted RNAs from the RNASTRAND database respectively. To test the robustness of the ENTRNA, the models are further blind tested on 206 pseudoknot-free and 93 pseudoknotted RNAs from the PDB database. For pseudoknot-free RNAs, ENTRNA has 86.5% sensitivity on the training dataset and 80.6% sensitivity on the testing dataset. For pseudoknotted RNAs, ENTRNA shows 81.5% sensitivity on the training dataset and 71.0% on the testing dataset. To test the applicability of ENTRNA to long structural-complex RNA, we collect 5 laboratory synthetic RNAs ranging from 1618 to 1790 nucleotides. ENTRNA is able to predict the foldability of 4 RNAs. CONCLUSION In this article, we reformulate the RNA design problem as a foldability prediction problem which is to predict the likelihood of the co-existence of a sequence-structure pair. This new construct has the potential for both RNA structure prediction and the inverse folding problem. In addition, this new construct enables us to explore data-driven approaches in RNA research.
Collapse
Affiliation(s)
- Congzhe Su
- School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, AZ 85281 USA
| | - Jeffery D. Weir
- Department of Operational Sciences, Graduate School of Engineering and Management, Air Force Institute of Technology, Wright-Patterson AFB, Dayton, OH 45433 USA
| | - Fei Zhang
- Biodesign Center for Molecular Design and Biomimetics, The Biodesign Institute & School of Molecular Sciences, Arizona State University, Tempe, AZ 85281 USA
| | - Hao Yan
- Biodesign Center for Molecular Design and Biomimetics, The Biodesign Institute & School of Molecular Sciences, Arizona State University, Tempe, AZ 85281 USA
| | - Teresa Wu
- School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, AZ 85281 USA
| |
Collapse
|
16
|
Bui LM, Geraldi A, Nguyen TT, Lee JH, Lee JY, Cho BK, Kim SC. mRNA Engineering for the Efficient Chaperone-Mediated Co-Translational Folding of Recombinant Proteins in Escherichia coli. Int J Mol Sci 2019; 20:ijms20133163. [PMID: 31261687 PMCID: PMC6651523 DOI: 10.3390/ijms20133163] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Revised: 06/18/2019] [Accepted: 06/21/2019] [Indexed: 12/22/2022] Open
Abstract
The production of soluble, functional recombinant proteins by engineered bacterial hosts is challenging. Natural molecular chaperone systems have been used to solubilize various recombinant proteins with limited success. Here, we attempted to facilitate chaperone-mediated folding by directing the molecular chaperones to their protein substrates before the co-translational folding process completed. To achieve this, we either anchored the bacterial chaperone DnaJ to the 3ʹ untranslated region of a target mRNA by fusing with an RNA-binding domain in the chaperone-recruiting mRNA scaffold (CRAS) system, or coupled the expression of DnaJ and a target recombinant protein using the overlapping stop-start codons 5ʹ-TAATG-3ʹ between the two genes in a chaperone-substrate co-localized expression (CLEX) system. By engineering the untranslated and intergenic sequences of the mRNA transcript, bacterial molecular chaperones are spatially constrained to the location of protein translation, expressing selected aggregation-prone proteins in their functionally active, soluble form. Our mRNA engineering methods surpassed the in-vivo solubilization efficiency of the simple DnaJ chaperone co-overexpression method, thus providing more effective tools for producing soluble therapeutic proteins and enzymes.
Collapse
Affiliation(s)
- Le Minh Bui
- KAIST Institute for BioCentury, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
- NTT Hi-Tech Institute, Nguyen Tat Thanh University (NTTU), Ho Chi Minh City 700000, Vietnam
| | - Almando Geraldi
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
- Biology Department, Science and Technology Faculty, Universitas Airlangga Mulyorejo, Surabaya 60115, Indonesia
| | - Thi Thuy Nguyen
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
| | - Jun Hyoung Lee
- KAIST Institute for BioCentury, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
| | - Ju Young Lee
- Center for Bio-based Chemistry, Korea Research Institute of Chemical Technology (KRICT), Ulsan 44429, Korea
| | - Byung-Kwan Cho
- KAIST Institute for BioCentury, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea.
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea.
- Intelligent Synthetic Biology Center, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea.
| | - Sun Chang Kim
- KAIST Institute for BioCentury, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea.
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea.
- Intelligent Synthetic Biology Center, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea.
| |
Collapse
|
17
|
Koodli RV, Keep B, Coppess KR, Portela F, Das R. EternaBrain: Automated RNA design through move sets and strategies from an Internet-scale RNA videogame. PLoS Comput Biol 2019; 15:e1007059. [PMID: 31247029 PMCID: PMC6597038 DOI: 10.1371/journal.pcbi.1007059] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 04/30/2019] [Indexed: 11/18/2022] Open
Abstract
Emerging RNA-based approaches to disease detection and gene therapy require RNA sequences that fold into specific base-pairing patterns, but computational algorithms generally remain inadequate for these secondary structure design tasks. The Eterna project has crowdsourced RNA design to human video game players in the form of puzzles that reach extraordinary difficulty. Here, we demonstrate that Eterna participants' moves and strategies can be leveraged to improve automated computational RNA design. We present an eternamoves-large repository consisting of 1.8 million of player moves on 12 of the most-played Eterna puzzles as well as an eternamoves-select repository of 30,477 moves from the top 72 players on a select set of more advanced puzzles. On eternamoves-select, we present a multilayer convolutional neural network (CNN) EternaBrain that achieves test accuracies of 51% and 34% in base prediction and location prediction, respectively, suggesting that top players' moves are partially stereotyped. Pipelining this CNN's move predictions with single-action-playout (SAP) of six strategies compiled by human players solves 61 out of 100 independent puzzles in the Eterna100 benchmark. EternaBrain-SAP outperforms previously published RNA design algorithms and achieves similar or better performance than a newer generation of deep learning methods, while being largely orthogonal to these other methods. Our study provides useful lessons for future efforts to achieve human-competitive performance with automated RNA design algorithms.
Collapse
Affiliation(s)
- Rohan V. Koodli
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Benjamin Keep
- Department of Education, Stanford University, Stanford, CA, United States of America
| | - Katherine R. Coppess
- Department of Physics, Stanford University, Stanford, CA, United States of America
| | - Fernando Portela
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, United States of America
| | | | - Rhiju Das
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, United States of America
- Department of Physics, Stanford University, Stanford, CA, United States of America
| |
Collapse
|
18
|
Evolving methods for rational de novo design of functional RNA molecules. Methods 2019; 161:54-63. [PMID: 31059832 DOI: 10.1016/j.ymeth.2019.04.022] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 04/26/2019] [Accepted: 04/29/2019] [Indexed: 12/16/2022] Open
Abstract
Artificial RNA molecules with novel functionality have many applications in synthetic biology, pharmacy and white biotechnology. The de novo design of such devices using computational methods and prediction tools is a resource-efficient alternative to experimental screening and selection pipelines. In this review, we describe methods common to many such computational approaches, thoroughly dissect these methods and highlight open questions for the individual steps. Initially, it is essential to investigate the biological target system, the regulatory mechanism that will be exploited, as well as the desired components in order to define design objectives. Subsequent computational design is needed to combine the selected components and to obtain novel functionality. This process can usually be split into constrained sequence sampling, the formulation of an optimization problem and an in silico analysis to narrow down the number of candidates with respect to secondary goals. Finally, experimental analysis is important to check whether the defined design objectives are indeed met in the target environment and detailed characterization experiments should be performed to improve the mechanistic models and detect missing design requirements.
Collapse
|
19
|
Jain S, Saju S, Petingi L, Schlick T. An extended dual graph library and partitioning algorithm applicable to pseudoknotted RNA structures. Methods 2019; 162-163:74-84. [PMID: 30928508 DOI: 10.1016/j.ymeth.2019.03.022] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 02/28/2019] [Accepted: 03/22/2019] [Indexed: 12/18/2022] Open
Abstract
Exploring novel RNA topologies is imperative for understanding RNA structure and pursuing its design. Our RNA-As-Graphs (RAG) approach exploits graph theory tools and uses coarse-grained tree and dual graphs to represent RNA helices and loops by vertices and edges. Only dual graphs represent pseudoknotted RNAs fully. Here we develop a dual graph enumeration algorithm to generate an expanded library of dual graph topologies for 2-9 vertices, and extend our dual graph partitioning algorithm to identify all possible RNA subgraphs. Our enumeration algorithm connects smaller-vertex graphs, using all possible edge combinations, to build larger-vertex graphs and retain all non-isomorphic graph topologies, thereby more than doubling the size of our prior library to a total of 110,667 dual graph topologies. We apply our dual graph partitioning algorithm, which keeps pseudoknots and junctions intact, to all existing RNA structures to identify all possible substructures up to 9 vertices. In addition, our expanded dual graph library assigns graph topologies to all RNA graphs and subgraphs, rectifying prior inconsistencies. We update our RAG-3Dual database of RNA atomic fragments with all newly identified substructures and their graph IDs, increasing its size by more than 50 times. The enlarged dual graph library and RAG-3Dual database provide a comprehensive repertoire of graph topologies and atomic fragments to study yet undiscovered RNA molecules and design RNA sequences with novel topologies, including a variety of pseudoknotted RNAs.
Collapse
Affiliation(s)
- Swati Jain
- Department of Chemistry, New York University, 1021 Silver, 100 Washington Square East, New York, NY 10003, USA
| | - Sera Saju
- Department of Chemistry, New York University, 1021 Silver, 100 Washington Square East, New York, NY 10003, USA
| | - Louis Petingi
- Computer Science Department, College of Staten Island, City University of New York, Staten Island, New York, NY 10314, USA
| | - Tamar Schlick
- Department of Chemistry, New York University, 1021 Silver, 100 Washington Square East, New York, NY 10003, USA; Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA; NYU-East China Normal University Center for Computational Chemistry at New York University Shanghai, Room 340, Geography Building, North Zhongshan Road, 3663 Shanghai, China.
| |
Collapse
|
20
|
Bellaousov S, Kayedkhordeh M, Peterson RJ, Mathews DH. Accelerated RNA secondary structure design using preselected sequences for helices and loops. RNA (NEW YORK, N.Y.) 2018; 24:1555-1567. [PMID: 30097542 PMCID: PMC6191713 DOI: 10.1261/rna.066324.118] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Accepted: 08/06/2018] [Indexed: 06/08/2023]
Abstract
Nucleic acids can be designed to be nano-machines, pharmaceuticals, or probes. RNA secondary structures can form the basis of self-assembling nanostructures. There are only four natural RNA bases, therefore it can be difficult to design sequences that fold to a single, specified structure because many other structures are often possible for a given sequence. One approach taken by state-of-the-art sequence design methods is to select sequences that fold to the specified structure using stochastic, iterative refinement. The goal of this work is to accelerate design. Many existing iterative methods select and refine sequences one base pair and one unpaired nucleotide at a time. Here, the hypothesis that sequences can be preselected in order to accelerate design was tested. To this aim, a database was built of helix sequences that demonstrate thermodynamic features found in natural sequences and that also have little tendency to cross-hybridize. Additionally, a database was assembled of RNA loop sequences with low helix-formation propensity and little tendency to cross-hybridize with either the helices or other loops. These databases of preselected sequences accelerate the selection of sequences that fold with minimal ensemble defect by replacing some of the trial and error of current refinement approaches. When using the database of preselected sequences as compared to randomly chosen sequences, sequences for natural structures are designed 36 times faster, and random structures are designed six times faster. The sequences selected with the aid of the database have similar ensemble defect as those sequences selected at random. The sequence database is part of RNAstructure package at http://rna.urmc.rochester.edu/RNAstructure.html.
Collapse
Affiliation(s)
- Stanislav Bellaousov
- Department of Biochemistry and Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York 14642, USA
| | - Mohammad Kayedkhordeh
- Department of Biochemistry and Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York 14642, USA
| | | | - David H Mathews
- Department of Biochemistry and Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York 14642, USA
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York 14642, USA
| |
Collapse
|
21
|
Jain S, Laederach A, Ramos SBV, Schlick T. A pipeline for computational design of novel RNA-like topologies. Nucleic Acids Res 2018; 46:7040-7051. [PMID: 30137633 PMCID: PMC6101589 DOI: 10.1093/nar/gky524] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Revised: 05/22/2018] [Accepted: 05/24/2018] [Indexed: 12/11/2022] Open
Abstract
Designing novel RNA topologies is a challenge, with important therapeutic and industrial applications. We describe a computational pipeline for design of novel RNA topologies based on our coarse-grained RNA-As-Graphs (RAG) framework. RAG represents RNA structures as tree graphs and describes RNA secondary (2D) structure topologies (currently up to 13 vertices, ≈260 nucleotides). We have previously identified novel graph topologies that are RNA-like among these. Here we describe a systematic design pipeline and illustrate design for six broad design problems using recently developed tools for graph-partitioning and fragment assembly (F-RAG). Following partitioning of the target graph, corresponding atomic fragments from our RAG-3D database are combined using F-RAG, and the candidate atomic models are scored using a knowledge-based potential developed for 3D structure prediction. The sequences of the top scoring models are screened further using available tools for 2D structure prediction. The results indicate that our modular approach based on RNA-like topologies rather than specific 2D structures allows for greater flexibility in the design process, and generates a large number of candidate sequences quickly. Experimental structure probing using SHAPE-MaP for two sequences agree with our predictions and suggest that our combined tools yield excellent candidates for further sequence and experimental screening.
Collapse
Affiliation(s)
- Swati Jain
- Department of Chemistry, New York University, 1001 Silver, 100 Washington Square East, New York, NY 10003, USA
| | - Alain Laederach
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Silvia B V Ramos
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Tamar Schlick
- Department of Chemistry, New York University, 1001 Silver, 100 Washington Square East, New York, NY 10003, USA
- Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA
- NYU-ECNU Center for Computational Chemistry at New York University Shanghai, Room 340, Geography Building, North Zhongshan Road, 3663 Shanghai, China
| |
Collapse
|
22
|
Eastman P, Shi J, Ramsundar B, Pande VS. Solving the RNA design problem with reinforcement learning. PLoS Comput Biol 2018; 14:e1006176. [PMID: 29927936 PMCID: PMC6029810 DOI: 10.1371/journal.pcbi.1006176] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2017] [Revised: 07/03/2018] [Accepted: 05/04/2018] [Indexed: 11/19/2022] Open
Abstract
We use reinforcement learning to train an agent for computational RNA design: given a target secondary structure, design a sequence that folds to that structure in silico. Our agent uses a novel graph convolutional architecture allowing a single model to be applied to arbitrary target structures of any length. After training it on randomly generated targets, we test it on the Eterna100 benchmark and find it outperforms all previous algorithms. Analysis of its solutions shows it has successfully learned some advanced strategies identified by players of the game Eterna, allowing it to solve some very difficult structures. On the other hand, it has failed to learn other strategies, possibly because they were not required for the targets in the training set. This suggests the possibility that future improvements to the training protocol may yield further gains in performance.
Collapse
Affiliation(s)
- Peter Eastman
- Department of Bioengineering, Stanford University, Stanford, CA, United States of America
| | - Jade Shi
- Department of Chemistry, Stanford University, Stanford, CA, United States of America
| | - Bharath Ramsundar
- Department of Computer Science, Stanford University, Stanford, CA, United States of America
| | - Vijay S. Pande
- Department of Bioengineering, Stanford University, Stanford, CA, United States of America
| |
Collapse
|
23
|
Lotfi M, Zare-Mirakabad F, Montaseri S. RNA design using simulated SHAPE data. Genes Genet Syst 2018; 92:257-265. [PMID: 28757510 DOI: 10.1266/ggs.16-00067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
It has long been established that in addition to being involved in protein translation, RNA plays essential roles in numerous other cellular processes, including gene regulation and DNA replication. Such roles are known to be dictated by higher-order structures of RNA molecules. It is therefore of prime importance to find an RNA sequence that can fold to acquire a particular function that is desirable for use in pharmaceuticals and basic research. The challenge of finding an RNA sequence for a given structure is known as the RNA design problem. Although there are several algorithms to solve this problem, they mainly consider hard constraints, such as minimum free energy, to evaluate the predicted sequences. Recently, SHAPE data has emerged as a new soft constraint for RNA secondary structure prediction. To take advantage of this new experimental constraint, we report here a new method for accurate design of RNA sequences based on their secondary structures using SHAPE data as pseudo-free energy. We then compare our algorithm with four others: INFO-RNA, ERD, MODENA and RNAifold 2.0. Our algorithm precisely predicts 26 out of 29 new sequences for the structures extracted from the Rfam dataset, while the other four algorithms predict no more than 22 out of 29. The proposed algorithm is comparable to the above algorithms on RNA-SSD datasets, where they can predict up to 33 appropriate sequences for RNA secondary structures out of 34.
Collapse
Affiliation(s)
- Mohadeseh Lotfi
- Faculty of Mathematics and Computer Science, Amirkabir University of Technology
| | | | - Soheila Montaseri
- School of Mathematics, Statistics and Computer Science, College of Science, Enghelab Avenue, University of Tehran
| |
Collapse
|
24
|
Guo S, Piao X, Li H, Guo P. Methods for construction and characterization of simple or special multifunctional RNA nanoparticles based on the 3WJ of phi29 DNA packaging motor. Methods 2018. [PMID: 29530505 DOI: 10.1016/j.ymeth.2018.02.025] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The field of RNA nanotechnology has developed rapidly over the last decade, as more elaborate RNA nanoarchitectures and therapeutic RNA nanoparticles have been constructed, and their applications have been extensively explored. Now it is time to offer different levels of RNA construction methods for both the beginners and the experienced researchers or enterprisers. The first and second parts of this article will provide instructions on basic and simple methods for the assembly and characterization of RNA nanoparticles, mainly based on the pRNA three-way junction (pRNA-3WJ) of phi29 DNA packaging motor. The third part of this article will focus on specific methods for the construction of more sophisticated multivalent RNA nanoparticles for therapeutic applications. In these parts, some simple protocols are provided to facilitate the initiation of the RNA nanoparticle construction in labs new to the field of RNA nanotechnology. This article is intended to serve as a general reference aimed at both apprentices and senior scientists for their future design, construction and characterization of RNA nanoparticles based on the pRNA-3WJ of phi29 DNA packaging motor.
Collapse
Affiliation(s)
- Sijin Guo
- Center for RNA Nanobiotechnology and Nanomedicine, The Ohio State University, Columbus, OH 43210, USA; College of Pharmacy, Division of Pharmaceutics and Pharmaceutical Chemistry, The Ohio State University, Columbus, OH 43210, USA
| | - Xijun Piao
- Center for RNA Nanobiotechnology and Nanomedicine, The Ohio State University, Columbus, OH 43210, USA; College of Pharmacy, Division of Pharmaceutics and Pharmaceutical Chemistry, The Ohio State University, Columbus, OH 43210, USA
| | - Hui Li
- Center for RNA Nanobiotechnology and Nanomedicine, The Ohio State University, Columbus, OH 43210, USA; College of Pharmacy, Division of Pharmaceutics and Pharmaceutical Chemistry, The Ohio State University, Columbus, OH 43210, USA
| | - Peixuan Guo
- Center for RNA Nanobiotechnology and Nanomedicine, The Ohio State University, Columbus, OH 43210, USA; College of Pharmacy, Division of Pharmaceutics and Pharmaceutical Chemistry, The Ohio State University, Columbus, OH 43210, USA; College of Medicine, Dorothy M. Davis Heart and Lung Research Institute, The Ohio State University, Columbus, OH 43210, USA; James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA.
| |
Collapse
|
25
|
Churkin A, Retwitzer MD, Reinharz V, Ponty Y, Waldispühl J, Barash D. Design of RNAs: comparing programs for inverse RNA folding. Brief Bioinform 2018; 19:350-358. [PMID: 28049135 PMCID: PMC6018860 DOI: 10.1093/bib/bbw120] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Computational programs for predicting RNA sequences with desired folding properties have been extensively developed and expanded in the past several years. Given a secondary structure, these programs aim to predict sequences that fold into a target minimum free energy secondary structure, while considering various constraints. This procedure is called inverse RNA folding. Inverse RNA folding has been traditionally used to design optimized RNAs with favorable properties, an application that is expected to grow considerably in the future in light of advances in the expanding new fields of synthetic biology and RNA nanostructures. Moreover, it was recently demonstrated that inverse RNA folding can successfully be used as a valuable preprocessing step in computational detection of novel noncoding RNAs. This review describes the most popular freeware programs that have been developed for such purposes, starting from RNAinverse that was devised when formulating the inverse RNA folding problem. The most recently published ones that consider RNA secondary structure as input are antaRNA, RNAiFold and incaRNAfbinv, each having different features that could be beneficial to specific biological problems in practice. The various programs also use distinct approaches, ranging from ant colony optimization to constraint programming, in addition to adaptive walk, simulated annealing and Boltzmann sampling. This review compares between the various programs and provides a simple description of the various possibilities that would benefit practitioners in selecting the most suitable program. It is geared for specific tasks requiring RNA design based on input secondary structure, with an outlook toward the future of RNA design programs.
Collapse
Affiliation(s)
- Alexander Churkin
- Shamoon College of Engineering and Physics Department at Ben-Gurion University, Beer-Sheva, Israel
| | | | - Vladimir Reinharz
- Department of Computer Science, Ben-Gurion University, Beer-Sheva, Israel
- School of Computer Science, McGill University, Montréal QC, Canada
| | - Yann Ponty
- Laboratoire d’informatique, École Polytechnique, Palaiseau, France
| | | | - Danny Barash
- Department of Computer Science, Ben-Gurion University, Beer-Sheva, Israel
| |
Collapse
|
26
|
Yang X, Yoshizoe K, Taneda A, Tsuda K. RNA inverse folding using Monte Carlo tree search. BMC Bioinformatics 2017; 18:468. [PMID: 29110632 PMCID: PMC5674771 DOI: 10.1186/s12859-017-1882-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Accepted: 10/26/2017] [Indexed: 11/10/2022] Open
Abstract
Background Artificially synthesized RNA molecules provide important ways for creating a variety of novel functional molecules. State-of-the-art RNA inverse folding algorithms can design simple and short RNA sequences of specific GC content, that fold into the target RNA structure. However, their performance is not satisfactory in complicated cases. Result We present a new inverse folding algorithm called MCTS-RNA, which uses Monte Carlo tree search (MCTS), a technique that has shown exceptional performance in Computer Go recently, to represent and discover the essential part of the sequence space. To obtain high accuracy, initial sequences generated by MCTS are further improved by a series of local updates. Our algorithm has an ability to control the GC content precisely and can deal with pseudoknot structures. Using common benchmark datasets for evaluation, MCTS-RNA showed a lot of promise as a standard method of RNA inverse folding. Conclusion MCTS-RNA is available at https://github.com/tsudalab/MCTS-RNA. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1882-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiufeng Yang
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, 277-8561, Japan
| | - Kazuki Yoshizoe
- RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihombashi Chuo-ku, Tokyo, 103-0027, Japan
| | - Akito Taneda
- Graduate School of Science and Technology, Hirosaki University, 3 Bunkyo-cho, Hirosaki, 036-8561, Japan
| | - Koji Tsuda
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, 277-8561, Japan. .,Center for Materials Research by Information Integration, National Institute for Materials Science, 1-2-1 Sengen, Tsukuba, 305-0047, Japan. .,RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihombashi Chuo-ku, Tokyo, 103-0027, Japan.
| |
Collapse
|
27
|
Balke D, Hieronymus R, Müller S. Challenges and Perspectives in Nucleic Acid Enzyme Engineering. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2017; 170:21-35. [DOI: 10.1007/10_2017_21] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
28
|
Jasinski D, Haque F, Binzel DW, Guo P. Advancement of the Emerging Field of RNA Nanotechnology. ACS NANO 2017; 11:1142-1164. [PMID: 28045501 PMCID: PMC5333189 DOI: 10.1021/acsnano.6b05737] [Citation(s) in RCA: 232] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 01/03/2017] [Indexed: 05/14/2023]
Abstract
The field of RNA nanotechnology has advanced rapidly during the past decade. A variety of programmable RNA nanoparticles with defined shape, size, and stoichiometry have been developed for diverse applications in nanobiotechnology. The rising popularity of RNA nanoparticles is due to a number of factors: (1) removing the concern of RNA degradation in vitro and in vivo by introducing chemical modification into nucleotides without significant alteration of the RNA property in folding and self-assembly; (2) confirming the concept that RNA displays very high thermodynamic stability and is suitable for in vivo trafficking and other applications; (3) obtaining the knowledge to tune the immunogenic properties of synthetic RNA constructs for in vivo applications; (4) increased understanding of the 4D structure and intermolecular interaction of RNA molecules; (5) developing methods to control shape, size, and stoichiometry of RNA nanoparticles; (6) increasing knowledge of regulation and processing functions of RNA in cells; (7) decreasing cost of RNA production by biological and chemical synthesis; and (8) proving the concept that RNA is a safe and specific therapeutic modality for cancer and other diseases with little or no accumulation in vital organs. Other applications of RNA nanotechnology, such as adapting them to construct 2D, 3D, and 4D structures for use in tissue engineering, biosensing, resistive biomemory, and potential computer logic gate modules, have stimulated the interest of the scientific community. This review aims to outline the current state of the art of RNA nanoparticles as programmable smart complexes and offers perspectives on the promising avenues of research in this fast-growing field.
Collapse
Affiliation(s)
| | | | - Daniel W Binzel
- College of Pharmacy, Division
of Pharmaceutics and Pharmaceutical Chemistry; College of Medicine,
Department of Physiology & Cell Biology; and Dorothy M. Davis
Heart and Lung Research Institute, The Ohio
State University, Columbus, Ohio 43210, United States
| | - Peixuan Guo
- College of Pharmacy, Division
of Pharmaceutics and Pharmaceutical Chemistry; College of Medicine,
Department of Physiology & Cell Biology; and Dorothy M. Davis
Heart and Lung Research Institute, The Ohio
State University, Columbus, Ohio 43210, United States
| |
Collapse
|
29
|
Wolfe BR, Porubsky NJ, Zadeh JN, Dirks RM, Pierce NA. Constrained Multistate Sequence Design for Nucleic Acid Reaction Pathway Engineering. J Am Chem Soc 2017; 139:3134-3144. [DOI: 10.1021/jacs.6b12693] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Affiliation(s)
- Brian R. Wolfe
- Division of Biology & Biological Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Nicholas J. Porubsky
- Division of Chemistry & Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Joseph N. Zadeh
- Division of Biology & Biological Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Robert M. Dirks
- Division of Biology & Biological Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Niles A. Pierce
- Division of Biology & Biological Engineering, California Institute of Technology, Pasadena, California 91125, United States
- Division of Engineering & Applied Science, California Institute of Technology, Pasadena, California 91125, United States
- Weatherall
Institute of Molecular Medicine, University of Oxford, Oxford OX3 9DS, United Kingdom
| |
Collapse
|
30
|
Condon A, Kirkpatrick B, Maňuch J. Design of nucleic acid strands with long low-barrier folding pathways. NATURAL COMPUTING 2017; 16:261-284. [PMID: 28690474 PMCID: PMC5480305 DOI: 10.1007/s11047-016-9587-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
A major goal of natural computing is to design biomolecules, such as nucleic acid sequences, that can be used to perform computations. We design sequences of nucleic acids that are "guaranteed" to have long folding pathways relative to their length. This particular sequences with high probability follow low-barrier folding pathways that visit a large number of distinct structures. Long folding pathways are interesting, because they demonstrate that natural computing can potentially support long and complex computations. Formally, we provide the first scalable designs of molecules whose low-barrier folding pathways, with respect to a simple, stacked pair energy model, grow superlinearly with the molecule length, but for which all significantly shorter alternative folding pathways have an energy barrier that is [Formula: see text] times that of the low-barrier pathway for any [Formula: see text] and a sufficiently long sequence.
Collapse
Affiliation(s)
- Anne Condon
- Department of Computer Science, University of British Columbia, Vancouver, Canada
| | | | - Ján Maňuch
- Department of Computer Science, University of British Columbia, Vancouver, Canada
| |
Collapse
|
31
|
Zandi K, Butler G, Kharma N. An Adaptive Defect Weighted Sampling Algorithm to Design Pseudoknotted RNA Secondary Structures. Front Genet 2016; 7:129. [PMID: 27499762 PMCID: PMC4956659 DOI: 10.3389/fgene.2016.00129] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2016] [Accepted: 07/06/2016] [Indexed: 01/18/2023] Open
Abstract
Computational design of RNA sequences that fold into targeted secondary structures has many applications in biomedicine, nanotechnology and synthetic biology. An RNA molecule is made of different types of secondary structure elements and an important RNA element named pseudoknot plays a key role in stabilizing the functional form of the molecule. However, due to the computational complexities associated with characterizing pseudoknotted RNA structures, most of the existing RNA sequence designer algorithms generally ignore this important structural element and therefore limit their applications. In this paper we present a new algorithm to design RNA sequences for pseudoknotted secondary structures. We use NUPACK as the folding algorithm to compute the equilibrium characteristics of the pseudoknotted RNAs, and describe a new adaptive defect weighted sampling algorithm named Enzymer to design low ensemble defect RNA sequences for targeted secondary structures including pseudoknots. We used a biological data set of 201 pseudoknotted structures from the Pseudobase library to benchmark the performance of our algorithm. We compared the quality characteristics of the RNA sequences we designed by Enzymer with the results obtained from the state of the art MODENA and antaRNA. Our results show our method succeeds more frequently than MODENA and antaRNA do, and generates sequences that have lower ensemble defect, lower probability defect and higher thermostability. Finally by using Enzymer and by constraining the design to a naturally occurring and highly conserved Hammerhead motif, we designed 8 sequences for a pseudoknotted cis-acting Hammerhead ribozyme. Enzymer is available for download at https://bitbucket.org/casraz/enzymer.
Collapse
Affiliation(s)
- Kasra Zandi
- Computer Science Department, Concordia UniversityMontreal, QC, Canada
| | - Gregory Butler
- Computer Science Department, Concordia UniversityMontreal, QC, Canada
- Centre for Structural and Functional Genomics, Concordia UniversityMontreal, QC, Canada
| | - Nawwaf Kharma
- Centre for Structural and Functional Genomics, Concordia UniversityMontreal, QC, Canada
- Electrical and Computer Engineering Department, Concordia UniversityMontreal, QC, Canada
| |
Collapse
|
32
|
Anderson-Lee J, Fisker E, Kosaraju V, Wu M, Kong J, Lee J, Lee M, Zada M, Treuille A, Das R. Principles for Predicting RNA Secondary Structure Design Difficulty. J Mol Biol 2016; 428:748-757. [PMID: 26902426 PMCID: PMC4833017 DOI: 10.1016/j.jmb.2015.11.013] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2015] [Revised: 11/04/2015] [Accepted: 11/10/2015] [Indexed: 11/27/2022]
Abstract
Designing RNAs that form specific secondary structures is enabling better understanding and control of living systems through RNA-guided silencing, genome editing and protein organization. Little is known, however, about which RNA secondary structures might be tractable for downstream sequence design, increasing the time and expense of design efforts due to inefficient secondary structure choices. Here, we present insights into specific structural features that increase the difficulty of finding sequences that fold into a target RNA secondary structure, summarizing the design efforts of tens of thousands of human participants and three automated algorithms (RNAInverse, INFO-RNA and RNA-SSD) in the Eterna massive open laboratory. Subsequent tests through three independent RNA design algorithms (NUPACK, DSS-Opt and MODENA) confirmed the hypothesized importance of several features in determining design difficulty, including sequence length, mean stem length, symmetry and specific difficult-to-design motifs such as zigzags. Based on these results, we have compiled an Eterna100 benchmark of 100 secondary structure design challenges that span a large range in design difficulty to help test future efforts. Our in silico results suggest new routes for improving computational RNA design methods and for extending these insights to assess "designability" of single RNA structures, as well as of switches for in vitro and in vivo applications.
Collapse
Affiliation(s)
| | | | - Vineet Kosaraju
- Eterna Massive Open Laboratory; Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Michelle Wu
- Eterna Massive Open Laboratory; Program in Biomedical Informatics, Stanford University, Stanford, CA 94305, USA
| | - Justin Kong
- Eterna Massive Open Laboratory; Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Jeehyung Lee
- Eterna Massive Open Laboratory; Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Minjae Lee
- Eterna Massive Open Laboratory; Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | - Adrien Treuille
- Eterna Massive Open Laboratory; Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Rhiju Das
- Eterna Massive Open Laboratory; Department of Biochemistry, Stanford University, Stanford, CA 94305, USA; Department of Physics, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
33
|
Wolfe BR, Pierce NA. Sequence Design for a Test Tube of Interacting Nucleic Acid Strands. ACS Synth Biol 2015; 4:1086-100. [PMID: 25329866 DOI: 10.1021/sb5002196] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
We describe an algorithm for designing the equilibrium base-pairing properties of a test tube of interacting nucleic acid strands. A target test tube is specified as a set of desired "on-target" complexes, each with a target secondary structure and target concentration, and a set of undesired "off-target" complexes, each with vanishing target concentration. Sequence design is performed by optimizing the test tube ensemble defect, corresponding to the concentration of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of the test tube. To reduce the computational cost of accepting or rejecting mutations to a random initial sequence, the structural ensemble of each on-target complex is hierarchically decomposed into a tree of conditional subensembles, yielding a forest of decomposition trees. Candidate sequences are evaluated efficiently at the leaf level of the decomposition forest by estimating the test tube ensemble defect from conditional physical properties calculated over the leaf subensembles. As optimized subsequences are merged toward the root level of the forest, any emergent defects are eliminated via ensemble redecomposition and sequence reoptimization. After successfully merging subsequences to the root level, the exact test tube ensemble defect is calculated for the first time, explicitly checking for the effect of the previously neglected off-target complexes. Any off-target complexes that form at appreciable concentration are hierarchically decomposed, added to the decomposition forest, and actively destabilized during subsequent forest reoptimization. For target test tubes representative of design challenges in the molecular programming and synthetic biology communities, our test tube design algorithm typically succeeds in achieving a normalized test tube ensemble defect ≤1% at a design cost within an order of magnitude of the cost of test tube analysis.
Collapse
Affiliation(s)
- Brian R. Wolfe
- Division of Biology and Biological
Engineering and ‡Division of Engineering and Applied
Science, California Institute of Technology, Pasadena, California 91125, United States
| | - Niles A. Pierce
- Division of Biology and Biological
Engineering and ‡Division of Engineering and Applied
Science, California Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
34
|
Jabbari H, Aminpour M, Montemagno C. Computational Approaches to Nucleic Acid Origami. ACS COMBINATORIAL SCIENCE 2015; 17:535-47. [PMID: 26348196 DOI: 10.1021/acscombsci.5b00079] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Recent advances in experimental DNA origami have dramatically expanded the horizon of DNA nanotechnology. Complex 3D suprastructures have been designed and developed using DNA origami with applications in biomaterial science, nanomedicine, nanorobotics, and molecular computation. Ribonucleic acid (RNA) origami has recently been realized as a new approach. Similar to DNA, RNA molecules can be designed to form complex 3D structures through complementary base pairings. RNA origami structures are, however, more compact and more thermodynamically stable due to RNA's non-canonical base pairing and tertiary interactions. With all these advantages, the development of RNA origami lags behind DNA origami by a large gap. Furthermore, although computational methods have proven to be effective in designing DNA and RNA origami structures and in their evaluation, advances in computational nucleic acid origami is even more limited. In this paper, we review major milestones in experimental and computational DNA and RNA origami and present current challenges in these fields. We believe collaboration between experimental nanotechnologists and computer scientists are critical for advancing these new research paradigms.
Collapse
Affiliation(s)
- Hosna Jabbari
- Ingenuity Lab, 11421 Saskatchewan
Drive, Edmonton, Alberta T6G 2M9, Canada
- Department
of Chemical and Materials Engineering, University of Alberta, Edmonton T6G 2V4, Canada
| | - Maral Aminpour
- Ingenuity Lab, 11421 Saskatchewan
Drive, Edmonton, Alberta T6G 2M9, Canada
- Department
of Chemical and Materials Engineering, University of Alberta, Edmonton T6G 2V4, Canada
| | - Carlo Montemagno
- Ingenuity Lab, 11421 Saskatchewan
Drive, Edmonton, Alberta T6G 2M9, Canada
- Department
of Chemical and Materials Engineering, University of Alberta, Edmonton T6G 2V4, Canada
| |
Collapse
|
35
|
Taneda A. Multi-objective optimization for RNA design with multiple target secondary structures. BMC Bioinformatics 2015; 16:280. [PMID: 26335276 PMCID: PMC4559319 DOI: 10.1186/s12859-015-0706-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2014] [Accepted: 08/17/2015] [Indexed: 12/24/2022] Open
Abstract
Background RNAs are attractive molecules as the biological parts for synthetic biology. In particular, the ability of conformational changes, which can be encoded in designer RNAs, enables us to create multistable molecular switches that function in biological circuits. Although various algorithms for designing such RNA switches have been proposed, the previous algorithms optimize the RNA sequences against the weighted sum of objective functions, where empirical weights among objective functions are used. In addition, an RNA design algorithm for multiple pseudoknot targets is currently not available. Results We developed a novel computational tool for automatically designing RNA sequences which fold into multiple target secondary structures. Our algorithm designs RNA sequences based on multi-objective genetic algorithm, by which we can explore the RNA sequences having good objective function values without empirical weight parameters among the objective functions. Our algorithm has great flexibility by virtue of this weight-free nature. We benchmarked our multi-target RNA design algorithm with the datasets of two, three, and four target structures and found that our algorithm shows better or comparable design performances compared with the previous algorithms, RNAdesign and Frnakenstein. In addition to the benchmarks with pseudoknot-free datasets, we benchmarked MODENA with two-target pseudoknot datasets and found that MODENA can design the RNAs which have the target pseudoknotted secondary structures whose free energies are close to the lowest free energy. Moreover, we applied our algorithm to a ribozyme-based ON-switch which takes a ribozyme-inactive secondary structure when the theophylline aptamer structure is assumed. Conclusions Currently, MODENA is the only RNA design software which can be applied to multiple pseudoknot targets. Successful design results for the multiple targets and an RNA device indicate usefulness of our multi-objective RNA design algorithm. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0706-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Akito Taneda
- Graduate School of Science and Technology, Hirosaki University, 3 Bunkyo-cho, Hirosaki, Aomori, Japan.
| |
Collapse
|
36
|
Kleinkauf R, Mann M, Backofen R. antaRNA: ant colony-based RNA sequence design. Bioinformatics 2015; 31:3114-21. [PMID: 26023105 PMCID: PMC4576691 DOI: 10.1093/bioinformatics/btv319] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Accepted: 05/18/2015] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION RNA sequence design is studied at least as long as the classical folding problem. Although for the latter the functional fold of an RNA molecule is to be found ,: inverse folding tries to identify RNA sequences that fold into a function-specific target structure. In combination with RNA-based biotechnology and synthetic biology ,: reliable RNA sequence design becomes a crucial step to generate novel biochemical components. RESULTS In this article ,: the computational tool antaRNA is presented. It is capable of compiling RNA sequences for a given structure that comply in addition with an adjustable full range objective GC-content distribution ,: specific sequence constraints and additional fuzzy structure constraints. antaRNA applies ant colony optimization meta-heuristics and its superior performance is shown on a biological datasets. AVAILABILITY AND IMPLEMENTATION http://www.bioinf.uni-freiburg.de/Software/antaRNA CONTACT: backofen@informatik.uni-freiburg.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Robert Kleinkauf
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany
| | - Martin Mann
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany, Center for Biological Signaling Studies (BIOSS), University of Freiburg, Germany, Center for Biological Systems Analysis (ZBSA), University of Freiburg, Germany and Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| |
Collapse
|
37
|
Garcia-Martin JA, Dotu I, Clote P. RNAiFold 2.0: a web server and software to design custom and Rfam-based RNA molecules. Nucleic Acids Res 2015; 43:W513-21. [PMID: 26019176 PMCID: PMC4489274 DOI: 10.1093/nar/gkv460] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2015] [Accepted: 04/27/2015] [Indexed: 12/25/2022] Open
Abstract
Several algorithms for RNA inverse folding have been used to design synthetic riboswitches, ribozymes and thermoswitches, whose activity has been experimentally validated. The RNAiFold software is unique among approaches for inverse folding in that (exhaustive) constraint programming is used instead of heuristic methods. For that reason, RNAiFold can generate all sequences that fold into the target structure or determine that there is no solution. RNAiFold 2.0 is a complete overhaul of RNAiFold 1.0, rewritten from the now defunct COMET language to C++. The new code properly extends the capabilities of its predecessor by providing a user-friendly pipeline to design synthetic constructs having the functionality of given Rfam families. In addition, the new software supports amino acid constraints, even for proteins translated in different reading frames from overlapping coding sequences; moreover, structure compatibility/incompatibility constraints have been expanded. With these features, RNAiFold 2.0 allows the user to design single RNA molecules as well as hybridization complexes of two RNA molecules. Availability: the web server, source code and linux binaries are publicly accessible at http://bioinformatics.bc.edu/clotelab/RNAiFold2.0.
Collapse
Affiliation(s)
| | - Ivan Dotu
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, IMIM (Hospital del Mar Medical Research Institute); C/Dr. Aiguader 88, Barcelona E-08003, Spain
| | - Peter Clote
- Biology Department, Boston College, 140 Commonwealth Avenue, Chestnut Hill, MA 02467, USA
| |
Collapse
|
38
|
Esmaili-Taheri A, Ganjtabesh M. ERD: a fast and reliable tool for RNA design including constraints. BMC Bioinformatics 2015; 16:20. [PMID: 25626878 PMCID: PMC4384295 DOI: 10.1186/s12859-014-0444-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Accepted: 11/19/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The function of an RNA in cellular processes is directly related to its structure. The free energy of RNA structure in another important key to its function as only some structures with a specific level of free energy can take part in cellular reactions. Therefore, to perform a specific function, a particular RNA structure with specific level of free energy is required. For a given RNA structure, the goal of the RNA design problem is to design an RNA sequence that folds into the given structure. To mimic the biological features of RNA sequences and structures, some sequence and energy constraints should be considered in designing RNA. Although the level of free energy is important, it is not considered in the available approaches for RNA design problem. RESULTS In this paper, we present a new version of our evolutionary algorithm for RNA design problem, entitled ERD, and extend it to handle some sequence and energy constraints. In the sequence constraints, one can restrict sequence positions to a fixed nucleotide or to a subset of nucleotides. As for the energy constraint, one can specify an interval for the free energy ranges of the designed sequences. We compare our algorithm with INFO-RNA, MODENA, NUPACK, and RNAiFold approaches for some artificial and natural RNA secondary structures and constraints. CONCLUSIONS The results indicate that our algorithm outperforms the other mentioned approaches in terms of accuracy, speedup, divergency, nucleotides distribution, and similarity to the natural RNA sequences. Particularly, the designed RNA sequences in our method are much more reliable and similar to the natural counterparts. The generated sequences are more diverse and they have closer nucleotides distribution to the natural one. The ERD tool and web server are freely available at http://mostafa.ut.ac.ir/corna/erd-cons/ .
Collapse
Affiliation(s)
- Ali Esmaili-Taheri
- Department of Computer Science, School of Mathematics, Statistics, and Computer Science, College of Science, University of Tehran, Tehran, Iran.
| | - Mohammad Ganjtabesh
- Department of Computer Science, School of Mathematics, Statistics, and Computer Science, College of Science, University of Tehran, Tehran, Iran. .,Laboratoire d'Informatique (LIX), Ecole Polytechnique, Palaiseau CEDEX, 91128, France.
| |
Collapse
|
39
|
Zalatan JG, Lee ME, Almeida R, Gilbert LA, Whitehead EH, La Russa M, Tsai JC, Weissman JS, Dueber JE, Qi LS, Lim WA. Engineering complex synthetic transcriptional programs with CRISPR RNA scaffolds. Cell 2015; 160:339-50. [PMID: 25533786 PMCID: PMC4297522 DOI: 10.1016/j.cell.2014.11.052] [Citation(s) in RCA: 678] [Impact Index Per Article: 67.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Revised: 10/27/2014] [Accepted: 11/19/2014] [Indexed: 12/28/2022]
Abstract
Eukaryotic cells execute complex transcriptional programs in which specific loci throughout the genome are regulated in distinct ways by targeted regulatory assemblies. We have applied this principle to generate synthetic CRISPR-based transcriptional programs in yeast and human cells. By extending guide RNAs to include effector protein recruitment sites, we construct modular scaffold RNAs that encode both target locus and regulatory action. Sets of scaffold RNAs can be used to generate synthetic multigene transcriptional programs in which some genes are activated and others are repressed. We apply this approach to flexibly redirect flux through a complex branched metabolic pathway in yeast. Moreover, these programs can be executed by inducing expression of the dCas9 protein, which acts as a single master regulatory control point. CRISPR-associated RNA scaffolds provide a powerful way to construct synthetic gene expression programs for a wide range of applications, including rewiring cell fates or engineering metabolic pathways.
Collapse
Affiliation(s)
- Jesse G Zalatan
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA 94158, USA; Howard Hughes Medical Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Michael E Lee
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA 94720, USA; Energy Biosciences Institute, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Ricardo Almeida
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA 94158, USA; Howard Hughes Medical Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Luke A Gilbert
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA 94158, USA; Howard Hughes Medical Institute, University of California San Francisco, San Francisco, CA 94158, USA; Center for RNA Systems Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Evan H Whitehead
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA 94158, USA; UCSF Center for Systems and Synthetic Biology, University of California San Francisco, San Francisco, CA 94158, USA
| | - Marie La Russa
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA 94158, USA; UCSF Center for Systems and Synthetic Biology, University of California San Francisco, San Francisco, CA 94158, USA; Biomedical Sciences Graduate Program, University of California San Francisco, San Francisco, CA 94158, USA
| | - Jordan C Tsai
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA 94158, USA; Howard Hughes Medical Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Jonathan S Weissman
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA 94158, USA; Howard Hughes Medical Institute, University of California San Francisco, San Francisco, CA 94158, USA; Center for RNA Systems Biology, University of California, Berkeley, Berkeley, CA 94720, USA; California Institute for Quantitative Biomedical Research, San Francisco, CA 94158, USA
| | - John E Dueber
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA 94720, USA; Energy Biosciences Institute, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Lei S Qi
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA 94158, USA; UCSF Center for Systems and Synthetic Biology, University of California San Francisco, San Francisco, CA 94158, USA; California Institute for Quantitative Biomedical Research, San Francisco, CA 94158, USA.
| | - Wendell A Lim
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA 94158, USA; Howard Hughes Medical Institute, University of California San Francisco, San Francisco, CA 94158, USA; UCSF Center for Systems and Synthetic Biology, University of California San Francisco, San Francisco, CA 94158, USA; California Institute for Quantitative Biomedical Research, San Francisco, CA 94158, USA.
| |
Collapse
|
40
|
Haque F, Guo P. Overview of methods in RNA nanotechnology: synthesis, purification, and characterization of RNA nanoparticles. Methods Mol Biol 2015; 1297:1-19. [PMID: 25895992 DOI: 10.1007/978-1-4939-2562-9_1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
RNA nanotechnology encompasses the use of RNA as a construction material to build homogeneous nanostructures by bottom-up self-assembly with defined size, structure, and stoichiometry; this pioneering concept demonstrated in 1998 (Guo et al., Molecular Cell 2:149-155, 1998; featured in Cell) has emerged as a new field that also involves materials engineering and synthetic structural biology (Guo, Nature Nanotechnology 5:833-842, 2010). The field of RNA nanotechnology has skyrocketed over the last few years, as evidenced by the burst of publications in prominent journals on RNA nanostructures and their applications in nanomedicine and nanotechnology. Rapid advances in RNA chemistry, RNA biophysics, and RNA biology have created new opportunities for translating basic science into clinical practice. RNA nanotechnology holds considerable promise in this regard. Increased evidence also suggests that substantial part of the 98.5 % of human genome (Lander et al. Nature 409:860-921, 2001) that used to be called "junk DNA" actually codes for noncoding RNA. As we understand more on how RNA structures are related to function, we can fabricate synthetic RNA nanoparticles for the diagnosis and treatment of diseases. This chapter provides a brief overview of the field regarding the design, construction, purification, and characterization of RNA nanoparticles for diverse applications in nanotechnology and nanomedicince.
Collapse
Affiliation(s)
- Farzin Haque
- Nanobiotechnology Center, Markey Cancer Center, Departmentof Pharmaceutical Sciences, University of Kentucky, 789 S Limestone Ave, 576 Biopharm Complex, Lexington, KY, 40536, USA,
| | | |
Collapse
|
41
|
Abstract
In this chapter, we review both computational and experimental aspects of de novo RNA sequence design. We give an overview of currently available design software and their limitations, and discuss the necessary setup to experimentally validate proper function in vitro and in vivo. We focus on transcription-regulating riboswitches, a task that has just recently lead to first successful designs of such RNA elements.
Collapse
Affiliation(s)
- Sven Findeiß
- Research Group Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, Austria; Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Manja Wachsmuth
- Institute for Biochemistry, University of Leipzig, Leipzig, Germany
| | - Mario Mörl
- Institute for Biochemistry, University of Leipzig, Leipzig, Germany.
| | - Peter F Stadler
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria; Bioinformatics Group, Department of Computer Science and the Interdisciplinary Center for Bioinformatic, University of Leipzig, Leipzig, Germany; Center for RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark; Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany; Fraunhofer Institute for Cell Therapy and Immunology, Leipzig, Germany; Santa Fe Institute, Santa Fe, New Mexico, USA
| |
Collapse
|
42
|
Kelwick R, MacDonald JT, Webb AJ, Freemont P. Developments in the tools and methodologies of synthetic biology. Front Bioeng Biotechnol 2014; 2:60. [PMID: 25505788 PMCID: PMC4244866 DOI: 10.3389/fbioe.2014.00060] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Accepted: 11/12/2014] [Indexed: 11/27/2022] Open
Abstract
Synthetic biology is principally concerned with the rational design and engineering of biologically based parts, devices, or systems. However, biological systems are generally complex and unpredictable, and are therefore, intrinsically difficult to engineer. In order to address these fundamental challenges, synthetic biology is aiming to unify a “body of knowledge” from several foundational scientific fields, within the context of a set of engineering principles. This shift in perspective is enabling synthetic biologists to address complexity, such that robust biological systems can be designed, assembled, and tested as part of a biological design cycle. The design cycle takes a forward-design approach in which a biological system is specified, modeled, analyzed, assembled, and its functionality tested. At each stage of the design cycle, an expanding repertoire of tools is being developed. In this review, we highlight several of these tools in terms of their applications and benefits to the synthetic biology community.
Collapse
Affiliation(s)
- Richard Kelwick
- Centre for Synthetic Biology and Innovation, Imperial College London , London , UK ; Department of Medicine, Imperial College London , London , UK
| | - James T MacDonald
- Centre for Synthetic Biology and Innovation, Imperial College London , London , UK ; Department of Medicine, Imperial College London , London , UK
| | - Alexander J Webb
- Centre for Synthetic Biology and Innovation, Imperial College London , London , UK ; Department of Medicine, Imperial College London , London , UK
| | - Paul Freemont
- Centre for Synthetic Biology and Innovation, Imperial College London , London , UK ; Department of Medicine, Imperial College London , London , UK
| |
Collapse
|
43
|
Dotu I, Garcia-Martin JA, Slinger BL, Mechery V, Meyer MM, Clote P. Complete RNA inverse folding: computational design of functional hammerhead ribozymes. Nucleic Acids Res 2014; 42:11752-62. [PMID: 25209235 PMCID: PMC4191386 DOI: 10.1093/nar/gku740] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Nanotechnology and synthetic biology currently constitute one of the most innovative, interdisciplinary fields of research, poised to radically transform society in the 21st century. This paper concerns the synthetic design of ribonucleic acid molecules, using our recent algorithm, RNAiFold, which can determine all RNA sequences whose minimum free energy secondary structure is a user-specified target structure. Using RNAiFold, we design ten cis-cleaving hammerhead ribozymes, all of which are shown to be functional by a cleavage assay. We additionally use RNAiFold to design a functional cis-cleaving hammerhead as a modular unit of a synthetic larger RNA. Analysis of kinetics on this small set of hammerheads suggests that cleavage rate of computationally designed ribozymes may be correlated with positional entropy, ensemble defect, structural flexibility/rigidity and related measures. Artificial ribozymes have been designed in the past either manually or by SELEX (Systematic Evolution of Ligands by Exponential Enrichment); however, this appears to be the first purely computational design and experimental validation of novel functional ribozymes. RNAiFold is available at http://bioinformatics.bc.edu/clotelab/RNAiFold/.
Collapse
Affiliation(s)
- Ivan Dotu
- Biology Department, Boston College, 140 Commonwealth Avenue, Chestnut Hill, MA 02467, USA
| | | | - Betty L Slinger
- Biology Department, Boston College, 140 Commonwealth Avenue, Chestnut Hill, MA 02467, USA
| | - Vinodh Mechery
- Hofstra North Shore-LIJ School of Medicine, Hempstead, NY 11549, USA
| | - Michelle M Meyer
- Biology Department, Boston College, 140 Commonwealth Avenue, Chestnut Hill, MA 02467, USA
| | - Peter Clote
- Biology Department, Boston College, 140 Commonwealth Avenue, Chestnut Hill, MA 02467, USA
| |
Collapse
|
44
|
Afonin K, Kasprzak WK, Bindewald E, Kireeva M, Viard M, Kashlev M, Shapiro BA. In silico design and enzymatic synthesis of functional RNA nanoparticles. Acc Chem Res 2014; 47:1731-41. [PMID: 24758371 PMCID: PMC4066900 DOI: 10.1021/ar400329z] [Citation(s) in RCA: 70] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2013] [Indexed: 12/25/2022]
Abstract
CONSPECTUS: The use of RNAs as scaffolds for biomedical applications has several advantages compared with other existing nanomaterials. These include (i) programmability, (ii) precise control over folding and self-assembly, (iii) natural functionalities as exemplified by ribozymes, riboswitches, RNAi, editing, splicing, and inherent translation and transcription control mechanisms, (iv) biocompatibility, (v) relatively low immune response, and (vi) relatively low cost and ease of production. We have tapped into several of these properties and functionalities to construct RNA-based functional nanoparticles (RNA NPs). In several cases, the structural core and the functional components of the NPs are inherent in the same construct. This permits control over the spatial disposition of the components, intracellular availability, and precise stoichiometry. To enable the generation of RNA NPs, a pipeline is being developed. On one end, it encompasses the rational design and various computational schemes that promote design of the RNA-based nanoconstructs, ultimately producing a set of sequences consisting of RNA or RNA-DNA hybrids, which can assemble into the designed construct. On the other end of the pipeline is an experimental component, which takes the produced sequences and uses them to initialize and characterize their proper assembly and then test the resulting RNA NPs for their function and delivery in cell culture and animal models. An important aspect of this pipeline is the feedback that constantly occurs between the computational and the experimental parts, which synergizes the refinement of both the algorithmic methodologies and the experimental protocols. The utility of this approach is depicted by the several examples described in this Account (nanocubes, nanorings, and RNA-DNA hybrids). Of particular interest, from the computational viewpoint, is that in most cases, first a three-dimensional representation of the assembly is produced, and only then are algorithms applied to generate the sequences that will assemble into the designated three-dimensional construct. This is opposite to the usual practice of predicting RNA structures from a given sequence, that is, the RNA folding problem. To be considered is the generation of sequences that upon assembly have the proper intra- or interstrand interactions (or both). Of particular interest from the experimental point of view is the determination and characterization of the proper thermodynamic, kinetic, functionality, and delivery protocols. Assembly of RNA NPs from individual single-stranded RNAs can be accomplished by one-pot techniques under the proper thermal and buffer conditions or, potentially more interestingly, by the use of various RNA polymerases that can promote the formation of RNA NPs cotransciptionally from specifically designed DNA templates. Also of importance is the delivery of the RNA NPs to the cells of interest in vitro or in vivo. Nonmodified RNAs rapidly degrade in blood serum and have difficulties crossing biological membranes due to their negative charge. These problems can be overcome by using, for example, polycationic lipid-based carriers. Our work involves the use of bolaamphiphiles, which are amphipathic compounds with positively charged hydrophilic head groups at each end connected by a hydrophobic chain. We have correlated results from molecular dynamics computations with various experiments to understand the characteristics of such delivery agents.
Collapse
Affiliation(s)
- Kirill
A. Afonin
- Basic
Research Laboratory, Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702, United States
| | - Wojciech K. Kasprzak
- Basic
Science Program, Leidos Biomedical Research,
Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States
| | - Eckart Bindewald
- Basic
Science Program, Leidos Biomedical Research,
Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States
| | - Maria Kireeva
- Gene
Regulation and Chromosome Biology Laboratory, Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702, United States
| | - Mathias Viard
- Basic
Research Laboratory, Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702, United States
- Basic
Science Program, Leidos Biomedical Research,
Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States
| | - Mikhail Kashlev
- Gene
Regulation and Chromosome Biology Laboratory, Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702, United States
| | - Bruce A. Shapiro
- Basic
Research Laboratory, Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702, United States
| |
Collapse
|
45
|
Shu Y, Pi F, Sharma A, Rajabi M, Haque F, Shu D, Leggas M, Evers BM, Guo P. Stable RNA nanoparticles as potential new generation drugs for cancer therapy. Adv Drug Deliv Rev 2014; 66:74-89. [PMID: 24270010 DOI: 10.1016/j.addr.2013.11.006] [Citation(s) in RCA: 181] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2013] [Revised: 10/11/2013] [Accepted: 11/13/2013] [Indexed: 12/13/2022]
Abstract
Human genome sequencing revealed that only ~1.5% of the DNA sequence coded for proteins. More and more evidence has uncovered that a substantial part of the 98.5% so-called "junk" DNAs actually code for noncoding RNAs. Two milestones, chemical drugs and protein drugs, have already appeared in the history of drug development, and it is expected that the third milestone in drug development will be RNA drugs or drugs that target RNA. This review focuses on the development of RNA therapeutics for potential cancer treatment by applying RNA nanotechnology. A therapeutic RNA nanoparticle is unique in that its scaffold, ligand, and therapeutic component can all be composed of RNA. The special physicochemical properties lend to the delivery of siRNA, miRNA, ribozymes, or riboswitches; imaging using fluogenenic RNA; and targeting using RNA aptamers. With recent advances in solving the chemical, enzymatic, and thermodynamic stability issues, RNA nanoparticles have been found to be advantageous for in vivo applications due to their uniform nano-scale size, precise stoichiometry, polyvalent nature, low immunogenicity, low toxicity, and target specificity. In vivo animal studies have revealed that RNA nanoparticles can specifically target tumors with favorable pharmacokinetic and pharmacodynamic parameters without unwanted accumulation in normal organs. This review summarizes the key studies that have led to the detailed understanding of RNA nanoparticle formation as well as chemical and thermodynamic stability issue. The methods for RNA nanoparticle construction, and the current challenges in the clinical application of RNA nanotechnology, such as endosome trapping and production costs, are also discussed.
Collapse
Affiliation(s)
- Yi Shu
- Markey Cancer Center, University of Kentucky, Lexington, KY 40536, USA; Department of Pharmaceutical Sciences, University of Kentucky, Lexington, KY 40536, USA
| | - Fengmei Pi
- Markey Cancer Center, University of Kentucky, Lexington, KY 40536, USA; Department of Pharmaceutical Sciences, University of Kentucky, Lexington, KY 40536, USA
| | - Ashwani Sharma
- Markey Cancer Center, University of Kentucky, Lexington, KY 40536, USA; Department of Pharmaceutical Sciences, University of Kentucky, Lexington, KY 40536, USA
| | - Mehdi Rajabi
- Markey Cancer Center, University of Kentucky, Lexington, KY 40536, USA; Department of Pharmaceutical Sciences, University of Kentucky, Lexington, KY 40536, USA
| | - Farzin Haque
- Markey Cancer Center, University of Kentucky, Lexington, KY 40536, USA; Department of Pharmaceutical Sciences, University of Kentucky, Lexington, KY 40536, USA
| | - Dan Shu
- Markey Cancer Center, University of Kentucky, Lexington, KY 40536, USA; Department of Pharmaceutical Sciences, University of Kentucky, Lexington, KY 40536, USA
| | - Markos Leggas
- Markey Cancer Center, University of Kentucky, Lexington, KY 40536, USA; Department of Pharmaceutical Sciences, University of Kentucky, Lexington, KY 40536, USA
| | - B Mark Evers
- Markey Cancer Center, University of Kentucky, Lexington, KY 40536, USA
| | - Peixuan Guo
- Markey Cancer Center, University of Kentucky, Lexington, KY 40536, USA; Department of Pharmaceutical Sciences, University of Kentucky, Lexington, KY 40536, USA.
| |
Collapse
|
46
|
Esmaili-Taheri A, Ganjtabesh M, Mohammad-Noori M. Evolutionary solution for the RNA design problem. Bioinformatics 2014; 30:1250-8. [PMID: 24407223 DOI: 10.1093/bioinformatics/btu001] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION RNAs play fundamental roles in cellular processes. The function of an RNA is highly dependent on its 3D conformation, which is referred to as the RNA tertiary structure. Because the prediction or experimental determination of these structures is difficult, so many works focus on the problems associated with the RNA secondary structure. Here, we consider the RNA inverse folding problem, in which an RNA secondary structure is given as a target structure and the goal is to design an RNA sequence that folds into the target structure. In this article, we introduce a new evolutionary algorithm for the RNA inverse folding problem. Our algorithm, entitled Evolutionary RNA Design, generates a sequence whose minimum free energy structure is the same as the target structure. RESULTS We compare our algorithm with INFO-RNA, MODENA, RNAiFold and NUPACK approaches for some biological test sets. The results presented in this article indicate that for longer structures, our algorithm performs better than the other mentioned algorithms in terms of the energy range, accuracy, speedup and nucleotide distribution. Particularly, the generated RNA sequences in our method are much more reliable and similar to the natural RNA sequences.
Collapse
Affiliation(s)
- Ali Esmaili-Taheri
- Department of Computer Science, School of Mathematics, Statistics, and Computer Science, University of Tehran, P. O. Box: 14155-6455, Tehran, Iran, Laboratoire d'Informatique (LIX), Ecole Polytechnique, 91128 Palaiseau CEDEX, France and School of Biological Science, Institute for Research in Fundamental Sciences (IPM), P.O. Box: 19395-5746 Tehran, Iran
| | | | | |
Collapse
|
47
|
Combinatorial Insights into RNA Secondary Structure. DISCRETE AND TOPOLOGICAL MODELS IN MOLECULAR BIOLOGY 2014. [DOI: 10.1007/978-3-642-40193-0_7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
48
|
Dotu I, Lozano G, Clote P, Martinez-Salas E. Using RNA inverse folding to identify IRES-like structural subdomains. RNA Biol 2013; 10:1842-52. [PMID: 24253111 DOI: 10.4161/rna.26994] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Internal ribosome entry site (IRES) elements govern protein synthesis of mRNAs that bypass cap-dependent translation inhibition under stress conditions. Picornavirus IRES are cis-acting elements, organized in modular domains that recruit the ribosome to internal mRNA sites. The aim of this study was to retrieve short RNA sequences with the capacity to adopt RNA folding patterns conserved with IRES structural subdomains, likely corresponding to RNA modules. We have applied a new program, RNAiFold, an inverse folding algorithm that determines all sequences whose minimum free energy structure is identical to that of the structural domains of interest. Sequences differing by more than 1 nt were clustered. Then, BLASTing one randomly chosen sequence from each cluster of the RNAiFold output, we retrieved viral and cellular sequences among output hits. As a proof of principle, we present the data corresponding to a coding region of Drosophila melanogaster TAF6, a transcription factor-associated protein that contains a structural motif within its coding region potentially folding into an IRES-like subdomain. This RNA region shows a biased codon usage, as predicted from structural constraints at the RNA level, it harbors conserved IRES structural motifs in loops, and interestingly, it has the capacity to confer internal initiation of translation in tissue culture cells.
Collapse
Affiliation(s)
- Ivan Dotu
- Biology Department; Boston College; Chestnut Hill, MA USA
| | - Gloria Lozano
- Centro de Biologia Molecular Severo Ochoa; Consejo Superior de Investigaciones Cientificas-Universidad Autonoma de Madrid; Madrid, Spain
| | - Peter Clote
- Biology Department; Boston College; Chestnut Hill, MA USA
| | - Encarnacion Martinez-Salas
- Centro de Biologia Molecular Severo Ochoa; Consejo Superior de Investigaciones Cientificas-Universidad Autonoma de Madrid; Madrid, Spain
| |
Collapse
|
49
|
Shu Y, Shu D, Haque F, Guo P. Fabrication of pRNA nanoparticles to deliver therapeutic RNAs and bioactive compounds into tumor cells. Nat Protoc 2013; 8:1635-59. [PMID: 23928498 DOI: 10.1038/nprot.2013.097] [Citation(s) in RCA: 88] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
RNA nanotechnology is a term that refers to the design, fabrication and use of nanoparticles that are mainly composed of RNAs via bottom-up self-assembly. The packaging RNA (pRNA) of the bacteriophage phi29 DNA packaging motor has been developed into a nanodelivery platform. This protocol describes the synthesis, assembly and functionalization of pRNA nanoparticles on the basis of three 'toolkits' derived from pRNA structural features: interlocking loops for hand-in-hand interactions, palindrome sequences for foot-to-foot interactions and an RNA three-way junction for branch extension. siRNAs, ribozymes, aptamers, chemical ligands, fluorophores and other functionalities can also be fused to the pRNA before the assembly of the nanoparticles, so as to ensure the production of homogeneous nanoparticles and the retention of appropriate folding and function of the incorporated modules. The resulting self-assembled multivalent pRNA nanoparticles are thermodynamically and chemically stable, and they remain intact at ultralow concentrations. Gene-silencing effects are progressively enhanced with increasing numbers of siRNAs in each pRNA nanoparticle. Systemic injection of the pRNA nanoparticles into xenograft-bearing mice has revealed strong binding to tumors without accumulation in vital organs or tissues. The pRNA-based nanodelivery scaffold paves a new way for nanotechnological application of pRNA-based nanoparticles for disease detection and treatment. The time required for completing one round of this protocol is 3-4 weeks when including in vitro functional assays, or 2-3 months when including in vivo studies.
Collapse
Affiliation(s)
- Yi Shu
- Nanobiotechnology Center, Markey Cancer Center, Lexington, Kentucky, USA
| | | | | | | |
Collapse
|
50
|
Garcia-Martin JA, Clote P, Dotu I. RNAiFold: a web server for RNA inverse folding and molecular design. Nucleic Acids Res 2013; 41:W465-70. [PMID: 23700314 PMCID: PMC3692061 DOI: 10.1093/nar/gkt280] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Synthetic biology and nanotechnology are poised to make revolutionary contributions to the 21st century. In this article, we describe a new web server to support in silico RNA molecular design. Given an input target RNA secondary structure, together with optional constraints, such as requiring GC-content to lie within a certain range, requiring the number of strong (GC), weak (AU) and wobble (GU) base pairs to lie in a certain range, the RNAiFold web server determines one or more RNA sequences, whose minimum free-energy secondary structure is the target structure. RNAiFold provides access to two servers: RNA-CPdesign, which applies constraint programming, and RNA-LNSdesign, which applies the large neighborhood search heuristic; hence, it is suitable for larger input structures. Both servers can also solve the RNA inverse hybridization problem, i.e. given a representation of the desired hybridization structure, RNAiFold returns two sequences, whose minimum free-energy hybridization is the input target structure. The web server is publicly accessible at http://bioinformatics.bc.edu/clotelab/RNAiFold, which provides access to two specialized servers: RNA-CPdesign and RNA-LNSdesign. Source code for the underlying algorithms, implemented in COMET and supported on linux, can be downloaded at the server website.
Collapse
|