1
|
Jin L, Zhou Y, Zhang S, Chen SJ. mRNA vaccine sequence and structure design and optimization: Advances and challenges. J Biol Chem 2024; 301:108015. [PMID: 39608721 DOI: 10.1016/j.jbc.2024.108015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Revised: 11/13/2024] [Accepted: 11/16/2024] [Indexed: 11/30/2024] Open
Abstract
Messenger RNA (mRNA) vaccines have emerged as a powerful tool against communicable diseases and cancers, as demonstrated by their huge success during the coronavirus disease 2019 (COVID-19) pandemic. Despite the outstanding achievements, mRNA vaccines still face challenges such as stringent storage requirements, insufficient antigen expression, and unexpected immune responses. Since the intrinsic properties of mRNA molecules significantly impact vaccine performance, optimizing mRNA design is crucial in preclinical development. In this review, we outline four key principles for optimal mRNA sequence design: enhancing ribosome loading and translation efficiency through untranslated region (UTR) optimization, improving translation efficiency via codon optimization, increasing structural stability by refining global RNA sequence and extending in-cell lifetime and expression fidelity by adjusting local RNA structures. We also explore recent advancements in computational models for designing and optimizing mRNA vaccine sequences following these principles. By integrating current mRNA knowledge, addressing challenges, and examining advanced computational methods, this review aims to promote the application of computational approaches in mRNA vaccine development and inspire novel solutions to existing obstacles.
Collapse
Affiliation(s)
- Lei Jin
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri, USA
| | - Yuanzhe Zhou
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri, USA
| | - Sicheng Zhang
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri, USA
| | - Shi-Jie Chen
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri, USA; Department of Biochemistry, MU Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA.
| |
Collapse
|
2
|
Mao FY, Tu MJ, Traber GM, Yu AM. Comparison of Three Computational Tools for the Prediction of RNA Tertiary Structures. Noncoding RNA 2024; 10:55. [PMID: 39585047 PMCID: PMC11587127 DOI: 10.3390/ncrna10060055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2024] [Revised: 10/27/2024] [Accepted: 11/05/2024] [Indexed: 11/26/2024] Open
Abstract
Understanding the structures of noncoding RNAs (ncRNAs) is important for the development of RNA-based therapeutics. There are inherent challenges in employing current experimental techniques to determine the tertiary (3D) structures of RNAs with high complexity and flexibility in folding, which makes computational methods indispensable. In this study, we compared the utilities of three advanced computational tools, namely RNAComposer, Rosetta FARFAR2, and the latest AlphaFold 3, to predict the 3D structures of various forms of RNAs, including the small interfering RNA drug, nedosiran, and the novel bioengineered RNA (BioRNA) molecule showing therapeutic potential. Our results showed that, while RNAComposer offered a malachite green aptamer 3D structure closer to its crystal structure, the performances of RNAComposer and Rosetta FARFAR2 largely depend upon the secondary structures inputted, and Rosetta FARFAR2 predictions might not even recapitulate the typical, inverted "L" shape tRNA 3D structure. Overall, AlphaFold 3, integrating molecular dynamics principles into its deep learning framework, directly predicted RNA 3D structures from RNA primary sequence inputs, even accepting several common post-transcriptional modifications, which closely aligned with the experimentally determined structures. However, there were significant discrepancies among three computational tools in predicting the distal loop of human pre-microRNA and larger BioRNA (tRNA fused pre-miRNA) molecules whose 3D structures have not been characterized experimentally. While computational predictions show considerable promise, their notable strengths and limitations emphasize the needs for experimental validation of predictions besides characterization of more RNA 3D structures.
Collapse
Affiliation(s)
| | | | | | - Ai-Ming Yu
- Department of Biochemistry and Molecular Medicine, School of Medicine, University of California Davis, 2700 Stockton Blvd, Sacramento, CA 95817, USA
| |
Collapse
|
3
|
Li J, Walter NG, Chen SJ. smFRET-assisted RNA structure prediction. COMMUNICATIONS IN INFORMATION AND SYSTEMS 2024; 24:163-179. [PMID: 39524454 PMCID: PMC11545564 DOI: 10.4310/cis.241021213225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
Single-molecule Förster Resonance Energy Transfer (smFRET) is a powerful biophysical technique that utilizes the distance-dependent energy transfer between donor and acceptor dyes linked to individual molecules, providing insights into molecular conformational changes and interactions at the single-molecule level. Prior investigations leveraged smFRET to study the conformational dynamics of single truncated Ubc4 pre-mRNA molecules during splicing, yet these efforts did not prioritize structural modeling. In this study, we develop an smFRET-assisted RNA prediction method to predict the 2D and 3D structures of this pre-mRNA. To achieve this, we initiate the process by generating RNA structural ensembles through coarse-grained molecular dynamics (MD) simulations. Subsequently, inter-dye distances are calculated for these RNA structural ensembles by performing all-atom MD simulations of the dye groups. The ultimate determination of the 2D and 3D structures for the pre-mRNA is achieved by comparing the calculated inter-dye distances with experimental counterparts. Notably, our computational results demonstrate a significant alignment with experimental findings, which involve a conformational change at the 2D level.
Collapse
Affiliation(s)
- Jun Li
- Department of Physics, University of Missouri, Columbia, MO, USA
| | - Nils G Walter
- Single Molecule Analysis Group and Center for RNA Biomedicine, Department of Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| |
Collapse
|
4
|
Jiang H, Xu Y, Tong Y, Zhang D, Zhou R. IsRNAcirc: 3D structure prediction of circular RNAs based on coarse-grained molecular dynamics simulation. PLoS Comput Biol 2024; 20:e1012293. [PMID: 39466881 PMCID: PMC11542809 DOI: 10.1371/journal.pcbi.1012293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 11/07/2024] [Accepted: 10/12/2024] [Indexed: 10/30/2024] Open
Abstract
As an emerging class of RNA molecules, circular RNAs play pivotal roles in various biological processes, thereby determining their three-dimensional (3D) structure is crucial for a deep understanding of their biological significances. Similar to linear RNAs, the development of computational methods for circular RNA 3D structure prediction is challenging, especially considering the inherent flexibility and potentially long length of circular RNAs. Here, we introduce an extension of our previous IsRNA2 model, named IsRNAcirc, to enable circular RNA 3D structure predictions through coarse-grained molecular dynamics simulations. The workflow of IsRNAcirc consists of four main steps, including input preparation, end closure, structure prediction, and model refinement. Our results demonstrate that IsRNAcirc can provide reasonable 3D structure predictions for circular RNAs, which significantly reduce the locally irrational elements contained in the initial input. Moreover, for a validation test set comprising 34 circular RNAs, our IsRNAcirc can generate 3D models with better scores than the template-based 3dRNA method. These findings demonstrate that our IsRNAcirc method is a promising tool to explore the structural details along with intricate interactions of circular RNAs.
Collapse
Affiliation(s)
- Haolin Jiang
- College of Life Sciences and Institute of Quantitative Biology, Zhejiang University, Hangzhou, Zhejiang, China
| | - Yulian Xu
- College of Life Sciences, China Jiliang University, Hangzhou, China
- China Jiliang University—Aoming (Hangzhou) Biomedical Co., Ltd. Joint Laboratory, Hangzhou, China
| | - Yunguang Tong
- College of Life Sciences, China Jiliang University, Hangzhou, China
- Aoming (Hangzhou) Biomedical Co., Ltd., Hangzhou, China
| | - Dong Zhang
- College of Life Sciences and Institute of Quantitative Biology, Zhejiang University, Hangzhou, Zhejiang, China
| | - Ruhong Zhou
- College of Life Sciences and Institute of Quantitative Biology, Zhejiang University, Hangzhou, Zhejiang, China
- The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
| |
Collapse
|
5
|
Fallah A, Havaei SA, Sedighian H, Kachuei R, Fooladi AAI. Prediction of aptamer affinity using an artificial intelligence approach. J Mater Chem B 2024; 12:8825-8842. [PMID: 39158322 DOI: 10.1039/d4tb00909f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/20/2024]
Abstract
Aptamers are oligonucleotide sequences that can connect to particular target molecules, similar to monoclonal antibodies. They can be chosen by systematic evolution of ligands by exponential enrichment (SELEX), and are modifiable and can be synthesized. Even if the SELEX approach has been improved a lot, it is frequently challenging and time-consuming to identify aptamers experimentally. In particular, structure-based methods are the most used in computer-aided design and development of aptamers. For this purpose, numerous web-based platforms have been suggested for the purpose of forecasting the secondary structure and 3D configurations of RNAs and DNAs. Also, molecular docking and molecular dynamics (MD), which are commonly utilized in protein compound selection by structural information, are suitable for aptamer selection. On the other hand, from a large number of sequences, artificial intelligence (AI) may be able to quickly discover the possible aptamer candidates. Conversely, sophisticated machine and deep-learning (DL) models have demonstrated efficacy in forecasting the binding properties between ligands and targets during drug discovery; as such, they may provide a reliable and precise method for forecasting the binding of aptamers to targets. This research looks at advancements in AI pipelines and strategies for aptamer binding ability prediction, such as machine and deep learning, as well as structure-based approaches, molecular dynamics and molecular docking simulation methods.
Collapse
Affiliation(s)
- Arezoo Fallah
- Department of Bacteriology and Virology, Faculty of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Seyed Asghar Havaei
- Department of Microbiology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.
| | - Hamid Sedighian
- Applied Microbiology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| | - Reza Kachuei
- Molecular Biology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Abbas Ali Imani Fooladi
- Applied Microbiology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
6
|
Nithin C, Kmiecik S, Błaszczyk R, Nowicka J, Tuszyńska I. Comparative analysis of RNA 3D structure prediction methods: towards enhanced modeling of RNA-ligand interactions. Nucleic Acids Res 2024; 52:7465-7486. [PMID: 38917327 PMCID: PMC11260495 DOI: 10.1093/nar/gkae541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 05/23/2024] [Accepted: 06/16/2024] [Indexed: 06/27/2024] Open
Abstract
Accurate RNA structure models are crucial for designing small molecule ligands that modulate their functions. This study assesses six standalone RNA 3D structure prediction methods-DeepFoldRNA, RhoFold, BRiQ, FARFAR2, SimRNA and Vfold2, excluding web-based tools due to intellectual property concerns. We focus on reproducing the RNA structure existing in RNA-small molecule complexes, particularly on the ability to model ligand binding sites. Using a comprehensive set of RNA structures from the PDB, which includes diverse structural elements, we found that machine learning (ML)-based methods effectively predict global RNA folds but are less accurate with local interactions. Conversely, non-ML-based methods demonstrate higher precision in modeling intramolecular interactions, particularly with secondary structure restraints. Importantly, ligand-binding site accuracy can remain sufficiently high for practical use, even if the overall model quality is not optimal. With the recent release of AlphaFold 3, we included this advanced method in our tests. Benchmark subsets containing new structures, not used in the training of the tested ML methods, show that AlphaFold 3's performance was comparable to other ML-based methods, albeit with some challenges in accurately modeling ligand binding sites. This study underscores the importance of enhancing binding site prediction accuracy and the challenges in modeling RNA-ligand interactions accurately.
Collapse
Affiliation(s)
- Chandran Nithin
- Molecure SA, 02-089 Warsaw, Poland
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, 02-089 Warsaw, Poland
| | - Sebastian Kmiecik
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, 02-089 Warsaw, Poland
| | | | | | | |
Collapse
|
7
|
Moafinejad SN, de Aquino BRH, Boniecki M, Pandaranadar Jeyeram IN, Nikolaev G, Magnus M, Farsani M, Badepally N, Wirecki T, Stefaniak F, Bujnicki J. SimRNAweb v2.0: a web server for RNA folding simulations and 3D structure modeling, with optional restraints and enhanced analysis of folding trajectories. Nucleic Acids Res 2024; 52:W368-W373. [PMID: 38738621 PMCID: PMC11223799 DOI: 10.1093/nar/gkae356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/07/2024] [Accepted: 04/29/2024] [Indexed: 05/14/2024] Open
Abstract
Research on ribonucleic acid (RNA) structures and functions benefits from easy-to-use tools for computational prediction and analyses of RNA three-dimensional (3D) structure. The SimRNAweb server version 2.0 offers an enhanced, user-friendly platform for RNA 3D structure prediction and analysis of RNA folding trajectories based on the SimRNA method. SimRNA employs a coarse-grained model, Monte Carlo sampling and statistical potentials to explore RNA conformational space, optionally guided by spatial restraints. Recognized for its accuracy in RNA 3D structure prediction in RNA-Puzzles and CASP competitions, SimRNA is particularly useful for incorporating restraints based on experimental data. The new server version introduces performance optimizations and extends user control over simulations and the processing of results. It allows the application of various hard and soft restraints, accommodating alternative structures involving canonical and noncanonical base pairs and unpaired residues, while also integrating data from chemical probing methods. Enhanced features include an improved analysis of folding trajectories, offering advanced clustering options and multiple analyses of the generated trajectories. These updates provide comprehensive tools for detailed RNA structure analysis. SimRNAweb v2.0 significantly broadens the scope of RNA modeling, emphasizing flexibility and user-defined parameter control. The web server is available at https://genesilico.pl/SimRNAweb.
Collapse
Affiliation(s)
- S Naeim Moafinejad
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Belisa R H de Aquino
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Michał J Boniecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Iswarya P N Pandaranadar Jeyeram
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Grigory Nikolaev
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Marcin Magnus
- Department of Molecular and Cellular Biology, Harvard University, 52 Oxford St, Cambridge, MA 02138, USA
| | - Masoud Amiri Farsani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Nagendar Goud Badepally
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Tomasz K Wirecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Filip Stefaniak
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| |
Collapse
|
8
|
Bernard C, Postic G, Ghannay S, Tahi F. State-of-the-RNArt: benchmarking current methods for RNA 3D structure prediction. NAR Genom Bioinform 2024; 6:lqae048. [PMID: 38745991 PMCID: PMC11091930 DOI: 10.1093/nargab/lqae048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/05/2024] [Accepted: 05/08/2024] [Indexed: 05/16/2024] Open
Abstract
RNAs are essential molecules involved in numerous biological functions. Understanding RNA functions requires the knowledge of their 3D structures. Computational methods have been developed for over two decades to predict the 3D conformations from RNA sequences. These computational methods have been widely used and are usually categorised as either ab initio or template-based. The performances remain to be improved. Recently, the rise of deep learning has changed the sight of novel approaches. Deep learning methods are promising, but their adaptation to RNA 3D structure prediction remains difficult. In this paper, we give a brief review of the ab initio, template-based and novel deep learning approaches. We highlight the different available tools and provide a benchmark on nine methods using the RNA-Puzzles dataset. We provide an online dashboard that shows the predictions made by benchmarked methods, freely available on the EvryRNA platform: https://evryrna.ibisc.univ-evry.fr/evryrna/state_of_the_rnart/.
Collapse
Affiliation(s)
- Clément Bernard
- Université Paris-Saclay, Univ. Evry, IBISC, 91020 Evry-Courcouronnes, France
- LISN - CNRS/Université Paris-Saclay, 91400 Orsay, France
| | - Guillaume Postic
- Université Paris-Saclay, Univ. Evry, IBISC, 91020 Evry-Courcouronnes, France
| | - Sahar Ghannay
- LISN - CNRS/Université Paris-Saclay, 91400 Orsay, France
| | - Fariza Tahi
- Université Paris-Saclay, Univ. Evry, IBISC, 91020 Evry-Courcouronnes, France
| |
Collapse
|
9
|
Niemyska W, Mukherjee S, Gren BA, Niewieczerzal S, Bujnicki JM, Sulkowska JI. Discovery of a trefoil knot in the RydC RNA: Challenging previous notions of RNA topology. J Mol Biol 2024; 436:168455. [PMID: 38272438 DOI: 10.1016/j.jmb.2024.168455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/16/2024] [Accepted: 01/18/2024] [Indexed: 01/27/2024]
Abstract
Knots are very common in polymers, including DNA and protein molecules. Yet, no genuine knot has been identified in natural RNA molecules to date. Upon re-examining experimentally determined RNA 3D structures, we discovered a trefoil knot 31, the most basic non-trivial knot, in the RydC RNA. This knotted RNA is a member of a small family of short bacterial RNAs, whose secondary structure is characterized by an H-type pseudoknot. Molecular dynamics simulations suggest a folding pathway of the RydC RNA that starts with a native twisted loop. Based on sequence analyses and computational RNA 3D structure predictions, we postulate that this trefoil knot is a conserved feature of all RydC-related RNAs. The first discovery of a knot in a natural RNA molecule introduces a novel perspective on RNA 3D structure formation and on fundamental research on the relationship between function and spatial structure of biopolymers.
Collapse
Affiliation(s)
- Wanda Niemyska
- Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland; Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland.
| | - Sunandan Mukherjee
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Trojdena 4, 02-109 Warsaw, Poland
| | - Bartosz A Gren
- Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| | - Szymon Niewieczerzal
- Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Trojdena 4, 02-109 Warsaw, Poland.
| | - Joanna I Sulkowska
- Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland.
| |
Collapse
|
10
|
Li J, Zhang S, Chen SJ. Advancing RNA 3D structure prediction: Exploring hierarchical and hybrid approaches in CASP15. Proteins 2023; 91:1779-1789. [PMID: 37615235 PMCID: PMC10841231 DOI: 10.1002/prot.26583] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 06/19/2023] [Accepted: 08/08/2023] [Indexed: 08/25/2023]
Abstract
In CASP15, we used an integrated hierarchical and hybrid approach to predict RNA structures. The approach involves three steps. First, with the use of physics-based methods, Vfold2D-MC and VfoldMCPX, we predict the 2D structures from the sequence. Second, we employ template-based methods, Vfold3D and VfoldLA, to build 3D scaffolds for the predicted 2D structures. Third, using the 3D scaffolds as initial structures and the predicted 2D structures as constraints, we predict the 3D structure from coarse-grained molecular dynamics simulations, IsRNA and RNAJP. Our approach was evaluated on 12 RNA targets in CASP15 and ranked second among all the 34 participating teams. The result demonstrated the reliability of our method in predicting RNA 2D structures with high accuracy and RNA 3D structures with moderate accuracy. Further improvements in RNA structure prediction for the next round of CASP may come from the incorporation of the physics-based method with machine learning techniques.
Collapse
Affiliation(s)
- Jun Li
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| | - Sicheng Zhang
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| |
Collapse
|
11
|
Wang W, Feng C, Han R, Wang Z, Ye L, Du Z, Wei H, Zhang F, Peng Z, Yang J. trRosettaRNA: automated prediction of RNA 3D structure with transformer network. Nat Commun 2023; 14:7266. [PMID: 37945552 PMCID: PMC10636060 DOI: 10.1038/s41467-023-42528-4] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 10/13/2023] [Indexed: 11/12/2023] Open
Abstract
RNA 3D structure prediction is a long-standing challenge. Inspired by the recent breakthrough in protein structure prediction, we developed trRosettaRNA, an automated deep learning-based approach to RNA 3D structure prediction. The trRosettaRNA pipeline comprises two major steps: 1D and 2D geometries prediction by a transformer network; and 3D structure folding by energy minimization. Benchmark tests suggest that trRosettaRNA outperforms traditional automated methods. In the blind tests of the 15th Critical Assessment of Structure Prediction (CASP15) and the RNA-Puzzles experiments, the automated trRosettaRNA predictions for the natural RNAs are competitive with the top human predictions. trRosettaRNA also outperforms other deep learning-based methods in CASP15 when measured by the Z-score of the Root-Mean-Square Deviation. Nevertheless, it remains challenging to predict accurate structures for synthetic RNAs with an automated approach. We hope this work could be a good start toward solving the hard problem of RNA structure prediction with deep learning.
Collapse
Affiliation(s)
- Wenkai Wang
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| | - Chenjie Feng
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
- School of Science, Ningxia Medical University, Yinchuan, 750004, China
| | - Renmin Han
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
| | - Ziyi Wang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
| | - Lisha Ye
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| | - Zongyang Du
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| | - Hong Wei
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| | - Fa Zhang
- School of Medical Technology, Beijing Institute of Technology, Beijing, 100081, China.
| | - Zhenling Peng
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.
| | - Jianyi Yang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.
| |
Collapse
|
12
|
Wang X, Yu S, Lou E, Tan YL, Tan ZJ. RNA 3D Structure Prediction: Progress and Perspective. Molecules 2023; 28:5532. [PMID: 37513407 PMCID: PMC10386116 DOI: 10.3390/molecules28145532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 07/05/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
Ribonucleic acid (RNA) molecules play vital roles in numerous important biological functions such as catalysis and gene regulation. The functions of RNAs are strongly coupled to their structures or proper structure changes, and RNA structure prediction has been paid much attention in the last two decades. Some computational models have been developed to predict RNA three-dimensional (3D) structures in silico, and these models are generally composed of predicting RNA 3D structure ensemble, evaluating near-native RNAs from the structure ensemble, and refining the identified RNAs. In this review, we will make a comprehensive overview of the recent advances in RNA 3D structure modeling, including structure ensemble prediction, evaluation, and refinement. Finally, we will emphasize some insights and perspectives in modeling RNA 3D structures.
Collapse
Affiliation(s)
- Xunxun Wang
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Shixiong Yu
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - En Lou
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Ya-Lan Tan
- School of Bioengineering and Health, Wuhan Textile University, Wuhan 430200, China
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430200, China
| | - Zhi-Jie Tan
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| |
Collapse
|
13
|
Vallat B, Tauriello G, Bienert S, Haas J, Webb BM, Žídek A, Zheng W, Peisach E, Piehl DW, Anischanka I, Sillitoe I, Tolchard J, Varadi M, Baker D, Orengo C, Zhang Y, Hoch JC, Kurisu G, Patwardhan A, Velankar S, Burley SK, Sali A, Schwede T, Berman HM, Westbrook JD. ModelCIF: An Extension of PDBx/mmCIF Data Representation for Computed Structure Models. J Mol Biol 2023; 435:168021. [PMID: 36828268 PMCID: PMC10293049 DOI: 10.1016/j.jmb.2023.168021] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 02/15/2023] [Accepted: 02/16/2023] [Indexed: 02/24/2023]
Abstract
ModelCIF (github.com/ihmwg/ModelCIF) is a data information framework developed for and by computational structural biologists to enable delivery of Findable, Accessible, Interoperable, and Reusable (FAIR) data to users worldwide. ModelCIF describes the specific set of attributes and metadata associated with macromolecular structures modeled by solely computational methods and provides an extensible data representation for deposition, archiving, and public dissemination of predicted three-dimensional (3D) models of macromolecules. It is an extension of the Protein Data Bank Exchange / macromolecular Crystallographic Information Framework (PDBx/mmCIF), which is the global data standard for representing experimentally-determined 3D structures of macromolecules and associated metadata. The PDBx/mmCIF framework and its extensions (e.g., ModelCIF) are managed by the Worldwide Protein Data Bank partnership (wwPDB, wwpdb.org) in collaboration with relevant community stakeholders such as the wwPDB ModelCIF Working Group (wwpdb.org/task/modelcif). This semantically rich and extensible data framework for representing computed structure models (CSMs) accelerates the pace of scientific discovery. Herein, we describe the architecture, contents, and governance of ModelCIF, and tools and processes for maintaining and extending the data standard. Community tools and software libraries that support ModelCIF are also described.
Collapse
Affiliation(s)
- Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA.
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Juergen Haas
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Benjamin M Webb
- Department of Bioengineering and Therapeutic Sciences, the Quantitative Biosciences Institute (QBI), and the Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94157, USA
| | | | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dennis W Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ivan Anischanka
- Department of Biochemistry, and Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Ian Sillitoe
- Department of Structural and Molecular Biology, UCL, London, UK
| | - James Tolchard
- AlphaFold Protein Structure Database, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK; Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK
| | - Mihaly Varadi
- AlphaFold Protein Structure Database, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK; Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK
| | - David Baker
- Department of Biochemistry, and Institute for Protein Design, University of Washington, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | | | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, University of Connecticut, Farmington, CT 06030, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
| | - Ardan Patwardhan
- Electron Microscopy Data Bank, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Sameer Velankar
- AlphaFold Protein Structure Database, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK; Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, the Quantitative Biosciences Institute (QBI), and the Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94157, USA. https://twitter.com/salilab_ucsf
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| |
Collapse
|
14
|
Mu ZC, Tan YL, Liu J, Zhang BG, Shi YZ. Computational Modeling of DNA 3D Structures: From Dynamics and Mechanics to Folding. Molecules 2023; 28:4833. [PMID: 37375388 DOI: 10.3390/molecules28124833] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 06/11/2023] [Accepted: 06/14/2023] [Indexed: 06/29/2023] Open
Abstract
DNA carries the genetic information required for the synthesis of RNA and proteins and plays an important role in many processes of biological development. Understanding the three-dimensional (3D) structures and dynamics of DNA is crucial for understanding their biological functions and guiding the development of novel materials. In this review, we discuss the recent advancements in computer methods for studying DNA 3D structures. This includes molecular dynamics simulations to analyze DNA dynamics, flexibility, and ion binding. We also explore various coarse-grained models used for DNA structure prediction or folding, along with fragment assembly methods for constructing DNA 3D structures. Furthermore, we also discuss the advantages and disadvantages of these methods and highlight their differences.
Collapse
Affiliation(s)
- Zi-Chun Mu
- Research Center of Nonlinear Science, School of Mathematical & Physical Sciences, Wuhan Textile University, Wuhan 430073, China
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan 430073, China
| | - Ya-Lan Tan
- Research Center of Nonlinear Science, School of Mathematical & Physical Sciences, Wuhan Textile University, Wuhan 430073, China
| | - Jie Liu
- Research Center of Nonlinear Science, School of Mathematical & Physical Sciences, Wuhan Textile University, Wuhan 430073, China
| | - Ben-Gong Zhang
- Research Center of Nonlinear Science, School of Mathematical & Physical Sciences, Wuhan Textile University, Wuhan 430073, China
| | - Ya-Zhou Shi
- Research Center of Nonlinear Science, School of Mathematical & Physical Sciences, Wuhan Textile University, Wuhan 430073, China
| |
Collapse
|
15
|
Justyna M, Antczak M, Szachniuk M. Machine learning for RNA 2D structure prediction benchmarked on experimental data. Brief Bioinform 2023; 24:7140288. [PMID: 37096592 DOI: 10.1093/bib/bbad153] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 03/15/2023] [Accepted: 03/29/2023] [Indexed: 04/26/2023] Open
Abstract
Since the 1980s, dozens of computational methods have addressed the problem of predicting RNA secondary structure. Among them are those that follow standard optimization approaches and, more recently, machine learning (ML) algorithms. The former were repeatedly benchmarked on various datasets. The latter, on the other hand, have not yet undergone extensive analysis that could suggest to the user which algorithm best fits the problem to be solved. In this review, we compare 15 methods that predict the secondary structure of RNA, of which 6 are based on deep learning (DL), 3 on shallow learning (SL) and 6 control methods on non-ML approaches. We discuss the ML strategies implemented and perform three experiments in which we evaluate the prediction of (I) representatives of the RNA equivalence classes, (II) selected Rfam sequences and (III) RNAs from new Rfam families. We show that DL-based algorithms (such as SPOT-RNA and UFold) can outperform SL and traditional methods if the data distribution is similar in the training and testing set. However, when predicting 2D structures for new RNA families, the advantage of DL is no longer clear, and its performance is inferior or equal to that of SL and non-ML methods.
Collapse
Affiliation(s)
- Marek Justyna
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Maciej Antczak
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| | - Marta Szachniuk
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| |
Collapse
|