1
|
Bu F, Adam Y, Adamiak RW, Antczak M, de Aquino BRH, Badepally NG, Batey RT, Baulin EF, Boinski P, Boniecki MJ, Bujnicki JM, Carpenter KA, Chacon J, Chen SJ, Chiu W, Cordero P, Das NK, Das R, Dawson WK, DiMaio F, Ding F, Dock-Bregeon AC, Dokholyan NV, Dror RO, Dunin-Horkawicz S, Eismann S, Ennifar E, Esmaeeli R, Farsani MA, Ferré-D'Amaré AR, Geniesse C, Ghanim GE, Guzman HV, Hood IV, Huang L, Jain DS, Jaryani F, Jin L, Joshi A, Karelina M, Kieft JS, Kladwang W, Kmiecik S, Koirala D, Kollmann M, Kretsch RC, Kurciński M, Li J, Li S, Magnus M, Masquida B, Moafinejad SN, Mondal A, Mukherjee S, Nguyen THD, Nikolaev G, Nithin C, Nye G, Pandaranadar Jeyeram IPN, Perez A, Pham P, Piccirilli JA, Pilla SP, Pluta R, Poblete S, Ponce-Salvatierra A, Popenda M, Popenda L, Pucci F, Rangan R, Ray A, Ren A, Sarzynska J, Sha CM, Stefaniak F, Su Z, Suddala KC, Szachniuk M, Townshend R, Trachman RJ, Wang J, Wang W, Watkins A, Wirecki TK, Xiao Y, Xiong P, Xiong Y, Yang J, Yesselman JD, Zhang J, Zhang Y, Zhang Z, Zhou Y, Zok T, Zhang D, Zhang S, Żyła A, Westhof E, Miao Z. RNA-Puzzles Round V: blind predictions of 23 RNA structures. Nat Methods 2025; 22:399-411. [PMID: 39623050 PMCID: PMC11810798 DOI: 10.1038/s41592-024-02543-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 10/29/2024] [Indexed: 01/16/2025]
Abstract
RNA-Puzzles is a collective endeavor dedicated to the advancement and improvement of RNA three-dimensional structure prediction. With agreement from structural biologists, RNA structures are predicted by modeling groups before publication of the experimental structures. We report a large-scale set of predictions by 18 groups for 23 RNA-Puzzles: 4 RNA elements, 2 Aptamers, 4 Viral elements, 5 Ribozymes and 8 Riboswitches. We describe automatic assessment protocols for comparisons between prediction and experiment. Our analyses reveal some critical steps to be overcome to achieve good accuracy in modeling RNA structures: identification of helix-forming pairs and of non-Watson-Crick modules, correct coaxial stacking between helices and avoidance of entanglements. Three of the top four modeling groups in this round also ranked among the top four in the CASP15 contest.
Collapse
Grants
- T32 GM066706 NIGMS NIH HHS
- NSFC T2225007 National Natural Science Foundation of China (National Science Foundation of China)
- R35 GM134919 NIGMS NIH HHS
- R35GM145409 Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- R35 GM145409 NIGMS NIH HHS
- 32270707 National Natural Science Foundation of China (National Science Foundation of China)
- R35 GM122579 NIGMS NIH HHS
- R35 GM134864 NIGMS NIH HHS
- T32 grant GM066706 Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- P20GM121342 Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- R21 CA219847 NCI NIH HHS
- 32171191 National Natural Science Foundation of China (National Science Foundation of China)
- P20 GM121342 NIGMS NIH HHS
- R35 GM152029 NIGMS NIH HHS
- R01 GM073850 NIGMS NIH HHS
- F32 GM112294 NIGMS NIH HHS
- ZIA DK075136 Intramural NIH HHS
- Z.M. is supported by Major Projects of Guangzhou National Laboratory, (Grant No. GZNL2023A01006, GZNL2024A01002, SRPG22-003, SRPG22-006, SRPG22-007, HWYQ23-003, YW-YFYJ0102), the National Key R&D Programs of China (2023YFF1204700, 2023YFF1204701, 2021YFF1200900, 2021YFF1200903). This work is part of the ITI 2021-2028 program and supported by IdEx Unistra (ANR-10-IDEX-0002 to E.W.), SFRI-STRAT’US project (ANR-20-SFRI-0012) and EUR IMCBio (IMCBio ANR-17-EURE-0023 to E.W.) under the framework of the French Investments for the Future Program.
- E.W. acknowledges also support from Wenzhou Institute, University of Chinese Academy of Sciences (WIUCASQD2024002).
- E.F.B. was additionally supported by European Molecular Biology Organization (EMBO) fellowship (ALTF 525-2022).
- Boniecki’s research was supported by the Polish National Science Center Poland (NCN) (grant 2016/23/B/ST6/03433 to Michal J. Boniecki). Predictions were performed using computational resources of the Interdisciplinary Centre for Mathematical and Computational Modelling of the University of Warsaw (ICM) (grant G66-9).
- J.M.B. is supported by the National Science Centre in Poland (NCN grants: 2017/26/A/NZ1/01083 to J.M.B., 2021/43/D/NZ1/03360 to S.M., 2020/39/B/NZ2/03127 to F.S., 2020/39/D/NZ2/02837 to T.K.W.). J.M.B. acknowledge Poland high-performance computing Infrastructure PLGrid (HPC Centers: ACK Cyfronet AGH, PCSS, CI TASK, WCSS) for providing computer facilities and support within the computational grant PLG/2023/016080.
- S.J.C. is supported by the National Institutes of Health under Grant R35-GM134919.
- R.D. is supported by Stanford Bio-X (to R.D., R.O.D., R.C.K., and S.E.); Stanford Gerald J. Lieberman Fellowship (to R.R.); the National Institutes of Health (R21 CA219847 and R35 GM122579 to R.D.), the Howard Hughes Medical Institute (HHMI, to R.D.); Consejo Nacional de Ciencia y Tecnología CONACyT Fellowship 312765 (P.C.); the Ruth L. Kirschstein National Research Service Award Postdoctoral Fellowships GM112294 (to J.D.Y.); National Science Foundation Graduate Research Fellowships (R.J.L.T. and R.R.); the National Library of Medicine T15 Training Grant (NLM T15007033 to K.A.C.); the U.S. Department of Energy, Office of Science Graduate Student Research program (R.J.L.T.).
- The National Institutes of Health grants 1R35 GM134864 and the Passan Foundation.
- R.O.D. is supported by the U.S. Department of Energy, Office of Science, Scientific Discovery through Advanced Computing (SciDAC) program (R.O.D.); Intel (R.O.D.).
- A.F.D. is supported, in part, by the intramural program of the National Heart, Lung and Blood Institute, National Institutes of Health, USA.
- Guangdong Science and Technology Department (2022A1515010328, 2023B1212060013, 2020B1212030004), Fundamental Research Funds for the Central Universities, Sun Yat-sen University (23ptpy41).
- D.K. is supported by the NSF CAREER award MCB-2236996, and start-up, SURFF, and START awards from the University of Maryland Baltimore County to D.K.
- BM is supported by the Interdisciplinary Thematic Institute IMCBio, as part of the ITI 2021-2028 program at the University of Strasbourg, CNRS and Inserm, by IdEx Unistra (ANR-10-IDEX-0002), and EUR (IMCBio ANR-17-EUR-0023), under the framework of the French Investments Program for the Future.
- T.H.D.N. is supported by UKRI-Medical Research Council grant MC_UP_1201/19.
- C.N. and M.K. acknowledge funding from the National Science Centre, Poland [OPUS 2019/33/B/NZ2/02100]; S.P.P. acknowledges funding from the National Science Centre, Poland [OPUS 2020/39/B/NZ2/01301]; S.K. acknowledges funding from the National Science Centre, Poland [Sheng 2021/40/Q/NZ2/00078]; C.N. acknowledge Polish high-performance computing infrastructure PLGrid (HPC Centers: PCSS, ACK Cyfronet AGH, CI TASK, WCSS) for providing computer facilities and support within the computational grants PLG/2022/016043, PLG/2022/015327 and PLG/2020/013424.
- AP is supported by an NSF-CAREER award CHE-2235785
- A.R. is supported by grants from the Natural Science Foundation of China (32325029, 32022039, 91940302, and 91640104), the National Key Research and Development Project of China (2021YFC2300300 and 2023YFC2604300).
- Marta Szachniuk are supported by the National Science Centre, Poland (2019/35/B/ST6/03074 to M.S.), the statutory funds of IBCH PAS and Poznan University of Technology.
- J.W. is supported by the Penn State College of Medicine’s Artificial Intelligence and Biomedical Informatics Program.
- J.Z. is supported by the Intramural Research Program of the NIH, the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) (ZIADK075136 to J.Z.), and an NIH Deputy Director for Intramural Research (DDIR) Challenge Award to J.Z.
Collapse
Affiliation(s)
- Fan Bu
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Yagoub Adam
- Inter-institutional Graduate Program on Bioinformatics, Department of Computer Science and Mathematics, FFCLRP, University of São Paulo, Ribeirão Preto, Brazil
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Nigeria
| | - Ryszard W Adamiak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Maciej Antczak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Belisa Rebeca H de Aquino
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Nagendar Goud Badepally
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Robert T Batey
- Department of Biochemistry, University of Colorado at Boulder, Boulder, CO, USA
| | - Eugene F Baulin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Pawel Boinski
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Michal J Boniecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Kristy A Carpenter
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Jose Chacon
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Department of Cell and Developmental Biology, University of California San Diego, San Diego, CA, USA
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Wah Chiu
- Department of Bioengineering and James H. Clark Center, Stanford University, Stanford, CA, USA
| | - Pablo Cordero
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Stripe, South San Francisco, CA, USA
| | - Naba Krishna Das
- Department of Chemistry and Biochemistry, University of Maryland Baltimore County, Baltimore, MD, USA
| | - Rhiju Das
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
- Biophysics program, Stanford University, Stanford, CA, USA
| | - Wayne K Dawson
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Feng Ding
- Department of Physics and Astronomy, Clemson University, Clemson, SC, USA
| | - Anne-Catherine Dock-Bregeon
- Laboratory of Integrative Biology of Marine Models (LBI2M), Sorbonne University-CNRS UMR8227, Roscoff, France
| | - Nikolay V Dokholyan
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Ron O Dror
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Department of Structural Biology, Stanford University, Stanford, CA, USA
- Department of Molecular and Cellular Physiology, Stanford University, Stanford, CA, USA
- Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, USA
| | - Stanisław Dunin-Horkawicz
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Stephan Eismann
- Department of Applied Physics, Stanford University, Stanford, CA, USA
- Atomic AI, South San Francisco, CA, USA
| | - Eric Ennifar
- Architecture et Réactivité de l'ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, Strasbourg, France
| | - Reza Esmaeeli
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Masoud Amiri Farsani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Adrian R Ferré-D'Amaré
- Laboratory of Nucleic Acids, National Heart, Lung and Blood Institute, Bethesda, MD, USA
| | - Caleb Geniesse
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - George E Ghanim
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Horacio V Guzman
- Instituto de Ciencia de Materials de Barcelona, ICMAB-CSIC, Bellaterra E-08193, Spain & Departamento de Física Teórica de la Materia Condensada, Universidad Autónoma de Madrid, Madrid, Spain
| | - Iris V Hood
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Lin Huang
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University Guangzhou, Guangdong, China
| | - Dharm Skandh Jain
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Farhang Jaryani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Lei Jin
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Astha Joshi
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Masha Karelina
- Biophysics program, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jeffrey S Kieft
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, CO, USA
- New York Structural Biology Center, New York, NY, USA
| | - Wipapat Kladwang
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Sebastian Kmiecik
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Deepak Koirala
- Department of Chemistry and Biochemistry, University of Maryland Baltimore County, Baltimore, MD, USA
| | - Markus Kollmann
- Department of Computer Science, Heinrich Heine University of Düsseldorf, Düsseldorf, Germany
| | | | - Mateusz Kurciński
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Jun Li
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Shuang Li
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Marcin Magnus
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - BenoÎt Masquida
- UMR 7156, CNRS - Université de Strasbourg, IPCB, Strasbourg, France
| | - S Naeim Moafinejad
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Sunandan Mukherjee
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | | | - Grigory Nikolaev
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Chandran Nithin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Grace Nye
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Iswarya P N Pandaranadar Jeyeram
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Phillip Pham
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Joseph A Piccirilli
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, IL, USA
- Department of Chemistry, The University of Chicago, Chicago, IL, USA
| | - Smita Priyadarshini Pilla
- Laboratory of Computational Biology, Biological and Chemical Research Center, University of Warsaw, Warsaw, Poland
| | - Radosław Pluta
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Simón Poblete
- Facultad de Ingeniería, Arquitectura y Diseño, Universidad San Sebastián, Santiago, Chile
- Centro BASAL Ciencia & Vida, Universidad San Sebastián, Santiago, Chile
| | - Almudena Ponce-Salvatierra
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Mariusz Popenda
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Lukasz Popenda
- NanoBioMedical Centre, Adam Mickiewicz University, Poznan, Poland
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
| | - Ramya Rangan
- Biophysics program, Stanford University, Stanford, CA, USA
- Atomic AI, South San Francisco, CA, USA
| | - Angana Ray
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Aiming Ren
- Life Sciences Institute, Zhejiang University, Hangzhou, China
| | - Joanna Sarzynska
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Congzhou Mike Sha
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Filip Stefaniak
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Zhaoming Su
- The State Key Laboratory of Biotherapy, West China Hospital, Chengdu, China
| | - Krishna C Suddala
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Raphael Townshend
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Atomic AI, South San Francisco, CA, USA
| | - Robert J Trachman
- Laboratory of Nucleic Acids, National Heart, Lung and Blood Institute, Bethesda, MD, USA
| | - Jian Wang
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Wenkai Wang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
| | - Andrew Watkins
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Prescient Design, Genentech Research and Early Development, South San Francisco, CA, USA
| | - Tomasz K Wirecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Yi Xiao
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, China
| | - Peng Xiong
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Department of Biomedical Engineering, Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China
| | - Yiduo Xiong
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, China
| | - Jianyi Yang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
| | - Joseph David Yesselman
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
- Department of Chemistry, University of Nebraska, Lincoln, NE, USA
| | - Jinwei Zhang
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Yi Zhang
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, China
| | - Zhenzhen Zhang
- Department of Physics and Astronomy, Clemson University, Clemson, SC, USA
| | - Yuanzhe Zhou
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Dong Zhang
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Sicheng Zhang
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Adriana Żyła
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Eric Westhof
- Architecture et Réactivité de l'ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, Strasbourg, France.
- Engineering Research Center of Clinical Functional Materials and Diagnosis & Treatment Devices of Zhejiang Province, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, China.
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China.
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People's Hospital, School of Medicine, Tongji University, Shanghai, China.
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Cambridge, UK.
| |
Collapse
|
2
|
Bernard C, Postic G, Ghannay S, Tahi F. Has AlphaFold3 achieved success for RNA? Acta Crystallogr D Struct Biol 2025; 81:49-62. [PMID: 39868559 PMCID: PMC11804252 DOI: 10.1107/s2059798325000592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Accepted: 01/21/2025] [Indexed: 01/28/2025] Open
Abstract
Predicting the 3D structure of RNA is a significant challenge despite ongoing advancements in the field. Although AlphaFold has successfully addressed this problem for proteins, RNA structure prediction raises difficulties due to the fundamental differences between proteins and RNA, which hinder its direct adaptation. The latest release of AlphaFold, AlphaFold3, has broadened its scope to include multiple different molecules such as DNA, ligands and RNA. While the AlphaFold3 article discussed the results for the last CASP-RNA data set, the scope of its performance and the limitations for RNA are unclear. In this article, we provide a comprehensive analysis of the performance of AlphaFold3 in the prediction of 3D structures of RNA. Through an extensive benchmark over five different test sets, we discuss the performance and limitations of AlphaFold3. We also compare its performance with ten existing state-of-the-art ab initio, template-based and deep-learning approaches. Our results are freely available on the EvryRNA platform at https://evryrna.ibisc.univ-evry.fr/evryrna/alphafold3/.
Collapse
Affiliation(s)
- Clément Bernard
- Université Paris-Saclay, Université Evry, IBISC, 91020Evry-Courcouronnes, France
- LISN – CNRS/Université Paris-Saclay, 91400Orsay, France
| | - Guillaume Postic
- Université Paris-Saclay, Université Evry, IBISC, 91020Evry-Courcouronnes, France
| | - Sahar Ghannay
- LISN – CNRS/Université Paris-Saclay, 91400Orsay, France
| | - Fariza Tahi
- Université Paris-Saclay, Université Evry, IBISC, 91020Evry-Courcouronnes, France
| |
Collapse
|
3
|
Kagaya Y, Zhang Z, Ibtehaz N, Wang X, Nakamura T, Punuru PD, Kihara D. NuFold: end-to-end approach for RNA tertiary structure prediction with flexible nucleobase center representation. Nat Commun 2025; 16:881. [PMID: 39837861 PMCID: PMC11751094 DOI: 10.1038/s41467-025-56261-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 01/13/2025] [Indexed: 01/23/2025] Open
Abstract
RNA plays a crucial role not only in information transfer as messenger RNA during gene expression but also in various biological functions as non-coding RNAs. Understanding mechanical mechanisms of function needs tertiary structure information; however, experimental determination of three-dimensional RNA structures is costly and time-consuming, leading to a substantial gap between RNA sequence and structural data. To address this challenge, we developed NuFold, a novel computational approach that leverages state-of-the-art deep learning architecture to accurately predict RNA tertiary structures. NuFold is a deep neural network trained end-to-end for the output structure from the input sequence. NuFold incorporates a nucleobase center representation, which enables flexible conformation of ribose rings. Benchmark study showed that NuFold clearly outperformed energy-based methods and demonstrated comparable results with existing state-of-the-art deep-learning-based methods. NuFold exhibited a particular advantage in building correct local geometries of RNA. Analyses of individual components in the NuFold pipeline indicated that the performance improved by utilizing metagenome sequences for multiple sequence alignment and increasing the number of recycling. NuFold is also capable of predicting multimer complex structures of RNA by linking the input sequences.
Collapse
Affiliation(s)
- Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, 47907, Indiana, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, 47907, Indiana, USA
| | - Nabil Ibtehaz
- Department of Computer Science, Purdue University, West Lafayette, 47907, Indiana, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, 47907, Indiana, USA
| | - Tsukasa Nakamura
- Department of Biological Sciences, Purdue University, West Lafayette, 47907, Indiana, USA
| | - Pranav Deep Punuru
- Department of Biological Sciences, Purdue University, West Lafayette, 47907, Indiana, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, 47907, Indiana, USA.
- Department of Computer Science, Purdue University, West Lafayette, 47907, Indiana, USA.
| |
Collapse
|
4
|
Bernard C, Postic G, Ghannay S, Tahi F. RNA-TorsionBERT: leveraging language models for RNA 3D torsion angles prediction. Bioinformatics 2024; 41:btaf004. [PMID: 39775709 PMCID: PMC11758789 DOI: 10.1093/bioinformatics/btaf004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 12/11/2024] [Accepted: 01/07/2025] [Indexed: 01/11/2025] Open
Abstract
MOTIVATION Predicting the 3D structure of RNA is an ongoing challenge that has yet to be completely addressed despite continuous advancements. RNA 3D structures rely on distances between residues and base interactions but also backbone torsional angles. Knowing the torsional angles for each residue could help reconstruct its global folding, which is what we tackle in this work. This paper presents a novel approach for directly predicting RNA torsional angles from raw sequence data. Our method draws inspiration from the successful application of language models in various domains and adapts them to RNA. RESULTS We have developed a language-based model, RNA-TorsionBERT, incorporating better sequential interactions for predicting RNA torsional and pseudo-torsional angles from the sequence only. Through extensive benchmarking, we demonstrate that our method improves the prediction of torsional angles compared to state-of-the-art methods. In addition, by using our predictive model, we have inferred a torsion angle-dependent scoring function, called TB-MCQ, that replaces the true reference angles by our model prediction. We show that it accurately evaluates the quality of near-native predicted structures, in terms of RNA backbone torsion angle values. Our work demonstrates promising results, suggesting the potential utility of language models in advancing RNA 3D structure prediction. AVAILABILITY AND IMPLEMENTATION Source code is freely available on the EvryRNA platform: https://evryrna.ibisc.univ-evry.fr/evryrna/RNA-TorsionBERT.
Collapse
Affiliation(s)
- Clément Bernard
- Université Paris Saclay, Univ Evry, IBISC, Evry-Courcouronnes 91020, France
- LISN—CNRS/Université Paris-Saclay, Orsay 91400, France
| | - Guillaume Postic
- Université Paris Saclay, Univ Evry, IBISC, Evry-Courcouronnes 91020, France
| | - Sahar Ghannay
- LISN—CNRS/Université Paris-Saclay, Orsay 91400, France
| | - Fariza Tahi
- Université Paris Saclay, Univ Evry, IBISC, Evry-Courcouronnes 91020, France
| |
Collapse
|
5
|
Shen T, Hu Z, Sun S, Liu D, Wong F, Wang J, Chen J, Wang Y, Hong L, Xiao J, Zheng L, Krishnamoorthi T, King I, Wang S, Yin P, Collins JJ, Li Y. Accurate RNA 3D structure prediction using a language model-based deep learning approach. Nat Methods 2024; 21:2287-2298. [PMID: 39572716 PMCID: PMC11621015 DOI: 10.1038/s41592-024-02487-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 09/25/2024] [Indexed: 12/07/2024]
Abstract
Accurate prediction of RNA three-dimensional (3D) structures remains an unsolved challenge. Determining RNA 3D structures is crucial for understanding their functions and informing RNA-targeting drug development and synthetic biology design. The structural flexibility of RNA, which leads to the scarcity of experimentally determined data, complicates computational prediction efforts. Here we present RhoFold+, an RNA language model-based deep learning method that accurately predicts 3D structures of single-chain RNAs from sequences. By integrating an RNA language model pretrained on ~23.7 million RNA sequences and leveraging techniques to address data scarcity, RhoFold+ offers a fully automated end-to-end pipeline for RNA 3D structure prediction. Retrospective evaluations on RNA-Puzzles and CASP15 natural RNA targets demonstrate the superiority of RhoFold+ over existing methods, including human expert groups. Its efficacy and generalizability are further validated through cross-family and cross-type assessments, as well as time-censored benchmarks. Additionally, RhoFold+ predicts RNA secondary structures and interhelical angles, providing empirically verifiable features that broaden its applicability to RNA structure and function studies.
Collapse
Affiliation(s)
- Tao Shen
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
- Shanghai Zelixir Biotech Company Ltd, Shanghai, China
- Shenzhen Institute of Advanced Technology, Shenzhen, China
| | - Zhihang Hu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Siqi Sun
- Research Institute of Intelligent Complex Systems, Fudan University, Shanghai, China.
- Shanghai Artificial Intelligence Laboratory, Shanghai, China.
| | - Di Liu
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Center for Molecular Design and Biomimetics at the Biodesign Institute, Arizona State University, Tempe, AZ, USA.
- School of Molecular Sciences, Arizona State University, Tempe, AZ, USA.
| | - Felix Wong
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, USA
- Integrated Biosciences, Redwood City, CA, USA
| | - Jiuming Wang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
- OneAIM Ltd, Hong Kong SAR, China
| | - Jiayang Chen
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Yixuan Wang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Liang Hong
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Jin Xiao
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Liangzhen Zheng
- Shanghai Zelixir Biotech Company Ltd, Shanghai, China
- Shenzhen Institute of Advanced Technology, Shenzhen, China
| | - Tejas Krishnamoorthi
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
| | - Irwin King
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd, Shanghai, China.
- Shenzhen Institute of Advanced Technology, Shenzhen, China.
| | - Peng Yin
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
| | - James J Collins
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Yu Li
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China.
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- The CUHK Shenzhen Research Institute, Shenzhen, China.
| |
Collapse
|
6
|
Bahai A, Kwoh CK, Mu Y, Li Y. Systematic benchmarking of deep-learning methods for tertiary RNA structure prediction. PLoS Comput Biol 2024; 20:e1012715. [PMID: 39775239 PMCID: PMC11723642 DOI: 10.1371/journal.pcbi.1012715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 01/10/2025] [Accepted: 12/10/2024] [Indexed: 01/11/2025] Open
Abstract
The 3D structure of RNA critically influences its functionality, and understanding this structure is vital for deciphering RNA biology. Experimental methods for determining RNA structures are labour-intensive, expensive, and time-consuming. Computational approaches have emerged as valuable tools, leveraging physics-based-principles and machine learning to predict RNA structures rapidly. Despite advancements, the accuracy of computational methods remains modest, especially when compared to protein structure prediction. Deep learning methods, while successful in protein structure prediction, have shown some promise for RNA structure prediction as well, but face unique challenges. This study systematically benchmarks state-of-the-art deep learning methods for RNA structure prediction across diverse datasets. Our aim is to identify factors influencing performance variation, such as RNA family diversity, sequence length, RNA type, multiple sequence alignment (MSA) quality, and deep learning model architecture. We show that generally ML-based methods perform much better than non-ML methods on most RNA targets, although the performance difference isn't substantial when working with unseen novel or synthetic RNAs. The quality of the MSA and secondary structure prediction both play an important role and most methods aren't able to predict non-Watson-Crick pairs in the RNAs. Overall among the automated 3D RNA structure prediction methods, DeepFoldRNA has the best prediction results followed by DRFold as the second best method. Finally, we also suggest possible mitigations to improve the quality of the prediction for future method development.
Collapse
Affiliation(s)
- Akash Bahai
- School of Biological Sciences (SBS), Nanyang Technological University, Singapore, Singapore
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Yuguang Mu
- School of Biological Sciences (SBS), Nanyang Technological University, Singapore, Singapore
| | - Yinghui Li
- School of Biological Sciences (SBS), Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
7
|
Mukherjee S, Moafinejad SN, Badepally NG, Merdas K, Bujnicki JM. Advances in the field of RNA 3D structure prediction and modeling, with purely theoretical approaches, and with the use of experimental data. Structure 2024; 32:1860-1876. [PMID: 39321802 DOI: 10.1016/j.str.2024.08.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Revised: 08/08/2024] [Accepted: 08/22/2024] [Indexed: 09/27/2024]
Abstract
Recent advancements in RNA three-dimensional (3D) structure prediction have provided significant insights into RNA biology, highlighting the essential role of RNA in cellular functions and its therapeutic potential. This review summarizes the latest developments in computational methods, particularly the incorporation of artificial intelligence and machine learning, which have improved the efficiency and accuracy of RNA structure predictions. We also discuss the integration of new experimental data types, including cryoelectron microscopy (cryo-EM) techniques and high-throughput sequencing, which have transformed RNA structure modeling. The combination of experimental advances with computational methods represents a significant leap in RNA structure determination. We review the outcomes of RNA-Puzzles and critical assessment of structure prediction (CASP) challenges, which assess the state of the field and limitations of existing methods. Future perspectives are discussed, focusing on the impact of RNA 3D structure prediction on understanding RNA mechanisms and its implications for drug discovery and RNA-targeted therapies, opening new avenues in molecular biology.
Collapse
Affiliation(s)
- Sunandan Mukherjee
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - S Naeim Moafinejad
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Nagendar Goud Badepally
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Katarzyna Merdas
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland.
| |
Collapse
|
8
|
Mackowiak M, Adamczyk B, Szachniuk M, Zok T. RNAtango: Analysing and comparing RNA 3D structures via torsional angles. PLoS Comput Biol 2024; 20:e1012500. [PMID: 39374268 PMCID: PMC11486365 DOI: 10.1371/journal.pcbi.1012500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 10/17/2024] [Accepted: 09/18/2024] [Indexed: 10/09/2024] Open
Abstract
RNA molecules, essential for viruses and living organisms, derive their pivotal functions from intricate 3D structures. To understand these structures, one can analyze torsion and pseudo-torsion angles, which describe rotations around bonds, whether real or virtual, thus capturing the RNA conformational flexibility. Such an analysis has been made possible by RNAtango, a web server introduced in this paper, that provides a trigonometric perspective on RNA 3D structures, giving insights into the variability of examined models and their alignment with reference targets. RNAtango offers comprehensive tools for calculating torsion and pseudo-torsion angles, generating angle statistics, comparing RNA structures based on backbone torsions, and assessing local and global structural similarities using trigonometric functions and angle measures. The system operates in three scenarios: single model analysis, model-versus-target comparison, and model-versus-model comparison, with results output in text and graphical formats. Compatible with all modern web browsers, RNAtango is accessible freely along with the source code. It supports researchers in accurately assessing structural similarities, which contributes to the precision and efficiency of RNA modeling.
Collapse
Affiliation(s)
- Marta Mackowiak
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Bartosz Adamczyk
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Marta Szachniuk
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| |
Collapse
|
9
|
Fallah A, Havaei SA, Sedighian H, Kachuei R, Fooladi AAI. Prediction of aptamer affinity using an artificial intelligence approach. J Mater Chem B 2024; 12:8825-8842. [PMID: 39158322 DOI: 10.1039/d4tb00909f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/20/2024]
Abstract
Aptamers are oligonucleotide sequences that can connect to particular target molecules, similar to monoclonal antibodies. They can be chosen by systematic evolution of ligands by exponential enrichment (SELEX), and are modifiable and can be synthesized. Even if the SELEX approach has been improved a lot, it is frequently challenging and time-consuming to identify aptamers experimentally. In particular, structure-based methods are the most used in computer-aided design and development of aptamers. For this purpose, numerous web-based platforms have been suggested for the purpose of forecasting the secondary structure and 3D configurations of RNAs and DNAs. Also, molecular docking and molecular dynamics (MD), which are commonly utilized in protein compound selection by structural information, are suitable for aptamer selection. On the other hand, from a large number of sequences, artificial intelligence (AI) may be able to quickly discover the possible aptamer candidates. Conversely, sophisticated machine and deep-learning (DL) models have demonstrated efficacy in forecasting the binding properties between ligands and targets during drug discovery; as such, they may provide a reliable and precise method for forecasting the binding of aptamers to targets. This research looks at advancements in AI pipelines and strategies for aptamer binding ability prediction, such as machine and deep learning, as well as structure-based approaches, molecular dynamics and molecular docking simulation methods.
Collapse
Affiliation(s)
- Arezoo Fallah
- Department of Bacteriology and Virology, Faculty of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Seyed Asghar Havaei
- Department of Microbiology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.
| | - Hamid Sedighian
- Applied Microbiology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| | - Reza Kachuei
- Molecular Biology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Abbas Ali Imani Fooladi
- Applied Microbiology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
10
|
Sha CM, Wang J, Dokholyan NV. Predicting 3D RNA structure from the nucleotide sequence using Euclidean neural networks. Biophys J 2024; 123:2671-2681. [PMID: 37838833 PMCID: PMC11393712 DOI: 10.1016/j.bpj.2023.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/19/2023] [Accepted: 10/12/2023] [Indexed: 10/16/2023] Open
Abstract
Fast and accurate 3D RNA structure prediction remains a major challenge in structural biology, mostly due to the size and flexibility of RNA molecules, as well as the lack of diverse experimentally determined structures of RNA molecules. Unlike DNA structure, RNA structure is far less constrained by basepair hydrogen bonding, resulting in an explosion of potential stable states. Here, we propose a convolutional neural network that predicts all pairwise distances between residues in an RNA, using a recently described smooth parametrization of Euclidean distance matrices. We achieve high-accuracy predictions on RNAs up to 100 nt in length in fractions of a second, a factor of 107 faster than existing molecular dynamics-based methods. We also convert our coarse-grained machine learning output into an all-atom model using discrete molecular dynamics with constraints. Our proposed computational pipeline predicts all-atom RNA models solely from the nucleotide sequence. However, this method suffers from the same limitation as nucleic acid molecular dynamics: the scarcity of available RNA crystal structures for training.
Collapse
Affiliation(s)
- Congzhou M Sha
- Department of Engineering Science and Mechanics, Penn State University, State College, Pennsylvania; Department of Pharmacology, Penn State College of Medicine, Hershey, Pennsylvania
| | - Jian Wang
- Department of Pharmacology, Penn State College of Medicine, Hershey, Pennsylvania
| | - Nikolay V Dokholyan
- Department of Engineering Science and Mechanics, Penn State University, State College, Pennsylvania; Department of Pharmacology, Penn State College of Medicine, Hershey, Pennsylvania; Department of Biochemistry and Molecular Biology, Penn State College of Medicine, Hershey, Pennsylvania; Department of Chemistry, Penn State University, State College, Pennsylvania; Department of Biomedical Engineering, Penn State University, State College, Pennsylvania.
| |
Collapse
|
11
|
Zhang S, Li J, Chen SJ. Machine learning in RNA structure prediction: Advances and challenges. Biophys J 2024; 123:2647-2657. [PMID: 38297836 PMCID: PMC11393687 DOI: 10.1016/j.bpj.2024.01.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/08/2024] [Accepted: 01/24/2024] [Indexed: 02/02/2024] Open
Abstract
RNA molecules play a crucial role in various biological processes, with their functionality closely tied to their structures. The remarkable advancements in machine learning techniques for protein structure prediction have shown promise in the field of RNA structure prediction. In this perspective, we discuss the advances and challenges encountered in constructing machine learning-based models for RNA structure prediction. We explore topics including model building strategies, specific challenges involved in predicting RNA secondary (2D) and tertiary (3D) structures, and approaches to these challenges. In addition, we highlight the advantages and challenges of constructing RNA language models. Given the rapid advances of machine learning techniques, we anticipate that machine learning-based models will serve as important tools for predicting RNA structures, thereby enriching our understanding of RNA structures and their corresponding functions.
Collapse
Affiliation(s)
- Sicheng Zhang
- Department of Physics and Institute of Data Science and Informatics, University of Missouri, Columbia, Missouri
| | - Jun Li
- Department of Physics and Institute of Data Science and Informatics, University of Missouri, Columbia, Missouri
| | - Shi-Jie Chen
- Department of Physics and Institute of Data Science and Informatics, University of Missouri, Columbia, Missouri; Department of Biochemistry, University of Missouri, Columbia, Missouri.
| |
Collapse
|
12
|
Nithin C, Kmiecik S, Błaszczyk R, Nowicka J, Tuszyńska I. Comparative analysis of RNA 3D structure prediction methods: towards enhanced modeling of RNA-ligand interactions. Nucleic Acids Res 2024; 52:7465-7486. [PMID: 38917327 PMCID: PMC11260495 DOI: 10.1093/nar/gkae541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 05/23/2024] [Accepted: 06/16/2024] [Indexed: 06/27/2024] Open
Abstract
Accurate RNA structure models are crucial for designing small molecule ligands that modulate their functions. This study assesses six standalone RNA 3D structure prediction methods-DeepFoldRNA, RhoFold, BRiQ, FARFAR2, SimRNA and Vfold2, excluding web-based tools due to intellectual property concerns. We focus on reproducing the RNA structure existing in RNA-small molecule complexes, particularly on the ability to model ligand binding sites. Using a comprehensive set of RNA structures from the PDB, which includes diverse structural elements, we found that machine learning (ML)-based methods effectively predict global RNA folds but are less accurate with local interactions. Conversely, non-ML-based methods demonstrate higher precision in modeling intramolecular interactions, particularly with secondary structure restraints. Importantly, ligand-binding site accuracy can remain sufficiently high for practical use, even if the overall model quality is not optimal. With the recent release of AlphaFold 3, we included this advanced method in our tests. Benchmark subsets containing new structures, not used in the training of the tested ML methods, show that AlphaFold 3's performance was comparable to other ML-based methods, albeit with some challenges in accurately modeling ligand binding sites. This study underscores the importance of enhancing binding site prediction accuracy and the challenges in modeling RNA-ligand interactions accurately.
Collapse
Affiliation(s)
- Chandran Nithin
- Molecure SA, 02-089 Warsaw, Poland
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, 02-089 Warsaw, Poland
| | - Sebastian Kmiecik
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, 02-089 Warsaw, Poland
| | | | | | | |
Collapse
|
13
|
Steffen FD, Cunha RA, Sigel RKO, Börner R. FRET-guided modeling of nucleic acids. Nucleic Acids Res 2024; 52:e59. [PMID: 38869063 PMCID: PMC11260485 DOI: 10.1093/nar/gkae496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 05/29/2024] [Indexed: 06/14/2024] Open
Abstract
The functional diversity of RNAs is encoded in their innate conformational heterogeneity. The combination of single-molecule spectroscopy and computational modeling offers new attractive opportunities to map structural transitions within nucleic acid ensembles. Here, we describe a framework to harmonize single-molecule Förster resonance energy transfer (FRET) measurements with molecular dynamics simulations and de novo structure prediction. Using either all-atom or implicit fluorophore modeling, we recreate FRET experiments in silico, visualize the underlying structural dynamics and quantify the reaction coordinates. Using multiple accessible-contact volumes as a post hoc scoring method for fragment assembly in Rosetta, we demonstrate that FRET can be used to filter a de novo RNA structure prediction ensemble by refuting models that are not compatible with in vitro FRET measurement. We benchmark our FRET-assisted modeling approach on double-labeled DNA strands and validate it against an intrinsically dynamic manganese(II)-binding riboswitch. We show that a FRET coordinate describing the assembly of a four-way junction allows our pipeline to recapitulate the global fold of the riboswitch displayed by the crystal structure. We conclude that computational fluorescence spectroscopy facilitates the interpretability of dynamic structural ensembles and improves the mechanistic understanding of nucleic acid interactions.
Collapse
Affiliation(s)
- Fabio D Steffen
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Richard A Cunha
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Roland K O Sigel
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Richard Börner
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| |
Collapse
|
14
|
Moafinejad SN, de Aquino BRH, Boniecki M, Pandaranadar Jeyeram IN, Nikolaev G, Magnus M, Farsani M, Badepally N, Wirecki T, Stefaniak F, Bujnicki J. SimRNAweb v2.0: a web server for RNA folding simulations and 3D structure modeling, with optional restraints and enhanced analysis of folding trajectories. Nucleic Acids Res 2024; 52:W368-W373. [PMID: 38738621 PMCID: PMC11223799 DOI: 10.1093/nar/gkae356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/07/2024] [Accepted: 04/29/2024] [Indexed: 05/14/2024] Open
Abstract
Research on ribonucleic acid (RNA) structures and functions benefits from easy-to-use tools for computational prediction and analyses of RNA three-dimensional (3D) structure. The SimRNAweb server version 2.0 offers an enhanced, user-friendly platform for RNA 3D structure prediction and analysis of RNA folding trajectories based on the SimRNA method. SimRNA employs a coarse-grained model, Monte Carlo sampling and statistical potentials to explore RNA conformational space, optionally guided by spatial restraints. Recognized for its accuracy in RNA 3D structure prediction in RNA-Puzzles and CASP competitions, SimRNA is particularly useful for incorporating restraints based on experimental data. The new server version introduces performance optimizations and extends user control over simulations and the processing of results. It allows the application of various hard and soft restraints, accommodating alternative structures involving canonical and noncanonical base pairs and unpaired residues, while also integrating data from chemical probing methods. Enhanced features include an improved analysis of folding trajectories, offering advanced clustering options and multiple analyses of the generated trajectories. These updates provide comprehensive tools for detailed RNA structure analysis. SimRNAweb v2.0 significantly broadens the scope of RNA modeling, emphasizing flexibility and user-defined parameter control. The web server is available at https://genesilico.pl/SimRNAweb.
Collapse
Affiliation(s)
- S Naeim Moafinejad
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Belisa R H de Aquino
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Michał J Boniecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Iswarya P N Pandaranadar Jeyeram
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Grigory Nikolaev
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Marcin Magnus
- Department of Molecular and Cellular Biology, Harvard University, 52 Oxford St, Cambridge, MA 02138, USA
| | - Masoud Amiri Farsani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Nagendar Goud Badepally
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Tomasz K Wirecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Filip Stefaniak
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| |
Collapse
|
15
|
He S, Huang R, Townley J, Kretsch RC, Karagianes TG, Cox DBT, Blair H, Penzar D, Vyaltsev V, Aristova E, Zinkevich A, Bakulin A, Sohn H, Krstevski D, Fukui T, Tatematsu F, Uchida Y, Jang D, Lee JS, Shieh R, Ma T, Martynov E, Shugaev MV, Bukhari HST, Fujikawa K, Onodera K, Henkel C, Ron S, Romano J, Nicol JJ, Nye GP, Wu Y, Choe C, Reade W, Das R. Ribonanza: deep learning of RNA structure through dual crowdsourcing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.24.581671. [PMID: 38464325 PMCID: PMC10925082 DOI: 10.1101/2024.02.24.581671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Prediction of RNA structure from sequence remains an unsolved problem, and progress has been slowed by a paucity of experimental data. Here, we present Ribonanza, a dataset of chemical mapping measurements on two million diverse RNA sequences collected through Eterna and other crowdsourced initiatives. Ribonanza measurements enabled solicitation, training, and prospective evaluation of diverse deep neural networks through a Kaggle challenge, followed by distillation into a single, self-contained model called RibonanzaNet. When fine tuned on auxiliary datasets, RibonanzaNet achieves state-of-the-art performance in modeling experimental sequence dropout, RNA hydrolytic degradation, and RNA secondary structure, with implications for modeling RNA tertiary structure.
Collapse
Affiliation(s)
- Shujun He
- Department of Chemical Engineering, Texas A&M University, TX, USA
| | - Rui Huang
- Department of Biochemistry, Stanford CA, USA
| | | | | | | | - David B T Cox
- Department of Biochemistry, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
| | | | - Dmitry Penzar
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
| | - Valeriy Vyaltsev
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
| | - Elizaveta Aristova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
| | - Arsenii Zinkevich
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
| | - Artemy Bakulin
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
| | - Hoyeol Sohn
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | - Daniel Krstevski
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | | | | | | | - Donghoon Jang
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
| | | | - Roger Shieh
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | - Tom Ma
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | - Eduard Martynov
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
| | - Maxim V Shugaev
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
| | | | | | | | | | - Shlomo Ron
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | - Jonathan Romano
- Eterna Massive Open Laboratory
- Howard Hughes Medical Institute
| | | | - Grace P Nye
- Department of Biochemistry, Stanford CA, USA
| | - Yuan Wu
- Department of Biochemistry, Stanford CA, USA
- Howard Hughes Medical Institute
| | | | | | - Rhiju Das
- Department of Biochemistry, Stanford CA, USA
- Biophysics Program, Stanford CA, USA
- Howard Hughes Medical Institute
| |
Collapse
|
16
|
Bernard C, Postic G, Ghannay S, Tahi F. State-of-the-RNArt: benchmarking current methods for RNA 3D structure prediction. NAR Genom Bioinform 2024; 6:lqae048. [PMID: 38745991 PMCID: PMC11091930 DOI: 10.1093/nargab/lqae048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/05/2024] [Accepted: 05/08/2024] [Indexed: 05/16/2024] Open
Abstract
RNAs are essential molecules involved in numerous biological functions. Understanding RNA functions requires the knowledge of their 3D structures. Computational methods have been developed for over two decades to predict the 3D conformations from RNA sequences. These computational methods have been widely used and are usually categorised as either ab initio or template-based. The performances remain to be improved. Recently, the rise of deep learning has changed the sight of novel approaches. Deep learning methods are promising, but their adaptation to RNA 3D structure prediction remains difficult. In this paper, we give a brief review of the ab initio, template-based and novel deep learning approaches. We highlight the different available tools and provide a benchmark on nine methods using the RNA-Puzzles dataset. We provide an online dashboard that shows the predictions made by benchmarked methods, freely available on the EvryRNA platform: https://evryrna.ibisc.univ-evry.fr/evryrna/state_of_the_rnart/.
Collapse
Affiliation(s)
- Clément Bernard
- Université Paris-Saclay, Univ. Evry, IBISC, 91020 Evry-Courcouronnes, France
- LISN - CNRS/Université Paris-Saclay, 91400 Orsay, France
| | - Guillaume Postic
- Université Paris-Saclay, Univ. Evry, IBISC, 91020 Evry-Courcouronnes, France
| | - Sahar Ghannay
- LISN - CNRS/Université Paris-Saclay, 91400 Orsay, France
| | - Fariza Tahi
- Université Paris-Saclay, Univ. Evry, IBISC, 91020 Evry-Courcouronnes, France
| |
Collapse
|
17
|
Das R, Kretsch RC, Simpkin AJ, Mulvaney T, Pham P, Rangan R, Bu F, Keegan RM, Topf M, Rigden DJ, Miao Z, Westhof E. Assessment of three-dimensional RNA structure prediction in CASP15. Proteins 2023; 91:1747-1770. [PMID: 37876231 PMCID: PMC10841292 DOI: 10.1002/prot.26602] [Citation(s) in RCA: 38] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/21/2023] [Accepted: 09/07/2023] [Indexed: 10/26/2023]
Abstract
The prediction of RNA three-dimensional structures remains an unsolved problem. Here, we report assessments of RNA structure predictions in CASP15, the first CASP exercise that involved RNA structure modeling. Forty-two predictor groups submitted models for at least one of twelve RNA-containing targets. These models were evaluated by the RNA-Puzzles organizers and, separately, by a CASP-recruited team using metrics (GDT, lDDT) and approaches (Z-score rankings) initially developed for assessment of proteins and generalized here for RNA assessment. The two assessments independently ranked the same predictor groups as first (AIchemy_RNA2), second (Chen), and third (RNAPolis and GeneSilico, tied); predictions from deep learning approaches were significantly worse than these top ranked groups, which did not use deep learning. Further analyses based on direct comparison of predicted models to cryogenic electron microscopy (cryo-EM) maps and x-ray diffraction data support these rankings. With the exception of two RNA-protein complexes, models submitted by CASP15 groups correctly predicted the global fold of the RNA targets. Comparisons of CASP15 submissions to designed RNA nanostructures as well as molecular replacement trials highlight the potential utility of current RNA modeling approaches for RNA nanotechnology and structural biology, respectively. Nevertheless, challenges remain in modeling fine details such as noncanonical pairs, in ranking among submitted models, and in prediction of multiple structures resolved by cryo-EM or crystallography.
Collapse
Affiliation(s)
- Rhiju Das
- Department of Biochemistry, Stanford University School of Medicine, CA USA
- Biophysics Program, Stanford University School of Medicine, CA USA
- Howard Hughes Medical Institute, Stanford University, CA USA
| | | | - Adam J. Simpkin
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Thomas Mulvaney
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV), Hamburg, Germany
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Phillip Pham
- Department of Biochemistry, Stanford University School of Medicine, CA USA
| | - Ramya Rangan
- Biophysics Program, Stanford University School of Medicine, CA USA
| | - Fan Bu
- Guangzhou Laboratory, Guangzhou International Bio Island, Guangzhou 510005, China
- Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230036, Anhui, China
| | - Ronan M. Keegan
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
- Life Science, Diamond Light Source, Harwell Science, UK
| | - Maya Topf
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV), Hamburg, Germany
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Daniel J. Rigden
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People's Hospital, School of Medicine, Tongji University, Shanghai 200434, China
| | - Eric Westhof
- Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, F-67084, Strasbourg, France
| |
Collapse
|
18
|
Li J, Zhang S, Chen SJ. Advancing RNA 3D structure prediction: Exploring hierarchical and hybrid approaches in CASP15. Proteins 2023; 91:1779-1789. [PMID: 37615235 PMCID: PMC10841231 DOI: 10.1002/prot.26583] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 06/19/2023] [Accepted: 08/08/2023] [Indexed: 08/25/2023]
Abstract
In CASP15, we used an integrated hierarchical and hybrid approach to predict RNA structures. The approach involves three steps. First, with the use of physics-based methods, Vfold2D-MC and VfoldMCPX, we predict the 2D structures from the sequence. Second, we employ template-based methods, Vfold3D and VfoldLA, to build 3D scaffolds for the predicted 2D structures. Third, using the 3D scaffolds as initial structures and the predicted 2D structures as constraints, we predict the 3D structure from coarse-grained molecular dynamics simulations, IsRNA and RNAJP. Our approach was evaluated on 12 RNA targets in CASP15 and ranked second among all the 34 participating teams. The result demonstrated the reliability of our method in predicting RNA 2D structures with high accuracy and RNA 3D structures with moderate accuracy. Further improvements in RNA structure prediction for the next round of CASP may come from the incorporation of the physics-based method with machine learning techniques.
Collapse
Affiliation(s)
- Jun Li
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| | - Sicheng Zhang
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| |
Collapse
|
19
|
Sarzynska J, Popenda M, Antczak M, Szachniuk M. RNA tertiary structure prediction using RNAComposer in CASP15. Proteins 2023; 91:1790-1799. [PMID: 37615316 DOI: 10.1002/prot.26578] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 06/14/2023] [Accepted: 08/08/2023] [Indexed: 08/25/2023]
Abstract
As CASP15 participants, in the new category of 3D RNA structure prediction, we applied expert modeling with the support of our proprietary system RNAComposer. Although RNAComposer is primarily known as an automated web server, its features allow it to be used interactively, for example, for homology-based modeling or assembling models from user-provided structural elements. In the paper, we present various scenarios of applying the system to predict the 3D RNA structures that we employed. Their combination with expert input, comparative analysis of models, and routines to select representative resultant structures form a ready-for-reuse workflow. With selected examples, we demonstrate its application for the in silico modeling of natural and synthetic RNA molecules targeted in CASP15.
Collapse
Affiliation(s)
- Joanna Sarzynska
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Mariusz Popenda
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Maciej Antczak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| |
Collapse
|
20
|
Kretsch RC, Andersen ES, Bujnicki JM, Chiu W, Das R, Luo B, Masquida B, McRae EK, Schroeder GM, Su Z, Wedekind JE, Xu L, Zhang K, Zheludev IN, Moult J, Kryshtafovych A. RNA target highlights in CASP15: Evaluation of predicted models by structure providers. Proteins 2023; 91:1600-1615. [PMID: 37466021 PMCID: PMC10792523 DOI: 10.1002/prot.26550] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 06/16/2023] [Accepted: 06/26/2023] [Indexed: 07/20/2023]
Abstract
The first RNA category of the Critical Assessment of Techniques for Structure Prediction competition was only made possible because of the scientists who provided experimental structures to challenge the predictors. In this article, these scientists offer a unique and valuable analysis of both the successes and areas for improvement in the predicted models. All 10 RNA-only targets yielded predictions topologically similar to experimentally determined structures. For one target, experimentalists were able to phase their x-ray diffraction data by molecular replacement, showing a potential application of structure predictions for RNA structural biologists. Recommended areas for improvement include: enhancing the accuracy in local interaction predictions and increased consideration of the experimental conditions such as multimerization, structure determination method, and time along folding pathways. The prediction of RNA-protein complexes remains the most significant challenge. Finally, given the intrinsic flexibility of many RNAs, we propose the consideration of ensemble models.
Collapse
Affiliation(s)
- Rachael C. Kretsch
- Biophysics Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Ebbe S. Andersen
- Interdisciplinary Nanoscience Center and Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark
| | - Janusz M. Bujnicki
- International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Wah Chiu
- Biophysics Program, Stanford University School of Medicine, Stanford, CA, USA
- Department of Bioengineering and James H. Clark Center, Stanford University, Stanford, CA, USA
- Division of CryoEM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Rhiju Das
- Biophysics Program, Stanford University School of Medicine, Stanford, CA, USA
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford, CA, USA
| | - Bingnan Luo
- The State Key Laboratory of Biotherapy, Frontiers Medical Center of Tianfu Jincheng Laboratory, Department of Geriatrics and National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu 610044, Sichuan, China
| | - Benoît Masquida
- UMR 7156, CNRS – Universite de Strasbourg, Strasbourg, France
| | - Ewan K.S. McRae
- Center for RNA Therapeutics, Houston Methodist Research Institute, Houston, TX 77030, USA
| | - Griffin M. Schroeder
- Department of Biochemistry and Biophysics, University of Rochester School of Medicine and Dentistry, Rochester, NY, 14642, USA
- Center for RNA Biology, University of Rochester School of Medicine and Dentistry, Rochester, NY, 14642, USA
| | - Zhaoming Su
- The State Key Laboratory of Biotherapy, Frontiers Medical Center of Tianfu Jincheng Laboratory, Department of Geriatrics and National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu 610044, Sichuan, China
| | - Joseph E. Wedekind
- Department of Biochemistry and Biophysics, University of Rochester School of Medicine and Dentistry, Rochester, NY, 14642, USA
- Center for RNA Biology, University of Rochester School of Medicine and Dentistry, Rochester, NY, 14642, USA
| | - Lily Xu
- Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA, USA
| | - Kaiming Zhang
- Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230027, China
| | - Ivan N. Zheludev
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
| | - John Moult
- Department of Cell Biology and Molecular Genetics, Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland, USA
| | | |
Collapse
|
21
|
Baulin EF, Mukherjee S, Moafinejad SN, Wirecki TK, Badepally NG, Jaryani F, Stefaniak F, Amiri Farsani M, Ray A, Rocha de Moura T, Bujnicki JM. RNA tertiary structure prediction in CASP15 by the GeneSilico group: Folding simulations based on statistical potentials and spatial restraints. Proteins 2023; 91:1800-1810. [PMID: 37622458 DOI: 10.1002/prot.26575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 07/06/2023] [Accepted: 07/31/2023] [Indexed: 08/26/2023]
Abstract
Ribonucleic acid (RNA) molecules serve as master regulators of cells by encoding their biological function in the ribonucleotide sequence, particularly their ability to interact with other molecules. To understand how RNA molecules perform their biological tasks and to design new sequences with specific functions, it is of great benefit to be able to computationally predict how RNA folds and interacts in the cellular environment. Our workflow for computational modeling of the 3D structures of RNA and its interactions with other molecules uses a set of methods developed in our laboratory, including MeSSPredRNA for predicting canonical and non-canonical base pairs, PARNASSUS for detecting remote homology based on comparisons of sequences and secondary structures, ModeRNA for comparative modeling, the SimRNA family of programs for modeling RNA 3D structure and its complexes with other molecules, and QRNAS for model refinement. In this study, we present the results of testing this workflow in predicting RNA 3D structures in the CASP15 experiment. The overall high score of the computational models predicted by our group demonstrates the robustness of our workflow and its individual components in terms of predicting RNA 3D structures of acceptable quality that are close to the target structures. However, the variance in prediction quality is still quite high, and the results are still too far from the level of protein 3D structure predictions. This exercise led us to consider several improvements, especially to better predict and enforce stacking interactions and non-canonical base pairs.
Collapse
Affiliation(s)
- Eugene F Baulin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Sunandan Mukherjee
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - S Naeim Moafinejad
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Tomasz K Wirecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Nagendar Goud Badepally
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Farhang Jaryani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Filip Stefaniak
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Masoud Amiri Farsani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Angana Ray
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Tales Rocha de Moura
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| |
Collapse
|
22
|
Das R, Kretsch RC, Simpkin AJ, Mulvaney T, Pham P, Rangan R, Bu F, Keegan RM, Topf M, Rigden DJ, Miao Z, Westhof E. Assessment of three-dimensional RNA structure prediction in CASP15. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.25.538330. [PMID: 37162955 PMCID: PMC10168427 DOI: 10.1101/2023.04.25.538330] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
The prediction of RNA three-dimensional structures remains an unsolved problem. Here, we report assessments of RNA structure predictions in CASP15, the first CASP exercise that involved RNA structure modeling. Forty two predictor groups submitted models for at least one of twelve RNA-containing targets. These models were evaluated by the RNA-Puzzles organizers and, separately, by a CASP-recruited team using metrics (GDT, lDDT) and approaches (Z-score rankings) initially developed for assessment of proteins and generalized here for RNA assessment. The two assessments independently ranked the same predictor groups as first (AIchemy_RNA2), second (Chen), and third (RNAPolis and GeneSilico, tied); predictions from deep learning approaches were significantly worse than these top ranked groups, which did not use deep learning. Further analyses based on direct comparison of predicted models to cryogenic electron microscopy (cryo-EM) maps and X-ray diffraction data support these rankings. With the exception of two RNA-protein complexes, models submitted by CASP15 groups correctly predicted the global fold of the RNA targets. Comparisons of CASP15 submissions to designed RNA nanostructures as well as molecular replacement trials highlight the potential utility of current RNA modeling approaches for RNA nanotechnology and structural biology, respectively. Nevertheless, challenges remain in modeling fine details such as non-canonical pairs, in ranking among submitted models, and in prediction of multiple structures resolved by cryo-EM or crystallography.
Collapse
Affiliation(s)
- Rhiju Das
- Department of Biochemistry, Stanford University School of Medicine, CA USA
- Biophysics Program, Stanford University School of Medicine, CA USA
- Howard Hughes Medical Institute, Stanford University, CA USA
| | | | - Adam J. Simpkin
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Thomas Mulvaney
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV)
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Phillip Pham
- Department of Biochemistry, Stanford University School of Medicine, CA USA
| | - Ramya Rangan
- Biophysics Program, Stanford University School of Medicine, CA USA
| | - Fan Bu
- Guangzhou Laboratory, Guangzhou International Bio Island, Guangzhou 510005, China
- Division of Life Sciences and Medicine,University of Science and Technology of China, Hefei 230036, Anhui, China
| | - Ronan M. Keegan
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
- Life Science, Diamond Light Source, Harwell Science, UK
| | - Maya Topf
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV)
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Daniel J. Rigden
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People’s Hospital, School of Medicine, Tongji University, Shanghai 200434, China
| | - Eric Westhof
- Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, F-67084, Strasbourg, France
| |
Collapse
|
23
|
Kagaya Y, Zhang Z, Ibtehaz N, Wang X, Nakamura T, Huang D, Kihara D. NuFold: A Novel Tertiary RNA Structure Prediction Method Using Deep Learning with Flexible Nucleobase Center Representation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.20.558715. [PMID: 37790488 PMCID: PMC10542152 DOI: 10.1101/2023.09.20.558715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
RNA is not only playing a core role in the central dogma as mRNA between DNA and protein, but also many non-coding RNAs have been discovered to have unique and diverse biological functions. As genome sequences become increasingly available and our knowledge of RNA sequences grows, the study of RNA's structure and function has become more demanding. However, experimental determination of three-dimensional RNA structures is both costly and time-consuming, resulting in a substantial disparity between RNA sequence data and structural insights. In response to this challenge, we propose a novel computational approach that harnesses state-of-the-art deep learning architecture NuFold to accurately predict RNA tertiary structures. This approach aims to offer a cost-effective and efficient means of bridging the gap between RNA sequence information and structural comprehension. NuFold implements a nucleobase center representation, which allows it to reproduce all possible nucleotide conformations accurately.
Collapse
Affiliation(s)
- Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Nabil Ibtehaz
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Tsukasa Nakamura
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - David Huang
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| |
Collapse
|
24
|
Lazzeri G, Micheletti C, Pasquali S, Faccioli P. RNA folding pathways from all-atom simulations with a variationally improved history-dependent bias. Biophys J 2023; 122:3089-3098. [PMID: 37355771 PMCID: PMC10432211 DOI: 10.1016/j.bpj.2023.06.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 05/03/2023] [Accepted: 06/15/2023] [Indexed: 06/26/2023] Open
Abstract
Atomically detailed simulations of RNA folding have proven very challenging in view of the difficulties of developing realistic force fields and the intrinsic computational complexity of sampling rare conformational transitions. As a step forward in tackling these issues, we extend to RNA an enhanced path-sampling method previously successfully applied to proteins. In this scheme, the information about the RNA's native structure is harnessed by a soft history-dependent biasing force promoting the generation of productive folding trajectories in an all-atom force field with explicit solvent. A rigorous variational principle is then applied to minimize the effect of the bias. Here, we report on an application of this method to RNA molecules from 20 to 47 nucleotides long and increasing topological complexity. By comparison with analog simulations performed on small proteins with similar size and architecture, we show that the RNA folding landscape is significantly more frustrated, even for relatively small chains with a simple topology. The predicted RNA folding mechanisms are found to be consistent with the available experiments and some of the existing coarse-grained models. Due to its computational performance, this scheme provides a promising platform to efficiently gather atomistic RNA folding trajectories, thus retain the information about the chemical composition of the sequence.
Collapse
Affiliation(s)
- Gianmarco Lazzeri
- Frankfurt Institute for Advanced Studies, Frankfurt am Main, Germany; Physics Department of Trento University, Povo (Trento), Italy
| | | | - Samuela Pasquali
- Laboratoire Cibles Thérapeutiques et Conception de Médicaments, Université Paris Cité, Paris, France; Laboratoire Biologie Fonctionnelle et Adaptative, Université Paris Cité, Paris, France.
| | - Pietro Faccioli
- Physics Department of Trento University, Povo (Trento), Italy; INFN-TIFPA, Povo (Trento), Italy.
| |
Collapse
|
25
|
Wang X, Yu S, Lou E, Tan YL, Tan ZJ. RNA 3D Structure Prediction: Progress and Perspective. Molecules 2023; 28:5532. [PMID: 37513407 PMCID: PMC10386116 DOI: 10.3390/molecules28145532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 07/05/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
Ribonucleic acid (RNA) molecules play vital roles in numerous important biological functions such as catalysis and gene regulation. The functions of RNAs are strongly coupled to their structures or proper structure changes, and RNA structure prediction has been paid much attention in the last two decades. Some computational models have been developed to predict RNA three-dimensional (3D) structures in silico, and these models are generally composed of predicting RNA 3D structure ensemble, evaluating near-native RNAs from the structure ensemble, and refining the identified RNAs. In this review, we will make a comprehensive overview of the recent advances in RNA 3D structure modeling, including structure ensemble prediction, evaluation, and refinement. Finally, we will emphasize some insights and perspectives in modeling RNA 3D structures.
Collapse
Affiliation(s)
- Xunxun Wang
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Shixiong Yu
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - En Lou
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Ya-Lan Tan
- School of Bioengineering and Health, Wuhan Textile University, Wuhan 430200, China
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430200, China
| | - Zhi-Jie Tan
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| |
Collapse
|
26
|
Gao W, Yang A, Rivas E. Thirteen dubious ways to detect conserved structural RNAs. IUBMB Life 2023; 75:471-492. [PMID: 36495545 PMCID: PMC11234323 DOI: 10.1002/iub.2694] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 10/24/2022] [Indexed: 12/14/2022]
Abstract
Covariation induced by compensatory base substitutions in RNA alignments is a great way to deduce conserved RNA structure, in principle. In practice, success depends on many factors, importantly the quality and depth of the alignment and the choice of covariation statistic. Measuring covariation between pairs of aligned positions is easy. However, using covariation to infer evolutionarily conserved RNA structure is complicated by other extraneous sources of covariation such as that resulting from homologous sequences having evolved from a common ancestor. In order to provide evidence of evolutionarily conserved RNA structure, a method to distinguish covariation due to sources other than RNA structure is necessary. Moreover, there are several sorts of artifactually generated covariation signals that can further confound the analysis. Additionally, some covariation signal is difficult to detect due to incomplete comparative data. Here, we investigate and critically discuss the practice of inferring conserved RNA structure by comparative sequence analysis. We provide new methods on how to approach and decide which of the numerous long non-coding RNAs (lncRNAs) have biologically relevant structures.
Collapse
Affiliation(s)
- William Gao
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Ann Yang
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
| | - Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
27
|
Wang J, Sha CM, Dokholyan NV. Combining Experimental Restraints and RNA 3D Structure Prediction in RNA Nanotechnology. Methods Mol Biol 2023; 2709:51-64. [PMID: 37572272 PMCID: PMC10680996 DOI: 10.1007/978-1-0716-3417-2_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/14/2023]
Abstract
Precise RNA tertiary structure prediction can aid in the design of RNA nanoparticles. However, most existing RNA tertiary structure prediction methods are limited to small RNAs with relatively simple secondary structures. Large RNA molecules usually have complex secondary structures, including multibranched loops and pseudoknots, allowing for highly flexible RNA geometries and multiple stable states. Various experiments and bioinformatics analyses can often provide information about the distance between atoms (or residues) in RNA, which can be used to guide the prediction of RNA tertiary structure. In this chapter, we will introduce a platform, iFoldNMR, that can incorporate non-exchangeable imino protons resonance data from NMR as restraints for RNA 3D structure prediction. We also introduce an algorithm, DVASS, which optimizes distance restraints for better RNA 3D structure prediction.
Collapse
Affiliation(s)
- Jian Wang
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Congzhou M Sha
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
- Department of Engineering Science and Mechanics, Penn State University, State College, PA, USA
| | - Nikolay V Dokholyan
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA.
- Department of Engineering Science and Mechanics, Penn State University, State College, PA, USA.
- Department of Biochemistry and Molecular Biology, Penn State College of Medicine, Hershey, PA, USA.
- Department of Chemistry, Penn State University, State College, PA, USA.
- Department of Biomedical Engineering, Penn State University, State College, PA, USA.
| |
Collapse
|
28
|
Paloncýová M, Pykal M, Kührová P, Banáš P, Šponer J, Otyepka M. Computer Aided Development of Nucleic Acid Applications in Nanotechnologies. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2022; 18:e2204408. [PMID: 36216589 DOI: 10.1002/smll.202204408] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/23/2022] [Revised: 09/12/2022] [Indexed: 06/16/2023]
Abstract
Utilization of nucleic acids (NAs) in nanotechnologies and nanotechnology-related applications is a growing field with broad application potential, ranging from biosensing up to targeted cell delivery. Computer simulations are useful techniques that can aid design and speed up development in this field. This review focuses on computer simulations of hybrid nanomaterials composed of NAs and other components. Current state-of-the-art molecular dynamics simulations, empirical force fields (FFs), and coarse-grained approaches for the description of deoxyribonucleic acid and ribonucleic acid are critically discussed. Challenges in combining biomacromolecular and nanomaterial FFs are emphasized. Recent applications of simulations for modeling NAs and their interactions with nano- and biomaterials are overviewed in the fields of sensing applications, targeted delivery, and NA templated materials. Future perspectives of development are also highlighted.
Collapse
Affiliation(s)
- Markéta Paloncýová
- Regional Center of Advanced Technologies and Materials, The Czech Advanced Technology and Research Institute (CATRIN), Palacký University Olomouc, Šlechtitelů 27, Olomouc, 779 00, Czech Republic
| | - Martin Pykal
- Regional Center of Advanced Technologies and Materials, The Czech Advanced Technology and Research Institute (CATRIN), Palacký University Olomouc, Šlechtitelů 27, Olomouc, 779 00, Czech Republic
| | - Petra Kührová
- Regional Center of Advanced Technologies and Materials, The Czech Advanced Technology and Research Institute (CATRIN), Palacký University Olomouc, Šlechtitelů 27, Olomouc, 779 00, Czech Republic
| | - Pavel Banáš
- Regional Center of Advanced Technologies and Materials, The Czech Advanced Technology and Research Institute (CATRIN), Palacký University Olomouc, Šlechtitelů 27, Olomouc, 779 00, Czech Republic
| | - Jiří Šponer
- Regional Center of Advanced Technologies and Materials, The Czech Advanced Technology and Research Institute (CATRIN), Palacký University Olomouc, Šlechtitelů 27, Olomouc, 779 00, Czech Republic
- Institute of Biophysics of the Czech Academy of Sciences, v. v. i., Královopolská 135, Brno, 612 65, Czech Republic
| | - Michal Otyepka
- Regional Center of Advanced Technologies and Materials, The Czech Advanced Technology and Research Institute (CATRIN), Palacký University Olomouc, Šlechtitelů 27, Olomouc, 779 00, Czech Republic
- IT4Innovations, VŠB - Technical University of Ostrava, 17. listopadu 2172/15, Ostrava-Poruba, 708 00, Czech Republic
| |
Collapse
|
29
|
Zhang H, Li S, Zhang L, Mathews D, Huang L. LazySampling and LinearSampling: fast stochastic sampling of RNA secondary structure with applications to SARS-CoV-2. Nucleic Acids Res 2022; 51:e7. [PMID: 36401871 PMCID: PMC9881153 DOI: 10.1093/nar/gkac1029] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 09/22/2022] [Accepted: 10/21/2022] [Indexed: 11/21/2022] Open
Abstract
Many RNAs fold into multiple structures at equilibrium, and there is a need to sample these structures according to their probabilities in the ensemble. The conventional sampling algorithm suffers from two limitations: (i) the sampling phase is slow due to many repeated calculations; and (ii) the end-to-end runtime scales cubically with the sequence length. These issues make it difficult to be applied to long RNAs, such as the full genomes of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). To address these problems, we devise a new sampling algorithm, LazySampling, which eliminates redundant work via on-demand caching. Based on LazySampling, we further derive LinearSampling, an end-to-end linear time sampling algorithm. Benchmarking on nine diverse RNA families, the sampled structures from LinearSampling correlate better with the well-established secondary structures than Vienna RNAsubopt and RNAplfold. More importantly, LinearSampling is orders of magnitude faster than standard tools, being 428× faster (72 s versus 8.6 h) than RNAsubopt on the full genome of SARS-CoV-2 (29 903 nt). The resulting sample landscape correlates well with the experimentally guided secondary structure models, and is closer to the alternative conformations revealed by experimentally driven analysis. Finally, LinearSampling finds 23 regions of 15 nt with high accessibilities in the SARS-CoV-2 genome, which are potential targets for COVID-19 diagnostics and therapeutics.
Collapse
Affiliation(s)
- He Zhang
- Baidu Research, Sunnyvale, CA, USA,School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR, USA
| | - Sizhen Li
- School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR, USA
| | - Liang Zhang
- School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA,Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | | |
Collapse
|
30
|
Yan S, Ilgu M, Nilsen-Hamilton M, Lamm MH. Computational Modeling of RNA Aptamers: Structure Prediction of the Apo State. J Phys Chem B 2022; 126:7114-7125. [PMID: 36097649 PMCID: PMC9512008 DOI: 10.1021/acs.jpcb.2c04649] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 08/23/2022] [Indexed: 11/28/2022]
Abstract
RNA aptamers are single-stranded oligonucleotides that bind to specific molecular targets with high affinity and specificity. To design aptamers for new applications, it is critical to understand the ligand binding mechanism in terms of the structure and dynamics of the ligand-bound and apo states. The problem is that most of the NMR or X-ray crystal structures available for RNA aptamers are for ligand-bound states. Available apo state structures, mostly characterized by crystallization under nonphysiological conditions or probed by low resolution techniques, might fail to represent the diverse structural variations of the apo state in solution. Here, we develop an approach to obtain a representative ensemble of apo structures that are based on in silico RNA 3D structure prediction and in vitro experiments that characterize base stacking. Using the neomycin-B aptamer as a case study, an ensemble of structures for the aptamer in the apo (unbound) state are validated and then used to investigate the ligand-binding mechanism for the aptamer in complex with neomycin-B.
Collapse
Affiliation(s)
- Shuting Yan
- Iowa
State University, Ames, Iowa 50011, United States
| | - Muslum Ilgu
- Iowa
State University, Ames, Iowa 50011, United States
- Ames
National Laboratory, Ames, Iowa 50011, United States
- Aptalogic
Inc., Ames, Iowa 50014, United States
| | - Marit Nilsen-Hamilton
- Iowa
State University, Ames, Iowa 50011, United States
- Ames
National Laboratory, Ames, Iowa 50011, United States
- Aptalogic
Inc., Ames, Iowa 50014, United States
| | | |
Collapse
|
31
|
Matarrese MAG, Loppini A, Nicoletti M, Filippi S, Chiodo L. Assessment of tools for RNA secondary structure prediction and extraction: a final-user perspective. J Biomol Struct Dyn 2022:1-20. [DOI: 10.1080/07391102.2022.2116110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
- Margherita A. G. Matarrese
- Engineering Department, Campus Bio-Medico University of Rome, Rome, Italy
- Jane and John Justin Neurosciences Center, Cook Children’s Health Care System, TX, USA
- Department of Bioengineering, The University of Texas at Arlington, Arlington, TX, USA
| | - Alessandro Loppini
- Engineering Department, Campus Bio-Medico University of Rome, Rome, Italy
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome, Italy
| | - Martina Nicoletti
- Engineering Department, Campus Bio-Medico University of Rome, Rome, Italy
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome, Italy
| | - Simonetta Filippi
- Engineering Department, Campus Bio-Medico University of Rome, Rome, Italy
| | - Letizia Chiodo
- Engineering Department, Campus Bio-Medico University of Rome, Rome, Italy
| |
Collapse
|
32
|
Kallert E, Fischer TR, Schneider S, Grimm M, Helm M, Kersten C. Protein-Based Virtual Screening Tools Applied for RNA-Ligand Docking Identify New Binders of the preQ 1-Riboswitch. J Chem Inf Model 2022; 62:4134-4148. [PMID: 35994617 DOI: 10.1021/acs.jcim.2c00751] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Targeting RNA with small molecules is an emerging field. While several ligands for different RNA targets are reported, structure-based virtual screenings (VSs) against RNAs are still rare. Here, we elucidated the general capabilities of protein-based docking programs to reproduce native binding modes of small-molecule RNA ligands and to discriminate known binders from decoys by the scoring function. The programs were found to perform similar compared to the RNA-based docking tool rDOCK, and the challenges faced during docking, namely, protomer and tautomer selection, target dynamics, and explicit solvent, do not largely differ from challenges in conventional protein-ligand docking. A prospective VS with the Bacillus subtilis preQ1-riboswitch aptamer domain performed with FRED, HYBRID, and FlexX followed by microscale thermophoresis assays identified six active compounds out of 23 tested VS hits with potencies between 29.5 nM and 11.0 μM. The hits were selected not solely based on their docking score but for resembling key interactions of the native ligand. Therefore, this study demonstrates the general feasibility to perform structure-based VSs against RNA targets, while at the same time it highlights pitfalls and their potential solutions when executing RNA-ligand docking.
Collapse
Affiliation(s)
- Elisabeth Kallert
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, Staudingerweg 5, Mainz 55128, Germany
| | - Tim R Fischer
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, Staudingerweg 5, Mainz 55128, Germany
| | - Simon Schneider
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, Staudingerweg 5, Mainz 55128, Germany
| | - Maike Grimm
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, Staudingerweg 5, Mainz 55128, Germany
| | - Mark Helm
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, Staudingerweg 5, Mainz 55128, Germany
| | - Christian Kersten
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, Staudingerweg 5, Mainz 55128, Germany
| |
Collapse
|
33
|
Nishima W, Girodat D, Holm M, Rundlet EJ, Alejo JL, Fischer K, Blanchard SC, Sanbonmatsu KY. Hyper-swivel head domain motions are required for complete mRNA-tRNA translocation and ribosome resetting. Nucleic Acids Res 2022; 50:8302-8320. [PMID: 35808938 DOI: 10.1093/nar/gkac597] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 06/15/2022] [Accepted: 07/05/2022] [Indexed: 11/14/2022] Open
Abstract
Translocation of messenger RNA (mRNA) and transfer RNA (tRNA) substrates through the ribosome during protein synthesis, an exemplar of directional molecular movement in biology, entails a complex interplay of conformational, compositional, and chemical changes. The molecular determinants of early translocation steps have been investigated rigorously. However, the elements enabling the ribosome to complete translocation and reset for subsequent protein synthesis reactions remain poorly understood. Here, we have combined molecular simulations with single-molecule fluorescence resonance energy transfer imaging to gain insights into the rate-limiting events of the translocation mechanism. We find that diffusive motions of the ribosomal small subunit head domain to hyper-swivelled positions, governed by universally conserved rRNA, can maneuver the mRNA and tRNAs to their fully translocated positions. Subsequent engagement of peptidyl-tRNA and disengagement of deacyl-tRNA from mRNA, within their respective small subunit binding sites, facilitate the ribosome resetting mechanism after translocation has occurred to enable protein synthesis to resume.
Collapse
Affiliation(s)
- Wataru Nishima
- Theoretical Biology and Biophysics, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
- New Mexico Consortium, Los Alamos, NM 87544, USA
| | - Dylan Girodat
- Theoretical Biology and Biophysics, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
- New Mexico Consortium, Los Alamos, NM 87544, USA
| | - Mikael Holm
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Emily J Rundlet
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
- Tri-Institutional PhD Program in Chemical Biology, Weill Cornell Medicine, New York, NY 10021, USA
| | - Jose L Alejo
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, MN 55455, USA
| | - Kara Fischer
- New Mexico Consortium, Los Alamos, NM 87544, USA
| | - Scott C Blanchard
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Karissa Y Sanbonmatsu
- Theoretical Biology and Biophysics, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
- New Mexico Consortium, Los Alamos, NM 87544, USA
| |
Collapse
|
34
|
Singh J, Paliwal K, Litfin T, Singh J, Zhou Y. Predicting RNA distance-based contact maps by integrated deep learning on physics-inferred secondary structure and evolutionary-derived mutational coupling. Bioinformatics 2022; 38:3900-3910. [PMID: 35751593 PMCID: PMC9364379 DOI: 10.1093/bioinformatics/btac421] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 04/30/2022] [Accepted: 06/28/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Recently, AlphaFold2 achieved high experimental accuracy for the majority of proteins in Critical Assessment of Structure Prediction (CASP 14). This raises the hope that one day, we may achieve the same feat for RNA structure prediction for those structured RNAs, which is as fundamentally and practically important similar to protein structure prediction. One major factor in the recent advancement of protein structure prediction is the highly accurate prediction of distance-based contact maps of proteins. RESULTS Here, we showed that by integrated deep learning with physics-inferred secondary structures, co-evolutionary information and multiple sequence-alignment sampling, we can achieve RNA contact-map prediction at a level of accuracy similar to that in protein contact-map prediction. More importantly, highly accurate prediction for top L long-range contacts can be assured for those RNAs with a high effective number of homologous sequences (Neff > 50). The initial use of the predicted contact map as distance-based restraints confirmed its usefulness in 3D structure prediction. AVAILABILITY AND IMPLEMENTATION SPOT-RNA-2D is available as a web server at https://sparks-lab.org/server/spot-rna-2d/ and as a standalone program at https://github.com/jaswindersingh2/SPOT-RNA-2D. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Thomas Litfin
- Institute for Glycomics, Griffith University, Parklands Dr. Southport, QLD 4222, Australia
| | - Jaspreet Singh
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD 4111, Australia
| | - Yaoqi Zhou
- To whom correspondence should be addressed. or or
| |
Collapse
|
35
|
Magnus M. rna-tools.online: a Swiss army knife for RNA 3D structure modeling workflow. Nucleic Acids Res 2022; 50:W657-W662. [PMID: 35580057 PMCID: PMC9252763 DOI: 10.1093/nar/gkac372] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 04/20/2022] [Accepted: 05/02/2022] [Indexed: 11/15/2022] Open
Abstract
Significant improvements have been made in the efficiency and accuracy of RNA 3D structure prediction methods in recent years; however, many tools developed in the field stay exclusive to only a few bioinformatic groups. To perform a complete RNA 3D structure modeling analysis as proposed by the RNA-Puzzles community, researchers must familiarize themselves with a quite complex set of tools. In order to facilitate the processing of RNA sequences and structures, we previously developed the rna-tools package. However, using rna-tools requires the installation of a mixture of libraries and tools, basic knowledge of the command line and the Python programming language. To provide an opportunity for the broader community of biologists to take advantage of the new developments in RNA structural biology, we developed rna-tools.online. The web server provides a user-friendly platform to perform many standard analyses required for the typical modeling workflow: 3D structure manipulation and editing, structure minimization, structure analysis, quality assessment, and comparison. rna-tools.online supports biologists to start benefiting from the maturing field of RNA 3D structural bioinformatics and can be used for educational purposes. The web server is available at https://rna-tools.online.
Collapse
Affiliation(s)
- Marcin Magnus
- ReMedy International Research Agenda Unit, IMol Polish Academy of Sciences, Warsaw, Poland
| |
Collapse
|
36
|
Carrascoza F, Antczak M, Miao Z, Westhof E, Szachniuk M. Evaluation of the stereochemical quality of predicted RNA 3D models in the RNA-Puzzles submissions. RNA (NEW YORK, N.Y.) 2022; 28:250-262. [PMID: 34819324 PMCID: PMC8906551 DOI: 10.1261/rna.078685.121] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Accepted: 11/05/2021] [Indexed: 06/13/2023]
Abstract
In silico prediction is a well-established approach to derive a general shape of an RNA molecule based on its sequence or secondary structure. This paper reports an analysis of the stereochemical quality of the RNA three-dimensional models predicted using dedicated computer programs. The stereochemistry of 1052 RNA 3D structures, including 1030 models predicted by fully automated and human-guided approaches within 22 RNA-Puzzles challenges and reference structures, is analyzed. The evaluation is based on standards of RNA stereochemistry that the Protein Data Bank requires from deposited experimental structures. Deviations from standard bond lengths and angles, planarity, or chirality are quantified. A reduction in the number of such deviations should help in the improvement of RNA 3D structure modeling approaches.
Collapse
Affiliation(s)
- Francisco Carrascoza
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, 60-965 Poznan, Poland
| | - Maciej Antczak
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
| | - Zhichao Miao
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
- Translational Research Institute of Brain and Brain-Like Intelligence, Department of Anesthesiology, Shanghai Fourth People's Hospital Affiliated to Tongji University School of Medicine, Shanghai 200081, China
| | - Eric Westhof
- Université de Strasbourg, Institut de Biologie Moléculaire et Cellulaire CNRS, Architecture et Réactivité de l'ARN, 67084 Strasbourg, France
| | - Marta Szachniuk
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
| |
Collapse
|
37
|
Liu Z, Yang Y, Li D, Lv X, Chen X, Dai Q. Prediction of the RNA Tertiary Structure Based on a Random Sampling Strategy and Parallel Mechanism. Front Genet 2022; 12:813604. [PMID: 35069706 PMCID: PMC8769045 DOI: 10.3389/fgene.2021.813604] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 11/19/2021] [Indexed: 12/14/2022] Open
Abstract
Background: Macromolecule structure prediction remains a fundamental challenge of bioinformatics. Over the past several decades, the Rosetta framework has provided solutions to diverse challenges in computational biology. However, it is challenging to model RNA tertiary structures effectively when the de novo modeling of RNA involves solving a well-defined small puzzle. Methods: In this study, we introduce a stepwise Monte Carlo parallelization (SMCP) algorithm for RNA tertiary structure prediction. Millions of conformations were randomly searched using the Monte Carlo algorithm and stepwise ansatz hypothesis, and SMCP uses a parallel mechanism for efficient sampling. Moreover, to achieve better prediction accuracy and completeness, we judged and processed the modeling results. Results: A benchmark of nine single-stranded RNA loops drawn from riboswitches establishes the general ability of the algorithm to model RNA with high accuracy and integrity, including six motifs that cannot be solved by knowledge mining-based modeling algorithms. Experimental results show that the modeling accuracy of the SMCP algorithm is up to 0.14 Å, and the modeling integrity on this benchmark is extremely high. Conclusion: SMCP is an ab initio modeling algorithm that substantially outperforms previous algorithms in the Rosetta framework, especially in improving the accuracy and completeness of the model. It is expected that the work will provide new research ideas for macromolecular structure prediction in the future. In addition, this work will provide theoretical basis for the development of the biomedical field.
Collapse
Affiliation(s)
- Zhendong Liu
- School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China
| | - Yurong Yang
- School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China
| | - Dongyan Li
- School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China
| | - Xinrong Lv
- School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China
| | - Xi Chen
- School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China
| | - Qionghai Dai
- Department of Automation, Tsinghua University, Beijing, China
| |
Collapse
|
38
|
Guo ZH, Yuan L, Tan YL, Zhang BG, Shi YZ. RNAStat: An Integrated Tool for Statistical Analysis of RNA 3D Structures. FRONTIERS IN BIOINFORMATICS 2022; 1:809082. [PMID: 36303785 PMCID: PMC9580920 DOI: 10.3389/fbinf.2021.809082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Accepted: 12/17/2021] [Indexed: 11/13/2022] Open
Abstract
The 3D architectures of RNAs are essential for understanding their cellular functions. While an accurate scoring function based on the statistics of known RNA structures is a key component for successful RNA structure prediction or evaluation, there are few tools or web servers that can be directly used to make comprehensive statistical analysis for RNA 3D structures. In this work, we developed RNAStat, an integrated tool for making statistics on RNA 3D structures. For given RNA structures, RNAStat automatically calculates RNA structural properties such as size and shape, and shows their distributions. Based on the RNA structure annotation from DSSR, RNAStat provides statistical information of RNA secondary structure motifs including canonical/non-canonical base pairs, stems, and various loops. In particular, the geometry of base-pairing/stacking can be calculated in RNAStat by constructing a local coordinate system for each base. In addition, RNAStat also supplies the distribution of distance between any atoms to the users to help build distance-based RNA statistical potentials. To test the usability of the tool, we established a non-redundant RNA 3D structure dataset, and based on the dataset, we made a comprehensive statistical analysis on RNA structures, which could have the guiding significance for RNA structure modeling. The python code of RNAStat, the dataset used in this work, and corresponding statistical data files are freely available at GitHub (https://github.com/RNA-folding-lab/RNAStat).
Collapse
Affiliation(s)
- Zhi-Hao Guo
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan, China
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, China
| | - Li Yuan
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan, China
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, China
| | - Ya-Lan Tan
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan, China
| | - Ben-Gong Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan, China
| | - Ya-Zhou Shi
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan, China
- *Correspondence: Ya-Zhou Shi,
| |
Collapse
|
39
|
Zerihun MB, Pucci F, Schug A. CoCoNet-boosting RNA contact prediction by convolutional neural networks. Nucleic Acids Res 2021; 49:12661-12672. [PMID: 34871451 PMCID: PMC8682773 DOI: 10.1093/nar/gkab1144] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 10/27/2021] [Accepted: 11/05/2021] [Indexed: 11/24/2022] Open
Abstract
Co-evolutionary models such as direct coupling analysis (DCA) in combination with machine learning (ML) techniques based on deep neural networks are able to predict accurate protein contact or distance maps. Such information can be used as constraints in structure prediction and massively increase prediction accuracy. Unfortunately, the same ML methods cannot readily be applied to RNA as they rely on large structural datasets only available for proteins. Here, we demonstrate how the available smaller data for RNA can be used to improve prediction of RNA contact maps. We introduce an algorithm called CoCoNet that is based on a combination of a Coevolutionary model and a shallow Convolutional Neural Network. Despite its simplicity and the small number of trained parameters, the method boosts the positive predictive value (PPV) of predicted contacts by about 70% with respect to DCA as tested by cross-validation of about eighty RNA structures. However, the direct inclusion of the CoCoNet contacts in 3D modeling tools does not result in a proportional increase of the 3D RNA structure prediction accuracy. Therefore, we suggest that the field develops, in addition to contact PPV, metrics which estimate the expected impact for 3D structure modeling tools better. CoCoNet is freely available and can be found at https://github.com/KIT-MBS/coconet.
Collapse
Affiliation(s)
- Mehari B Zerihun
- John von Neumann Institute for Computing, Jülich Supercomputing Centre, Forschungszentrum Jülich, 52428 Jülich, Germany.,Steinbuch Centre for Computing, Karlsruhe Institute of Technology, 76344 Eggenstein-Leopoldshafen, Germany
| | - Fabrizio Pucci
- John von Neumann Institute for Computing, Jülich Supercomputing Centre, Forschungszentrum Jülich, 52428 Jülich, Germany.,Computational Biology and Bioinformatics, Université Libre de Bruxelles 1050, Brussels, Belgium
| | - Alexander Schug
- John von Neumann Institute for Computing, Jülich Supercomputing Centre, Forschungszentrum Jülich, 52428 Jülich, Germany.,Faculty of Biology, University of Duisburg-Essen, 45117 Essen, Germany
| |
Collapse
|
40
|
Zafferani M, Haddad C, Luo L, Davila-Calderon J, Chiu LY, Mugisha CS, Monaghan AG, Kennedy AA, Yesselman JD, Gifford RJ, Tai AW, Kutluay SB, Li ML, Brewer G, Tolbert BS, Hargrove AE. Amilorides inhibit SARS-CoV-2 replication in vitro by targeting RNA structures. SCIENCE ADVANCES 2021; 7:eabl6096. [PMID: 34826236 PMCID: PMC8626076 DOI: 10.1126/sciadv.abl6096] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 10/06/2021] [Indexed: 05/15/2023]
Abstract
The SARS-CoV-2 pandemic, and the likelihood of future coronavirus pandemics, emphasized the urgent need for development of novel antivirals. Small-molecule chemical probes offer both to reveal aspects of virus replication and to serve as leads for antiviral therapeutic development. Here, we report on the identification of amiloride-based small molecules that potently inhibit OC43 and SARS-CoV-2 replication through targeting of conserved structured elements within the viral 5′-end. Nuclear magnetic resonance–based structural studies revealed specific amiloride interactions with stem loops containing bulge like structures and were predicted to be strongly bound by the lead amilorides in retrospective docking studies. Amilorides represent the first antiviral small molecules that target RNA structures within the 5′ untranslated regions and proximal region of the CoV genomes. These molecules will serve as chemical probes to further understand CoV RNA biology and can pave the way for the development of specific CoV RNA–targeted antivirals.
Collapse
Affiliation(s)
- Martina Zafferani
- Chemistry Department, Duke University, 124 Science Drive, Durham, NC 27705, USA
| | - Christina Haddad
- Department of Chemistry, Case Western Reserve University, Cleveland, OH 441106, USA
| | - Le Luo
- Department of Chemistry, Case Western Reserve University, Cleveland, OH 441106, USA
| | | | - Liang-Yuan Chiu
- Department of Chemistry, Case Western Reserve University, Cleveland, OH 441106, USA
| | - Christian Shema Mugisha
- Department of Molecular Microbiology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Adeline G. Monaghan
- Chemistry Department, Duke University, 124 Science Drive, Durham, NC 27705, USA
| | - Andrew A. Kennedy
- Department of Internal Medicine and Department of Microbiology and Immunology, University of Michigan, 1150 W Medical Center Dr., Ann Arbor, MI 48109, USA
| | - Joseph D. Yesselman
- Department of Chemistry, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Robert J. Gifford
- MRC-University of Glasgow Centre for Virus Research, 464 Bearsden Rd., Bearsden, Glasgow G61 1QH, UK
| | - Andrew W. Tai
- Department of Internal Medicine and Department of Microbiology and Immunology, University of Michigan, 1150 W Medical Center Dr., Ann Arbor, MI 48109, USA
| | - Sebla B. Kutluay
- Department of Molecular Microbiology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Mei-Ling Li
- Department of Biochemistry and Molecular Biology, Rutgers Robert Wood Johnson Medical School, 675 Hoes Lane West, Piscataway, NJ 08854, USA
| | - Gary Brewer
- Department of Biochemistry and Molecular Biology, Rutgers Robert Wood Johnson Medical School, 675 Hoes Lane West, Piscataway, NJ 08854, USA
| | - Blanton S. Tolbert
- Department of Chemistry, Case Western Reserve University, Cleveland, OH 441106, USA
| | - Amanda E. Hargrove
- Chemistry Department, Duke University, 124 Science Drive, Durham, NC 27705, USA
| |
Collapse
|
41
|
De Bisschop G, Allouche D, Frezza E, Masquida B, Ponty Y, Will S, Sargueil B. Progress toward SHAPE Constrained Computational Prediction of Tertiary Interactions in RNA Structure. Noncoding RNA 2021; 7:71. [PMID: 34842779 PMCID: PMC8628965 DOI: 10.3390/ncrna7040071] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 10/29/2021] [Accepted: 11/02/2021] [Indexed: 01/04/2023] Open
Abstract
As more sequencing data accumulate and novel puzzling genetic regulations are discovered, the need for accurate automated modeling of RNA structure increases. RNA structure modeling from chemical probing experiments has made tremendous progress, however accurately predicting large RNA structures is still challenging for several reasons: RNA are inherently flexible and often adopt many energetically similar structures, which are not reliably distinguished by the available, incomplete thermodynamic model. Moreover, computationally, the problem is aggravated by the relevance of pseudoknots and non-canonical base pairs, which are hardly predicted efficiently. To identify nucleotides involved in pseudoknots and non-canonical interactions, we scrutinized the SHAPE reactivity of each nucleotide of the 188 nt long lariat-capping ribozyme under multiple conditions. Reactivities analyzed in the light of the X-ray structure were shown to report accurately the nucleotide status. Those that seemed paradoxical were rationalized by the nucleotide behavior along molecular dynamic simulations. We show that valuable information on intricate interactions can be deduced from probing with different reagents, and in the presence or absence of Mg2+. Furthermore, probing at increasing temperature was remarkably efficient at pointing to non-canonical interactions and pseudoknot pairings. The possibilities of following such strategies to inform structure modeling software are discussed.
Collapse
Affiliation(s)
- Grégoire De Bisschop
- Université de Paris, CNRS, UMR 8038/CiTCoM, F-75006 Paris, France; (G.D.B.); (D.A.); (E.F.)
- Institut de Recherches Cliniques de Montréal (IRCM), Montréal, QC H2W 1R7, Canada
| | - Delphine Allouche
- Université de Paris, CNRS, UMR 8038/CiTCoM, F-75006 Paris, France; (G.D.B.); (D.A.); (E.F.)
- Institut Necker-Enfants Malades (INEM), Inserm U1151, 156 rue de Vaugirard, CEDEX 15, 75015 Paris, France
| | - Elisa Frezza
- Université de Paris, CNRS, UMR 8038/CiTCoM, F-75006 Paris, France; (G.D.B.); (D.A.); (E.F.)
| | - Benoît Masquida
- Université de Strasbourg, CNRS UMR7156 GMGM, 67084 Strasbourg, France;
| | - Yann Ponty
- Ecole Polytechnique, CNRS UMR 7161, LIX, 91120 Palaiseau, France; (Y.P.); (S.W.)
| | - Sebastian Will
- Ecole Polytechnique, CNRS UMR 7161, LIX, 91120 Palaiseau, France; (Y.P.); (S.W.)
| | - Bruno Sargueil
- Université de Paris, CNRS, UMR 8038/CiTCoM, F-75006 Paris, France; (G.D.B.); (D.A.); (E.F.)
| |
Collapse
|
42
|
Zhang D, Chen SJ, Zhou R. Modeling Noncanonical RNA Base Pairs by a Coarse-Grained IsRNA2 Model. J Phys Chem B 2021; 125:11907-11915. [PMID: 34694128 DOI: 10.1021/acs.jpcb.1c07288] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Noncanonical base pairs contribute crucially to the three-dimensional architecture of large RNA molecules; however, how to accurately model them remains an open challenge in RNA 3D structure prediction. Here, we report a promising coarse-grained (CG) IsRNA2 model to predict noncanonical base pairs in large RNAs through molecular dynamics simulations. By introducing a five-bead per nucleotide CG representation to reserve the three interacting edges of nucleobases, IsRNA2 accurately models various base-pairing interactions, including both canonical and noncanonical base pairs. A benchmark test indicated that IsRNA2 achieves a comparable performance to the atomic model in de novo modeling of noncanonical RNA structures. In addition, IsRNA2 was able to refine the 3D structure predictions for large RNAs in RNA-puzzle challenges. Finally, the graphics processing unit acceleration was introduced to speed up the sampling efficiency in IsRNA2 for very large RNA molecules. Therefore, the CG IsRNA2 model reported here offers a reliable approach to predict the structures and dynamics of large RNAs.
Collapse
Affiliation(s)
- Dong Zhang
- College of Life Sciences and Institute of Quantitative Biology, Zhejiang University, Hangzhou 310058, China
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry, and Institute of Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| | - Ruhong Zhou
- College of Life Sciences and Institute of Quantitative Biology, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
43
|
Manigrasso J, Marcia M, De Vivo M. Computer-aided design of RNA-targeted small molecules: A growing need in drug discovery. Chem 2021. [DOI: 10.1016/j.chempr.2021.05.021] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
44
|
Popenda M, Zok T, Sarzynska J, Korpeta A, Adamiak R, Antczak M, Szachniuk M. Entanglements of structure elements revealed in RNA 3D models. Nucleic Acids Res 2021; 49:9625-9632. [PMID: 34432024 PMCID: PMC8464073 DOI: 10.1093/nar/gkab716] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 08/02/2021] [Accepted: 08/06/2021] [Indexed: 01/14/2023] Open
Abstract
Computational methods to predict RNA 3D structure have more and more practical applications in molecular biology and medicine. Therefore, it is crucial to intensify efforts to improve the accuracy and quality of predicted three-dimensional structures. A significant role in this is played by the RNA-Puzzles initiative that collects, evaluates, and shares RNAs built computationally within currently nearly 30 challenges. RNA-Puzzles datasets, subjected to multi-criteria analysis, allow revealing the strengths and weaknesses of computer prediction methods. Here, we study the issue of entangled RNA fragments in the predicted RNA 3D structure models. By entanglement, we mean an arrangement of two structural elements such that one of them passes through the other. We propose the classification of entanglements driven by their topology and components. It distinguishes two general classes, interlaces and lassos, and subclasses characterized by element types-loops, dinucleotide steps, open single-stranded fragments-and puncture multiplicity. Our computational pipeline for entanglement detection, applied for 1,017 non-redundant models from RNA-Puzzles, has shown the frequency of different entanglements and allowed identifying 138 structures with intersected assemblies.
Collapse
Affiliation(s)
- Mariusz Popenda
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| | - Tomasz Zok
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Joanna Sarzynska
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| | - Agnieszka Korpeta
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Ryszard W Adamiak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Maciej Antczak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| |
Collapse
|
45
|
Hanumanthappa AK, Singh J, Paliwal K, Singh J, Zhou Y. Single-sequence and profile-based prediction of RNA solvent accessibility using dilated convolutional neural network. Bioinformatics 2021; 36:5169-5176. [PMID: 33106872 DOI: 10.1093/bioinformatics/btaa652] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Revised: 06/30/2020] [Accepted: 07/14/2020] [Indexed: 12/11/2022] Open
Abstract
MOTIVATION RNA solvent accessibility, similar to protein solvent accessibility, reflects the structural regions that are accessible to solvents or other functional biomolecules, and plays an important role for structural and functional characterization. Unlike protein solvent accessibility, only a few tools are available for predicting RNA solvent accessibility despite the fact that millions of RNA transcripts have unknown structures and functions. Also, these tools have limited accuracy. Here, we have developed RNAsnap2 that uses a dilated convolutional neural network with a new feature, based on predicted base-pairing probabilities from LinearPartition. RESULTS Using the same training set from the recent predictor RNAsol, RNAsnap2 provides an 11% improvement in median Pearson Correlation Coefficient (PCC) and 9% improvement in mean absolute errors for the same test set of 45 RNA chains. A larger improvement (22% in median PCC) is observed for 31 newly deposited RNA chains that are non-redundant and independent from the training and the test sets. A single-sequence version of RNAsnap2 (i.e. without using sequence profiles generated from homology search by Infernal) has achieved comparable performance to the profile-based RNAsol. In addition, RNAsnap2 has achieved comparable performance for protein-bound and protein-free RNAs. Both RNAsnap2 and RNAsnap2 (SingleSeq) are expected to be useful for searching structural signatures and locating functional regions of non-coding RNAs. AVAILABILITY AND IMPLEMENTATION Standalone-versions of RNAsnap2 and RNAsnap2 (SingleSeq) are available at https://github.com/jaswindersingh2/RNAsnap2. Direct prediction can also be made at https://sparks-lab.org/server/rnasnap2. The datasets used in this research can also be downloaded from the GITHUB and the webserver mentioned above. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Anil Kumar Hanumanthappa
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD 4111, Australia
| | - Jaswinder Singh
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD 4111, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD 4111, Australia
| | - Jaspreet Singh
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD 4111, Australia
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, QLD 4222, Australia
| |
Collapse
|
46
|
Yi D, Bayer T, Badenhorst CPS, Wu S, Doerr M, Höhne M, Bornscheuer UT. Recent trends in biocatalysis. Chem Soc Rev 2021; 50:8003-8049. [PMID: 34142684 PMCID: PMC8288269 DOI: 10.1039/d0cs01575j] [Citation(s) in RCA: 156] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Indexed: 12/13/2022]
Abstract
Biocatalysis has undergone revolutionary progress in the past century. Benefited by the integration of multidisciplinary technologies, natural enzymatic reactions are constantly being explored. Protein engineering gives birth to robust biocatalysts that are widely used in industrial production. These research achievements have gradually constructed a network containing natural enzymatic synthesis pathways and artificially designed enzymatic cascades. Nowadays, the development of artificial intelligence, automation, and ultra-high-throughput technology provides infinite possibilities for the discovery of novel enzymes, enzymatic mechanisms and enzymatic cascades, and gradually complements the lack of remaining key steps in the pathway design of enzymatic total synthesis. Therefore, the research of biocatalysis is gradually moving towards the era of novel technology integration, intelligent manufacturing and enzymatic total synthesis.
Collapse
Affiliation(s)
- Dong Yi
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| | - Thomas Bayer
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| | - Christoffel P. S. Badenhorst
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| | - Shuke Wu
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| | - Mark Doerr
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| | - Matthias Höhne
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| | - Uwe T. Bornscheuer
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| |
Collapse
|
47
|
Christy TW, Giannetti CA, Houlihan G, Smola MJ, Rice GM, Wang J, Dokholyan NV, Laederach A, Holliger P, Weeks KM. Direct Mapping of Higher-Order RNA Interactions by SHAPE-JuMP. Biochemistry 2021; 60:1971-1982. [PMID: 34121404 PMCID: PMC8256721 DOI: 10.1021/acs.biochem.1c00270] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Higher-order structure governs function for many RNAs. However, discerning this structure for large RNA molecules in solution is an unresolved challenge. Here, we present SHAPE-JuMP (selective 2'-hydroxyl acylation analyzed by primer extension and juxtaposed merged pairs) to interrogate through-space RNA tertiary interactions. A bifunctional small molecule is used to chemically link proximal nucleotides in an RNA structure. The RNA cross-link site is then encoded into complementary DNA (cDNA) in a single, direct step using an engineered reverse transcriptase that "jumps" across cross-linked nucleotides. The resulting cDNAs contain a deletion relative to the native RNA sequence, which can be detected by sequencing, that indicates the sites of cross-linked nucleotides. SHAPE-JuMP measures RNA tertiary structure proximity concisely across large RNA molecules at nanometer resolution. SHAPE-JuMP is especially effective at measuring interactions in multihelix junctions and loop-to-helix packing, enables modeling of the global fold for RNAs up to several hundred nucleotides in length, facilitates ranking of structural models by consistency with through-space restraints, and is poised to enable solution-phase structural interrogation and modeling of complex RNAs.
Collapse
Affiliation(s)
- Thomas W. Christy
- Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599-3290
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, North Carolina 27599
| | - Catherine A. Giannetti
- Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599-3290
| | - Gillian Houlihan
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK
| | - Matthew J. Smola
- Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599-3290
| | - Greggory M. Rice
- Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599-3290
| | - Jian Wang
- Departments of Pharmacology, and Biochemistry and Molecular Biology, Penn State University College of Medicine, Hershey, PA 17033, USA
| | - Nikolay V. Dokholyan
- Departments of Pharmacology, and Biochemistry and Molecular Biology, Penn State University College of Medicine, Hershey, PA 17033, USA
- Departments of Chemistry, and Biomedical Engineering, Pennsylvania State University, University Park, PA 16802
| | - Alain Laederach
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599
| | - Philipp Holliger
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK
| | - Kevin M. Weeks
- Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599-3290
| |
Collapse
|
48
|
Sun S, Wang W, Peng Z, Yang J. RNA inter-nucleotide 3D closeness prediction by deep residual neural networks. Bioinformatics 2021; 37:1093-1098. [PMID: 33135062 PMCID: PMC8150135 DOI: 10.1093/bioinformatics/btaa932] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 10/01/2020] [Accepted: 10/22/2020] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Recent years have witnessed that the inter-residue contact/distance in proteins could be accurately predicted by deep neural networks, which significantly improve the accuracy of predicted protein structure models. In contrast, fewer studies have been done for the prediction of RNA inter-nucleotide 3D closeness. RESULTS We proposed a new algorithm named RNAcontact for the prediction of RNA inter-nucleotide 3D closeness. RNAcontact was built based on the deep residual neural networks. The covariance information from multiple sequence alignments and the predicted secondary structure were used as the input features of the networks. Experiments show that RNAcontact achieves the respective precisions of 0.8 and 0.6 for the top L/10 and L (where L is the length of an RNA) predictions on an independent test set, significantly higher than other evolutionary coupling methods. Analysis shows that about 1/3 of the correctly predicted 3D closenesses are not base pairings of secondary structure, which are critical to the determination of RNA structure. In addition, we demonstrated that the predicted 3D closeness could be used as distance restraints to guide RNA structure folding by the 3dRNA package. More accurate models could be built by using the predicted 3D closeness than the models without using 3D closeness. AVAILABILITY AND IMPLEMENTATION The webserver and a standalone package are available at: http://yanglab.nankai.edu.cn/RNAcontact/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Saisai Sun
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| | - Wenkai Wang
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| | - Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin 300072, China
| | - Jianyi Yang
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| |
Collapse
|
49
|
Singh J, Paliwal K, Singh J, Zhou Y. RNA Backbone Torsion and Pseudotorsion Angle Prediction Using Dilated Convolutional Neural Networks. J Chem Inf Model 2021; 61:2610-2622. [PMID: 34037398 DOI: 10.1021/acs.jcim.1c00153] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
RNA three-dimensional structure prediction has been relied on using a predicted or experimentally determined secondary structure as a restraint to reduce the conformational sampling space. However, the secondary-structure restraints are limited to paired bases, and the conformational space of the ribose-phosphate backbone is still too large to be sampled efficiently. Here, we employed the dilated convolutional neural network to predict backbone torsion and pseudotorsion angles using a single RNA sequence as input. The method called SPOT-RNA-1D was trained on a high-resolution training data set and tested on three independent, nonredundant, and high-resolution test sets. The proposed method yields substantially smaller mean absolute errors than the baseline predictors based on random predictions and based on helix conformations according to actual angle distributions. The mean absolute errors for three test sets range from 14°-44° for different angles, compared to 17°-62° by random prediction and 14°-58° by helix prediction. The method also accurately recovers the overall patterns of single or pairwise angle distributions. In general, torsion angles further away from the bases and associated with unpaired bases and paired bases involved in tertiary interactions are more difficult to predict. Compared to the best models in RNA-puzzles experiments, SPOT-RNA-1D yielded more accurate dihedral angles and, thus, are potentially useful as model quality indicators and restraints for RNA structure prediction as in protein structure prediction.
Collapse
Affiliation(s)
- Jaswinder Singh
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland 4122, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland 4122, Australia
| | - Jaspreet Singh
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland 4122, Australia
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Queensland 4222, Australia.,Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China.,Peking University Shenzhen Graduate School, Shenzhen 518055, P.R. China
| |
Collapse
|
50
|
Zhang T, Singh J, Litfin T, Zhan J, Paliwal K, Zhou Y. RNAcmap: A Fully Automatic Pipeline for Predicting Contact Maps of RNAs by Evolutionary Coupling Analysis. Bioinformatics 2021; 37:3494-3500. [PMID: 34021744 DOI: 10.1093/bioinformatics/btab391] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Revised: 03/27/2021] [Accepted: 05/18/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION The accuracy of RNA secondary and tertiary structure prediction can be significantly improved by using structural restraints derived from evolutionary coupling or direct coupling analysis. Currently, these coupling analyses relied on manually curated multiple sequence alignments collected in the Rfam database, which contains 3016 families. By comparison, millions of non-coding RNA sequences are known. Here, we established RNAcmap, a fully automatic pipeline that enables evolutionary coupling analysis for any RNA sequences. The homology search was based on the covariance model built by INFERNAL according to two secondary structure predictors: a folding-based algorithm RNAfold and the latest deep-learning method SPOT-RNA. RESULTS We showed that the performance of RNAcmap is less dependent on the specific evolutionary coupling tool but is more dependent on the accuracy of secondary structure predictor with the best performance given by RNAcmap (SPOT-RNA). The performance of RNAcmap (SPOT-RNA) is comparable to that based on Rfam-supplied alignment and consistent for those sequences that are not in Rfam collections. Further improvement can be made with a simple meta predictor RNAcmap (SPOT-RNA/RNAfold) depending on which secondary structure predictor can find more homologous sequences. Reliable base-pairing information generated from RNAcmap, for RNAs with high effective homologous sequences, in particular, will be useful for aiding RNA structure prediction. AVAILABILITY RNAcmap is available as a web server at https://sparks-lab.org/server/rnacmap/ and as a standalone application along with the datasets at https://github.com/sparks-lab-org/RNAcmap_standalone. A platform independent and fully configured docker image of RNAcmap is also provided at https://hub.docker.com/r/jaswindersingh2/rnacmap.
Collapse
Affiliation(s)
- Tongchuan Zhang
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr. Southport, QLD 4222, Australia
| | - Jaswinder Singh
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD 4111, Australia
| | - Thomas Litfin
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr. Southport, QLD 4222, Australia
| | - Jian Zhan
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr. Southport, QLD 4222, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD 4111, Australia
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr. Southport, QLD 4222, Australia.,Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| |
Collapse
|