1
|
Bu F, Adam Y, Adamiak RW, Antczak M, de Aquino BRH, Badepally NG, Batey RT, Baulin EF, Boinski P, Boniecki MJ, Bujnicki JM, Carpenter KA, Chacon J, Chen SJ, Chiu W, Cordero P, Das NK, Das R, Dawson WK, DiMaio F, Ding F, Dock-Bregeon AC, Dokholyan NV, Dror RO, Dunin-Horkawicz S, Eismann S, Ennifar E, Esmaeeli R, Farsani MA, Ferré-D'Amaré AR, Geniesse C, Ghanim GE, Guzman HV, Hood IV, Huang L, Jain DS, Jaryani F, Jin L, Joshi A, Karelina M, Kieft JS, Kladwang W, Kmiecik S, Koirala D, Kollmann M, Kretsch RC, Kurciński M, Li J, Li S, Magnus M, Masquida B, Moafinejad SN, Mondal A, Mukherjee S, Nguyen THD, Nikolaev G, Nithin C, Nye G, Pandaranadar Jeyeram IPN, Perez A, Pham P, Piccirilli JA, Pilla SP, Pluta R, Poblete S, Ponce-Salvatierra A, Popenda M, Popenda L, Pucci F, Rangan R, Ray A, Ren A, Sarzynska J, Sha CM, Stefaniak F, Su Z, Suddala KC, Szachniuk M, Townshend R, Trachman RJ, Wang J, Wang W, Watkins A, Wirecki TK, Xiao Y, Xiong P, Xiong Y, Yang J, Yesselman JD, Zhang J, Zhang Y, Zhang Z, Zhou Y, Zok T, Zhang D, Zhang S, Żyła A, Westhof E, Miao Z. RNA-Puzzles Round V: blind predictions of 23 RNA structures. Nat Methods 2025; 22:399-411. [PMID: 39623050 PMCID: PMC11810798 DOI: 10.1038/s41592-024-02543-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 10/29/2024] [Indexed: 01/16/2025]
Abstract
RNA-Puzzles is a collective endeavor dedicated to the advancement and improvement of RNA three-dimensional structure prediction. With agreement from structural biologists, RNA structures are predicted by modeling groups before publication of the experimental structures. We report a large-scale set of predictions by 18 groups for 23 RNA-Puzzles: 4 RNA elements, 2 Aptamers, 4 Viral elements, 5 Ribozymes and 8 Riboswitches. We describe automatic assessment protocols for comparisons between prediction and experiment. Our analyses reveal some critical steps to be overcome to achieve good accuracy in modeling RNA structures: identification of helix-forming pairs and of non-Watson-Crick modules, correct coaxial stacking between helices and avoidance of entanglements. Three of the top four modeling groups in this round also ranked among the top four in the CASP15 contest.
Collapse
Grants
- T32 GM066706 NIGMS NIH HHS
- NSFC T2225007 National Natural Science Foundation of China (National Science Foundation of China)
- R35 GM134919 NIGMS NIH HHS
- R35GM145409 Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- R35 GM145409 NIGMS NIH HHS
- 32270707 National Natural Science Foundation of China (National Science Foundation of China)
- R35 GM122579 NIGMS NIH HHS
- R35 GM134864 NIGMS NIH HHS
- T32 grant GM066706 Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- P20GM121342 Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- R21 CA219847 NCI NIH HHS
- 32171191 National Natural Science Foundation of China (National Science Foundation of China)
- P20 GM121342 NIGMS NIH HHS
- R35 GM152029 NIGMS NIH HHS
- R01 GM073850 NIGMS NIH HHS
- F32 GM112294 NIGMS NIH HHS
- ZIA DK075136 Intramural NIH HHS
- Z.M. is supported by Major Projects of Guangzhou National Laboratory, (Grant No. GZNL2023A01006, GZNL2024A01002, SRPG22-003, SRPG22-006, SRPG22-007, HWYQ23-003, YW-YFYJ0102), the National Key R&D Programs of China (2023YFF1204700, 2023YFF1204701, 2021YFF1200900, 2021YFF1200903). This work is part of the ITI 2021-2028 program and supported by IdEx Unistra (ANR-10-IDEX-0002 to E.W.), SFRI-STRAT’US project (ANR-20-SFRI-0012) and EUR IMCBio (IMCBio ANR-17-EURE-0023 to E.W.) under the framework of the French Investments for the Future Program.
- E.W. acknowledges also support from Wenzhou Institute, University of Chinese Academy of Sciences (WIUCASQD2024002).
- E.F.B. was additionally supported by European Molecular Biology Organization (EMBO) fellowship (ALTF 525-2022).
- Boniecki’s research was supported by the Polish National Science Center Poland (NCN) (grant 2016/23/B/ST6/03433 to Michal J. Boniecki). Predictions were performed using computational resources of the Interdisciplinary Centre for Mathematical and Computational Modelling of the University of Warsaw (ICM) (grant G66-9).
- J.M.B. is supported by the National Science Centre in Poland (NCN grants: 2017/26/A/NZ1/01083 to J.M.B., 2021/43/D/NZ1/03360 to S.M., 2020/39/B/NZ2/03127 to F.S., 2020/39/D/NZ2/02837 to T.K.W.). J.M.B. acknowledge Poland high-performance computing Infrastructure PLGrid (HPC Centers: ACK Cyfronet AGH, PCSS, CI TASK, WCSS) for providing computer facilities and support within the computational grant PLG/2023/016080.
- S.J.C. is supported by the National Institutes of Health under Grant R35-GM134919.
- R.D. is supported by Stanford Bio-X (to R.D., R.O.D., R.C.K., and S.E.); Stanford Gerald J. Lieberman Fellowship (to R.R.); the National Institutes of Health (R21 CA219847 and R35 GM122579 to R.D.), the Howard Hughes Medical Institute (HHMI, to R.D.); Consejo Nacional de Ciencia y Tecnología CONACyT Fellowship 312765 (P.C.); the Ruth L. Kirschstein National Research Service Award Postdoctoral Fellowships GM112294 (to J.D.Y.); National Science Foundation Graduate Research Fellowships (R.J.L.T. and R.R.); the National Library of Medicine T15 Training Grant (NLM T15007033 to K.A.C.); the U.S. Department of Energy, Office of Science Graduate Student Research program (R.J.L.T.).
- The National Institutes of Health grants 1R35 GM134864 and the Passan Foundation.
- R.O.D. is supported by the U.S. Department of Energy, Office of Science, Scientific Discovery through Advanced Computing (SciDAC) program (R.O.D.); Intel (R.O.D.).
- A.F.D. is supported, in part, by the intramural program of the National Heart, Lung and Blood Institute, National Institutes of Health, USA.
- Guangdong Science and Technology Department (2022A1515010328, 2023B1212060013, 2020B1212030004), Fundamental Research Funds for the Central Universities, Sun Yat-sen University (23ptpy41).
- D.K. is supported by the NSF CAREER award MCB-2236996, and start-up, SURFF, and START awards from the University of Maryland Baltimore County to D.K.
- BM is supported by the Interdisciplinary Thematic Institute IMCBio, as part of the ITI 2021-2028 program at the University of Strasbourg, CNRS and Inserm, by IdEx Unistra (ANR-10-IDEX-0002), and EUR (IMCBio ANR-17-EUR-0023), under the framework of the French Investments Program for the Future.
- T.H.D.N. is supported by UKRI-Medical Research Council grant MC_UP_1201/19.
- C.N. and M.K. acknowledge funding from the National Science Centre, Poland [OPUS 2019/33/B/NZ2/02100]; S.P.P. acknowledges funding from the National Science Centre, Poland [OPUS 2020/39/B/NZ2/01301]; S.K. acknowledges funding from the National Science Centre, Poland [Sheng 2021/40/Q/NZ2/00078]; C.N. acknowledge Polish high-performance computing infrastructure PLGrid (HPC Centers: PCSS, ACK Cyfronet AGH, CI TASK, WCSS) for providing computer facilities and support within the computational grants PLG/2022/016043, PLG/2022/015327 and PLG/2020/013424.
- AP is supported by an NSF-CAREER award CHE-2235785
- A.R. is supported by grants from the Natural Science Foundation of China (32325029, 32022039, 91940302, and 91640104), the National Key Research and Development Project of China (2021YFC2300300 and 2023YFC2604300).
- Marta Szachniuk are supported by the National Science Centre, Poland (2019/35/B/ST6/03074 to M.S.), the statutory funds of IBCH PAS and Poznan University of Technology.
- J.W. is supported by the Penn State College of Medicine’s Artificial Intelligence and Biomedical Informatics Program.
- J.Z. is supported by the Intramural Research Program of the NIH, the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) (ZIADK075136 to J.Z.), and an NIH Deputy Director for Intramural Research (DDIR) Challenge Award to J.Z.
Collapse
Affiliation(s)
- Fan Bu
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Yagoub Adam
- Inter-institutional Graduate Program on Bioinformatics, Department of Computer Science and Mathematics, FFCLRP, University of São Paulo, Ribeirão Preto, Brazil
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Nigeria
| | - Ryszard W Adamiak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Maciej Antczak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Belisa Rebeca H de Aquino
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Nagendar Goud Badepally
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Robert T Batey
- Department of Biochemistry, University of Colorado at Boulder, Boulder, CO, USA
| | - Eugene F Baulin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Pawel Boinski
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Michal J Boniecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Kristy A Carpenter
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Jose Chacon
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Department of Cell and Developmental Biology, University of California San Diego, San Diego, CA, USA
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Wah Chiu
- Department of Bioengineering and James H. Clark Center, Stanford University, Stanford, CA, USA
| | - Pablo Cordero
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Stripe, South San Francisco, CA, USA
| | - Naba Krishna Das
- Department of Chemistry and Biochemistry, University of Maryland Baltimore County, Baltimore, MD, USA
| | - Rhiju Das
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
- Biophysics program, Stanford University, Stanford, CA, USA
| | - Wayne K Dawson
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Feng Ding
- Department of Physics and Astronomy, Clemson University, Clemson, SC, USA
| | - Anne-Catherine Dock-Bregeon
- Laboratory of Integrative Biology of Marine Models (LBI2M), Sorbonne University-CNRS UMR8227, Roscoff, France
| | - Nikolay V Dokholyan
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Ron O Dror
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Department of Structural Biology, Stanford University, Stanford, CA, USA
- Department of Molecular and Cellular Physiology, Stanford University, Stanford, CA, USA
- Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, USA
| | - Stanisław Dunin-Horkawicz
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Stephan Eismann
- Department of Applied Physics, Stanford University, Stanford, CA, USA
- Atomic AI, South San Francisco, CA, USA
| | - Eric Ennifar
- Architecture et Réactivité de l'ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, Strasbourg, France
| | - Reza Esmaeeli
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Masoud Amiri Farsani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Adrian R Ferré-D'Amaré
- Laboratory of Nucleic Acids, National Heart, Lung and Blood Institute, Bethesda, MD, USA
| | - Caleb Geniesse
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - George E Ghanim
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Horacio V Guzman
- Instituto de Ciencia de Materials de Barcelona, ICMAB-CSIC, Bellaterra E-08193, Spain & Departamento de Física Teórica de la Materia Condensada, Universidad Autónoma de Madrid, Madrid, Spain
| | - Iris V Hood
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Lin Huang
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University Guangzhou, Guangdong, China
| | - Dharm Skandh Jain
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Farhang Jaryani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Lei Jin
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Astha Joshi
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Masha Karelina
- Biophysics program, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jeffrey S Kieft
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, CO, USA
- New York Structural Biology Center, New York, NY, USA
| | - Wipapat Kladwang
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Sebastian Kmiecik
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Deepak Koirala
- Department of Chemistry and Biochemistry, University of Maryland Baltimore County, Baltimore, MD, USA
| | - Markus Kollmann
- Department of Computer Science, Heinrich Heine University of Düsseldorf, Düsseldorf, Germany
| | | | - Mateusz Kurciński
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Jun Li
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Shuang Li
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Marcin Magnus
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - BenoÎt Masquida
- UMR 7156, CNRS - Université de Strasbourg, IPCB, Strasbourg, France
| | - S Naeim Moafinejad
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Sunandan Mukherjee
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | | | - Grigory Nikolaev
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Chandran Nithin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Grace Nye
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Iswarya P N Pandaranadar Jeyeram
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Phillip Pham
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Joseph A Piccirilli
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, IL, USA
- Department of Chemistry, The University of Chicago, Chicago, IL, USA
| | - Smita Priyadarshini Pilla
- Laboratory of Computational Biology, Biological and Chemical Research Center, University of Warsaw, Warsaw, Poland
| | - Radosław Pluta
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Simón Poblete
- Facultad de Ingeniería, Arquitectura y Diseño, Universidad San Sebastián, Santiago, Chile
- Centro BASAL Ciencia & Vida, Universidad San Sebastián, Santiago, Chile
| | - Almudena Ponce-Salvatierra
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Mariusz Popenda
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Lukasz Popenda
- NanoBioMedical Centre, Adam Mickiewicz University, Poznan, Poland
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
| | - Ramya Rangan
- Biophysics program, Stanford University, Stanford, CA, USA
- Atomic AI, South San Francisco, CA, USA
| | - Angana Ray
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Aiming Ren
- Life Sciences Institute, Zhejiang University, Hangzhou, China
| | - Joanna Sarzynska
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Congzhou Mike Sha
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Filip Stefaniak
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Zhaoming Su
- The State Key Laboratory of Biotherapy, West China Hospital, Chengdu, China
| | - Krishna C Suddala
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Raphael Townshend
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Atomic AI, South San Francisco, CA, USA
| | - Robert J Trachman
- Laboratory of Nucleic Acids, National Heart, Lung and Blood Institute, Bethesda, MD, USA
| | - Jian Wang
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Wenkai Wang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
| | - Andrew Watkins
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Prescient Design, Genentech Research and Early Development, South San Francisco, CA, USA
| | - Tomasz K Wirecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Yi Xiao
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, China
| | - Peng Xiong
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Department of Biomedical Engineering, Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China
| | - Yiduo Xiong
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, China
| | - Jianyi Yang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
| | - Joseph David Yesselman
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
- Department of Chemistry, University of Nebraska, Lincoln, NE, USA
| | - Jinwei Zhang
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Yi Zhang
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, China
| | - Zhenzhen Zhang
- Department of Physics and Astronomy, Clemson University, Clemson, SC, USA
| | - Yuanzhe Zhou
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Dong Zhang
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Sicheng Zhang
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Adriana Żyła
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Eric Westhof
- Architecture et Réactivité de l'ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, Strasbourg, France.
- Engineering Research Center of Clinical Functional Materials and Diagnosis & Treatment Devices of Zhejiang Province, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, China.
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China.
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People's Hospital, School of Medicine, Tongji University, Shanghai, China.
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Cambridge, UK.
| |
Collapse
|
2
|
Wang Y, Xiao K, Tao T, Zhang R, Shu H, Sun X. Evaluating the Performance of Peak Calling Algorithms Available for Intracellular G-Quadruplex Sequencing. Int J Mol Sci 2025; 26:1268. [PMID: 39941033 PMCID: PMC11818603 DOI: 10.3390/ijms26031268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2024] [Revised: 01/27/2025] [Accepted: 01/27/2025] [Indexed: 02/16/2025] Open
Abstract
DNA G-quadruplexes (G4) are non-canonical DNA structures that play key roles in various biological processes. Antibody-dependent sequencing is an important tool for identifying intracellularly formed DNA G4s, and peak calling is a crucial step in processing the sequencing data. As the applicability of existing peak calling algorithms to intracellular G4 data has not been previously assessed, we systematically compared and evaluated these algorithms to determine those best suited for G4 detection. We selected seven representative candidates from 43 published peak calling algorithms for detailed evaluation. The performance of each candidate on six published intracellular G4 sequencing datasets (GSE107690, GSE145090, GSE133379, GSE178668ChIP-seq, GSE178668CUT&Tag, GSE221437) were assessed by precision and recall against customized benchmarks integrating results from multiple algorithms, as well as consistency with known G4 information (pG4 predicted by pqsfinder, oG4 from GSE63874, and multi-cell-line conserved G4s) and epigenetic signals. We identified MACS2, PeakRanger, and GoPeaks as the most effective algorithms for analyzing intracellular G4 sequencing data, and attributed their superior performance partially to the distribution model of sequencing reads/fragments used in the hypothesis testing step of the peak calling procedures. These findings provide guidance and rationale for selecting peak callers appropriate for intracellular G4 data.
Collapse
Affiliation(s)
| | | | | | | | | | - Xiao Sun
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing 211189, China; (Y.W.); (K.X.); (T.T.); (R.Z.); (H.S.)
| |
Collapse
|
3
|
D'Anna L, Froux A, Rainot A, Spinello A, Perricone U, Barbault F, Grandemange S, Barone G, Terenzi A, Monari A. Resolving the Structure of a Guanine Quadruplex in TMPRSS2 Messenger RNA by Circular Dichroism and Molecular Modeling. Chemistry 2024; 30:e202403572. [PMID: 39365977 DOI: 10.1002/chem.202403572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Revised: 10/04/2024] [Accepted: 10/04/2024] [Indexed: 10/06/2024]
Abstract
The presence of a guanine quadruplex in the opening reading frame of the messenger RNA coding for the transmembrane serine protease 2 (TMPRSS2) may pave the way to original anticancer and host-oriented antiviral strategy. Indeed, TMPRSS2 in addition to being overexpressed in different cancer types, is also related to the infection of respiratory viruses, including SARS-CoV-2, by promoting the cellular and viral membrane fusion through its proteolytic activity. The design of selective ligands targeting TMPRSS2 messenger RNA requires a detailed knowledge, at atomic level, of its structure. Therefore, we have used an original experimental-computational protocol to predict the first resolved structure of the parallel guanine quadruplex secondary structure in the RNA of TMPRSS2, which shows a rigid core flanked by a flexible loop. This represents the first atomic scale structure of the guanine quadruplex structure present in TMPRSS2 messenger RNA.
Collapse
Affiliation(s)
- Luisa D'Anna
- Department of Biological, Chemical and Pharmaceutical Sciences and Technologies, Università di Palermo, Viale delle Scienze, Edificio 17, Palermo, 90128, Italy
| | - Aurane Froux
- Department of Biological, Chemical and Pharmaceutical Sciences and Technologies, Università di Palermo, Viale delle Scienze, Edificio 17, Palermo, 90128, Italy
- Université de Lorraine and CNRS, UMR 7039 CRAN, Nancy, F-54000, France
| | - Aurianne Rainot
- Department of Biological, Chemical and Pharmaceutical Sciences and Technologies, Università di Palermo, Viale delle Scienze, Edificio 17, Palermo, 90128, Italy
- Université Paris Cité, CNRS, ITODYS, Paris, F-75013, France
| | - Angelo Spinello
- Department of Biological, Chemical and Pharmaceutical Sciences and Technologies, Università di Palermo, Viale delle Scienze, Edificio 17, Palermo, 90128, Italy
| | - Ugo Perricone
- Fondazione Ri.MED, Via Filippo Marini 14, Palermo, 90128, Italy
| | | | | | - Giampaolo Barone
- Department of Biological, Chemical and Pharmaceutical Sciences and Technologies, Università di Palermo, Viale delle Scienze, Edificio 17, Palermo, 90128, Italy
| | - Alessio Terenzi
- Department of Biological, Chemical and Pharmaceutical Sciences and Technologies, Università di Palermo, Viale delle Scienze, Edificio 17, Palermo, 90128, Italy
| | - Antonio Monari
- Université Paris Cité, CNRS, ITODYS, Paris, F-75013, France
| |
Collapse
|
4
|
Michael Sabo T, Trent JO, Chaires JB, Monsen RC. Strategy for modeling higher-order G-quadruplex structures recalcitrant to NMR determination. Methods 2024; 230:9-20. [PMID: 39032720 DOI: 10.1016/j.ymeth.2024.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 06/22/2024] [Accepted: 07/17/2024] [Indexed: 07/23/2024] Open
Abstract
Guanine-rich nucleic acids can form intramolecularly folded four-stranded structures known as G-quadruplexes (G4s). Traditionally, G4 research has focused on short, highly modified DNA or RNA sequences that form well-defined homogeneous compact structures. However, the existence of longer sequences with multiple G4 repeats, from proto-oncogene promoters to telomeres, suggests the potential for more complex higher-order structures with multiple G4 units that might offer selective drug-targeting sites for therapeutic development. These larger structures present significant challenges for structural characterization by traditional high-resolution methods like multi-dimensional NMR and X-ray crystallography due to their molecular complexity. To address this current challenge, we have developed an integrated structural biology (ISB) platform, combining experimental and computational methods to determine self-consistent molecular models of higher-order G4s (xG4s). Here we outline our ISB method using two recent examples from our lab, an extended c-Myc promoter and long human telomere G4 repeats, that highlights the utility and generality of our approach to characterizing biologically relevant xG4s.
Collapse
Affiliation(s)
- T Michael Sabo
- UofL Health Brown Cancer Center, University of Louisville, Louisville, KY, United States
| | - John O Trent
- UofL Health Brown Cancer Center, University of Louisville, Louisville, KY, United States
| | - Jonathan B Chaires
- UofL Health Brown Cancer Center, University of Louisville, Louisville, KY, United States
| | - Robert C Monsen
- UofL Health Brown Cancer Center, University of Louisville, Louisville, KY, United States.
| |
Collapse
|
5
|
Sundaresan S, Uttamrao PP, Kovuri P, Rathinavelan T. Entangled World of DNA Quadruplex Folds. ACS OMEGA 2024; 9:38696-38709. [PMID: 39310165 PMCID: PMC11411666 DOI: 10.1021/acsomega.4c04579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/28/2024] [Accepted: 08/21/2024] [Indexed: 09/25/2024]
Abstract
DNA quadruplexes participate in many biological functions. It takes up a variety of folds based on the sequence and environment. Here, a meticulous analysis of experimentally determined 437 quadruplex structures (433 PDBs) deposited in the PDB is carried out. The analysis reveals the modular representation of the quadruplex folds. Forty-eight unique quadruplex motifs (whose diversity arises out of the propeller, bulge, diagonal, and lateral loops that connect the quartets) are identified, leading to simple to complex inter/intramolecular quadruplex folds. The two-layered structural motifs are further classified into 33 continuous and 15 discontinuous motifs. While the continuous motifs can directly be extended to a quadruplex fold, the discontinuous motif requires an additional loop(s) to complete a fold, as illustrated here with examples. Similarly, higher-order quadruplex folds can also be represented by continuous or discontinuous motifs or their combinations. Such a modular representation of the quadruplex folds may assist in custom engineering of quadruplexes, designing motif-based drugs, and the prediction of the quadruplex structure. Furthermore, it could facilitate understanding of the role of quadruplexes in biological functions and diseases.
Collapse
Affiliation(s)
- Sruthi Sundaresan
- Department of Biotechnology, Indian Institute of Technology Hyderabad, Kandi, Telangana 502284, India
| | - Patil Pranita Uttamrao
- Department of Biotechnology, Indian Institute of Technology Hyderabad, Kandi, Telangana 502284, India
| | - Purnima Kovuri
- Department of Biotechnology, Indian Institute of Technology Hyderabad, Kandi, Telangana 502284, India
| | | |
Collapse
|
6
|
Geng Y, Liu C, Xu N, Suen MC, Miao H, Xie Y, Zhang B, Chen X, Song Y, Wang Z, Cai Q, Zhu G. Crystal structure of a tetrameric RNA G-quadruplex formed by hexanucleotide repeat expansions of C9orf72 in ALS/FTD. Nucleic Acids Res 2024; 52:7961-7970. [PMID: 38860430 PMCID: PMC11260476 DOI: 10.1093/nar/gkae473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 05/16/2024] [Accepted: 05/29/2024] [Indexed: 06/12/2024] Open
Abstract
The abnormal GGGGCC hexanucleotide repeat expansions (HREs) in C9orf72 cause the fatal neurodegenerative diseases including amyotrophic lateral sclerosis and frontotemporal dementia. The transcribed RNA HREs, short for r(G4C2)n, can form toxic RNA foci which sequestrate RNA binding proteins and impair RNA processing, ultimately leading to neurodegeneration. Here, we determined the crystal structure of r(G4C2)2, which folds into a parallel tetrameric G-quadruplex composed of two four-layer dimeric G-quadruplex via 5'-to-5' stacking in coordination with a K+ ion. Notably, the two C bases locate at 3'- end stack on the outer G-tetrad with the assistance of two additional K+ ions. The high-resolution structure reported here lays a foundation in understanding the mechanism of neurological toxicity of RNA HREs. Furthermore, the atomic details provide a structural basis for the development of potential therapeutic agents against the fatal neurodegenerative diseases ALS/FTD.
Collapse
Affiliation(s)
- Yanyan Geng
- Clinical Research Institute of the First Affiliated Hospital of Xiamen University, Fujian Key Laboratory of Brain Tumors Diagnosis and Precision Treatment, Xiamen Key Laboratory of Brain Center, the First Affiliated Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, China
- Institute for Advanced Study and State Key Laboratory of Molecular Neuroscience, Division of Life Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China
- Department of Neurosurgery and Department of Neuroscience, Fujian Key Laboratory of Brain Tumors Diagnosis and Precision Treatment, Xiamen Key Laboratory of Brain Center, the First Affiliated Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, China
| | - Changdong Liu
- Institute for Advanced Study and State Key Laboratory of Molecular Neuroscience, Division of Life Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China
- HKUST Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen, Guangdong, China
| | - Naining Xu
- Institute for Advanced Study and State Key Laboratory of Molecular Neuroscience, Division of Life Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China
- HKUST Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen, Guangdong, China
| | - Monica Ching Suen
- Institute for Advanced Study and State Key Laboratory of Molecular Neuroscience, Division of Life Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China
- HKUST Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen, Guangdong, China
| | - Haitao Miao
- Institute for Advanced Study and State Key Laboratory of Molecular Neuroscience, Division of Life Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China
| | - Yuanyuan Xie
- Department of Neurosurgery and Department of Neuroscience, Fujian Key Laboratory of Brain Tumors Diagnosis and Precision Treatment, Xiamen Key Laboratory of Brain Center, the First Affiliated Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, China
| | - Bingchang Zhang
- Department of Neurosurgery and Department of Neuroscience, Fujian Key Laboratory of Brain Tumors Diagnosis and Precision Treatment, Xiamen Key Laboratory of Brain Center, the First Affiliated Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, China
| | - Xueqin Chen
- Clinical Research Institute of the First Affiliated Hospital of Xiamen University, Fujian Key Laboratory of Brain Tumors Diagnosis and Precision Treatment, Xiamen Key Laboratory of Brain Center, the First Affiliated Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, China
| | - Yuanjian Song
- Jiangsu Key Laboratory of Brain Disease Bioinformation, Department of Genetics, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Zhanxiang Wang
- Department of Neurosurgery and Department of Neuroscience, Fujian Key Laboratory of Brain Tumors Diagnosis and Precision Treatment, Xiamen Key Laboratory of Brain Center, the First Affiliated Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, China
| | - Qixu Cai
- State Key Laboratory of Vaccines for Infectious Diseases, School of Public Health, Xiamen University, Xiamen, Fujian, China
| | - Guang Zhu
- Institute for Advanced Study and State Key Laboratory of Molecular Neuroscience, Division of Life Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China
- HKUST Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen, Guangdong, China
| |
Collapse
|
7
|
Farag M, Mouawad L. Comprehensive analysis of intramolecular G-quadruplex structures: furthering the understanding of their formalism. Nucleic Acids Res 2024; 52:3522-3546. [PMID: 38512075 DOI: 10.1093/nar/gkae182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 02/16/2024] [Accepted: 03/01/2024] [Indexed: 03/22/2024] Open
Abstract
G-quadruplexes (G4) are helical structures found in guanine-rich DNA or RNA sequences. Generally, their formalism is based on a few dozen structures, which can produce some inconsistencies or incompleteness. Using the website ASC-G4, we analyzed the structures of 333 intramolecular G4s, of all types, which allowed us to clarify some key concepts and present new information. To each of the eight distinguishable topologies corresponds a groove-width signature and a predominant glycosidic configuration (gc) pattern governed by the directions of the strands. The relative orientations of the stacking guanines within the strands, which we quantified and related to their vertical gc successions, determine the twist and tilt of the helices. The latter impact the minimum groove widths, which represent the space available for lateral ligand binding. The G4 four helices have similar twists, even when these twists are irregular, meaning that they have various angles along the strands. Despite its importance, the vertical gc succession has no strict one-to-one relationship with the topology, which explains the discrepancy between some topologies and their corresponding circular dichroism spectra. This study allowed us to introduce the new concept of platypus G4s, which are structures with properties corresponding to several topologies.
Collapse
Affiliation(s)
- Marc Farag
- Chemistry and Modeling for the Biology of Cancer, CNRS UMR9187, INSERM U1196, Institut Curie, PSL Research University, Université Paris-Saclay, CS 90030, 91401 ORSAYCedex, France
| | - Liliane Mouawad
- Chemistry and Modeling for the Biology of Cancer, CNRS UMR9187, INSERM U1196, Institut Curie, PSL Research University, Université Paris-Saclay, CS 90030, 91401 ORSAYCedex, France
| |
Collapse
|
8
|
Qian SH, Shi MW, Xiong YL, Zhang Y, Zhang ZH, Song XM, Deng XY, Chen ZX. EndoQuad: a comprehensive genome-wide experimentally validated endogenous G-quadruplex database. Nucleic Acids Res 2024; 52:D72-D80. [PMID: 37904589 PMCID: PMC10767823 DOI: 10.1093/nar/gkad966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/22/2023] [Accepted: 10/14/2023] [Indexed: 11/01/2023] Open
Abstract
G-quadruplexes (G4s) are non-canonical four-stranded structures and are emerging as novel genetic regulatory elements. However, a comprehensive genomic annotation of endogenous G4s (eG4s) and systematic characterization of their regulatory network are still lacking, posing major challenges for eG4 research. Here, we present EndoQuad (https://EndoQuad.chenzxlab.cn/) to address these pressing issues by integrating high-throughput experimental data. First, based on high-quality genome-wide eG4s mapping datasets (human: 1181; mouse: 24; chicken: 2) generated by G4 ChIP-seq/CUT&Tag, we generate a reference set of genome-wide eG4s. Our multi-omics analyses show that most eG4s are identified in one or a few cell types. The eG4s with higher occurrences across samples are more structurally stable, evolutionarily conserved, enriched in promoter regions, mark highly expressed genes and associate with complex regulatory programs, demonstrating higher confidence level for further experiments. Finally, we integrate millions of functional genomic variants and prioritize eG4s with regulatory functions in disease and cancer contexts. These efforts have culminated in the comprehensive and interactive database of experimentally validated DNA eG4s. As such, EndoQuad enables users to easily access, download and repurpose these data for their own research. EndoQuad will become a one-stop resource for eG4 research and lay the foundation for future functional studies.
Collapse
Affiliation(s)
- Sheng Hu Qian
- Hubei Hongshan Laboratory, College of Life Science and Technology, College of Biomedicine and Health, Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Meng-Wei Shi
- Hubei Hongshan Laboratory, College of Life Science and Technology, College of Biomedicine and Health, Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Yu-Li Xiong
- Hubei Hongshan Laboratory, College of Life Science and Technology, College of Biomedicine and Health, Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Yuan Zhang
- Hubei Hongshan Laboratory, College of Life Science and Technology, College of Biomedicine and Health, Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Ze-Hao Zhang
- Hubei Hongshan Laboratory, College of Life Science and Technology, College of Biomedicine and Health, Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Xue-Mei Song
- Hubei Hongshan Laboratory, College of Life Science and Technology, College of Biomedicine and Health, Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Xin-Yin Deng
- Hubei Hongshan Laboratory, College of Life Science and Technology, College of Biomedicine and Health, Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Zhen-Xia Chen
- Hubei Hongshan Laboratory, College of Life Science and Technology, College of Biomedicine and Health, Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan 430070, PR China
- Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Shenzhen 518000, China
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518000, China
| |
Collapse
|
9
|
Lawson CL, Berman H, Chen L, Vallat B, Zirbel C. The Nucleic Acid Knowledgebase: a new portal for 3D structural information about nucleic acids. Nucleic Acids Res 2024; 52:D245-D254. [PMID: 37953312 PMCID: PMC10767938 DOI: 10.1093/nar/gkad957] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 10/02/2023] [Accepted: 10/16/2023] [Indexed: 11/14/2023] Open
Abstract
The Nucleic Acid Knowledgebase (nakb.org) is a new data resource, updated weekly, for experimentally determined 3D structures containing DNA and/or RNA nucleic acid polymers and their biological assemblies. NAKB indexes nucleic acid-containing structures derived from all major structure determination methods (X-ray, NMR and EM), including all held by the Protein Data Bank (PDB). As the planned successor to the Nucleic Acid Database (NDB), NAKB's design preserves all functionality of the NDB and provides novel nucleic acid-centric content, including structural and functional annotations, as well as annotations from and links to external resources. A variety of custom interactive tools have been developed to enable rapid exploration and drill-down of NAKB's content.
Collapse
Affiliation(s)
- Catherine L Lawson
- Institute for Quantitative Biomedicine, Rutgers, State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Li Chen
- Institute for Quantitative Biomedicine, Rutgers, State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Institute for Quantitative Biomedicine, Rutgers, State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Craig L Zirbel
- Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, OH 43403, USA
| |
Collapse
|
10
|
Lu K, Wang HC, Tu YC, Lou PJ, Chang TC, Lin JJ. EGFR suppression contributes to growth inhibitory activity of G-quadruplex ligands in non-small cell lung cancers. Biochem Pharmacol 2023; 216:115788. [PMID: 37683841 DOI: 10.1016/j.bcp.2023.115788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 09/01/2023] [Accepted: 09/05/2023] [Indexed: 09/10/2023]
Abstract
Non-small cell lung carcinomas (NSCLCs) commonly harbor activating mutations in the epidermal growth factor receptor (EGFR). Drugs targeting the tyrosine kinase activity of EGFR have shown effectiveness in inhibiting the growth of cancer cells with EGFR mutations. However, the development of additional mutations in cancer cells often leads to the persistence of the disease, necessitating alternative strategies to overcome this challenge. We explored the efficacy of stabilizing the G-quadruplex structure formed in the promoter region of EGFR as a means to suppress its expression and impede the growth of cancer cells with EGFR mutations. We revealed that the carbazole derivative BMVC-8C3O effectively suppressed EGFR expression and demonstrated significant growth inhibition in EGFR-mutated NSCLC cells, both in cell culture and mouse xenograft models. Importantly, the observed repression of EGFR expression and growth inhibition were not exclusive to carbazole derivatives, as several other G-quadruplex ligands exhibited similar effects. The growth-inhibitory activity of BMVC-8C3O is attributed, at least in part, to the repression of EGFR, although it is possible that additional cellular targets are also affected. Remarkably, the growth-inhibitory effect was observed even in osimertinib-resistant cells, indicating that BMVC-8C3O holds promise for treating drug-resistant NSCLC. Our findings present a promising and innovative approach for inhibiting the growth of NSCLC cells with EGFR mutations by effectively suppressing EGFR expression. The demonstrated efficacy of G-quadruplex ligands in this study highlights their potential as candidates for further development in NSCLC therapy.
Collapse
Affiliation(s)
- Kai Lu
- Institute of Biochemistry and Molecular Biology, National Taiwan University College of Medicine, Taipei, Taiwan
| | - Hsin-Chiao Wang
- Institute of Biochemistry and Molecular Biology, National Taiwan University College of Medicine, Taipei, Taiwan
| | - Yi-Chen Tu
- Institute of Biochemistry and Molecular Biology, National Taiwan University College of Medicine, Taipei, Taiwan
| | - Pei-Jen Lou
- Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei 110, Taiwan
| | - Ta-Chau Chang
- Institute of Atomic and Molecular Sciences, Academia Sinica, P.O. Box 23-166, Taipei, 106, Taiwan.
| | - Jing-Jer Lin
- Institute of Biochemistry and Molecular Biology, National Taiwan University College of Medicine, Taipei, Taiwan.
| |
Collapse
|
11
|
Zhong HS, Dong MJ, Gao F. G4Bank: A database of experimentally identified DNA G-quadruplex sequences. Interdiscip Sci 2023; 15:515-523. [PMID: 37389723 DOI: 10.1007/s12539-023-00577-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 06/18/2023] [Accepted: 06/19/2023] [Indexed: 07/01/2023]
Abstract
G-quadruplex (G4), a non-canonical nucleic acid structure, has been suggested to play a key role in important cellular processes including transcription, replication and cancer development. Recently, high-throughput sequencing approaches for G4 detection have provided a large amount of experimentally identified G4 data that reveal genome-wide G4 landscapes and enable the development of new methods for predicting potential G4s from sequences. Although several existing databases provide G4 experimental data and relevant biological information from different perspectives, there is no dedicated database to collect and analyze DNA G4 experimental data genome-widely. Here, we constructed G4Bank, a database of experimentally identified DNA G-quadruplex sequences. A total of 6,915,983 DNA G4s were collected from 13 organisms, and state-of-the-art prediction methods were performed to filter and analyze the G4 data. Therefore, G4Bank will facilitate users to access comprehensive G4 experimental data and enable sequence feature analysis of G4 for further investigation. The database of the experimentally identified DNA G-quadruplex sequences can be accessed at http://tubic.tju.edu.cn/g4bank/ .
Collapse
Affiliation(s)
- Hong-Sheng Zhong
- Department of Physics, School of Science, Tianjin University, Tianjin, 300072, China
| | - Mei-Jing Dong
- Department of Physics, School of Science, Tianjin University, Tianjin, 300072, China
| | - Feng Gao
- Department of Physics, School of Science, Tianjin University, Tianjin, 300072, China.
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, China.
- SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin, 300072, China.
| |
Collapse
|
12
|
Mu ZC, Tan YL, Liu J, Zhang BG, Shi YZ. Computational Modeling of DNA 3D Structures: From Dynamics and Mechanics to Folding. Molecules 2023; 28:4833. [PMID: 37375388 DOI: 10.3390/molecules28124833] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 06/11/2023] [Accepted: 06/14/2023] [Indexed: 06/29/2023] Open
Abstract
DNA carries the genetic information required for the synthesis of RNA and proteins and plays an important role in many processes of biological development. Understanding the three-dimensional (3D) structures and dynamics of DNA is crucial for understanding their biological functions and guiding the development of novel materials. In this review, we discuss the recent advancements in computer methods for studying DNA 3D structures. This includes molecular dynamics simulations to analyze DNA dynamics, flexibility, and ion binding. We also explore various coarse-grained models used for DNA structure prediction or folding, along with fragment assembly methods for constructing DNA 3D structures. Furthermore, we also discuss the advantages and disadvantages of these methods and highlight their differences.
Collapse
Affiliation(s)
- Zi-Chun Mu
- Research Center of Nonlinear Science, School of Mathematical & Physical Sciences, Wuhan Textile University, Wuhan 430073, China
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan 430073, China
| | - Ya-Lan Tan
- Research Center of Nonlinear Science, School of Mathematical & Physical Sciences, Wuhan Textile University, Wuhan 430073, China
| | - Jie Liu
- Research Center of Nonlinear Science, School of Mathematical & Physical Sciences, Wuhan Textile University, Wuhan 430073, China
| | - Ben-Gong Zhang
- Research Center of Nonlinear Science, School of Mathematical & Physical Sciences, Wuhan Textile University, Wuhan 430073, China
| | - Ya-Zhou Shi
- Research Center of Nonlinear Science, School of Mathematical & Physical Sciences, Wuhan Textile University, Wuhan 430073, China
| |
Collapse
|
13
|
Abstract
G-quadruplexes (G4s) are distinctive four-stranded DNA or RNA structures found within cells that are thought to play functional roles in gene regulation and transcription, translation, recombination, and DNA damage/repair. While G4 structures can be uni-, bi-, or tetramolecular with respect to strands, folded unimolecular conformations are most significant in vivo. Unimolecular G4 can potentially form in sequences with runs of guanines interspersed with what will become loops in the folded structure: 5'GxLyGxLyGxLyGx, where x is typically 2-4 and y is highly variable. Such sequences are highly conserved and specifically located in genomes. In the folded structure, guanines from each run combine to form planar tetrads with four hydrogen-bonded guanine bases; these tetrads stack on one another to produce four strand segments aligned in specific parallel or antiparallel orientations, connected by the loop sequences. Three types of loops (lateral, diagonal, or "propeller") have been identified. The stacked tetrads form a central cavity that features strong coordination sites for monovalent cations that stabilize the G4 structure, with potassium or sodium preferred. A single monomeric G4 typically forms from a sequence containing roughly 20-30 nucleotides. Such short sequences have been the primary focus of X-ray crystallographic or NMR studies that have produced high-resolution structures of a variety of monomeric G4 conformations. These structures are often used as the basis for drug design efforts to modulate G4 function.We believe that the focus on monomeric G4 structures formed by such short sequences is perhaps myopic. Such short sequences for structural studies are often arbitrarily selected and removed from their native genomic sequence context, and then are often changed from their native sequences by base substitutions or deletions intended to optimize the formation of a homogeneous G4 conformation. We believe instead that G-quadruplexes prefer company and that in a longer natural sequence context multiple adjacent G4 units can form to combine into more complex multimeric G4 structures with richer topographies than simple monomeric forms. Bioinformatic searches of the human genome show that longer sequences with the potential for forming multiple G4 units are common. Telomeric DNA, for example, has a single-stranded overhang of hundreds of nucleotides with the requisite repetitive sequence with the potential for formation of multiple G4s. Numerous extended promoter sequences have similar potentials for multimeric G4 formation. X-ray crystallography and NMR methods are challenged by these longer sequences (>30 nt), so other tools are needed to explore the possible multimeric G4 landscape. We have implemented an integrated structural biology approach to address this challenge. This approach integrates experimental biophysical results with atomic-level molecular modeling and molecular dynamics simulations that provide quantitatively testable model structures. In every long sequence we have studied so far, we found that multimeric G4 structures readily form, with a surprising diversity of structures dependent on the exact native sequence used. In some cases, stable hairpin duplexes form along with G4 units to provide an even richer landscape. This Account provides an overview of our approach and recent progress and provides a new perspective on the G-quadruplex folding landscape.
Collapse
Affiliation(s)
- Robert C Monsen
- UofL Health Brown Cancer Center, University of Louisville, 505 S. Hancock St., Louisville, Kentucky 40202, United States
| | - John O Trent
- UofL Health Brown Cancer Center, University of Louisville, 505 S. Hancock St., Louisville, Kentucky 40202, United States.,Department of Medicine, University of Louisville, 505 S. Hancock St., Louisville, Kentucky 40202, United States.,Department of Biochemistry and Molecular Genetics, University of Louisville, 505 S. Hancock St., Louisville, Kentucky 40202, United States
| | - Jonathan B Chaires
- UofL Health Brown Cancer Center, University of Louisville, 505 S. Hancock St., Louisville, Kentucky 40202, United States.,Department of Medicine, University of Louisville, 505 S. Hancock St., Louisville, Kentucky 40202, United States.,Department of Biochemistry and Molecular Genetics, University of Louisville, 505 S. Hancock St., Louisville, Kentucky 40202, United States
| |
Collapse
|
14
|
Criscuolo A, Napolitano E, Riccardi C, Musumeci D, Platella C, Montesarchio D. Insights into the Small Molecule Targeting of Biologically Relevant G-Quadruplexes: An Overview of NMR and Crystal Structures. Pharmaceutics 2022; 14:pharmaceutics14112361. [PMID: 36365179 PMCID: PMC9696056 DOI: 10.3390/pharmaceutics14112361] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 10/23/2022] [Accepted: 10/28/2022] [Indexed: 11/06/2022] Open
Abstract
G-quadruplexes turned out to be important targets for the development of novel targeted anticancer/antiviral therapies. More than 3000 G-quadruplex small-molecule ligands have been described, with most of them exerting anticancer/antiviral activity by inducing telomeric damage and/or altering oncogene or viral gene expression in cancer cells and viruses, respectively. For some ligands, in-depth NMR and/or crystallographic studies were performed, providing detailed knowledge on their interactions with diverse G-quadruplex targets. Here, the PDB-deposited NMR and crystal structures of the complexes between telomeric, oncogenic or viral G-quadruplexes and small-molecule ligands, of both organic and metal-organic nature, have been summarized and described based on the G-quadruplex target, from telomeric DNA and RNA G-quadruplexes to DNA oncogenic G-quadruplexes, and finally to RNA viral G-quadruplexes. An overview of the structural details of these complexes is here provided to guide the design of novel ligands targeting more efficiently and selectively cancer- and virus-related G-quadruplex structures.
Collapse
Affiliation(s)
- Andrea Criscuolo
- Department of Chemical Sciences, University of Naples Federico II, Via Cintia 21, 80126 Naples, Italy
| | - Ettore Napolitano
- Department of Chemical Sciences, University of Naples Federico II, Via Cintia 21, 80126 Naples, Italy
| | - Claudia Riccardi
- Department of Chemical Sciences, University of Naples Federico II, Via Cintia 21, 80126 Naples, Italy
| | - Domenica Musumeci
- Department of Chemical Sciences, University of Naples Federico II, Via Cintia 21, 80126 Naples, Italy
- Institute of Biostructures and Bioimages, CNR, 80134 Naples, Italy
| | - Chiara Platella
- Department of Chemical Sciences, University of Naples Federico II, Via Cintia 21, 80126 Naples, Italy
- Correspondence:
| | - Daniela Montesarchio
- Department of Chemical Sciences, University of Naples Federico II, Via Cintia 21, 80126 Naples, Italy
| |
Collapse
|
15
|
Bourdon S, Herviou P, Dumas L, Destefanis E, Zen A, Cammas A, Millevoi S, Dassi E. QUADRatlas: the RNA G-quadruplex and RG4-binding proteins database. Nucleic Acids Res 2022; 51:D240-D247. [PMID: 36124670 PMCID: PMC9825518 DOI: 10.1093/nar/gkac782] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 08/12/2022] [Accepted: 09/01/2022] [Indexed: 01/29/2023] Open
Abstract
RNA G-quadruplexes (RG4s) are non-canonical, disease-associated post-transcriptional regulators of gene expression whose functions are driven by RNA-binding proteins (RBPs). Being able to explore transcriptome-wide RG4 formation and interaction with RBPs is thus paramount to understanding how they are regulated and exploiting them as potential therapeutic targets. Towards this goal, we present QUADRatlas (https://rg4db.cibio.unitn.it), a database of experimentally-derived and computationally predicted RG4s in the human transcriptome, enriched with biological function and disease associations. As RBPs are key to their function, we mined known interactions of RG4s with such proteins, complemented with an extensive RBP binding sites dataset. Users can thus intersect RG4s with their potential regulators and effectors, enabling the formulation of novel hypotheses on RG4 regulation, function and pathogenicity. To support this capability, we provide analysis tools for predicting whether an RBP can bind RG4s, RG4 enrichment in a gene set, and de novo RG4 prediction. Genome-browser and table views allow exploring, filtering, and downloading the data quickly for individual genes and in batch. QUADRatlas is a significant step forward in our ability to understand the biology of RG4s, offering unmatched data content and enabling the integrated analysis of RG4s and their interactions with RBPs.
Collapse
Affiliation(s)
| | | | | | - Eliana Destefanis
- Laboratory of RNA Regulatory Networks, Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, 38123 Trento, Italy,Laboratory of Translational Genomics, Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, 38123 Trento, Italy
| | - Andrea Zen
- Laboratory of RNA Regulatory Networks, Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, 38123 Trento, Italy
| | - Anne Cammas
- Cancer Research Centre of Toulouse, INSERM UMR 1037, 31037 Toulouse, France,Université Toulouse III – Paul Sabatier, 31330 Toulouse, France
| | | | - Erik Dassi
- To whom correspondence should be addressed. Tel: +39 0461 282792;
| |
Collapse
|
16
|
Wiedemann J, Kaczor J, Milostan M, Zok T, Blazewicz J, Szachniuk M, Antczak M. RNAloops: a database of RNA multiloops. Bioinformatics 2022; 38:4200-4205. [PMID: 35809063 PMCID: PMC9438955 DOI: 10.1093/bioinformatics/btac484] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 06/26/2022] [Accepted: 07/06/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Knowledge of the 3D structure of RNA supports discovering its functions and is crucial for designing drugs and modern therapeutic solutions. Thus, much attention is devoted to experimental determination and computational prediction targeting the global fold of RNA and its local substructures. The latter include multi-branched loops-functionally significant elements that highly affect the spatial shape of the entire molecule. Unfortunately, their computational modeling constitutes a weak point of structural bioinformatics. A remedy for this is in collecting these motifs and analyzing their features. RESULTS RNAloops is a self-updating database that stores multi-branched loops identified in the PDB-deposited RNA structures. A description of each loop includes angular data-planar and Euler angles computed between pairs of adjacent helices to allow studying their mutual arrangement in space. The system enables search and analysis of multiloops, presents their structure details numerically and visually, and computes data statistics. AVAILABILITY AND IMPLEMENTATION RNAloops is freely accessible at https://rnaloops.cs.put.poznan.pl. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jakub Wiedemann
- Institute of Computing Science, Poznan University of Technology, 60-965 Poznan, Poland
| | - Jacek Kaczor
- Institute of Computing Science, Poznan University of Technology, 60-965 Poznan, Poland
| | - Maciej Milostan
- Institute of Computing Science, Poznan University of Technology, 60-965 Poznan, Poland,Poznan Supercomputing and Networking Center, 61-131 Poznan, Poland
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, 60-965 Poznan, Poland,Poznan Supercomputing and Networking Center, 61-131 Poznan, Poland
| | - Jacek Blazewicz
- Institute of Computing Science, Poznan University of Technology, 60-965 Poznan, Poland,Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
| | | | | |
Collapse
|
17
|
Zurkowski M, Zok T, Szachniuk M. DrawTetrado to create layer diagrams of G4 structures. Bioinformatics 2022; 38:3835-3836. [PMID: 35703937 PMCID: PMC9344840 DOI: 10.1093/bioinformatics/btac394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 05/13/2022] [Accepted: 06/13/2022] [Indexed: 11/14/2022] Open
Abstract
Motivation Quadruplexes are specific 3D structures found in nucleic acids. Due to the exceptional properties of these motifs, their exploration with the general-purpose bioinformatics methods can be problematic or insufficient. The same applies to visualizing their structure. A hand-drawn layer diagram is the most common way to represent the quadruplex anatomy. No molecular visualization software generates such a structural model based on atomic coordinates. Results DrawTetrado is an open-source Python program for automated visualization targeting the structures of quadruplexes and G4-helices. It generates static layer diagrams that represent structural data in a pseudo-3D perspective. The possibility to set color schemes, nucleotide labels, inter-element distances or angle of view allows for easy customization of the output drawing. Availability and implementation The program is available under the MIT license at https://github.com/RNApolis/drawtetrado.
Collapse
Affiliation(s)
- Michal Zurkowski
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, Poznan, 60-965, Poland
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, Poznan, 60-965, Poland
| | - Marta Szachniuk
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, Poznan, 60-965, Poland.,Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, Poznan, 61-704, Poland
| |
Collapse
|