1
|
Burley SK, Wu-Wu A, Dutta S, Ganesan S, Zheng SXF. Impact of structural biology and the protein data bank on us fda new drug approvals of low molecular weight antineoplastic agents 2019-2023. Oncogene 2024; 43:2229-2243. [PMID: 38886570 PMCID: PMC11245395 DOI: 10.1038/s41388-024-03077-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 06/04/2024] [Accepted: 06/05/2024] [Indexed: 06/20/2024]
Abstract
Open access to three-dimensional atomic-level biostructure information from the Protein Data Bank (PDB) facilitated discovery/development of 100% of the 34 new low molecular weight, protein-targeted, antineoplastic agents approved by the US FDA 2019-2023. Analyses of PDB holdings, the scientific literature, and related documents for each drug-target combination revealed that the impact of structural biologists and public-domain 3D biostructure data was broad and substantial, ranging from understanding target biology (100% of all drug targets), to identifying a given target as likely druggable (100% of all targets), to structure-guided drug discovery (>80% of all new small-molecule drugs, made up of 50% confirmed and >30% probable cases). In addition to aggregate impact assessments, illustrative case studies are presented for six first-in-class small-molecule anti-cancer drugs, including a selective inhibitor of nuclear export targeting Exportin 1 (selinexor, Xpovio), an ATP-competitive CSF-1R receptor tyrosine kinase inhibitor (pexidartinib,Turalia), a non-ATP-competitive inhibitor of the BCR-Abl fusion protein targeting the myristoyl binding pocket within the kinase catalytic domain of Abl (asciminib, Scemblix), a covalently-acting G12C KRAS inhibitor (sotorasib, Lumakras or Lumykras), an EZH2 methyltransferase inhibitor (tazemostat, Tazverik), and an agent targeting the basic-Helix-Loop-Helix transcription factor HIF-2α (belzutifan, Welireg).
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA.
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ, 08903, USA.
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, 92093, USA.
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA.
| | - Amy Wu-Wu
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ, 08903, USA
| | - Shridar Ganesan
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ, 08903, USA
| | - Steven X F Zheng
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ, 08903, USA
| |
Collapse
|
2
|
Choudhary P, Feng Z, Berrisford J, Chao H, Ikegawa Y, Peisach E, Piehl DW, Smith J, Tanweer A, Varadi M, Westbrook JD, Young JY, Patwardhan A, Morris KL, Hoch JC, Kurisu G, Velankar S, Burley SK. PDB NextGen Archive: centralizing access to integrated annotations and enriched structural information by the Worldwide Protein Data Bank. Database (Oxford) 2024; 2024:baae041. [PMID: 38803272 PMCID: PMC11130521 DOI: 10.1093/database/baae041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/29/2024] [Accepted: 05/14/2024] [Indexed: 05/29/2024]
Abstract
The Protein Data Bank (PDB) is the global repository for public-domain experimentally determined 3D biomolecular structural information. The archival nature of the PDB presents certain challenges pertaining to updating or adding associated annotations from trusted external biodata resources. While each Worldwide PDB (wwPDB) partner has made best efforts to provide up-to-date external annotations, accessing and integrating information from disparate wwPDB data centers can be an involved process. To address this issue, the wwPDB has established the PDB Next Generation (or NextGen) Archive, developed to centralize and streamline access to enriched structural annotations from wwPDB partners and trusted external sources. At present, the NextGen Archive provides mappings between experimentally determined 3D structures of proteins and UniProt amino acid sequences, domain annotations from Pfam, SCOP2 and CATH databases and intra-molecular connectivity information. Since launch, the PDB NextGen Archive has seen substantial user engagement with over 3.5 million data file downloads, ensuring researchers have access to accurate, up-to-date and easily accessible structural annotations. Database URL: http://www.wwpdb.org/ftp/pdb-nextgen-archive-site.
Collapse
Affiliation(s)
- Preeti Choudhary
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - John Berrisford
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Henry Chao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - Yasuyo Ikegawa
- Protein Data Bank Japan, Protein Research Foundation, 3-2, Yamadaoka, Minoh, Osaka 562-8686, Japan
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - Dennis W Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - James Smith
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - Ahsan Tanweer
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Mihaly Varadi
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - Ardan Patwardhan
- The Electron Microscopy Data Bank, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Kyle L Morris
- The Electron Microscopy Data Bank, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, 263 Farmington Avenue, Farmington, CT 06030-3305, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Protein Research Foundation, 3-2, Yamadaoka, Minoh, Osaka 562-8686, Japan
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, 195 Little Albany St., New Brunswick, NJ 08901, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 123 Bevier Rd., Piscataway, NJ 08854, USA
| |
Collapse
|
3
|
Burley SK, Piehl DW, Vallat B, Zardecki C. RCSB Protein Data Bank: supporting research and education worldwide through explorations of experimentally determined and computationally predicted atomic level 3D biostructures. IUCRJ 2024; 11:279-286. [PMID: 38597878 PMCID: PMC11067742 DOI: 10.1107/s2052252524002604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 03/19/2024] [Indexed: 04/11/2024]
Abstract
The Protein Data Bank (PDB) was established as the first open-access digital data resource in biology and medicine in 1971 with seven X-ray crystal structures of proteins. Today, the PDB houses >210 000 experimentally determined, atomic level, 3D structures of proteins and nucleic acids as well as their complexes with one another and small molecules (e.g. approved drugs, enzyme cofactors). These data provide insights into fundamental biology, biomedicine, bioenergy and biotechnology. They proved particularly important for understanding the SARS-CoV-2 global pandemic. The US-funded Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) and other members of the Worldwide Protein Data Bank (wwPDB) partnership jointly manage the PDB archive and support >60 000 `data depositors' (structural biologists) around the world. wwPDB ensures the quality and integrity of the data in the ever-expanding PDB archive and supports global open access without limitations on data usage. The RCSB PDB research-focused web portal at https://www.rcsb.org/ (RCSB.org) supports millions of users worldwide, representing a broad range of expertise and interests. In addition to retrieving 3D structure data, PDB `data consumers' access comparative data and external annotations, such as information about disease-causing point mutations and genetic variations. RCSB.org also provides access to >1 000 000 computed structure models (CSMs) generated using artificial intelligence/machine-learning methods. To avoid doubt, the provenance and reliability of experimentally determined PDB structures and CSMs are identified. Related training materials are available to support users in their RCSB.org explorations.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Biology Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
4
|
Turner J, Abbott S, Fonseca N, Pye R, Carrijo L, Duraisamy AK, Salih O, Wang Z, Kleywegt GJ, Morris KL, Patwardhan A, Burley SK, Crichlow G, Feng Z, Flatt JW, Ghosh S, Hudson BP, Lawson CL, Liang Y, Peisach E, Persikova I, Sekharan M, Shao C, Young J, Velankar S, Armstrong D, Bage M, Bueno WM, Evans G, Gaborova R, Ganguly S, Gupta D, Harrus D, Tanweer A, Bansal M, Rangannan V, Kurisu G, Cho H, Ikegawa Y, Kengaku Y, Kim JY, Niwa S, Sato J, Takuwa A, Yu J, Hoch JC, Baskaran K, Xu W, Zhang W, Ma X. EMDB-the Electron Microscopy Data Bank. Nucleic Acids Res 2024; 52:D456-D465. [PMID: 37994703 PMCID: PMC10767987 DOI: 10.1093/nar/gkad1019] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 10/18/2023] [Accepted: 10/20/2023] [Indexed: 11/24/2023] Open
Abstract
The Electron Microscopy Data Bank (EMDB) is the global public archive of three-dimensional electron microscopy (3DEM) maps of biological specimens derived from transmission electron microscopy experiments. As of 2021, EMDB is managed by the Worldwide Protein Data Bank consortium (wwPDB; wwpdb.org) as a wwPDB Core Archive, and the EMDB team is a core member of the consortium. Today, EMDB houses over 30 000 entries with maps containing macromolecules, complexes, viruses, organelles and cells. Herein, we provide an overview of the rapidly growing EMDB archive, including its current holdings, recent updates, and future plans.
Collapse
|
5
|
Wayment-Steele HK, Ojoawo A, Otten R, Apitz JM, Pitsawong W, Hömberger M, Ovchinnikov S, Colwell L, Kern D. Predicting multiple conformations via sequence clustering and AlphaFold2. Nature 2024; 625:832-839. [PMID: 37956700 PMCID: PMC10808063 DOI: 10.1038/s41586-023-06832-9] [Citation(s) in RCA: 51] [Impact Index Per Article: 51.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 11/03/2023] [Indexed: 11/15/2023]
Abstract
AlphaFold2 (ref. 1) has revolutionized structural biology by accurately predicting single structures of proteins. However, a protein's biological function often depends on multiple conformational substates2, and disease-causing point mutations often cause population changes within these substates3,4. We demonstrate that clustering a multiple-sequence alignment by sequence similarity enables AlphaFold2 to sample alternative states of known metamorphic proteins with high confidence. Using this method, named AF-Cluster, we investigated the evolutionary distribution of predicted structures for the metamorphic protein KaiB5 and found that predictions of both conformations were distributed in clusters across the KaiB family. We used nuclear magnetic resonance spectroscopy to confirm an AF-Cluster prediction: a cyanobacteria KaiB variant is stabilized in the opposite state compared with the more widely studied variant. To test AF-Cluster's sensitivity to point mutations, we designed and experimentally verified a set of three mutations predicted to flip KaiB from Rhodobacter sphaeroides from the ground to the fold-switched state. Finally, screening for alternative states in protein families without known fold switching identified a putative alternative state for the oxidoreductase Mpt53 in Mycobacterium tuberculosis. Further development of such bioinformatic methods in tandem with experiments will probably have a considerable impact on predicting protein energy landscapes, essential for illuminating biological function.
Collapse
Affiliation(s)
- Hannah K Wayment-Steele
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
| | - Adedolapo Ojoawo
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
| | - Renee Otten
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
- Treeline Biosciences, Watertown, MA, USA
| | - Julia M Apitz
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
| | - Warintra Pitsawong
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
- Biomolecular Discovery, Relay Therapeutics, Cambridge, MA, USA
| | - Marc Hömberger
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
- Treeline Biosciences, Watertown, MA, USA
| | | | - Lucy Colwell
- Google Research, Cambridge, MA, USA
- Cambridge University, Cambridge, UK
| | - Dorothee Kern
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA.
| |
Collapse
|