1
|
Zhozhikov L, Vasilev F, Maksimova N. Protein-Variant-Phenotype Study of NBAS Using AlphaFold in the Aspect of SOPH Syndrome. Proteins 2024. [PMID: 39641476 DOI: 10.1002/prot.26764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Revised: 10/04/2024] [Accepted: 11/01/2024] [Indexed: 12/07/2024]
Abstract
NBAS gene variants cause phenotypically distinct and nonoverlapping conditions, SOPH syndrome and ILFS2. NBAS is a so-called "moonlighting" protein responsible for retrograde membrane trafficking and nonsense-mediated decay. However, its three-dimensional model and the nature of its possible interactions with other proteins have remained elusive. Here, we used AlphaFold to predict protein-protein interaction (PPI) sites and mapped them to NBAS pathogenic variants. We repeated in silico milestone studies of the NBAS protein to explain the multisystem phenotype of its variants, with particular emphasis on the SOPH variant (p.R1914H). We revealed the putative binding sites for the main interaction partners of NBAS and assessed the implications of these binding sites for the subdomain architecture of the NBAS protein. Using AlphaFold, we disclosed the far-reaching impact of NBAS variants on the development of each phenotypic trait in patients with NBAS-related pathologies.
Collapse
Affiliation(s)
- Leonid Zhozhikov
- Research Laboratory of "Molecular Medicine and Human Genetics", Institute of Medicine, Ammosov North-Eastern Federal University, Yakutsk, Republic of Sakha (Yakutia), Russia
| | - Filipp Vasilev
- Research Laboratory of "Molecular Medicine and Human Genetics", Institute of Medicine, Ammosov North-Eastern Federal University, Yakutsk, Republic of Sakha (Yakutia), Russia
| | - Nadezhda Maksimova
- Research Laboratory of "Molecular Medicine and Human Genetics", Institute of Medicine, Ammosov North-Eastern Federal University, Yakutsk, Republic of Sakha (Yakutia), Russia
| |
Collapse
|
2
|
Keegan RM, Simpkin AJ, Rigden DJ. The success rate of processed predicted models in molecular replacement: implications for experimental phasing in the AlphaFold era. Acta Crystallogr D Struct Biol 2024; 80:766-779. [PMID: 39360967 PMCID: PMC11544426 DOI: 10.1107/s2059798324009380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Accepted: 09/23/2024] [Indexed: 11/09/2024] Open
Abstract
The availability of highly accurate protein structure predictions from AlphaFold2 (AF2) and similar tools has hugely expanded the applicability of molecular replacement (MR) for crystal structure solution. Many structures can be solved routinely using raw models, structures processed to remove unreliable parts or models split into distinct structural units. There is therefore an open question around how many and which cases still require experimental phasing methods such as single-wavelength anomalous diffraction (SAD). Here, this question is addressed using a large set of PDB depositions that were solved by SAD. A large majority (87%) could be solved using unedited or minimally edited AF2 predictions. A further 18 (4%) yield straightforwardly to MR after splitting of the AF2 prediction using Slice'N'Dice, although different splitting methods succeeded on slightly different sets of cases. It is also found that further unique targets can be solved by alternative modelling approaches such as ESMFold (four cases), alternative MR approaches such as ARCIMBOLDO and AMPLE (two cases each), and multimeric model building with AlphaFold-Multimer or UniFold (three cases). Ultimately, only 12 cases, or 3% of the SAD-phased set, did not yield to any form of MR tested here, offering valuable hints as to the number and the characteristics of cases where experimental phasing remains essential for macromolecular structure solution.
Collapse
Affiliation(s)
- Ronan M. Keegan
- Institute of Systems, Molecular and Integrative BiologyUniversity of LiverpoolLiverpoolL69 7ZBUnited Kingdom
- UKRI–STFCRutherford Appleton LaboratoryResearch Complex at HarwellDidcotOX11 0FAUnited Kingdom
| | - Adam J. Simpkin
- Institute of Systems, Molecular and Integrative BiologyUniversity of LiverpoolLiverpoolL69 7ZBUnited Kingdom
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative BiologyUniversity of LiverpoolLiverpoolL69 7ZBUnited Kingdom
| |
Collapse
|
3
|
Abriata LA. The Nobel Prize in Chemistry: past, present, and future of AI in biology. Commun Biol 2024; 7:1409. [PMID: 39472680 PMCID: PMC11522274 DOI: 10.1038/s42003-024-07113-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Accepted: 10/21/2024] [Indexed: 11/02/2024] Open
Abstract
A Comment on the transformative progress of artificial intelligence for structural and protein biology, referencing the 2024 Nobel Prize in Chemistry.
Collapse
Affiliation(s)
- Luciano A Abriata
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, CH-1015, Lausanne, Switzerland.
| |
Collapse
|
4
|
El Omari K, Forsyth I, Duman R, Orr CM, Mykhaylyk V, Mancini EJ, Wagner A. Utilizing anomalous signals for element identification in macromolecular crystallography. Acta Crystallogr D Struct Biol 2024; 80:713-721. [PMID: 39291627 PMCID: PMC11448921 DOI: 10.1107/s2059798324008659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 09/03/2024] [Indexed: 09/19/2024] Open
Abstract
AlphaFold2 has revolutionized structural biology by offering unparalleled accuracy in predicting protein structures. Traditional methods for determining protein structures, such as X-ray crystallography and cryo-electron microscopy, are often time-consuming and resource-intensive. AlphaFold2 provides models that are valuable for molecular replacement, aiding in model building and docking into electron density or potential maps. However, despite its capabilities, models from AlphaFold2 do not consistently match the accuracy of experimentally determined structures, need to be validated experimentally and currently miss some crucial information, such as post-translational modifications, ligands and bound ions. In this paper, the advantages are explored of collecting X-ray anomalous data to identify chemical elements, such as metal ions, which are key to understanding certain structures and functions of proteins. This is achieved through methods such as calculating anomalous difference Fourier maps or refining the imaginary component of the anomalous scattering factor f''. Anomalous data can serve as a valuable complement to the information provided by AlphaFold2 models and this is particularly significant in elucidating the roles of metal ions.
Collapse
Affiliation(s)
- Kamel El Omari
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Ismay Forsyth
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Ramona Duman
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Christian M Orr
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Vitaliy Mykhaylyk
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Erika J Mancini
- School of Life Sciences, University of Sussex, Falmer, Brighton BN1 9QG, United Kingdom
| | - Armin Wagner
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| |
Collapse
|
5
|
Rosignoli S, Pacelli M, Manganiello F, Paiardini A. An outlook on structural biology after AlphaFold: tools, limits and perspectives. FEBS Open Bio 2024. [PMID: 39313455 DOI: 10.1002/2211-5463.13902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 08/19/2024] [Accepted: 09/13/2024] [Indexed: 09/25/2024] Open
Abstract
AlphaFold and similar groundbreaking, AI-based tools, have revolutionized the field of structural bioinformatics, with their remarkable accuracy in ab-initio protein structure prediction. This success has catalyzed the development of new software and pipelines aimed at incorporating AlphaFold's predictions, often focusing on addressing the algorithm's remaining challenges. Here, we present the current landscape of structural bioinformatics shaped by AlphaFold, and discuss how the field is dynamically responding to this revolution, with new software, methods, and pipelines. While the excitement around AI-based tools led to their widespread application, it is essential to acknowledge that their practical success hinges on their integration into established protocols within structural bioinformatics, often neglected in the context of AI-driven advancements. Indeed, user-driven intervention is still as pivotal in the structure prediction process as in complementing state-of-the-art algorithms with functional and biological knowledge.
Collapse
Affiliation(s)
- Serena Rosignoli
- Department of Biochemical sciences "A. Rossi Fanelli", Sapienza Università di Roma, Italy
| | - Maddalena Pacelli
- Department of Biochemical sciences "A. Rossi Fanelli", Sapienza Università di Roma, Italy
| | - Francesca Manganiello
- Department of Biochemical sciences "A. Rossi Fanelli", Sapienza Università di Roma, Italy
| | - Alessandro Paiardini
- Department of Biochemical sciences "A. Rossi Fanelli", Sapienza Università di Roma, Italy
| |
Collapse
|
6
|
Heinzinger M, Rost B. Artificial Intelligence Learns Protein Prediction. Cold Spring Harb Perspect Biol 2024; 16:a041458. [PMID: 38858069 PMCID: PMC11368192 DOI: 10.1101/cshperspect.a041458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2024]
Abstract
From AlphaGO over StableDiffusion to ChatGPT, the recent decade of exponential advances in artificial intelligence (AI) has been altering life. In parallel, advances in computational biology are beginning to decode the language of life: AlphaFold2 leaped forward in protein structure prediction, and protein language models (pLMs) replaced expertise and evolutionary information from multiple sequence alignments with information learned from reoccurring patterns in databases of billions of proteins without experimental annotations other than the amino acid sequences. None of those tools could have been developed 10 years ago; all will increase the wealth of experimental data and speed up the cycle from idea to proof. AI is affecting molecular and medical biology at giant steps, and the most important might be the leap toward more powerful protein design.
Collapse
Affiliation(s)
- Michael Heinzinger
- Technical University of Munich (TUM) School of School of Computation, Information and Technology (CIT), Bioinformatics and Computational Biology - i12, 85748 Garching/Munich, Germany
| | - Burkhard Rost
- Technical University of Munich (TUM) School of School of Computation, Information and Technology (CIT), Bioinformatics and Computational Biology - i12, 85748 Garching/Munich, Germany
- Institute for Advanced Study (TUM-IAS), 85748 Garching/Munich, Germany
- TUM School of Life Sciences Weihenstephan (WZW), 85354 Freising, Germany
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA
| |
Collapse
|
7
|
Kovalevskiy O, Mateos-Garcia J, Tunyasuvunakool K. AlphaFold two years on: Validation and impact. Proc Natl Acad Sci U S A 2024; 121:e2315002121. [PMID: 39133843 PMCID: PMC11348012 DOI: 10.1073/pnas.2315002121] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024] Open
Abstract
Two years on from the initial release of AlphaFold, we have seen its widespread adoption as a structure prediction tool. Here, we discuss some of the latest work based on AlphaFold, with a particular focus on its use within the structural biology community. This encompasses use cases like speeding up structure determination itself, enabling new computational studies, and building new tools and workflows. We also look at the ongoing validation of AlphaFold, as its predictions continue to be compared against large numbers of experimental structures to further delineate the model's capabilities and limitations.
Collapse
|
8
|
Bowman GR. AlphaFold and Protein Folding: Not Dead Yet! The Frontier Is Conformational Ensembles. Annu Rev Biomed Data Sci 2024; 7:51-57. [PMID: 38603560 DOI: 10.1146/annurev-biodatasci-102423-011435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/13/2024]
Abstract
Like the black knight in the classic Monty Python movie, grand scientific challenges such as protein folding are hard to finish off. Notably, AlphaFold is revolutionizing structural biology by bringing highly accurate structure prediction to the masses and opening up innumerable new avenues of research. Despite this enormous success, calling structure prediction, much less protein folding and related problems, "solved" is dangerous, as doing so could stymie further progress. Imagine what the world would be like if we had declared flight solved after the first commercial airlines opened and stopped investing in further research and development. Likewise, there are still important limitations to structure prediction that we would benefit from addressing. Moreover, we are limited in our understanding of the enormous diversity of different structures a single protein can adopt (called a conformational ensemble) and the dynamics by which a protein explores this space. What is clear is that conformational ensembles are critical to protein function, and understanding this aspect of protein dynamics will advance our ability to design new proteins and drugs.
Collapse
Affiliation(s)
- Gregory R Bowman
- Departments of Biochemistry and Biophysics and Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania, USA;
| |
Collapse
|
9
|
Jiao Z, He Y, Fu X, Zhang X, Geng Z, Ding W. A predicted model-aided reconstruction algorithm for X-ray free-electron laser single-particle imaging. IUCRJ 2024; 11:602-619. [PMID: 38904548 PMCID: PMC11220885 DOI: 10.1107/s2052252524004858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Accepted: 05/23/2024] [Indexed: 06/22/2024]
Abstract
Ultra-intense, ultra-fast X-ray free-electron lasers (XFELs) enable the imaging of single protein molecules under ambient temperature and pressure. A crucial aspect of structure reconstruction involves determining the relative orientations of each diffraction pattern and recovering the missing phase information. In this paper, we introduce a predicted model-aided algorithm for orientation determination and phase retrieval, which has been tested on various simulated datasets and has shown significant improvements in the success rate, accuracy and efficiency of XFEL data reconstruction.
Collapse
Affiliation(s)
- Zhichao Jiao
- Laboratory of Soft Matter PhysicsInstitute of Physics, Chinese Academy of SciencesBeijing100190People’s Republic of China
- University of Chinese Academy of SciencesBeijing100049People’s Republic of China
| | - Yao He
- Research Instrument ScientistNew York University Abu DhabiAbu DhabiUnited Arab Emirates
| | - Xingke Fu
- Laboratory of Soft Matter PhysicsInstitute of Physics, Chinese Academy of SciencesBeijing100190People’s Republic of China
- University of Chinese Academy of SciencesBeijing100049People’s Republic of China
| | - Xin Zhang
- The University of Hong KongHong Kong SARPeople’s Republic of China
| | - Zhi Geng
- Beijing Synchrotron Radiation FacilityInstitute of High Energy Physics, Chinese Academy of SciencesBeijing100049People’s Republic of China
- University of Chinese Academy of SciencesBeijing100049People’s Republic of China
| | - Wei Ding
- Laboratory of Soft Matter PhysicsInstitute of Physics, Chinese Academy of SciencesBeijing100190People’s Republic of China
- University of Chinese Academy of SciencesBeijing100049People’s Republic of China
| |
Collapse
|
10
|
Versini R, Sritharan S, Aykac Fas B, Tubiana T, Aimeur SZ, Henri J, Erard M, Nüsse O, Andreani J, Baaden M, Fuchs P, Galochkina T, Chatzigoulas A, Cournia Z, Santuz H, Sacquin-Mora S, Taly A. A Perspective on the Prospective Use of AI in Protein Structure Prediction. J Chem Inf Model 2024; 64:26-41. [PMID: 38124369 DOI: 10.1021/acs.jcim.3c01361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
AlphaFold2 (AF2) and RoseTTaFold (RF) have revolutionized structural biology, serving as highly reliable and effective methods for predicting protein structures. This article explores their impact and limitations, focusing on their integration into experimental pipelines and their application in diverse protein classes, including membrane proteins, intrinsically disordered proteins (IDPs), and oligomers. In experimental pipelines, AF2 models help X-ray crystallography in resolving the phase problem, while complementarity with mass spectrometry and NMR data enhances structure determination and protein flexibility prediction. Predicting the structure of membrane proteins remains challenging for both AF2 and RF due to difficulties in capturing conformational ensembles and interactions with the membrane. Improvements in incorporating membrane-specific features and predicting the structural effect of mutations are crucial. For intrinsically disordered proteins, AF2's confidence score (pLDDT) serves as a competitive disorder predictor, but integrative approaches including molecular dynamics (MD) simulations or hydrophobic cluster analyses are advocated for accurate dynamics representation. AF2 and RF show promising results for oligomeric models, outperforming traditional docking methods, with AlphaFold-Multimer showing improved performance. However, some caveats remain in particular for membrane proteins. Real-life examples demonstrate AF2's predictive capabilities in unknown protein structures, but models should be evaluated for their agreement with experimental data. Furthermore, AF2 models can be used complementarily with MD simulations. In this Perspective, we propose a "wish list" for improving deep-learning-based protein folding prediction models, including using experimental data as constraints and modifying models with binding partners or post-translational modifications. Additionally, a meta-tool for ranking and suggesting composite models is suggested, driving future advancements in this rapidly evolving field.
Collapse
Affiliation(s)
- Raphaelle Versini
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Sujith Sritharan
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Burcu Aykac Fas
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Thibault Tubiana
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Sana Zineb Aimeur
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, 91405 Orsay, France
| | - Julien Henri
- Sorbonne Université, CNRS, Laboratoire de Biologie, Computationnelle et Quantitative UMR 7238, Institut de Biologie Paris-Seine, 4 Place Jussieu, F-75005 Paris, France
| | - Marie Erard
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, 91405 Orsay, France
| | - Oliver Nüsse
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, 91405 Orsay, France
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Marc Baaden
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Patrick Fuchs
- Sorbonne Université, École Normale Supérieure, PSL University, CNRS, Laboratoire des Biomolécules, LBM, 75005 Paris, France
- Université de Paris, UFR Sciences du Vivant, 75013 Paris, France
| | - Tatiana Galochkina
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75014 Paris, France
| | - Alexios Chatzigoulas
- Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, 15784 Athens, Greece
| | - Zoe Cournia
- Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, 15784 Athens, Greece
| | - Hubert Santuz
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Sophie Sacquin-Mora
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Antoine Taly
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| |
Collapse
|
11
|
Matinyan S, Filipcik P, Abrahams JP. Deep learning applications in protein crystallography. Acta Crystallogr A Found Adv 2024; 80:1-17. [PMID: 38189437 PMCID: PMC10833361 DOI: 10.1107/s2053273323009300] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 10/24/2023] [Indexed: 01/09/2024] Open
Abstract
Deep learning techniques can recognize complex patterns in noisy, multidimensional data. In recent years, researchers have started to explore the potential of deep learning in the field of structural biology, including protein crystallography. This field has some significant challenges, in particular producing high-quality and well ordered protein crystals. Additionally, collecting diffraction data with high completeness and quality, and determining and refining protein structures can be problematic. Protein crystallographic data are often high-dimensional, noisy and incomplete. Deep learning algorithms can extract relevant features from these data and learn to recognize patterns, which can improve the success rate of crystallization and the quality of crystal structures. This paper reviews progress in this field.
Collapse
Affiliation(s)
| | | | - Jan Pieter Abrahams
- Biozentrum, Basel University, Basel, Switzerland
- Paul Scherrer Institute, Villigen, Switzerland
| |
Collapse
|
12
|
Simpkin AJ, Mesdaghi S, Sánchez Rodríguez F, Elliott L, Murphy DL, Kryshtafovych A, Keegan RM, Rigden DJ. Tertiary structure assessment at CASP15. Proteins 2023; 91:1616-1635. [PMID: 37746927 PMCID: PMC10792517 DOI: 10.1002/prot.26593] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 08/25/2023] [Accepted: 09/07/2023] [Indexed: 09/26/2023]
Abstract
The results of tertiary structure assessment at CASP15 are reported. For the first time, recognizing the outstanding performance of AlphaFold 2 (AF2) at CASP14, all single-chain predictions were assessed together, irrespective of whether a template was available. At CASP15, there was no single stand-out group, with most of the best-scoring groups-led by PEZYFoldings, UM-TBM, and Yang Server-employing AF2 in one way or another. Many top groups paid special attention to generating deep Multiple Sequence Alignments (MSAs) and testing variant MSAs, thereby allowing them to successfully address some of the hardest targets. Such difficult targets, as well as lacking templates, were typically proteins with few homologues. Local divergence between prediction and target correlated with localization at crystal lattice or chain interfaces, and with regions exhibiting high B-factor factors in crystal structure targets, and should not necessarily be considered as representing error in the prediction. However, analysis of exposed and buried side chain accuracy showed room for improvement even in the latter. Nevertheless, a majority of groups produced high-quality predictions for most targets, which are valuable for experimental structure determination, functional analysis, and many other tasks across biology. These include those applying methods similar to those used to generate major resources such as the AlphaFold Protein Structure Database and the ESM Metagenomic atlas: the confidence estimates of the former were also notably accurate.
Collapse
Affiliation(s)
- Adam J. Simpkin
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
| | - Shahram Mesdaghi
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
- Computational Biology Facility, MerseyBio, University of LiverpoolLiverpoolUK
| | - Filomeno Sánchez Rodríguez
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
- Life Science, Diamond Light Source, Harwell Science and Innovation CampusOxfordshireUK
- Department of Chemistry, York Structural Biology LaboratoryUniversity of YorkYorkUK
| | - Luc Elliott
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
| | - David L. Murphy
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
| | | | - Ronan M. Keegan
- UKRI‐STFC, Rutherford Appleton Laboratory, Research Complex at HarwellDidcotUK
| | - Daniel J. Rigden
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
| |
Collapse
|
13
|
Lee JW, Won JH, Jeon S, Choo Y, Yeon Y, Oh JS, Kim M, Kim S, Joung I, Jang C, Lee SJ, Kim TH, Jin KH, Song G, Kim ES, Yoo J, Paek E, Noh YK, Joo K. DeepFold: enhancing protein structure prediction through optimized loss functions, improved template features, and re-optimized energy function. Bioinformatics 2023; 39:btad712. [PMID: 37995286 PMCID: PMC10699847 DOI: 10.1093/bioinformatics/btad712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 11/17/2023] [Accepted: 11/22/2023] [Indexed: 11/25/2023] Open
Abstract
MOTIVATION Predicting protein structures with high accuracy is a critical challenge for the broad community of life sciences and industry. Despite progress made by deep neural networks like AlphaFold2, there is a need for further improvements in the quality of detailed structures, such as side-chains, along with protein backbone structures. RESULTS Building upon the successes of AlphaFold2, the modifications we made include changing the losses of side-chain torsion angles and frame aligned point error, adding loss functions for side chain confidence and secondary structure prediction, and replacing template feature generation with a new alignment method based on conditional random fields. We also performed re-optimization by conformational space annealing using a molecular mechanics energy function which integrates the potential energies obtained from distogram and side-chain prediction. In the CASP15 blind test for single protein and domain modeling (109 domains), DeepFold ranked fourth among 132 groups with improvements in the details of the structure in terms of backbone, side-chain, and Molprobity. In terms of protein backbone accuracy, DeepFold achieved a median GDT-TS score of 88.64 compared with 85.88 of AlphaFold2. For TBM-easy/hard targets, DeepFold ranked at the top based on Z-scores for GDT-TS. This shows its practical value to the structural biology community, which demands highly accurate structures. In addition, a thorough analysis of 55 domains from 39 targets with publicly available structures indicates that DeepFold shows superior side-chain accuracy and Molprobity scores among the top-performing groups. AVAILABILITY AND IMPLEMENTATION DeepFold tools are open-source software available at https://github.com/newtonjoo/deepfold.
Collapse
Affiliation(s)
- Jae-Won Lee
- Department of Computer Science, Hanyang University, Seoul 04763, Korea
- Center for Advanced Computation, Korea Institute for Advanced Study, Seoul 02455, Korea
| | - Jong-Hyun Won
- Department of Computer Science, Hanyang University, Seoul 04763, Korea
- Center for Advanced Computation, Korea Institute for Advanced Study, Seoul 02455, Korea
| | - Seonggwang Jeon
- Department of Computer Science, Hanyang University, Seoul 04763, Korea
- Center for Advanced Computation, Korea Institute for Advanced Study, Seoul 02455, Korea
| | - Yujin Choo
- Center for Advanced Computation, Korea Institute for Advanced Study, Seoul 02455, Korea
- Department of Artificial intelligence, Hanyang University, Seoul 04763, Korea
| | - Yubin Yeon
- Department of Computer Science, Hanyang University, Seoul 04763, Korea
- Center for Advanced Computation, Korea Institute for Advanced Study, Seoul 02455, Korea
| | - Jin-Seon Oh
- Center for Advanced Computation, Korea Institute for Advanced Study, Seoul 02455, Korea
- Department of Artificial intelligence, Hanyang University, Seoul 04763, Korea
| | - Minsoo Kim
- Department of Physics, Sungkyunkwan University, Suwon 16419, Korea
| | - SeonHwa Kim
- School of Electrical Engineering, Korea University, Seoul 02841, Korea
| | | | - Cheongjae Jang
- Artificial Intelligence Institute, Hanyang University, Seoul 04763, Korea
| | - Sung Jong Lee
- Basic Science Research Institute, Changwon National University, Changwon 51140, Korea
| | - Tae Hyun Kim
- Department of Computer Science, Hanyang University, Seoul 04763, Korea
| | - Kyong Hwan Jin
- School of Electrical Engineering, Korea University, Seoul 02841, Korea
| | - Giltae Song
- School of Computer Science and Engineering, Pusan National University, Busan 46241, Korea
| | - Eun-Sol Kim
- Department of Computer Science, Hanyang University, Seoul 04763, Korea
| | - Jejoong Yoo
- Department of Physics, Sungkyunkwan University, Suwon 16419, Korea
| | - Eunok Paek
- Department of Computer Science, Hanyang University, Seoul 04763, Korea
| | - Yung-Kyun Noh
- Department of Computer Science, Hanyang University, Seoul 04763, Korea
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul 02455, Korea
| | - Keehyoung Joo
- Center for Advanced Computation, Korea Institute for Advanced Study, Seoul 02455, Korea
| |
Collapse
|
14
|
Liu J, Guo Z, Wu T, Roy RS, Chen C, Cheng J. Improving AlphaFold2-based protein tertiary structure prediction with MULTICOM in CASP15. Commun Chem 2023; 6:188. [PMID: 37679431 PMCID: PMC10484931 DOI: 10.1038/s42004-023-00991-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 08/21/2023] [Indexed: 09/09/2023] Open
Abstract
Since the 14th Critical Assessment of Techniques for Protein Structure Prediction (CASP14), AlphaFold2 has become the standard method for protein tertiary structure prediction. One remaining challenge is to further improve its prediction. We developed a new version of the MULTICOM system to sample diverse multiple sequence alignments (MSAs) and structural templates to improve the input for AlphaFold2 to generate structural models. The models are then ranked by both the pairwise model similarity and AlphaFold2 self-reported model quality score. The top ranked models are refined by a novel structure alignment-based refinement method powered by Foldseek. Moreover, for a monomer target that is a subunit of a protein assembly (complex), MULTICOM integrates tertiary and quaternary structure predictions to account for tertiary structural changes induced by protein-protein interaction. The system participated in the tertiary structure prediction in 2022 CASP15 experiment. Our server predictor MULTICOM_refine ranked 3rd among 47 CASP15 server predictors and our human predictor MULTICOM ranked 7th among all 132 human and server predictors. The average GDT-TS score and TM-score of the first structural models that MULTICOM_refine predicted for 94 CASP15 domains are ~0.80 and ~0.92, 9.6% and 8.2% higher than ~0.73 and 0.85 of the standard AlphaFold2 predictor respectively.
Collapse
Affiliation(s)
- Jian Liu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| | - Zhiye Guo
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| | - Tianqi Wu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| | - Raj S Roy
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| | - Chen Chen
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|
15
|
Niazi SK. The Coming of Age of AI/ML in Drug Discovery, Development, Clinical Testing, and Manufacturing: The FDA Perspectives. Drug Des Devel Ther 2023; 17:2691-2725. [PMID: 37701048 PMCID: PMC10493153 DOI: 10.2147/dddt.s424991] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 08/24/2023] [Indexed: 09/14/2023] Open
Abstract
Artificial intelligence (AI) and machine learning (ML) represent significant advancements in computing, building on technologies that humanity has developed over millions of years-from the abacus to quantum computers. These tools have reached a pivotal moment in their development. In 2021 alone, the U.S. Food and Drug Administration (FDA) received over 100 product registration submissions that heavily relied on AI/ML for applications such as monitoring and improving human performance in compiling dossiers. To ensure the safe and effective use of AI/ML in drug discovery and manufacturing, the FDA and numerous other U.S. federal agencies have issued continuously updated, stringent guidelines. Intriguingly, these guidelines are often generated or updated with the aid of AI/ML tools themselves. The overarching goal is to expedite drug discovery, enhance the safety profiles of existing drugs, introduce novel treatment modalities, and improve manufacturing compliance and robustness. Recent FDA publications offer an encouraging outlook on the potential of these tools, emphasizing the need for their careful deployment. This has expanded market opportunities for retraining personnel handling these technologies and enabled innovative applications in emerging therapies such as gene editing, CRISPR-Cas9, CAR-T cells, mRNA-based treatments, and personalized medicine. In summary, the maturation of AI/ML technologies is a testament to human ingenuity. Far from being autonomous entities, these are tools created by and for humans designed to solve complex problems now and in the future. This paper aims to present the status of these technologies, along with examples of their present and future applications.
Collapse
|
16
|
Simpkin AJ, Caballero I, McNicholas S, Stevenson K, Jiménez E, Sánchez Rodríguez F, Fando M, Uski V, Ballard C, Chojnowski G, Lebedev A, Krissinel E, Usón I, Rigden DJ, Keegan RM. Predicted models and CCP4. Acta Crystallogr D Struct Biol 2023; 79:806-819. [PMID: 37594303 PMCID: PMC10478639 DOI: 10.1107/s2059798323006289] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 07/19/2023] [Indexed: 08/19/2023] Open
Abstract
In late 2020, the results of CASP14, the 14th event in a series of competitions to assess the latest developments in computational protein structure-prediction methodology, revealed the giant leap forward that had been made by Google's Deepmind in tackling the prediction problem. The level of accuracy in their predictions was the first instance of a competitor achieving a global distance test score of better than 90 across all categories of difficulty. This achievement represents both a challenge and an opportunity for the field of experimental structural biology. For structure determination by macromolecular X-ray crystallography, access to highly accurate structure predictions is of great benefit, particularly when it comes to solving the phase problem. Here, details of new utilities and enhanced applications in the CCP4 suite, designed to allow users to exploit predicted models in determining macromolecular structures from X-ray diffraction data, are presented. The focus is mainly on applications that can be used to solve the phase problem through molecular replacement.
Collapse
Affiliation(s)
- Adam J. Simpkin
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Iracema Caballero
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona, Spain
| | - Stuart McNicholas
- York Structural Biology Laboratory, Department of Chemistry, The University of York, York YO10 5DD, United Kingdom
| | - Kyle Stevenson
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Elisabet Jiménez
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona, Spain
| | - Filomeno Sánchez Rodríguez
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
- York Structural Biology Laboratory, Department of Chemistry, The University of York, York YO10 5DD, United Kingdom
| | - Maria Fando
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Ville Uski
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Charles Ballard
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Grzegorz Chojnowski
- European Molecular Biology Laboratory, Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany
| | - Andrey Lebedev
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Eugene Krissinel
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Isabel Usón
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona, Spain
- ICREA, Institució Catalana de Recerca i Estudis Avançats, Passeig Lluís Companys 23, 08003 Barcelona, Spain
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Ronan M. Keegan
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| |
Collapse
|
17
|
Read RJ, Baker EN, Bond CS, Garman EF, van Raaij MJ. AlphaFold and the future of structural biology. Acta Crystallogr F Struct Biol Commun 2023; 79:166-168. [PMID: 37358500 PMCID: PMC10327576 DOI: 10.1107/s2053230x23004934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/27/2023] Open
Abstract
This editorial acknowledges the transformative impact of new machine-learning methods, such as the use of AlphaFold, but also makes the case for the continuing need for experimental structural biology.
Collapse
Affiliation(s)
- Randy J. Read
- Cambridge Institute for Medical Research, University of Cambridge, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Edward N. Baker
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Charles S. Bond
- School of Molecular Sciences, University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia
| | - Elspeth F. Garman
- Department of Biochemistry, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, United Kingdom
| | - Mark J. van Raaij
- Departamento de Estructura de Macromoleculas, Centro Nacional de Biotecnologia, Consejo Superior de Investigaciones Cientificas, 28049 Madrid, Spain
| |
Collapse
|
18
|
Read RJ, Baker EN, Bond CS, Garman EF, van Raaij MJ. AlphaFold and the future of structural biology. IUCRJ 2023; 10:377-379. [PMID: 37358477 PMCID: PMC10324484 DOI: 10.1107/s2052252523004943] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/27/2023]
Abstract
This editorial acknowledges the transformative impact of new machine-learning methods, such as the use of AlphaFold, but also makes the case for the continuing need for experimental structural biology.
Collapse
Affiliation(s)
- Randy J. Read
- Cambridge Institute for Medical Research, University of Cambridge, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Edward N. Baker
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Charles S. Bond
- School of Molecular Sciences, University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia
| | - Elspeth F. Garman
- Department of Biochemistry, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, United Kingdom
| | - Mark J. van Raaij
- Departamento de Estructura de Macromoleculas, Centro Nacional de Biotecnologia, Consejo Superior de Investigaciones Cientificas, 28049 Madrid, Spain
| |
Collapse
|
19
|
Read RJ, Baker EN, Bond CS, Garman EF, van Raaij MJ. AlphaFold and the future of structural biology. Acta Crystallogr D Struct Biol 2023; 79:556-558. [PMID: 37378959 DOI: 10.1107/s2059798323004928] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2023] Open
Abstract
This editorial acknowledges the transformative impact of new machine-learning methods, such as the use of AlphaFold, but also makes the case for the continuing need for experimental structural biology.
Collapse
Affiliation(s)
- Randy J Read
- Cambridge Institute for Medical Research, University of Cambridge, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Edward N Baker
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Charles S Bond
- School of Molecular Sciences, University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia
| | - Elspeth F Garman
- Department of Biochemistry, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, United Kingdom
| | - Mark J van Raaij
- Departamento de Estructura de Macromoleculas, Centro Nacional de Biotecnologia, Consejo Superior de Investigaciones Cientificas, 28049 Madrid, Spain
| |
Collapse
|
20
|
Eppinger E, Stolz A, Ferraroni M. Crystal structure of the monocupin ring-cleaving dioxygenase 5-nitrosalicylate 1,2-dioxygenase from Bradyrhizobium sp. Acta Crystallogr D Struct Biol 2023; 79:632-640. [PMID: 37326584 PMCID: PMC10306065 DOI: 10.1107/s2059798323004199] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 05/14/2023] [Indexed: 06/17/2023] Open
Abstract
5-Nitrosalicylate 1,2-dioxygenase (5NSDO) is an iron(II)-dependent dioxygenase involved in the aerobic degradation of 5-nitroanthranilic acid by the bacterium Bradyrhizobium sp. It catalyzes the opening of the 5-nitrosalicylate aromatic ring, a key step in the degradation pathway. Besides 5-nitrosalicylate, the enzyme is also active towards 5-chlorosalicylate. The X-ray crystallographic structure of the enzyme was solved at 2.1 Å resolution by molecular replacement using a model from the AI program AlphaFold. The enzyme crystallized in the monoclinic space group P21, with unit-cell parameters a = 50.42, b = 143.17, c = 60.07 Å, β = 107.3°. 5NSDO belongs to the third class of ring-cleaving dioxygenases. Members of this family convert para-diols or hydroxylated aromatic carboxylic acids and belong to the cupin superfamily, which is one of the most functionally diverse protein classes and is named on the basis of a conserved β-barrel fold. 5NSDO is a tetramer composed of four identical subunits, each folded as a monocupin domain. The iron(II) ion in the enzyme active site is coordinated by His96, His98 and His136 and three water molecules with a distorted octahedral geometry. The residues in the active site are poorly conserved compared with other dioxygenases of the third class, such as gentisate 1,2-dioxygenase and salicylate 1,2-dioxygenase. Comparison with these other representatives of the same class and docking of the substrate into the active site of 5NSDO allowed the identification of residues which are crucial for the catalytic mechanism and enzyme selectivity.
Collapse
Affiliation(s)
- Erik Eppinger
- Institut für Mikrobiologie, Universität Stuttgart, Allmandring 31, 70569 Stuttgart, Germany
| | - Andreas Stolz
- Institut für Mikrobiologie, Universität Stuttgart, Allmandring 31, 70569 Stuttgart, Germany
| | - Marta Ferraroni
- Dipartimento di Chimica ‘Ugo Schiff’, Università di Firenze, Via della Lastruccia 3, 50019 Sesto Fiorentino (FI), Italy
| |
Collapse
|
21
|
Millán C, McCoy AJ, Terwilliger TC, Read RJ. Likelihood-based docking of models into cryo-EM maps. Acta Crystallogr D Struct Biol 2023; 79:281-289. [PMID: 36920336 PMCID: PMC10071562 DOI: 10.1107/s2059798323001602] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 02/22/2023] [Indexed: 03/16/2023] Open
Abstract
Optimized docking of models into cryo-EM maps requires exploiting an understanding of the signal expected in the data to minimize the calculation time while maintaining sufficient signal. The likelihood-based rotation function used in crystallography can be employed to establish plausible orientations in a docking search. A phased likelihood translation function yields scores for the placement and rigid-body refinement of oriented models. Optimized strategies for choices of the resolution of data from the cryo-EM maps to use in the calculations and the size of search volumes are based on expected log-likelihood-gain scores computed in advance of the search calculation. Tests demonstrate that the new procedure is fast, robust and effective at placing models into even challenging cryo-EM maps.
Collapse
Affiliation(s)
- Claudia Millán
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Airlie J. McCoy
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Thomas C. Terwilliger
- New Mexico Consortium, Los Alamos National Laboratory, 100 Entrada Drive, Los Alamos, NM 87544, USA
| | - Randy J. Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| |
Collapse
|
22
|
Bordin N, Dallago C, Heinzinger M, Kim S, Littmann M, Rauer C, Steinegger M, Rost B, Orengo C. Novel machine learning approaches revolutionize protein knowledge. Trends Biochem Sci 2023; 48:345-359. [PMID: 36504138 PMCID: PMC10570143 DOI: 10.1016/j.tibs.2022.11.001] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 10/24/2022] [Accepted: 11/17/2022] [Indexed: 12/10/2022]
Abstract
Breakthrough methods in machine learning (ML), protein structure prediction, and novel ultrafast structural aligners are revolutionizing structural biology. Obtaining accurate models of proteins and annotating their functions on a large scale is no longer limited by time and resources. The most recent method to be top ranked by the Critical Assessment of Structure Prediction (CASP) assessment, AlphaFold 2 (AF2), is capable of building structural models with an accuracy comparable to that of experimental structures. Annotations of 3D models are keeping pace with the deposition of the structures due to advancements in protein language models (pLMs) and structural aligners that help validate these transferred annotations. In this review we describe how recent developments in ML for protein science are making large-scale structural bioinformatics available to the general scientific community.
Collapse
Affiliation(s)
- Nicola Bordin
- Institute of Structural and Molecular Biology, University College London, Gower St, WC1E 6BT London, UK
| | - Christian Dallago
- Technical University of Munich (TUM) Department of Informatics, Bioinformatics and Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany; VantAI, 151 W 42nd Street, New York, NY 10036, USA
| | - Michael Heinzinger
- Technical University of Munich (TUM) Department of Informatics, Bioinformatics and Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany; TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748 Garching, Germany
| | - Stephanie Kim
- School of Biological Sciences, Seoul National University, Seoul, South Korea; Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | - Maria Littmann
- Technical University of Munich (TUM) Department of Informatics, Bioinformatics and Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany
| | - Clemens Rauer
- Institute of Structural and Molecular Biology, University College London, Gower St, WC1E 6BT London, UK
| | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, South Korea; Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | - Burkhard Rost
- Technical University of Munich (TUM) Department of Informatics, Bioinformatics and Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany; Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748 Garching/Munich, Germany; TUM School of Life Sciences Weihenstephan (TUM-WZW), Alte Akademie 8, Freising, Germany
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, Gower St, WC1E 6BT London, UK.
| |
Collapse
|
23
|
Terwilliger TC, Afonine PV, Liebschner D, Croll TI, McCoy AJ, Oeffner RD, Williams CJ, Poon BK, Richardson JS, Read RJ, Adams PD. Accelerating crystal structure determination with iterative AlphaFold prediction. Acta Crystallogr D Struct Biol 2023; 79:234-244. [PMID: 36876433 PMCID: PMC9986801 DOI: 10.1107/s205979832300102x] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 02/03/2023] [Indexed: 02/28/2023] Open
Abstract
Experimental structure determination can be accelerated with artificial intelligence (AI)-based structure-prediction methods such as AlphaFold. Here, an automatic procedure requiring only sequence information and crystallographic data is presented that uses AlphaFold predictions to produce an electron-density map and a structural model. Iterating through cycles of structure prediction is a key element of this procedure: a predicted model rebuilt in one cycle is used as a template for prediction in the next cycle. This procedure was applied to X-ray data for 215 structures released by the Protein Data Bank in a recent six-month period. In 87% of cases our procedure yielded a model with at least 50% of Cα atoms matching those in the deposited models within 2 Å. Predictions from the iterative template-guided prediction procedure were more accurate than those obtained without templates. It is concluded that AlphaFold predictions obtained based on sequence information alone are usually accurate enough to solve the crystallographic phase problem with molecular replacement, and a general strategy for macromolecular structure determination that includes AI-based prediction both as a starting point and as a method of model optimization is suggested.
Collapse
Affiliation(s)
- Thomas C. Terwilliger
- New Mexico Consortium, Los Alamos, NM 87544, USA
- Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Pavel V. Afonine
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Dorothee Liebschner
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Tristan I. Croll
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Airlie J. McCoy
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Robert D. Oeffner
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | | | - Billy K. Poon
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | | | - Randy J. Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Paul D. Adams
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
24
|
Oeffner RD, Croll TI, Millán C, Poon BK, Schlicksup CJ, Read RJ, Terwilliger TC. Putting AlphaFold models to work with phenix.process_predicted_model and ISOLDE. Acta Crystallogr D Struct Biol 2022; 78:1303-1314. [PMID: 36322415 PMCID: PMC9629492 DOI: 10.1107/s2059798322010026] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 10/13/2022] [Indexed: 11/23/2022] Open
Abstract
AlphaFold has recently become an important tool in providing models for experimental structure determination by X-ray crystallography and cryo-EM. Large parts of the predicted models typically approach the accuracy of experimentally determined structures, although there are frequently local errors and errors in the relative orientations of domains. Importantly, residues in the model of a protein predicted by AlphaFold are tagged with a predicted local distance difference test score, informing users about which regions of the structure are predicted with less confidence. AlphaFold also produces a predicted aligned error matrix indicating its confidence in the relative positions of each pair of residues in the predicted model. The phenix.process_predicted_model tool downweights or removes low-confidence residues and can break a model into confidently predicted domains in preparation for molecular replacement or cryo-EM docking. These confidence metrics are further used in ISOLDE to weight torsion and atom-atom distance restraints, allowing the complete AlphaFold model to be interactively rearranged to match the docked fragments and reducing the need for the rebuilding of connecting regions.
Collapse
Affiliation(s)
- Robert D. Oeffner
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge Biomedical Campus, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Tristan I. Croll
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge Biomedical Campus, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Claudia Millán
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge Biomedical Campus, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Billy K. Poon
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory (LBNL), Building 33R0349, Berkeley, CA 94720-8235, USA
| | - Christopher J. Schlicksup
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory (LBNL), Building 33R0349, Berkeley, CA 94720-8235, USA
| | - Randy J. Read
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge Biomedical Campus, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Tom C. Terwilliger
- New Mexico Consortium, Los Alamos National Laboratory, 100 Entrada Drive, Los Alamos, NM 87544, USA
| |
Collapse
|
25
|
Terwilliger TC, Poon BK, Afonine PV, Schlicksup CJ, Croll TI, Millán C, Richardson JS, Read RJ, Adams PD. Improved AlphaFold modeling with implicit experimental information. Nat Methods 2022; 19:1376-1382. [PMID: 36266465 PMCID: PMC9636017 DOI: 10.1038/s41592-022-01645-6] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 09/09/2022] [Indexed: 12/02/2022]
Abstract
Machine-learning prediction algorithms such as AlphaFold and RoseTTAFold can create remarkably accurate protein models, but these models usually have some regions that are predicted with low confidence or poor accuracy. We hypothesized that by implicitly including new experimental information such as a density map, a greater portion of a model could be predicted accurately, and that this might synergistically improve parts of the model that were not fully addressed by either machine learning or experiment alone. An iterative procedure was developed in which AlphaFold models are automatically rebuilt on the basis of experimental density maps and the rebuilt models are used as templates in new AlphaFold predictions. We show that including experimental information improves prediction beyond the improvement obtained with simple rebuilding guided by the experimental data. This procedure for AlphaFold modeling with density has been incorporated into an automated procedure for interpretation of crystallographic and electron cryo-microscopy maps.
Collapse
Affiliation(s)
- Thomas C Terwilliger
- New Mexico Consortium, Los Alamos, NM, USA.
- Los Alamos National Laboratory, Los Alamos, NM, USA.
| | - Billy K Poon
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Pavel V Afonine
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Christopher J Schlicksup
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Tristan I Croll
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | - Claudia Millán
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | | | - Randy J Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | - Paul D Adams
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Bioengineering, University of California, Berkeley, CA, USA
| |
Collapse
|
26
|
Medina A, Jiménez E, Caballero I, Castellví A, Triviño Valls J, Alcorlo M, Molina R, Hermoso JA, Sammito MD, Borges R, Usón I. Verification: model-free phasing with enhanced predicted models in ARCIMBOLDO_SHREDDER. Acta Crystallogr D Struct Biol 2022; 78:1283-1293. [PMID: 36322413 PMCID: PMC9629495 DOI: 10.1107/s2059798322009706] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 10/03/2022] [Indexed: 11/23/2022] Open
Abstract
Structure predictions have matched the accuracy of experimental structures from close homologues, providing suitable models for molecular replacement phasing. Even in predictions that present large differences due to the relative movement of domains or poorly predicted areas, very accurate regions tend to be present. These are suitable for successful fragment-based phasing as implemented in ARCIMBOLDO. The particularities of predicted models are inherently addressed in the new predicted_model mode, rendering preliminary treatment superfluous but also harmless. B-value conversion from predicted LDDT or error estimates, the removal of unstructured polypeptide, hierarchical decomposition of structural units from domains to local folds and systematically probing the model against the experimental data will ensure the optimal use of the model in phasing. Concomitantly, the exhaustive use of models and stereochemistry in phasing, refinement and validation raises the concern of crystallographic model bias and the need to critically establish the information contributed by the experiment. Therefore, in its predicted_model mode ARCIMBOLDO_SHREDDER will first determine whether the input model already constitutes a solution or provides a straightforward solution with Phaser. If not, extracted fragments will be located. If the landscape of solutions reveals numerous, clearly discriminated and consistent probes or if the input model already constitutes a solution, model-free verification will be activated. Expansions with SHELXE will omit the partial solution seeding phases and all traces outside their respective masks will be combined in ALIXE, as far as consistent. This procedure completely eliminates the molecular replacement search model in favour of the inferences derived from this model. In the case of fragments, an incorrect starting hypothesis impedes expansion. The predicted_model mode has been tested in different scenarios.
Collapse
Affiliation(s)
- Ana Medina
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Elisabet Jiménez
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Iracema Caballero
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Albert Castellví
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Josep Triviño Valls
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Martin Alcorlo
- Department of Crystallography and Structural Biology, Institute of Physical Chemistry ‘Rocasolano’, Spanish National Research Council (CSIC), Madrid, Spain
| | - Rafael Molina
- Department of Crystallography and Structural Biology, Institute of Physical Chemistry ‘Rocasolano’, Spanish National Research Council (CSIC), Madrid, Spain
| | - Juan A. Hermoso
- Department of Crystallography and Structural Biology, Institute of Physical Chemistry ‘Rocasolano’, Spanish National Research Council (CSIC), Madrid, Spain
| | - Massimo D. Sammito
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Rafael Borges
- Department of Biophysics and Pharmacology, Biosciences Institute, São Paulo State University (UNESP), Botucatu, Sao Paulo 18618-689, Brazil
| | - Isabel Usón
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
- ICREA, Institució Catalana de Recerca i Estudis Avançats, Passeig Lluís Companys 23, 08003 Barcelona, Spain
| |
Collapse
|
27
|
Pak MA, Ivankov DN. Best templates outperform homology models in predicting the impact of mutations on protein stability. Bioinformatics 2022; 38:4312-4320. [PMID: 35894930 DOI: 10.1093/bioinformatics/btac515] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 05/31/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Prediction of protein stability change upon mutation (ΔΔG) is crucial for facilitating protein engineering and understanding of protein folding principles. Robust prediction of protein folding free energy change requires the knowledge of protein three-dimensional (3D) structure. In case, protein 3D structure is not available, one can predict the structure from protein sequence; however, the perspectives of ΔΔG predictions for predicted protein structures are unknown. The accuracy of using 3D structures of the best templates for the ΔΔG prediction is also unclear. RESULTS To investigate these questions, we used a representative set of seven diverse and accurate publicly available tools (FoldX, Eris, Rosetta, DDGun, ACDC-NN, ThermoNet and DynaMut) for stability change prediction combined with AlphaFold or I-Tasser for protein 3D structure prediction. We found that best templates perform consistently better than (or similar to) homology models for all ΔΔG predictors. Our findings imply using the best template structure for the prediction of protein stability change upon mutation if the protein 3D structure is not available. AVAILABILITY AND IMPLEMENTATION The data are available at https://github.com/ivankovlab/template-vs-model. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marina A Pak
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow 121205, Russia
| | - Dmitry N Ivankov
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow 121205, Russia
| |
Collapse
|
28
|
Krissinel E, Lebedev AA, Uski V, Ballard CB, Keegan RM, Kovalevskiy O, Nicholls RA, Pannu NS, Skubák P, Berrisford J, Fando M, Lohkamp B, Wojdyr M, Simpkin AJ, Thomas JMH, Oliver C, Vonrhein C, Chojnowski G, Basle A, Purkiss A, Isupov MN, McNicholas S, Lowe E, Triviño J, Cowtan K, Agirre J, Rigden DJ, Uson I, Lamzin V, Tews I, Bricogne G, Leslie AGW, Brown DG. CCP4 Cloud for structure determination and project management in macromolecular crystallography. Acta Crystallogr D Struct Biol 2022; 78:1079-1089. [PMID: 36048148 PMCID: PMC9435598 DOI: 10.1107/s2059798322007987] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 08/08/2022] [Indexed: 11/10/2022] Open
Abstract
Nowadays, progress in the determination of three-dimensional macromolecular structures from diffraction images is achieved partly at the cost of increasing data volumes. This is due to the deployment of modern high-speed, high-resolution detectors, the increased complexity and variety of crystallographic software, the use of extensive databases and high-performance computing. This limits what can be accomplished with personal, offline, computing equipment in terms of both productivity and maintainability. There is also an issue of long-term data maintenance and availability of structure-solution projects as the links between experimental observations and the final results deposited in the PDB. In this article, CCP4 Cloud, a new front-end of the CCP4 software suite, is presented which mitigates these effects by providing an online, cloud-based environment for crystallographic computation. CCP4 Cloud was developed for the efficient delivery of computing power, database services and seamless integration with web resources. It provides a rich graphical user interface that allows project sharing and long-term storage for structure-solution projects, and can be linked to data-producing facilities. The system is distributed with the CCP4 software suite version 7.1 and higher, and an online publicly available instance of CCP4 Cloud is provided by CCP4.
Collapse
Affiliation(s)
- Eugene Krissinel
- Scientific Computing Department, Science and Technology Facilities Council UK, Didcot OX11 0FA, United Kingdom
| | - Andrey A. Lebedev
- Scientific Computing Department, Science and Technology Facilities Council UK, Didcot OX11 0FA, United Kingdom
| | - Ville Uski
- Scientific Computing Department, Science and Technology Facilities Council UK, Didcot OX11 0FA, United Kingdom
| | - Charles B. Ballard
- Scientific Computing Department, Science and Technology Facilities Council UK, Didcot OX11 0FA, United Kingdom
| | - Ronan M. Keegan
- Scientific Computing Department, Science and Technology Facilities Council UK, Didcot OX11 0FA, United Kingdom
| | - Oleg Kovalevskiy
- Scientific Computing Department, Science and Technology Facilities Council UK, Didcot OX11 0FA, United Kingdom
| | - Robert A. Nicholls
- Structural Studies Division, MRC Laboratory for Structural Biology, Cambridge CB2 0QH, United Kingdom
| | - Navraj S. Pannu
- Leiden University Medical Center, 2333 ZA Leiden, The Netherlands
| | - Pavol Skubák
- Leiden University Medical Center, 2333 ZA Leiden, The Netherlands
| | - John Berrisford
- European Bioinformatics Institute, Hinxton CB9 1SD, United Kingdom
| | - Maria Fando
- Institute of Protein Research, Pushchino 142290, Russian Federation
- Translational and Clinical Research Institute, Newcastle University, Framlington Place, Newcastle upon Tyne NE2 4HH, United Kingdom
- Biological Sciences, Institute for Life Sciences, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Bernhard Lohkamp
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, SE-171 77 Stockholm, Sweden
| | - Marcin Wojdyr
- Global Phasing Limited, Sheraton House, Castle Park, Cambridge CB3 0AX, United Kingdom
| | - Adam J. Simpkin
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Jens M. H. Thomas
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Christopher Oliver
- School of Physics and Astronomy, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Clemens Vonrhein
- Global Phasing Limited, Sheraton House, Castle Park, Cambridge CB3 0AX, United Kingdom
| | - Grzegorz Chojnowski
- European Molecular Biology Laboratory, Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany
| | - Arnaud Basle
- Newcastle University Biosciences Institute Medical School, Newcastle upon Tyne NE2 4AX, United Kingdom
| | - Andrew Purkiss
- Structural Biology Science Technology Platform, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, United Kingdom
| | - Michail N. Isupov
- Biosciences, University of Exeter, Stocker Road, Exeter EX4 4QD, United Kingdom
| | - Stuart McNicholas
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Edward Lowe
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, United Kingdom
| | - Josep Triviño
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Kevin Cowtan
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Jon Agirre
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Isabel Uson
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
- ICREA: Institució Catalana de Recerca i Estudis Avançats, Pg. Lluis Companys 23, 08010 Barcelona, Spain
| | - Victor Lamzin
- European Molecular Biology Laboratory, Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany
| | - Ivo Tews
- Biological Sciences, Institute for Life Sciences, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Gerard Bricogne
- Global Phasing Limited, Sheraton House, Castle Park, Cambridge CB3 0AX, United Kingdom
| | - Andrew G. W. Leslie
- Structural Studies Division, MRC Laboratory for Structural Biology, Cambridge CB2 0QH, United Kingdom
| | - David G. Brown
- Servier Research Institute in Croissy-sur-Seine, 125 Chemin de Ronde, 78290 Croissy, France
| |
Collapse
|
29
|
Structure Prediction, Evaluation, and Validation of GPR18 Lipid Receptor Using Free Programs. Int J Mol Sci 2022; 23:ijms23147917. [PMID: 35887268 PMCID: PMC9319093 DOI: 10.3390/ijms23147917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/04/2022] [Accepted: 07/08/2022] [Indexed: 11/30/2022] Open
Abstract
The GPR18 receptor, often referred to as the N-arachidonylglycine receptor, although assigned (along with GPR55 and GPR119) to the new class A GPCR subfamily-lipid receptors, officially still has the status of a class A GPCR orphan. While its signaling pathways and biological significance have not yet been fully elucidated, increasing evidence points to the therapeutic potential of GPR18 in relation to immune, neurodegenerative, and cancer processes to name a few. Therefore, it is necessary to understand the interactions of potential ligands with the receptor and the influence of particular structural elements on their activity. Thus, given the lack of an experimentally solved structure, the goal of the present study was to obtain a homology model of the GPR18 receptor in the inactive state, meeting all requirements in terms of protein structure quality and recognition of active ligands. To increase the reliability and precision of the predictions, different contemporary protein structure prediction methods and software were used and compared herein. To test the usability of the resulting models, we optimized and compared the selected structures followed by the assessment of the ability to recognize known, active ligands. The stability of the predicted poses was then evaluated by means of molecular dynamics simulations. On the other hand, most of the best-ranking contemporary CADD software/platforms for its full usability require rather expensive licenses. To overcome this down-to-earth obstacle, the overarching goal of these studies was to test whether it is possible to perform the thorough CADD experiments with high scientific confidence while using only license-free/academic software and online platforms. The obtained results indicate that a wide range of freely available software and/or academic licenses allow us to carry out meaningful molecular modelling/docking studies.
Collapse
|
30
|
Hryc CF, Baker ML. AlphaFold2 and CryoEM: Revisiting CryoEM modeling in near-atomic resolution density maps. iScience 2022; 25:104496. [PMID: 35733789 PMCID: PMC9207676 DOI: 10.1016/j.isci.2022.104496] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/07/2022] [Accepted: 05/24/2022] [Indexed: 11/27/2022] Open
Abstract
With the advent of new artificial intelligence and machine learning algorithms, predictive modeling can, in some cases, produce structures on par with experimental methods. The combination of predictive modeling and experimental structure determination by electron cryomicroscopy (cryoEM) offers a tantalizing approach for producing robust atomic models of macromolecular assemblies. Here, we apply AlphaFold2 to a set of community standard data sets and compare the results with the corresponding reference maps and models. Moreover, we present three unique case studies from previously determined cryoEM density maps of viruses. Our results show that AlphaFold2 can not only produce reasonably accurate models for analysis and additional hypotheses testing, but can also potentially yield incorrect structures if not properly validated with experimental data. Whereas we outline numerous shortcomings and potential pitfalls of predictive modeling, the obvious synergy between predictive modeling and cryoEM will undoubtedly result in new computational modeling tools.
Collapse
Affiliation(s)
- Corey F. Hryc
- Department of Biochemistry and Molecular Biology, Structural Biology Imaging Center, McGovern Medical School at The University of Texas Health Science Center at Houston, 6431 Fannin Street, Houston, TX 77030, USA
| | - Matthew L. Baker
- Department of Biochemistry and Molecular Biology, Structural Biology Imaging Center, McGovern Medical School at The University of Texas Health Science Center at Houston, 6431 Fannin Street, Houston, TX 77030, USA
| |
Collapse
|
31
|
Moi D, Nishio S, Li X, Valansi C, Langleib M, Brukman NG, Flyak K, Dessimoz C, de Sanctis D, Tunyasuvunakool K, Jumper J, Graña M, Romero H, Aguilar PS, Jovine L, Podbilewicz B. Discovery of archaeal fusexins homologous to eukaryotic HAP2/GCS1 gamete fusion proteins. Nat Commun 2022; 13:3880. [PMID: 35794124 PMCID: PMC9259645 DOI: 10.1038/s41467-022-31564-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 06/22/2022] [Indexed: 12/26/2022] Open
Abstract
Sexual reproduction consists of genome reduction by meiosis and subsequent gamete fusion. The presence of genes homologous to eukaryotic meiotic genes in archaea and bacteria suggests that DNA repair mechanisms evolved towards meiotic recombination. However, fusogenic proteins resembling those found in gamete fusion in eukaryotes have so far not been found in prokaryotes. Here, we identify archaeal proteins that are homologs of fusexins, a superfamily of fusogens that mediate eukaryotic gamete and somatic cell fusion, as well as virus entry. The crystal structure of a trimeric archaeal fusexin (Fusexin1 or Fsx1) reveals an archetypical fusexin architecture with unique features such as a six-helix bundle and an additional globular domain. Ectopically expressed Fusexin1 can fuse mammalian cells, and this process involves the additional globular domain and a conserved fusion loop. Furthermore, archaeal fusexin genes are found within integrated mobile elements, suggesting potential roles in cell-cell fusion and gene exchange in archaea, as well as different scenarios for the evolutionary history of fusexins.
Collapse
Affiliation(s)
- David Moi
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE-CONICET), Buenos Aires, Argentina
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Shunsuke Nishio
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Xiaohui Li
- Department of Biology, Technion- Israel Institute of Technology, Haifa, Israel
| | - Clari Valansi
- Department of Biology, Technion- Israel Institute of Technology, Haifa, Israel
| | - Mauricio Langleib
- Unidad de Genómica Evolutiva, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
- Unidad de Bioinformática, Institut Pasteur de Montevideo, Montevideo, Uruguay
| | - Nicolas G Brukman
- Department of Biology, Technion- Israel Institute of Technology, Haifa, Israel
| | - Kateryna Flyak
- Department of Biology, Technion- Israel Institute of Technology, Haifa, Israel
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Genetics, Evolution and Environment, Centre for Life's Origins and Evolution, University College London, London, UK
- Department of Computer Science, University College London, London, UK
| | | | | | | | - Martin Graña
- Unidad de Bioinformática, Institut Pasteur de Montevideo, Montevideo, Uruguay.
| | - Héctor Romero
- Unidad de Genómica Evolutiva, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay.
- Centro Universitario Regional Este - CURE, Centro Interdisciplinario de Ciencia de Datos y Aprendizaje Automático - CICADA, Universidad de la República, Montevideo, Uruguay.
| | - Pablo S Aguilar
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE-CONICET), Buenos Aires, Argentina.
- Instituto de Investigaciones Biotecnológicas Universidad Nacional de San Martín (IIB-CONICET), San Martín, Buenos Aires, Argentina.
| | - Luca Jovine
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden.
| | | |
Collapse
|
32
|
Laurents DV. AlphaFold 2 and NMR Spectroscopy: Partners to Understand Protein Structure, Dynamics and Function. Front Mol Biosci 2022; 9:906437. [PMID: 35655760 PMCID: PMC9152297 DOI: 10.3389/fmolb.2022.906437] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 04/25/2022] [Indexed: 11/29/2022] Open
Abstract
The artificial intelligence program AlphaFold 2 is revolutionizing the field of protein structure determination as it accurately predicts the 3D structure of two thirds of the human proteome. Its predictions can be used directly as structural models or indirectly as aids for experimental structure determination using X-ray crystallography, CryoEM or NMR spectroscopy. Nevertheless, AlphaFold 2 can neither afford insight into how proteins fold, nor can it determine protein stability or dynamics. Rare folds or minor alternative conformations are also not predicted by AlphaFold 2 and the program does not forecast the impact of post translational modifications, mutations or ligand binding. The remaining third of human proteome which is poorly predicted largely corresponds to intrinsically disordered regions of proteins. Key to regulation and signaling networks, these disordered regions often form biomolecular condensates or amyloids. Fortunately, the limitations of AlphaFold 2 are largely complemented by NMR spectroscopy. This experimental approach provides information on protein folding and dynamics as well as biomolecular condensates and amyloids and their modulation by experimental conditions, small molecules, post translational modifications, mutations, flanking sequence, interactions with other proteins, RNA and virus. Together, NMR spectroscopy and AlphaFold 2 can collaborate to advance our comprehension of proteins.
Collapse
|
33
|
Simpkin AJ, Thomas JMH, Keegan RM, Rigden DJ. MrParse: finding homologues in the PDB and the EBI AlphaFold database for molecular replacement and more. Acta Crystallogr D Struct Biol 2022; 78:553-559. [PMID: 35503204 PMCID: PMC9063843 DOI: 10.1107/s2059798322003576] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 03/29/2022] [Indexed: 11/10/2022] Open
Abstract
Crystallographers have an array of search-model options for structure solution by molecular replacement (MR). The well established options of homologous experimental structures and regular secondary-structure elements or motifs are increasingly supplemented by computational modelling. Such modelling may be carried out locally or may use pre-calculated predictions retrieved from databases such as the EBI AlphaFold database. MrParse is a new pipeline to help to streamline the decision process in MR by consolidating bioinformatic predictions in one place. When reflection data are provided, MrParse can rank any experimental homologues found using eLLG, which indicates the likelihood that a given search model will work in MR. Inbuilt displays of predicted secondary structure, coiled-coil and transmembrane regions further inform the choice of MR protocol. MrParse can also identify and rank homologues in the EBI AlphaFold database, a function that will also interest other structural biologists and bioinformaticians.
Collapse
Affiliation(s)
- Adam J. Simpkin
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Jens M. H. Thomas
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Ronan M. Keegan
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| |
Collapse
|
34
|
Aderinwale T, Bharadwaj V, Christoffer C, Terashi G, Zhang Z, Jahandideh R, Kagaya Y, Kihara D. Real-time structure search and structure classification for AlphaFold protein models. Commun Biol 2022; 5:316. [PMID: 35383281 PMCID: PMC8983703 DOI: 10.1038/s42003-022-03261-8] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 03/11/2022] [Indexed: 11/17/2022] Open
Abstract
Last year saw a breakthrough in protein structure prediction, where the AlphaFold2 method showed a substantial improvement in the modeling accuracy. Following the software release of AlphaFold2, predicted structures by AlphaFold2 for proteins in 21 species were made publicly available via the AlphaFold Database. Here, to facilitate structural analysis and application of AlphaFold2 models, we provide the infrastructure, 3D-AF-Surfer, which allows real-time structure-based search for the AlphaFold2 models. In 3D-AF-Surfer, structures are represented with 3D Zernike descriptors (3DZD), which is a rotationally invariant, mathematical representation of 3D shapes. We developed a neural network that takes 3DZDs of proteins as input and retrieves proteins of the same fold more accurately than direct comparison of 3DZDs. Using 3D-AF-Surfer, we report structure classifications of AlphaFold2 models and discuss the correlation between confidence levels of AlphaFold2 models and intrinsic disordered regions.
Collapse
Affiliation(s)
- Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Vijay Bharadwaj
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | | | - Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
35
|
van Breugel M, Rosa E Silva I, Andreeva A. Structural validation and assessment of AlphaFold2 predictions for centrosomal and centriolar proteins and their complexes. Commun Biol 2022; 5:312. [PMID: 35383272 PMCID: PMC8983713 DOI: 10.1038/s42003-022-03269-0] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 02/28/2022] [Indexed: 11/21/2022] Open
Abstract
Obtaining the high-resolution structures of proteins and their complexes is a crucial aspect of understanding the mechanisms of life. Experimental structure determination methods are time-consuming, expensive and cannot keep pace with the growing number of protein sequences available through genomic DNA sequencing. Thus, the ability to accurately predict the structure of proteins from their sequence is a holy grail of structural and computational biology that would remove a bottleneck in our efforts to understand as well as rationally engineer living systems. Recent advances in protein structure prediction, in particular the breakthrough with the AI-based tool AlphaFold2 (AF2), hold promise for achieving this goal, but the practical utility of AF2 remains to be explored. Focusing on proteins with essential roles in centrosome and centriole biogenesis, we demonstrate the quality and usability of the AF2 prediction models and we show that they can provide important insights into the modular organization of two key players in this process, CEP192 and CEP44. Furthermore, we used the AF2 algorithm to elucidate and then experimentally validate previously unknown prime features in the structure of TTBK2 bound to CEP164, as well as the Chibby1-FAM92A complex for which no structural information was available to date. These findings have important implications in understanding the regulation and function of these complexes. Finally, we also discuss some practical limitations of AF2 and anticipate the implications for future research approaches in the centriole/centrosome field.
Collapse
Affiliation(s)
- Mark van Breugel
- Queen Mary University of London, School of Biological and Behavioural Sciences, 4 Newark Street, London, E1 2AT, UK.
- Medical Research Council-Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK.
| | - Ivan Rosa E Silva
- Queen Mary University of London, School of Biological and Behavioural Sciences, 4 Newark Street, London, E1 2AT, UK
- Medical Research Council-Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
- University of Campinas, Faculty of Pharmaceutical Sciences, Cândido Portinari Street, Campinas, 13083-871, Brazil
| | - Antonina Andreeva
- Medical Research Council-Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| |
Collapse
|
36
|
Barbarin-Bocahu I, Graille M. The X-ray crystallography phase problem solved thanks to AlphaFold and RoseTTAFold models: a case-study report. ACTA CRYSTALLOGRAPHICA SECTION D STRUCTURAL BIOLOGY 2022; 78:517-531. [PMID: 35362474 DOI: 10.1107/s2059798322002157] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 02/23/2022] [Indexed: 11/10/2022]
Abstract
The breakthrough recently made in protein structure prediction by deep-learning programs such as AlphaFold and RoseTTAFold will certainly revolutionize biology over the coming decades. The scientific community is only starting to appreciate the various applications, benefits and limitations of these protein models. Yet, after the first thrills due to this revolution, it is important to evaluate the impact of the proposed models and their overall quality to avoid the misinterpretation or overinterpretation of these models by biologists. One of the first applications of these models is in solving the `phase problem' encountered in X-ray crystallography in calculating electron-density maps from diffraction data. Indeed, the most frequently used technique to derive electron-density maps is molecular replacement. As this technique relies on knowledge of the structure of a protein that shares strong structural similarity with the studied protein, the availability of high-accuracy models is then definitely critical for successful structure solution. After the collection of a 2.45 Å resolution data set, we struggled for two years in trying to solve the crystal structure of a protein involved in the nonsense-mediated mRNA decay pathway, an mRNA quality-control pathway dedicated to the elimination of eukaryotic mRNAs harboring premature stop codons. We used different methods (isomorphous replacement, anomalous diffraction and molecular replacement) to determine this structure, but all failed until we straightforwardly succeeded thanks to both AlphaFold and RoseTTAFold models. Here, we describe how these new models helped us to solve this structure and conclude that in our case the AlphaFold model largely outcompetes the other models. We also discuss the importance of search-model generation for successful molecular replacement.
Collapse
|
37
|
Hegedűs T, Geisler M, Lukács GL, Farkas B. Ins and outs of AlphaFold2 transmembrane protein structure predictions. Cell Mol Life Sci 2022; 79:73. [PMID: 35034173 PMCID: PMC8761152 DOI: 10.1007/s00018-021-04112-1] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 11/25/2021] [Accepted: 12/20/2021] [Indexed: 12/20/2022]
Abstract
Transmembrane (TM) proteins are major drug targets, but their structure determination, a prerequisite for rational drug design, remains challenging. Recently, the DeepMind's AlphaFold2 machine learning method greatly expanded the structural coverage of sequences with high accuracy. Since the employed algorithm did not take specific properties of TM proteins into account, the reliability of the generated TM structures should be assessed. Therefore, we quantitatively investigated the quality of structures at genome scales, at the level of ABC protein superfamily folds and for specific membrane proteins (e.g. dimer modeling and stability in molecular dynamics simulations). We tested template-free structure prediction with a challenging TM CASP14 target and several TM protein structures published after AlphaFold2 training. Our results suggest that AlphaFold2 performs well in the case of TM proteins and its neural network is not overfitted. We conclude that cautious applications of AlphaFold2 structural models will advance TM protein-associated studies at an unexpected level.
Collapse
Affiliation(s)
- Tamás Hegedűs
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest, Hungary.
- TKI, Eötvös Loránd Research Network, Budapest, Hungary.
| | - Markus Geisler
- Department of Biology, University of Fribourg, Fribourg, Switzerland
| | | | - Bianka Farkas
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest, Hungary
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary
| |
Collapse
|
38
|
|
39
|
McCoy AJ, Sammito MD, Read RJ. Implications of AlphaFold2 for crystallographic phasing by molecular replacement. Acta Crystallogr D Struct Biol 2022; 78:1-13. [PMID: 34981757 PMCID: PMC8725160 DOI: 10.1107/s2059798321012122] [Citation(s) in RCA: 57] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Accepted: 11/13/2021] [Indexed: 12/11/2022] Open
Abstract
The AlphaFold2 results in the 14th edition of Critical Assessment of Structure Prediction (CASP14) showed that accurate (low root-mean-square deviation) in silico models of protein structure domains are on the horizon, whether or not the protein is related to known structures through high-coverage sequence similarity. As highly accurate models become available, generated by harnessing the power of correlated mutations and deep learning, one of the aspects of structural biology to be impacted will be methods of phasing in crystallography. Here, the data from CASP14 are used to explore the prospects for changes in phasing methods, and in particular to explore the prospects for molecular-replacement phasing using in silico models.
Collapse
Affiliation(s)
- Airlie J. McCoy
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Massimo D. Sammito
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Randy J. Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| |
Collapse
|
40
|
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)-Round XIV. Proteins 2021; 89:1607-1617. [PMID: 34533838 PMCID: PMC8726744 DOI: 10.1002/prot.26237] [Citation(s) in RCA: 236] [Impact Index Per Article: 78.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 07/28/2021] [Indexed: 01/14/2023]
Abstract
Critical assessment of structure prediction (CASP) is a community experiment to advance methods of computing three-dimensional protein structure from amino acid sequence. Core components are rigorous blind testing of methods and evaluation of the results by independent assessors. In the most recent experiment (CASP14), deep-learning methods from one research group consistently delivered computed structures rivaling the corresponding experimental ones in accuracy. In this sense, the results represent a solution to the classical protein-folding problem, at least for single proteins. The models have already been shown to be capable of providing solutions for problematic crystal structures, and there are broad implications for the rest of structural biology. Other research groups also substantially improved performance. Here, we describe these results and outline some of the many implications. Other related areas of CASP, including modeling of protein complexes, structure refinement, estimation of model accuracy, and prediction of inter-residue contacts and distances, are also described.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Torsten Schwede
- University of Basel, Biozentrum & SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Maya Topf
- Centre for Structural Systems Biology, Leibniz-Institut für Experimentelle Virologie and Universit tsklinikum Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Krzysztof Fidelis
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - John Moult
- Institute for Bioscience and Biotechnology Research, 9600 Gudelsky Drive, Rockville, MD 20850, USA, Department of Cell Biology and Molecular Genetics, University of Maryland
| |
Collapse
|
41
|
Alexander LT, Lepore R, Kryshtafovych A, Adamopoulos A, Alahuhta M, Arvin AM, Bomble YJ, Böttcher B, Breyton C, Chiarini V, Chinnam NB, Chiu W, Fidelis K, Grinter R, Gupta GD, Hartmann MD, Hayes CS, Heidebrecht T, Ilari A, Joachimiak A, Kim Y, Linares R, Lovering AL, Lunin VV, Lupas AN, Makbul C, Michalska K, Moult J, Mukherjee PK, Nutt W(S, Oliver SL, Perrakis A, Stols L, Tainer JA, Topf M, Tsutakawa SE, Valdivia‐Delgado M, Schwede T. Target highlights in CASP14: Analysis of models by structure providers. Proteins 2021; 89:1647-1672. [PMID: 34561912 PMCID: PMC8616854 DOI: 10.1002/prot.26247] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 09/13/2021] [Accepted: 09/16/2021] [Indexed: 12/11/2022]
Abstract
The biological and functional significance of selected Critical Assessment of Techniques for Protein Structure Prediction 14 (CASP14) targets are described by the authors of the structures. The authors highlight the most relevant features of the target proteins and discuss how well these features were reproduced in the respective submitted predictions. The overall ability to predict three-dimensional structures of proteins has improved remarkably in CASP14, and many difficult targets were modeled with impressive accuracy. For the first time in the history of CASP, the experimentalists not only highlighted that computational models can accurately reproduce the most critical structural features observed in their targets, but also envisaged that models could serve as a guidance for further studies of biologically-relevant properties of proteins.
Collapse
Affiliation(s)
- Leila T. Alexander
- Biozentrum, University of BaselBaselSwitzerland
- Computational Structural BiologySIB Swiss Institute of BioinformaticsBaselSwitzerland
| | | | | | - Athanassios Adamopoulos
- Oncode Institute and Division of BiochemistryNetherlands Cancer InstituteAmsterdamThe Netherlands
| | - Markus Alahuhta
- Bioscience Center, National Renewable Energy LaboratoryGoldenColoradoUSA
| | - Ann M. Arvin
- Department of PediatricsStanford University School of MedicineStanfordCaliforniaUSA
- Microbiology and ImmunologyStanford University School of MedicineStanfordCaliforniaUSA
| | - Yannick J. Bomble
- Bioscience Center, National Renewable Energy LaboratoryGoldenColoradoUSA
| | - Bettina Böttcher
- Biocenter and Rudolf Virchow Center, Julius‐Maximilians Universität WürzburgWürzburgGermany
| | - Cécile Breyton
- Univ. Grenoble Alpes, CNRS, CEA, Institute for Structural BiologyGrenobleFrance
| | - Valerio Chiarini
- Program in Structural Biology and BiophysicsInstitute of Biotechnology, University of HelsinkiHelsinkiFinland
| | - Naga babu Chinnam
- Department of Molecular and Cellular OncologyThe University of Texas M.D. Anderson Cancer CenterHoustonTexasUSA
| | - Wah Chiu
- Microbiology and ImmunologyStanford University School of MedicineStanfordCaliforniaUSA
- BioengineeringStanford University School of MedicineStanfordCaliforniaUSA
- Division of Cryo‐EM and Bioimaging SSRLSLAC National Accelerator LaboratoryMenlo ParkCaliforniaUSA
| | | | - Rhys Grinter
- Infection and Immunity Program, Biomedicine Discovery Institute and Department of MicrobiologyMonash UniversityClaytonAustralia
| | - Gagan D. Gupta
- Radiation Biology & Health Sciences DivisionBhabha Atomic Research CentreMumbaiIndia
| | - Marcus D. Hartmann
- Department of Protein EvolutionMax Planck Institute for Developmental BiologyTübingenGermany
| | - Christopher S. Hayes
- Department of Molecular, Cellular and Developmental BiologyUniversity of California, Santa BarbaraSanta BarbaraCaliforniaUSA
- Biomolecular Science and Engineering ProgramUniversity of California, Santa BarbaraSanta BarbaraCaliforniaUSA
| | - Tatjana Heidebrecht
- Oncode Institute and Division of BiochemistryNetherlands Cancer InstituteAmsterdamThe Netherlands
| | - Andrea Ilari
- Institute of Molecular Biology and Pathology of the National Research Council of Italy (CNR)RomeItaly
| | - Andrzej Joachimiak
- Center for Structural Genomics of Infectious Diseases, Consortium for Advanced Science and Engineering, University of ChicagoChicagoIllinoisUSA
- X‐ray Science DivisionArgonne National Laboratory, Structural Biology CenterArgonneIllinoisUSA
- Department of Biochemistry and Molecular BiologyUniversity of ChicagoChicagoIllinoisUSA
| | - Youngchang Kim
- Center for Structural Genomics of Infectious Diseases, Consortium for Advanced Science and Engineering, University of ChicagoChicagoIllinoisUSA
- X‐ray Science DivisionArgonne National Laboratory, Structural Biology CenterArgonneIllinoisUSA
| | - Romain Linares
- Univ. Grenoble Alpes, CNRS, CEA, Institute for Structural BiologyGrenobleFrance
| | | | - Vladimir V. Lunin
- Bioscience Center, National Renewable Energy LaboratoryGoldenColoradoUSA
| | - Andrei N. Lupas
- Department of Protein EvolutionMax Planck Institute for Developmental BiologyTübingenGermany
| | - Cihan Makbul
- Biocenter and Rudolf Virchow Center, Julius‐Maximilians Universität WürzburgWürzburgGermany
| | - Karolina Michalska
- Center for Structural Genomics of Infectious Diseases, Consortium for Advanced Science and Engineering, University of ChicagoChicagoIllinoisUSA
- X‐ray Science DivisionArgonne National Laboratory, Structural Biology CenterArgonneIllinoisUSA
| | - John Moult
- Department of Cell Biology and Molecular GeneticsInstitute for Bioscience and Biotechnology Research, University of MarylandRockvilleMarylandUSA
| | - Prasun K. Mukherjee
- Nuclear Agriculture & Biotechnology DivisionBhabha Atomic Research CentreMumbaiIndia
| | - William (Sam) Nutt
- Center for Structural Genomics of Infectious Diseases, Consortium for Advanced Science and Engineering, University of ChicagoChicagoIllinoisUSA
- X‐ray Science DivisionArgonne National Laboratory, Structural Biology CenterArgonneIllinoisUSA
| | - Stefan L. Oliver
- Department of PediatricsStanford University School of MedicineStanfordCaliforniaUSA
| | - Anastassis Perrakis
- Oncode Institute and Division of BiochemistryNetherlands Cancer InstituteAmsterdamThe Netherlands
| | - Lucy Stols
- Center for Structural Genomics of Infectious Diseases, Consortium for Advanced Science and Engineering, University of ChicagoChicagoIllinoisUSA
- X‐ray Science DivisionArgonne National Laboratory, Structural Biology CenterArgonneIllinoisUSA
| | - John A. Tainer
- Department of Molecular and Cellular OncologyThe University of Texas M.D. Anderson Cancer CenterHoustonTexasUSA
- Department of Cancer BiologyUniversity of Texas MD Anderson Cancer CenterHoustonTexasUSA
| | - Maya Topf
- Institute of Structural and Molecular Biology, Birkbeck, University College LondonLondonUK
- Centre for Structural Systems Biology, Leibniz‐Institut für Experimentelle VirologieHamburgGermany
| | - Susan E. Tsutakawa
- Molecular Biophysics and Integrated BioimagingLawrence Berkeley National LaboratoryBerkeleyCaliforniaUSA
| | | | - Torsten Schwede
- Biozentrum, University of BaselBaselSwitzerland
- Computational Structural BiologySIB Swiss Institute of BioinformaticsBaselSwitzerland
| |
Collapse
|
42
|
Kryshtafovych A, Moult J, Albrecht R, Chang GA, Chao K, Fraser A, Greenfield J, Hartmann MD, Herzberg O, Josts I, Leiman PG, Linden SB, Lupas AN, Nelson DC, Rees SD, Shang X, Sokolova ML, Tidow H. Computational models in the service of X-ray and cryo-electron microscopy structure determination. Proteins 2021; 89:1633-1646. [PMID: 34449113 PMCID: PMC8616789 DOI: 10.1002/prot.26223] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 08/11/2021] [Accepted: 08/17/2021] [Indexed: 01/20/2023]
Abstract
Critical assessment of structure prediction (CASP) conducts community experiments to determine the state of the art in computing protein structure from amino acid sequence. The process relies on the experimental community providing information about not yet public or about to be solved structures, for use as targets. For some targets, the experimental structure is not solved in time for use in CASP. Calculated structure accuracy improved dramatically in this round, implying that models should now be much more useful for resolving many sorts of experimental difficulties. To test this, selected models for seven unsolved targets were provided to the experimental groups. These models were from the AlphaFold2 group, who overall submitted the most accurate predictions in CASP14. Four targets were solved with the aid of the models, and, additionally, the structure of an already solved target was improved. An a posteriori analysis showed that, in some cases, models from other groups would also be effective. This paper provides accounts of the successful application of models to structure determination, including molecular replacement for X-ray crystallography, backbone tracing and sequence positioning in a cryo-electron microscopy structure, and correction of local features. The results suggest that, in future, there will be greatly increased synergy between computational and experimental approaches to structure determination.
Collapse
Affiliation(s)
| | - John Moult
- Institute for Bioscience and Biotechnology Research, Department of Cell Biology and Molecular genetics, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA
| | - Reinhard Albrecht
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Geoffrey A. Chang
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California-San Diego, La Jolla, CA, 92093, USA
- Department of Pharmacology, University of California-San Diego, La Jolla, CA, 92093, USA
| | - Kinlin Chao
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
| | - Alec Fraser
- Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics (SCSB), The University of Texas Medical Branch at Galveston, TX 77555, USA
| | - Julia Greenfield
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
| | - Marcus D. Hartmann
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Osnat Herzberg
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742, USA
| | - Inokentijs Josts
- The Hamburg Advanced Research Center for Bioorganic Chemistry (HARBOR) & Department of Chemistry, Institute for Biochemistry and Molecular Biology, University of Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany
| | - Petr G. Leiman
- Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics (SCSB), The University of Texas Medical Branch at Galveston, TX 77555, USA
| | - Sara B. Linden
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
| | - Andrei N. Lupas
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Daniel C. Nelson
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
- Department of Veterinary Medicine, University of Maryland, College Park, MD 20742, USA
| | - Steven D. Rees
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California-San Diego, La Jolla, CA, 92093, USA
| | - Xiaoran Shang
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
| | - Maria L. Sokolova
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
| | - Henning Tidow
- The Hamburg Advanced Research Center for Bioorganic Chemistry (HARBOR) & Department of Chemistry, Institute for Biochemistry and Molecular Biology, University of Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany
| | | |
Collapse
|
43
|
Simpkin AJ, Rodríguez FS, Mesdaghi S, Kryshtafovych A, Rigden DJ. Evaluation of model refinement in CASP14. Proteins 2021; 89:1852-1869. [PMID: 34288138 PMCID: PMC8616799 DOI: 10.1002/prot.26185] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/19/2021] [Accepted: 07/11/2021] [Indexed: 12/15/2022]
Abstract
We report here an assessment of the model refinement category of the 14th round of Critical Assessment of Structure Prediction (CASP14). As before, predictors submitted up to five ranked refinements, along with associated residue-level error estimates, for targets that had a wide range of starting quality. The ability of groups to accurately rank their submissions and to predict coordinate error varied widely. Overall, only four groups out-performed a "naïve predictor" corresponding to the resubmission of the starting model. Among the top groups, there are interesting differences of approach and in the spread of improvements seen: some methods are more conservative, others more adventurous. Some targets were "double-barreled" for which predictors were offered a high-quality AlphaFold 2 (AF2)-derived prediction alongside another of lower quality. The AF2-derived models were largely unimprovable, many of their apparent errors being found to reside at domain and, especially, crystal lattice contacts. Refinement is shown to have a mixed impact overall on structure-based function annotation methods to predict nucleic acid binding, spot catalytic sites, and dock protein structures.
Collapse
Affiliation(s)
- Adam J. Simpkin
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - Filomeno Sánchez Rodríguez
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
- Life Science, Diamond Light Source, Harwell Science and Innovation Campus, Didcot, Oxfordshire OX11 0DE, England
| | - Shahram Mesdaghi
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | | | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| |
Collapse
|
44
|
Perrakis A, Sixma TK. AI revolutions in biology: The joys and perils of AlphaFold. EMBO Rep 2021; 22:e54046. [PMID: 34668287 PMCID: PMC8567224 DOI: 10.15252/embr.202154046] [Citation(s) in RCA: 82] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 10/05/2021] [Indexed: 11/30/2022] Open
Affiliation(s)
- Anastassis Perrakis
- Oncode Institute and Division of Biochemistry, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Titia K Sixma
- Oncode Institute and Division of Biochemistry, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| |
Collapse
|
45
|
David A, Islam S, Tankhilevich E, Sternberg MJE. The AlphaFold Database of Protein Structures: A Biologist's Guide. J Mol Biol 2021; 434:167336. [PMID: 34757056 PMCID: PMC8783046 DOI: 10.1016/j.jmb.2021.167336] [Citation(s) in RCA: 129] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 10/25/2021] [Accepted: 10/26/2021] [Indexed: 01/06/2023]
Abstract
AlphaFold, the deep learning algorithm developed by DeepMind, recently released the three-dimensional models of the whole human proteome to the scientific community. Here we discuss the advantages, limitations and the still unsolved challenges of the AlphaFold models from the perspective of a biologist, who may not be an expert in structural biology.
Collapse
Affiliation(s)
- Alessia David
- Centre for Integrative System Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK.
| | - Suhail Islam
- Centre for Integrative System Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK
| | - Evgeny Tankhilevich
- Centre for Integrative System Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK
| | - Michael J E Sternberg
- Centre for Integrative System Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
46
|
Jovine L. Using machine learning to study protein-protein interactions: From the uromodulin polymer to egg zona pellucida filaments. Mol Reprod Dev 2021; 88:686-693. [PMID: 34590381 DOI: 10.1002/mrd.23538] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 09/17/2021] [Indexed: 12/18/2022]
Abstract
Neural network-based models for protein structure prediction have recently reached near-experimental accuracy and are fast becoming a powerful tool in the arsenal of biologists. As suggested by initial studies using RoseTTAFold or the ColabFold implementation of AlphaFold2, a particularly interesting future development will be the optimization of these computational methods to also routinely yield high-confidence predictions of protein-protein interactions. Here I use AlphaFold2 and ColabFold to investigate the activation and polymerization of uromodulin (UMOD)/Tamm-Horsfall protein, a zona pellucida (ZP) module-containing protein whose precursor and filamentous structures have been previously determined experimentally by X-ray crystallography and cryo-EM, respectively. Despite having no knowledge of the UMOD polymer structure (coordinates for which were neither used for model training nor as template), AlphaFold2/ColabFold are able to recapitulate a crucial conformational change underlying UMOD polymerization, as well as the general organization of protein subunits within the resulting filament. This surprising result is achieved by simply deleting from the input sequence a stretch of residues that correspond to a polymerization-inhibiting C-terminal propeptide. By mimicking in silico the activating effect of propeptide dissociation triggered by site-specific proteolysis of the protein precursor, this example has implications for the assembly of egg coat proteins and the many other molecules that also contain a ZP module. Most importantly, it shows the potential of exploiting machine learning not only to accurately predict the structures of individual proteins or complexes, but also to carry out computational experiments replicating specific molecular events.
Collapse
Affiliation(s)
- Luca Jovine
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| |
Collapse
|
47
|
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, Bridgland A, Meyer C, Kohl SAA, Ballard AJ, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Petersen S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior AW, Kavukcuoglu K, Kohli P, Hassabis D. Highly accurate protein structure prediction with AlphaFold. Nature 2021; 596:583-589. [PMID: 34265844 PMCID: PMC8371605 DOI: 10.1038/s41586-021-03819-2] [Citation(s) in RCA: 18073] [Impact Index Per Article: 6024.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 07/12/2021] [Indexed: 02/07/2023]
Abstract
Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort1-4, the structures of around 100,000 unique proteins have been determined5, but this represents a small fraction of the billions of known protein sequences6,7. Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'8-has been an important open research problem for more than 50 years9. Despite recent progress10-14, existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14)15, demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, South Korea
- Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | | | | | | | | | | | | | | | | | | |
Collapse
|