1
|
King JE, Koes DR. Interpreting forces as deep learning gradients improves quality of predicted protein structures. Biophys J 2024; 123:2730-2739. [PMID: 38104241 PMCID: PMC11393680 DOI: 10.1016/j.bpj.2023.12.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 10/30/2023] [Accepted: 12/12/2023] [Indexed: 12/19/2023] Open
Abstract
Protein structure predictions from deep learning models like AlphaFold2, despite their remarkable accuracy, are likely insufficient for direct use in downstream tasks like molecular docking. The functionality of such models could be improved with a combination of increased accuracy and physical intuition. We propose a new method to train deep learning protein structure prediction models using molecular dynamics force fields to work toward these goals. Our custom PyTorch loss function, OpenMM-Loss, represents the potential energy of a predicted structure. OpenMM-Loss can be applied to any all-atom representation of a protein structure capable of mapping into our software package, SidechainNet. We demonstrate our method's efficacy by finetuning OpenFold. We show that subsequently predicted protein structures, both before and after a relaxation procedure, exhibit comparable accuracy while displaying lower potential energy and improved structural quality as assessed by MolProbity metrics.
Collapse
Affiliation(s)
- Jonathan Edward King
- Joint PhD Program in Computational Biology, Carnegie Mellon University-University of Pittsburgh, Pittsburgh, Pennsylvania
| | - David Ryan Koes
- Computational & Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania.
| |
Collapse
|
2
|
McGuffin LJ, Alharbi SMA. ModFOLD9: A Web Server for Independent Estimates of 3D Protein Model Quality. J Mol Biol 2024; 436:168531. [PMID: 39237204 DOI: 10.1016/j.jmb.2024.168531] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 02/19/2024] [Accepted: 03/06/2024] [Indexed: 09/07/2024]
Abstract
Accurate models of protein tertiary structures are now available from numerous advanced prediction methods, although the accuracy of each method often varies depending on the specific protein target. Additionally, many models may still contain significant local errors. Therefore, reliable, independent model quality estimates are essential both for identifying errors and selecting the very best models for further biological investigations. ModFOLD9 is a leading independent server for detecting the local errors in models produced by any method, and it can accurately discriminate between high-quality models from multiple alternative approaches. ModFOLD9 incorporates several new scores from deep learning-based approaches, leading to greatly improved prediction accuracy compared with earlier versions of the server. ModFOLD9 is continuously independently benchmarked, and it is shown to be highly competitive with other public servers. ModFOLD9 is freely available at https://www.reading.ac.uk/bioinf/ModFOLD/.
Collapse
|
3
|
Chen Y, Xu Y, Liu D, Xing Y, Gong H. An end-to-end framework for the prediction of protein structure and fitness from single sequence. Nat Commun 2024; 15:7400. [PMID: 39191788 DOI: 10.1038/s41467-024-51776-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 08/19/2024] [Indexed: 08/29/2024] Open
Abstract
Significant research progress has been made in the field of protein structure and fitness prediction. Particularly, single-sequence-based structure prediction methods like ESMFold and OmegaFold achieve a balance between inference speed and prediction accuracy, showing promise for many downstream prediction tasks. Here, we propose SPIRED, a single-sequence-based structure prediction model that exhibits comparable performance to the state-of-the-art methods but with approximately 5-fold acceleration in inference and at least one order of magnitude reduction in training consumption. By integrating SPIRED with downstream neural networks, we compose an end-to-end framework named SPIRED-Fitness for the rapid prediction of both protein structure and fitness from single sequence with satisfactory accuracy. Moreover, SPIRED-Stab, the derivative of SPIRED-Fitness, achieves state-of-the-art performance in predicting the mutational effects on protein stability.
Collapse
Affiliation(s)
- Yinghui Chen
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Frontier Research Center for Biological Structure, Tsinghua University, Beijing, China
| | - Yunxin Xu
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Frontier Research Center for Biological Structure, Tsinghua University, Beijing, China
| | - Di Liu
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Frontier Research Center for Biological Structure, Tsinghua University, Beijing, China
| | - Yaoguang Xing
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Frontier Research Center for Biological Structure, Tsinghua University, Beijing, China
| | - Haipeng Gong
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China.
- Beijing Frontier Research Center for Biological Structure, Tsinghua University, Beijing, China.
| |
Collapse
|
4
|
Margelevičius M. GTalign: spatial index-driven protein structure alignment, superposition, and search. Nat Commun 2024; 15:7305. [PMID: 39181863 PMCID: PMC11344802 DOI: 10.1038/s41467-024-51669-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 08/14/2024] [Indexed: 08/27/2024] Open
Abstract
With protein databases growing rapidly due to advances in structural and computational biology, the ability to accurately align and rapidly search protein structures has become essential for biological research. In response to the challenge posed by vast protein structure repositories, GTalign offers an innovative solution to protein structure alignment and search-an algorithm that achieves optimal superposition at high speeds. Through the design and implementation of spatial structure indexing, GTalign parallelizes all stages of superposition search across residues and protein structure pairs, yielding rapid identification of optimal superpositions. Rigorous evaluation across diverse datasets reveals GTalign as the most accurate among structure aligners while presenting orders of magnitude in speedup at state-of-the-art accuracy. GTalign's high speed and accuracy make it useful for numerous applications, including functional inference, evolutionary analyses, protein design, and drug discovery, contributing to advancing understanding of protein structure and function.
Collapse
|
5
|
Chen L, Li Q, Nasif KFA, Xie Y, Deng B, Niu S, Pouriyeh S, Dai Z, Chen J, Xie CY. AI-Driven Deep Learning Techniques in Protein Structure Prediction. Int J Mol Sci 2024; 25:8426. [PMID: 39125995 PMCID: PMC11313475 DOI: 10.3390/ijms25158426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 07/29/2024] [Accepted: 07/29/2024] [Indexed: 08/12/2024] Open
Abstract
Protein structure prediction is important for understanding their function and behavior. This review study presents a comprehensive review of the computational models used in predicting protein structure. It covers the progression from established protein modeling to state-of-the-art artificial intelligence (AI) frameworks. The paper will start with a brief introduction to protein structures, protein modeling, and AI. The section on established protein modeling will discuss homology modeling, ab initio modeling, and threading. The next section is deep learning-based models. It introduces some state-of-the-art AI models, such as AlphaFold (AlphaFold, AlphaFold2, AlphaFold3), RoseTTAFold, ProteinBERT, etc. This section also discusses how AI techniques have been integrated into established frameworks like Swiss-Model, Rosetta, and I-TASSER. The model performance is compared using the rankings of CASP14 (Critical Assessment of Structure Prediction) and CASP15. CASP16 is ongoing, and its results are not included in this review. Continuous Automated Model EvaluatiOn (CAMEO) complements the biennial CASP experiment. Template modeling score (TM-score), global distance test total score (GDT_TS), and Local Distance Difference Test (lDDT) score are discussed too. This paper then acknowledges the ongoing difficulties in predicting protein structure and emphasizes the necessity of additional searches like dynamic protein behavior, conformational changes, and protein-protein interactions. In the application section, this paper introduces some applications in various fields like drug design, industry, education, and novel protein development. In summary, this paper provides a comprehensive overview of the latest advancements in established protein modeling and deep learning-based models for protein structure predictions. It emphasizes the significant advancements achieved by AI and identifies potential areas for further investigation.
Collapse
Affiliation(s)
- Lingtao Chen
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Qiaomu Li
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Kazi Fahim Ahmad Nasif
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Ying Xie
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Bobin Deng
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Shuteng Niu
- Department of Computer Science, Bowling Green State University, Bowling Green, OH 43403, USA;
| | - Seyedamin Pouriyeh
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Zhiyu Dai
- Division of Pulmonary and Critical Care Medicine, John T. Milliken Department of Medicine, Washington University School of Medicine in St. Louis, St. Louis, MO 63110, USA;
| | - Jiawei Chen
- College of Computing, Data Science and Society, University of California, Berkeley, CA 94720, USA;
| | - Chloe Yixin Xie
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| |
Collapse
|
6
|
Waterhouse AM, Studer G, Robin X, Bienert S, Tauriello G, Schwede T. The structure assessment web server: for proteins, complexes and more. Nucleic Acids Res 2024; 52:W318-W323. [PMID: 38634802 PMCID: PMC11223858 DOI: 10.1093/nar/gkae270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 03/21/2024] [Accepted: 04/02/2024] [Indexed: 04/19/2024] Open
Abstract
The 'structure assessment' web server is a one-stop shop for interactive evaluation and benchmarking of structural models of macromolecular complexes including proteins and nucleic acids. A user-friendly web dashboard links sequence with structure information and results from a variety of state-of-the-art tools, which facilitates the visual exploration and evaluation of structure models. The dashboard integrates stereochemistry information, secondary structure information, global and local model quality assessment of the tertiary structure of comparative protein models, as well as prediction of membrane location. In addition, a benchmarking mode is available where a model can be compared to a reference structure, providing easy access to scores that have been used in recent CASP experiments and CAMEO. The structure assessment web server is available at https://swissmodel.expasy.org/assess.
Collapse
Affiliation(s)
- Andrew M Waterhouse
- Biozentrum, University of Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Gabriel Studer
- Biozentrum, University of Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Xavier Robin
- Biozentrum, University of Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| |
Collapse
|
7
|
Han Y, Lu Y, Yan X, Cui H, Cheng S, Zheng J, Zhou Y, Wang S, Li Z. Atom-ProteinQA: Atom-level protein model quality assessment through fine-grained joint learning. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 249:108078. [PMID: 38537495 DOI: 10.1016/j.cmpb.2024.108078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 12/26/2023] [Accepted: 02/10/2024] [Indexed: 04/21/2024]
Abstract
MOTIVATION Protein model quality assessment (ProteinQA) is a fundamental task that is essential for biologically relevant applications, i.e., protein structure refinement, protein design, etc. Previous works aimed to conduct ProteinQA only on the global structure or per-residue level, ignoring potentially usable and precise cues from a fine-grained per-atom perspective. In this study, we propose an atom-level ProteinQA model, named Atom-ProteinQA, in which two innovative modules are designed to extract geometric and topological atom-level relationships respectively. Specifically, on the one hand, a geometric perception module exploits 3D sparse convolution to capture the geometric features of the input protein, generating fine-grained atom-level predictions. On the other hand, natural chemical bonds are utilized to construct an atom-level graph, then message passing from a topological perception module is applied to output residue-level predictions in parallel. Eventually, through a cross-model aggregation module, features from different modules mutually interact, enhancing performance on both the atom and residue levels. RESULTS Extensive experiments show that our proposed Atom-ProteinQA outperforms previous methods by a large margin, regardless of residue-level or atom-level assessment. Concretely, we achieved state-of-the-art performance on CATH-2084, Decoy-8000, public benchmarks CASP13 & CASP14, and the CAMEO. AVAILABILITY The repository of this project is released on: https://github.com/luyfcandy/Atom_ProteinQA.
Collapse
Affiliation(s)
- Yatong Han
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Yingfeng Lu
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Xu Yan
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Hannah Cui
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | | | - Jiayou Zheng
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Yuzhe Zhou
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd., Shanghai, 200030, China.
| | - Zhen Li
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China.
| |
Collapse
|
8
|
Yin S, Mi X, Shukla D. Leveraging machine learning models for peptide-protein interaction prediction. RSC Chem Biol 2024; 5:401-417. [PMID: 38725911 PMCID: PMC11078210 DOI: 10.1039/d3cb00208j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 02/07/2024] [Indexed: 05/12/2024] Open
Abstract
Peptides play a pivotal role in a wide range of biological activities through participating in up to 40% protein-protein interactions in cellular processes. They also demonstrate remarkable specificity and efficacy, making them promising candidates for drug development. However, predicting peptide-protein complexes by traditional computational approaches, such as docking and molecular dynamics simulations, still remains a challenge due to high computational cost, flexible nature of peptides, and limited structural information of peptide-protein complexes. In recent years, the surge of available biological data has given rise to the development of an increasing number of machine learning models for predicting peptide-protein interactions. These models offer efficient solutions to address the challenges associated with traditional computational approaches. Furthermore, they offer enhanced accuracy, robustness, and interpretability in their predictive outcomes. This review presents a comprehensive overview of machine learning and deep learning models that have emerged in recent years for the prediction of peptide-protein interactions.
Collapse
Affiliation(s)
- Song Yin
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign Urbana 61801 Illinois USA
| | - Xuenan Mi
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign Urbana IL 61801 USA
| | - Diwakar Shukla
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign Urbana 61801 Illinois USA
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign Urbana IL 61801 USA
- Department of Bioengineering, University of Illinois Urbana-Champaign Urbana IL 61801 USA
| |
Collapse
|
9
|
Sarvmeili J, Baghban Kohnehrouz B, Gholizadeh A, Shanehbandi D, Ofoghi H. Immunoinformatics design of a structural proteins driven multi-epitope candidate vaccine against different SARS-CoV-2 variants based on fynomer. Sci Rep 2024; 14:10297. [PMID: 38704475 PMCID: PMC11069592 DOI: 10.1038/s41598-024-61025-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 04/30/2024] [Indexed: 05/06/2024] Open
Abstract
The ideal vaccines for combating diseases that may emerge in the future require more than simply inactivating a few pathogenic strains. This study aims to provide a peptide-based multi-epitope vaccine effective against various severe acute respiratory syndrome coronavirus 2 strains. To design the vaccine, a library of peptides from the spike, nucleocapsid, membrane, and envelope structural proteins of various strains was prepared. Then, the final vaccine structure was optimized using the fully protected epitopes and the fynomer scaffold. Using bioinformatics tools, the antigenicity, allergenicity, toxicity, physicochemical properties, population coverage, and secondary and three-dimensional structures of the vaccine candidate were evaluated. The bioinformatic analyses confirmed the high quality of the vaccine. According to further investigations, this structure is similar to native protein and there is a stable and strong interaction between vaccine and receptors. Based on molecular dynamics simulation, structural compactness and stability in binding were also observed. In addition, the immune simulation showed that the vaccine can stimulate immune responses similar to real conditions. Finally, codon optimization and in silico cloning confirmed efficient expression in Escherichia coli. In conclusion, the fynomer-based vaccine can be considered as a new style in designing and updating vaccines to protect against coronavirus disease.
Collapse
Affiliation(s)
- Javad Sarvmeili
- Department of Plant Breeding and Biotechnology, University of Tabriz, Tabriz, 51666, Iran
| | | | - Ashraf Gholizadeh
- Department of Animal Biology, University of Tabriz, Tabriz, 51666, Iran
| | - Dariush Shanehbandi
- Department of Immunology, Tabriz University of Medical Sciences, Tabriz, 51666, Iran
| | - Hamideh Ofoghi
- Department of Biotechnology, Iranian Research Organization for Science and Technology, Tehran, 33131, Iran
| |
Collapse
|
10
|
Morehead A, Liu J, Cheng J. Protein structure accuracy estimation using geometry-complete perceptron networks. Protein Sci 2024; 33:e4932. [PMID: 38380738 PMCID: PMC10880424 DOI: 10.1002/pro.4932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 01/05/2024] [Accepted: 02/01/2024] [Indexed: 02/22/2024]
Abstract
Estimating the accuracy of protein structural models is a critical task in protein bioinformatics. The need for robust methods in the estimation of protein model accuracy (EMA) is prevalent in the field of protein structure prediction, where computationally-predicted structures need to be screened rapidly for the reliability of the positions predicted for each of their amino acid residues and their overall quality. Current methods proposed for EMA are either coupled tightly to existing protein structure prediction methods or evaluate protein structures without sufficiently leveraging the rich, geometric information available in such structures to guide accuracy estimation. In this work, we propose a geometric message passing neural network referred to as the geometry-complete perceptron network for protein structure EMA (GCPNet-EMA), where we demonstrate through rigorous computational benchmarks that GCPNet-EMA's accuracy estimations are 47% faster and more than 10% (6%) more correlated with ground-truth measures of per-residue (per-target) structural accuracy compared to baseline state-of-the-art methods for tertiary (multimer) structure EMA including AlphaFold 2. The source code and data for GCPNet-EMA are available on GitHub, and a public web server implementation is freely available.
Collapse
Affiliation(s)
- Alex Morehead
- Department of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Jian Liu
- Department of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| |
Collapse
|
11
|
Beton JG, Mulvaney T, Cragnolini T, Topf M. Cryo-EM structure and B-factor refinement with ensemble representation. Nat Commun 2024; 15:444. [PMID: 38200043 PMCID: PMC10781738 DOI: 10.1038/s41467-023-44593-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 12/20/2023] [Indexed: 01/12/2024] Open
Abstract
Cryo-EM experiments produce images of macromolecular assemblies that are combined to produce three-dimensional density maps. Typically, atomic models of the constituent molecules are fitted into these maps, followed by a density-guided refinement. We introduce TEMPy-ReFF, a method for atomic structure refinement in cryo-EM density maps. Our method represents atomic positions as components of a Gaussian mixture model, utilising their variances as B-factors, which are used to derive an ensemble description. Extensively tested on a substantial dataset of 229 cryo-EM maps from EMDB ranging in resolution from 2.1-4.9 Å with corresponding PDB and CERES atomic models, our results demonstrate that TEMPy-ReFF ensembles provide a superior representation of cryo-EM maps. On a single-model basis, it performs similarly to the CERES re-refinement protocol, although there are cases where it provides a better fit to the map. Furthermore, our method enables the creation of composite maps free of boundary artefacts. TEMPy-ReFF is useful for better interpretation of flexible structures, such as those involving RNA, DNA or ligands.
Collapse
Affiliation(s)
- Joseph G Beton
- Leibniz Institute of Virology (LIV) and Universitätsklinikum Hamburg Eppendorf (UKE), Centre for Structural Systems Biology (CSSB), 22607, Hamburg, Germany
| | - Thomas Mulvaney
- Leibniz Institute of Virology (LIV) and Universitätsklinikum Hamburg Eppendorf (UKE), Centre for Structural Systems Biology (CSSB), 22607, Hamburg, Germany
| | - Tristan Cragnolini
- Leibniz Institute of Virology (LIV) and Universitätsklinikum Hamburg Eppendorf (UKE), Centre for Structural Systems Biology (CSSB), 22607, Hamburg, Germany
- Institute of Structural and Molecular Biology, Birkbeck, University of London, London, UK
| | - Maya Topf
- Leibniz Institute of Virology (LIV) and Universitätsklinikum Hamburg Eppendorf (UKE), Centre for Structural Systems Biology (CSSB), 22607, Hamburg, Germany.
| |
Collapse
|
12
|
Das R, Kretsch RC, Simpkin AJ, Mulvaney T, Pham P, Rangan R, Bu F, Keegan RM, Topf M, Rigden DJ, Miao Z, Westhof E. Assessment of three-dimensional RNA structure prediction in CASP15. Proteins 2023; 91:1747-1770. [PMID: 37876231 PMCID: PMC10841292 DOI: 10.1002/prot.26602] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/21/2023] [Accepted: 09/07/2023] [Indexed: 10/26/2023]
Abstract
The prediction of RNA three-dimensional structures remains an unsolved problem. Here, we report assessments of RNA structure predictions in CASP15, the first CASP exercise that involved RNA structure modeling. Forty-two predictor groups submitted models for at least one of twelve RNA-containing targets. These models were evaluated by the RNA-Puzzles organizers and, separately, by a CASP-recruited team using metrics (GDT, lDDT) and approaches (Z-score rankings) initially developed for assessment of proteins and generalized here for RNA assessment. The two assessments independently ranked the same predictor groups as first (AIchemy_RNA2), second (Chen), and third (RNAPolis and GeneSilico, tied); predictions from deep learning approaches were significantly worse than these top ranked groups, which did not use deep learning. Further analyses based on direct comparison of predicted models to cryogenic electron microscopy (cryo-EM) maps and x-ray diffraction data support these rankings. With the exception of two RNA-protein complexes, models submitted by CASP15 groups correctly predicted the global fold of the RNA targets. Comparisons of CASP15 submissions to designed RNA nanostructures as well as molecular replacement trials highlight the potential utility of current RNA modeling approaches for RNA nanotechnology and structural biology, respectively. Nevertheless, challenges remain in modeling fine details such as noncanonical pairs, in ranking among submitted models, and in prediction of multiple structures resolved by cryo-EM or crystallography.
Collapse
Affiliation(s)
- Rhiju Das
- Department of Biochemistry, Stanford University School of Medicine, CA USA
- Biophysics Program, Stanford University School of Medicine, CA USA
- Howard Hughes Medical Institute, Stanford University, CA USA
| | | | - Adam J. Simpkin
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Thomas Mulvaney
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV), Hamburg, Germany
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Phillip Pham
- Department of Biochemistry, Stanford University School of Medicine, CA USA
| | - Ramya Rangan
- Biophysics Program, Stanford University School of Medicine, CA USA
| | - Fan Bu
- Guangzhou Laboratory, Guangzhou International Bio Island, Guangzhou 510005, China
- Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230036, Anhui, China
| | - Ronan M. Keegan
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
- Life Science, Diamond Light Source, Harwell Science, UK
| | - Maya Topf
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV), Hamburg, Germany
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Daniel J. Rigden
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People's Hospital, School of Medicine, Tongji University, Shanghai 200434, China
| | - Eric Westhof
- Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, F-67084, Strasbourg, France
| |
Collapse
|
13
|
Leemann M, Sagasta A, Eberhardt J, Schwede T, Robin X, Durairaj J. Automated benchmarking of combined protein structure and ligand conformation prediction. Proteins 2023; 91:1912-1924. [PMID: 37885318 DOI: 10.1002/prot.26605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 09/15/2023] [Accepted: 09/21/2023] [Indexed: 10/28/2023]
Abstract
The prediction of protein-ligand complexes (PLC), using both experimental and predicted structures, is an active and important area of research, underscored by the inclusion of the Protein-Ligand Interaction category in the latest round of the Critical Assessment of Protein Structure Prediction experiment CASP15. The prediction task in CASP15 consisted of predicting both the three-dimensional structure of the receptor protein as well as the position and conformation of the ligand. This paper addresses the challenges and proposed solutions for devising automated benchmarking techniques for PLC prediction. The reliability of experimentally solved PLC as ground truth reference structures is assessed using various validation criteria. Similarity of PLC to previously released complexes are employed to judge PLC diversity and the difficulty of a PLC as a prediction target. We show that the commonly used PDBBind time-split test-set is inappropriate for comprehensive PLC evaluation, with state-of-the-art tools showing conflicting results on a more representative and high quality dataset constructed for benchmarking purposes. We also show that redocking on crystal structures is a much simpler task than docking into predicted protein models, demonstrated by the two PLC-prediction-specific scoring metrics created. Finally, we introduce a fully automated pipeline that predicts PLC and evaluates the accuracy of the protein structure, ligand pose, and protein-ligand interactions.
Collapse
Affiliation(s)
- Michèle Leemann
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Ander Sagasta
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Jerome Eberhardt
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Xavier Robin
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Janani Durairaj
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
14
|
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)-Round XV. Proteins 2023; 91:1539-1549. [PMID: 37920879 PMCID: PMC10843301 DOI: 10.1002/prot.26617] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 10/06/2023] [Indexed: 11/04/2023]
Abstract
Computing protein structure from amino acid sequence information has been a long-standing grand challenge. Critical assessment of structure prediction (CASP) conducts community experiments aimed at advancing solutions to this and related problems. Experiments are conducted every 2 years. The 2020 experiment (CASP14) saw major progress, with the second generation of deep learning methods delivering accuracy comparable with experiment for many single proteins. There is an expectation that these methods will have much wider application in computational structural biology. Here we summarize results from the most recent experiment, CASP15, in 2022, with an emphasis on new deep learning-driven progress. Other papers in this special issue of proteins provide more detailed analysis. For single protein structures, the AlphaFold2 deep learning method is still superior to other approaches, but there are two points of note. First, although AlphaFold2 was the core of all the most successful methods, there was a wide variety of implementation and combination with other methods. Second, using the standard AlphaFold2 protocol and default parameters only produces the highest quality result for about two thirds of the targets, and more extensive sampling is required for the others. The major advance in this CASP is the enormous increase in the accuracy of computed protein complexes, achieved by the use of deep learning methods, although overall these do not fully match the performance for single proteins. Here too, AlphaFold2 based method perform best, and again more extensive sampling than the defaults is often required. Also of note are the encouraging early results on the use of deep learning to compute ensembles of macromolecular structures. Critically for the usability of computed structures, for both single proteins and protein complexes, deep learning derived estimates of both local and global accuracy are of high quality, however the estimates in interface regions are slightly less reliable. CASP15 also included computation of RNA structures for the first time. Here, the classical approaches produced better agreement with experiment than the new deep learning ones, and accuracy is limited. Also, for the first time, CASP included the computation of protein-ligand complexes, an area of special interest for drug design. Here too, classical methods were still superior to deep learning ones. Many new approaches were discussed at the CASP conference, and it is clear methods will continue to advance.
Collapse
Affiliation(s)
| | - Torsten Schwede
- University of Basel, Biozentrum & SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Maya Topf
- Centre for Structural Systems Biology, Leibniz-Institut für Experimentelle Virologie and Universitätsklinikum Hamburg-Eppendorf (UKE), Hamburg, Germany
| | | | - John Moult
- Institute for Bioscience and Biotechnology Research, Rockville, MD, USA, and Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, USA
| |
Collapse
|
15
|
Das R, Kretsch RC, Simpkin AJ, Mulvaney T, Pham P, Rangan R, Bu F, Keegan RM, Topf M, Rigden DJ, Miao Z, Westhof E. Assessment of three-dimensional RNA structure prediction in CASP15. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.25.538330. [PMID: 37162955 PMCID: PMC10168427 DOI: 10.1101/2023.04.25.538330] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
The prediction of RNA three-dimensional structures remains an unsolved problem. Here, we report assessments of RNA structure predictions in CASP15, the first CASP exercise that involved RNA structure modeling. Forty two predictor groups submitted models for at least one of twelve RNA-containing targets. These models were evaluated by the RNA-Puzzles organizers and, separately, by a CASP-recruited team using metrics (GDT, lDDT) and approaches (Z-score rankings) initially developed for assessment of proteins and generalized here for RNA assessment. The two assessments independently ranked the same predictor groups as first (AIchemy_RNA2), second (Chen), and third (RNAPolis and GeneSilico, tied); predictions from deep learning approaches were significantly worse than these top ranked groups, which did not use deep learning. Further analyses based on direct comparison of predicted models to cryogenic electron microscopy (cryo-EM) maps and X-ray diffraction data support these rankings. With the exception of two RNA-protein complexes, models submitted by CASP15 groups correctly predicted the global fold of the RNA targets. Comparisons of CASP15 submissions to designed RNA nanostructures as well as molecular replacement trials highlight the potential utility of current RNA modeling approaches for RNA nanotechnology and structural biology, respectively. Nevertheless, challenges remain in modeling fine details such as non-canonical pairs, in ranking among submitted models, and in prediction of multiple structures resolved by cryo-EM or crystallography.
Collapse
Affiliation(s)
- Rhiju Das
- Department of Biochemistry, Stanford University School of Medicine, CA USA
- Biophysics Program, Stanford University School of Medicine, CA USA
- Howard Hughes Medical Institute, Stanford University, CA USA
| | | | - Adam J. Simpkin
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Thomas Mulvaney
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV)
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Phillip Pham
- Department of Biochemistry, Stanford University School of Medicine, CA USA
| | - Ramya Rangan
- Biophysics Program, Stanford University School of Medicine, CA USA
| | - Fan Bu
- Guangzhou Laboratory, Guangzhou International Bio Island, Guangzhou 510005, China
- Division of Life Sciences and Medicine,University of Science and Technology of China, Hefei 230036, Anhui, China
| | - Ronan M. Keegan
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
- Life Science, Diamond Light Source, Harwell Science, UK
| | - Maya Topf
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV)
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Daniel J. Rigden
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People’s Hospital, School of Medicine, Tongji University, Shanghai 200434, China
| | - Eric Westhof
- Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, F-67084, Strasbourg, France
| |
Collapse
|
16
|
Mahtha SK, Kumari K, Gaur V, Yadav G. Cavity architecture based modulation of ligand binding tunnels in plant START domains. Comput Struct Biotechnol J 2023; 21:3946-3963. [PMID: 37635766 PMCID: PMC10448341 DOI: 10.1016/j.csbj.2023.07.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2023] Open
Abstract
The Steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domain represents an evolutionarily conserved superfamily of lipid transfer proteins widely distributed across the tree of life. Despite significant expansion in plants, knowledge about this domain remains inadequate in plants. In this work, we explore the role of cavity architectural modulations in START protein evolution and functional diversity. We use deep-learning approaches to generate plant START domain models, followed by surface accessibility studies and a comprehensive structural investigation of the rice START family. We validate 28 rice START domain models, delineate binding cavities, measure pocket volumes, and compare these with mammalian counterparts to understand evolution of binding preferences. Overall, plant START domains retain the ancestral α/β helix-grip signature, but we find subtle variation in cavity architectures, resulting in significantly smaller ligand-binding tunnels in the plant kingdom. We identify cavity lining residues (CLRs) responsible for reduction in ancestral tunnel space, and these appear to be class specific, and unique to plants, providing a mechanism for the observed shift in domain function. For instance, mammalian cavity lining residues A135, G181 and A192 have evolved to larger CLRs across the plant kingdom, contributing to smaller sizes, minimal STARTs being the largest, while members of type-IV HD-Zip family show almost complete obliteration of lipid binding cavities, consistent with their present-day DNA binding functions. In summary, this work quantifies plant START structural & functional divergence, bridging current knowledge gaps.
Collapse
Affiliation(s)
| | - Kamlesh Kumari
- National Institute of Plant Genome Research, New Delhi 110067, India
| | - Vineet Gaur
- National Institute of Plant Genome Research, New Delhi 110067, India
| | - Gitanjali Yadav
- National Institute of Plant Genome Research, New Delhi 110067, India
| |
Collapse
|
17
|
Wodak SJ, Vajda S, Lensink MF, Kozakov D, Bates PA. Critical Assessment of Methods for Predicting the 3D Structure of Proteins and Protein Complexes. Annu Rev Biophys 2023; 52:183-206. [PMID: 36626764 PMCID: PMC10885158 DOI: 10.1146/annurev-biophys-102622-084607] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Advances in a scientific discipline are often measured by small, incremental steps. In this review, we report on two intertwined disciplines in the protein structure prediction field, modeling of single chains and modeling of complexes, that have over decades emulated this pattern, as monitored by the community-wide blind prediction experiments CASP and CAPRI. However, over the past few years, dramatic advances were observed for the accurate prediction of single protein chains, driven by a surge of deep learning methodologies entering the prediction field. We review the mainscientific developments that enabled these recent breakthroughs and feature the important role of blind prediction experiments in building up and nurturing the structure prediction field. We discuss how the new wave of artificial intelligence-based methods is impacting the fields of computational and experimental structural biology and highlight areas in which deep learning methods are likely to lead to future developments, provided that major challenges are overcome.
Collapse
Affiliation(s)
- Shoshana J Wodak
- VIB-VUB Center for Structural Biology, Vrije Universiteit Brussel, Brussels, Belgium;
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA;
- Department of Chemistry, Boston University, Boston, Massachusetts, USA
| | - Marc F Lensink
- Univ. Lille, CNRS, UMR 8576-UGSF-Unité de Glycobiologie Structurale et Fonctionnelle, Lille, France;
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA;
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Paul A Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, United Kingdom;
| |
Collapse
|
18
|
Gerloff DL, Ilina EI, Cialini C, Mata Salcedo U, Mittelbronn M, Müller T. Prediction and verification of glycosyltransferase activity by bioinformatics analysis and protein engineering. STAR Protoc 2023; 4:101905. [PMID: 36528856 PMCID: PMC9792956 DOI: 10.1016/j.xpro.2022.101905] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 10/01/2022] [Accepted: 11/14/2022] [Indexed: 12/23/2022] Open
Abstract
A significant number of proteins are annotated as functionally uncharacterized proteins. Within this protocol, we describe how to use protein family multiple sequence alignments and structural bioinformatics resources to design loss-of-function mutations of previously uncharacterized proteins within the glycosyltransferase family. We detail approaches to determine target protein active sites using three-dimensional modeling. We generate active site mutants and quantify any changes in enzymatic function by a glycosyltransferase assay. With modifications, this protocol could be applied to other metal-dependent enzymes. For complete details on the use and execution of this protocol, please refer to Ilina et al. (2022).1.
Collapse
Affiliation(s)
- Dietlind L Gerloff
- Foundation for Applied Molecular Evolution (FfAME), Alachua, FL 32615, USA
| | - Elena I Ilina
- Department of Cancer Research (DoCR), Luxembourg Institute of Health (LIH), 1526 Luxembourg, Luxembourg; Luxembourg Centre of Neuropathology (LCNP), 1526 Luxembourg, Luxembourg
| | - Camille Cialini
- Department of Cancer Research (DoCR), Luxembourg Institute of Health (LIH), 1526 Luxembourg, Luxembourg; Luxembourg Centre of Neuropathology (LCNP), 1526 Luxembourg, Luxembourg
| | - Uxue Mata Salcedo
- Department of Cancer Research (DoCR), Luxembourg Institute of Health (LIH), 1526 Luxembourg, Luxembourg; Luxembourg Centre of Neuropathology (LCNP), 1526 Luxembourg, Luxembourg
| | - Michel Mittelbronn
- Department of Cancer Research (DoCR), Luxembourg Institute of Health (LIH), 1526 Luxembourg, Luxembourg; Luxembourg Centre of Neuropathology (LCNP), 1526 Luxembourg, Luxembourg; National Center of Pathology (NCP), Laboratoire National de Santé (LNS), 3555 Dudelange, Luxembourg; Department of Life Sciences and Medicine (DLSM), University of Luxembourg, 4365 Esch sur Alzette, Luxembourg; Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 4365 Esch-sur-Alzette, Luxembourg; Faculty of Science, Technology and Medicine (FSTM), University of Luxembourg, 4365 Esch-sur-Alzette, Luxembourg
| | - Tanja Müller
- Department of Cancer Research (DoCR), Luxembourg Institute of Health (LIH), 1526 Luxembourg, Luxembourg; Luxembourg Centre of Neuropathology (LCNP), 1526 Luxembourg, Luxembourg.
| |
Collapse
|
19
|
Gniado N, Krawczyk-Balska A, Mehta P, Miszta P, Filipek S. Protein Homology Modeling for Effective Drug Design. Methods Mol Biol 2023; 2627:329-337. [PMID: 36959456 DOI: 10.1007/978-1-0716-2974-1_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
The effective drug design, especially for combating the multi-drug-resistant bacterial pathogens, requires more and more sophisticated procedures to obtain novel lead-like compounds. New classes of enzymes should be explored, especially those that help bacteria overcome existing treatments. The homology modeling is useful in obtaining the models of new enzymes; however, the active sites of them are sometimes present in closed conformations in the crystal structures, not suitable for drug design purposes. In such difficult cases, the combination of homology modeling, molecular dynamics simulations, and fragment screening can give satisfactory results.
Collapse
Affiliation(s)
- Natalia Gniado
- Faculty of Chemistry, Biological and Chemical Research Centre, University of Warsaw, Warsaw, Poland
- Department of Molecular Microbiology, Biological and Chemical Research Centre, Faculty of Biology, University of Warsaw, Warsaw, Poland
| | - Agata Krawczyk-Balska
- Department of Molecular Microbiology, Biological and Chemical Research Centre, Faculty of Biology, University of Warsaw, Warsaw, Poland
| | - Pakhuri Mehta
- Faculty of Chemistry, Biological and Chemical Research Centre, University of Warsaw, Warsaw, Poland
| | - Przemysław Miszta
- Faculty of Chemistry, Biological and Chemical Research Centre, University of Warsaw, Warsaw, Poland
| | - Sławomir Filipek
- Faculty of Chemistry, Biological and Chemical Research Centre, University of Warsaw, Warsaw, Poland.
| |
Collapse
|
20
|
Chen C, Chen X, Morehead A, Wu T, Cheng J. 3D-equivariant graph neural networks for protein model quality assessment. BIOINFORMATICS (OXFORD, ENGLAND) 2023; 39:6986970. [PMID: 36637199 PMCID: PMC10089647 DOI: 10.1093/bioinformatics/btad030] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 11/28/2022] [Accepted: 01/12/2023] [Indexed: 01/14/2023]
Abstract
MOTIVATION Quality assessment (QA) of predicted protein tertiary structure models plays an important role in ranking and using them. With the recent development of deep learning end-to-end protein structure prediction techniques for generating highly confident tertiary structures for most proteins, it is important to explore corresponding QA strategies to evaluate and select the structural models predicted by them since these models have better quality and different properties than the models predicted by traditional tertiary structure prediction methods. RESULTS We develop EnQA, a novel graph-based 3D-equivariant neural network method that is equivariant to rotation and translation of 3D objects to estimate the accuracy of protein structural models by leveraging the structural features acquired from the state-of-the-art tertiary structure prediction method-AlphaFold2. We train and test the method on both traditional model datasets (e.g. the datasets of the Critical Assessment of Techniques for Protein Structure Prediction) and a new dataset of high-quality structural models predicted only by AlphaFold2 for the proteins whose experimental structures were released recently. Our approach achieves state-of-the-art performance on protein structural models predicted by both traditional protein structure prediction methods and the latest end-to-end deep learning method-AlphaFold2. It performs even better than the model QA scores provided by AlphaFold2 itself. The results illustrate that the 3D-equivariant graph neural network is a promising approach to the evaluation of protein structural models. Integrating AlphaFold2 features with other complementary sequence and structural features is important for improving protein model QA. AVAILABILITY AND IMPLEMENTATION The source code is available at https://github.com/BioinfoMachineLearning/EnQA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chen Chen
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Xiao Chen
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Alex Morehead
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Tianqi Wu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
21
|
Improved inter-residue contact prediction via a hybrid generative model and dynamic loss function. Comput Struct Biotechnol J 2022; 20:6138-6148. [DOI: 10.1016/j.csbj.2022.11.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 11/07/2022] [Accepted: 11/07/2022] [Indexed: 11/13/2022] Open
|
22
|
Danazumi AU, Iliyasu Gital S, Idris S, BS Dibba L, Balogun EO, Górna MW. Immunoinformatic design of a putative multi-epitope vaccine candidate against Trypanosoma brucei gambiense. Comput Struct Biotechnol J 2022; 20:5574-5585. [PMID: 36284708 PMCID: PMC9576565 DOI: 10.1016/j.csbj.2022.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 09/13/2022] [Accepted: 10/02/2022] [Indexed: 11/28/2022] Open
Abstract
Human African trypanosomiasis (HAT) is a neglected tropical disease that is caused by flagellated parasites of the genus Trypanosoma. HAT imposes a significant socio-economic burden on many countries in sub-Saharan Africa and its control is hampered by several drawbacks ranging from the ineffectiveness of drugs, complex dosing regimens, drug resistance, and lack of a vaccine. Despite more than a century of research and investigations, the development of a vaccine to tackle HAT is still challenging due to the complex biology of the pathogens. Advancements in computational modeling coupled with the availability of an unprecedented amount of omics data from different organisms have allowed the design of new generation vaccines that offer better antigenicity and safety profile. One of such new generation approaches is a multi-epitope vaccine (MEV) designed from a collection of antigenic peptides. A MEV can stimulate both cellular and humoral immune responses as well as avoiding possible allergenic reactions. Herein, we take advantage of this approach to design a MEV from conserved hypothetical plasma membrane proteins of Trypanosoma brucei gambiense, the trypanosome subspecies that is responsible for the west and central African forms of HAT. The designed MEV is 402 amino acids long (41.5 kDa). It is predicted to be antigenic, non-toxic, to assume a stable 3D conformation, and to interact with a key immune receptor. In addition, immune simulation foresaw adequate immune stimulation by the putative antigen and a lasting memory. Therefore, the designed chimeric vaccine represents a potential candidate that could be used to target HAT.
Collapse
Affiliation(s)
- Ammar Usman Danazumi
- Biological and Chemical Research Centre, Department of Chemistry, University of Warsaw, Warsaw, Poland,Faculty of Chemistry, Warsaw University of Technology, Warsaw, Poland,Groningen Research Institute of Pharmacy, University of Groningen, the Netherlands,Corresponding authors at: Biological and Chemical Research Centre, Department of Chemistry, University of Warsaw, Warsaw, Poland (A.U. Danazumi, M. W. Górna).
| | | | - Salisu Idris
- Department of Biochemistry, Ahmadu Bello University, Zaria, Nigeria,Department of Medical Laboratory Science, Kazaure School of Health Technology, Jigawa, Nigeria
| | - Lamin BS Dibba
- Africa Centre of Excellence for Neglected Tropical Diseases and Forensic Biotechnology, Ahmadu Bello University, Zaria, Nigeria,Department of Physical and Natural Sciences, School of Arts and Sciences, University of the Gambia, Brikama Campus. P.O Box 3530, Serrekunda, the Gambia
| | - Emmanuel Oluwadare Balogun
- Department of Biochemistry, Ahmadu Bello University, Zaria, Nigeria,Africa Centre of Excellence for Neglected Tropical Diseases and Forensic Biotechnology, Ahmadu Bello University, Zaria, Nigeria,Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA,Department of Biomedical Chemistry, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Maria Wiktoria Górna
- Biological and Chemical Research Centre, Department of Chemistry, University of Warsaw, Warsaw, Poland,Corresponding authors at: Biological and Chemical Research Centre, Department of Chemistry, University of Warsaw, Warsaw, Poland (A.U. Danazumi, M. W. Górna).
| |
Collapse
|
23
|
Abstract
Acetylcholine is a central biological signal molecule present in all kingdoms of life. In humans, acetylcholine is the primary neurotransmitter of the peripheral nervous system; it mediates signal transmission at neuromuscular junctions. Here, we show that the opportunistic human pathogen Pseudomonas aeruginosa exhibits chemoattraction toward acetylcholine over a concentration range of 1 μM to 100 mM. The maximal magnitude of the response was superior to that of many other P. aeruginosa chemoeffectors. We demonstrate that this chemoattraction is mediated by the PctD (PA4633) chemoreceptor. Using microcalorimetry, we show that the PctD ligand-binding domain (LBD) binds acetylcholine with a equilibrium dissociation constant (KD) of 23 μM. It also binds choline and with lower affinity betaine. Highly sensitive responses to acetylcholine and choline, and less sensitive responses to betaine and l-carnitine, were observed in Escherichia coli expressing a chimeric receptor comprising the PctD-LBD fused to the Tar chemoreceptor signaling domain. We also identified the PacA (ECA_RS10935) chemoreceptor of the phytopathogen Pectobacterium atrosepticum, which binds choline and betaine but fails to recognize acetylcholine. To identify the molecular determinants for acetylcholine recognition, we report high-resolution structures of PctD-LBD (with bound acetylcholine and choline) and PacA-LBD (with bound betaine). We identified an amino acid motif in PctD-LBD that interacts with the acetylcholine tail. This motif is absent in PacA-LBD. Significant acetylcholine chemotaxis was also detected in the plant pathogens Agrobacterium tumefaciens and Dickeya solani. To the best of our knowledge, this is the first report of acetylcholine chemotaxis and extends the range of host signals perceived by bacterial chemoreceptors.
Collapse
|