1
|
Liang F, Sun M, Xie L, Zhao X, Liu D, Zhao K, Zhang G. Recent advances and challenges in protein complex model accuracy estimation. Comput Struct Biotechnol J 2024; 23:1824-1832. [PMID: 38707538 PMCID: PMC11066466 DOI: 10.1016/j.csbj.2024.04.049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Revised: 04/18/2024] [Accepted: 04/18/2024] [Indexed: 05/07/2024] Open
Abstract
Estimation of model accuracy plays a crucial role in protein structure prediction, aiming to evaluate the quality of predicted protein structure models accurately and objectively. This process is not only key to screening candidate models that are close to the real structure, but also provides guidance for further optimization of protein structures. With the significant advancements made by AlphaFold2 in monomer structure, the problem of single-domain protein structure prediction has been widely solved. Correspondingly, the importance of assessing the quality of single-domain protein models decreased, and the research focus has shifted to estimation of model accuracy of protein complexes. In this review, our goal is to provide a comprehensive overview of the reference and statistical metrics, as well as representative methods, and the current challenges within four distinct facets (Topology Global Score, Interface Total Score, Interface Residue-Wise Score, and Tertiary Residue-Wise Score) in the field of complex EMA.
Collapse
Affiliation(s)
| | | | - Lei Xie
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xuanfeng Zhao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Dong Liu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Kailong Zhao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
2
|
Wu X, Lin H, Bai R, Duan H. Deep learning for advancing peptide drug development: Tools and methods in structure prediction and design. Eur J Med Chem 2024; 268:116262. [PMID: 38387334 DOI: 10.1016/j.ejmech.2024.116262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 02/06/2024] [Accepted: 02/17/2024] [Indexed: 02/24/2024]
Abstract
Peptides can bind challenging disease targets with high affinity and specificity, offering enormous opportunities for addressing unmet medical needs. However, peptides' unique features, including smaller size, increased structural flexibility, and limited data availability, pose additional challenges to the design process compared to proteins. This review explores the dynamic field of peptide therapeutics, leveraging deep learning to enhance structure prediction and design. Our exploration encompasses various facets of peptide research, ranging from dataset curation handling to model development. As deep learning technologies become more refined, we channel our efforts into peptide structure prediction and design, aligning with the fundamental principles of structure-activity relationships in drug development. To guide researchers in harnessing the potential of deep learning to advance peptide drug development, our insights comprehensively explore current challenges and future directions of peptide therapeutics.
Collapse
Affiliation(s)
- Xinyi Wu
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, PR China
| | - Huitian Lin
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, PR China
| | - Renren Bai
- School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, PR China.
| | - Hongliang Duan
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, PR China.
| |
Collapse
|
3
|
Kiani YS, Jabeen I. Challenges of Protein-Protein Docking of the Membrane Proteins. Methods Mol Biol 2024; 2780:203-255. [PMID: 38987471 DOI: 10.1007/978-1-0716-3985-6_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Despite the recent advances in the determination of high-resolution membrane protein (MP) structures, the structural and functional characterization of MPs remains extremely challenging, mainly due to the hydrophobic nature, low abundance, poor expression, purification, and crystallization difficulties associated with MPs. Whereby the major challenges/hurdles for MP structure determination are associated with the expression, purification, and crystallization procedures. Although there have been significant advances in the experimental determination of MP structures, only a limited number of MP structures (approximately less than 1% of all) are available in the Protein Data Bank (PDB). Therefore, the structures of a large number of MPs still remain unresolved, which leads to the availability of widely unplumbed structural and functional information related to MPs. As a result, recent developments in the drug discovery realm and the significant biological contemplation have led to the development of several novel, low-cost, and time-efficient computational methods that overcome the limitations of experimental approaches, supplement experiments, and provide alternatives for the characterization of MPs. Whereby the fine tuning and optimizations of these computational approaches remains an ongoing endeavor.Computational methods offer a potential way for the elucidation of structural features and the augmentation of currently available MP information. However, the use of computational modeling can be extremely challenging for MPs mainly due to insufficient knowledge of (or gaps in) atomic structures of MPs. Despite the availability of numerous in silico methods for 3D structure determination the applicability of these methods to MPs remains relatively low since all methods are not well-suited or adequate for MPs. However, sophisticated methods for MP structure predictions are constantly being developed and updated to integrate the modifications required for MPs. Currently, different computational methods for (1) MP structure prediction, (2) stability analysis of MPs through molecular dynamics simulations, (3) modeling of MP complexes through docking, (4) prediction of interactions between MPs, and (5) MP interactions with its soluble partner are extensively used. Towards this end, MP docking is widely used. It is notable that the MP docking methods yet few in number might show greater potential in terms of filling the knowledge gap. In this chapter, MP docking methods and associated challenges have been reviewed to improve the applicability, accuracy, and the ability to model macromolecular complexes.
Collapse
Affiliation(s)
- Yusra Sajid Kiani
- School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Ishrat Jabeen
- School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan.
| |
Collapse
|
4
|
Kotev M, Diaz Gonzalez C. Molecular Dynamics and Other HPC Simulations for Drug Discovery. Methods Mol Biol 2024; 2716:265-291. [PMID: 37702944 DOI: 10.1007/978-1-0716-3449-3_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2023]
Abstract
High performance computing (HPC) is taking an increasingly important place in drug discovery. It makes possible the simulation of complex biochemical systems with high precision in a short time, thanks to the use of sophisticated algorithms. It promotes the advancement of knowledge in fields that are inaccessible or difficult to access through experimentation and it contributes to accelerating the discovery of drugs for unmet medical needs while reducing costs. Herein, we report how computational performance has evolved over the past years, and then we detail three domains where HPC is essential. Molecular dynamics (MD) is commonly used to explore the flexibility of proteins, thus generating a better understanding of different possible approaches to modulate their activity. Modeling and simulation of biopolymer complexes enables the study of protein-protein interactions (PPI) in healthy and disease states, thus helping the identification of targets of pharmacological interest. Virtual screening (VS) also benefits from HPC to predict in a short time, among millions or billions of virtual chemical compounds, the best potential ligands that will be tested in relevant assays to start a rational drug design process.
Collapse
Affiliation(s)
- Martin Kotev
- Evotec SE, Integrated Drug Discovery, Molecular Architects, Campus Curie, Toulouse, France
| | | |
Collapse
|
5
|
Das R, Kretsch RC, Simpkin AJ, Mulvaney T, Pham P, Rangan R, Bu F, Keegan RM, Topf M, Rigden DJ, Miao Z, Westhof E. Assessment of three-dimensional RNA structure prediction in CASP15. Proteins 2023; 91:1747-1770. [PMID: 37876231 PMCID: PMC10841292 DOI: 10.1002/prot.26602] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/21/2023] [Accepted: 09/07/2023] [Indexed: 10/26/2023]
Abstract
The prediction of RNA three-dimensional structures remains an unsolved problem. Here, we report assessments of RNA structure predictions in CASP15, the first CASP exercise that involved RNA structure modeling. Forty-two predictor groups submitted models for at least one of twelve RNA-containing targets. These models were evaluated by the RNA-Puzzles organizers and, separately, by a CASP-recruited team using metrics (GDT, lDDT) and approaches (Z-score rankings) initially developed for assessment of proteins and generalized here for RNA assessment. The two assessments independently ranked the same predictor groups as first (AIchemy_RNA2), second (Chen), and third (RNAPolis and GeneSilico, tied); predictions from deep learning approaches were significantly worse than these top ranked groups, which did not use deep learning. Further analyses based on direct comparison of predicted models to cryogenic electron microscopy (cryo-EM) maps and x-ray diffraction data support these rankings. With the exception of two RNA-protein complexes, models submitted by CASP15 groups correctly predicted the global fold of the RNA targets. Comparisons of CASP15 submissions to designed RNA nanostructures as well as molecular replacement trials highlight the potential utility of current RNA modeling approaches for RNA nanotechnology and structural biology, respectively. Nevertheless, challenges remain in modeling fine details such as noncanonical pairs, in ranking among submitted models, and in prediction of multiple structures resolved by cryo-EM or crystallography.
Collapse
Affiliation(s)
- Rhiju Das
- Department of Biochemistry, Stanford University School of Medicine, CA USA
- Biophysics Program, Stanford University School of Medicine, CA USA
- Howard Hughes Medical Institute, Stanford University, CA USA
| | | | - Adam J. Simpkin
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Thomas Mulvaney
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV), Hamburg, Germany
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Phillip Pham
- Department of Biochemistry, Stanford University School of Medicine, CA USA
| | - Ramya Rangan
- Biophysics Program, Stanford University School of Medicine, CA USA
| | - Fan Bu
- Guangzhou Laboratory, Guangzhou International Bio Island, Guangzhou 510005, China
- Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230036, Anhui, China
| | - Ronan M. Keegan
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
- Life Science, Diamond Light Source, Harwell Science, UK
| | - Maya Topf
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV), Hamburg, Germany
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Daniel J. Rigden
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People's Hospital, School of Medicine, Tongji University, Shanghai 200434, China
| | - Eric Westhof
- Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, F-67084, Strasbourg, France
| |
Collapse
|
6
|
Lensink MF, Brysbaert G, Raouraoua N, Bates PA, Giulini M, Honorato RV, van Noort C, Teixeira JMC, Bonvin AMJJ, Kong R, Shi H, Lu X, Chang S, Liu J, Guo Z, Chen X, Morehead A, Roy RS, Wu T, Giri N, Quadir F, Chen C, Cheng J, Del Carpio CA, Ichiishi E, Rodriguez‐Lumbreras LA, Fernandez‐Recio J, Harmalkar A, Chu L, Canner S, Smanta R, Gray JJ, Li H, Lin P, He J, Tao H, Huang S, Roel‐Touris J, Jimenez‐Garcia B, Christoffer CW, Jain AJ, Kagaya Y, Kannan H, Nakamura T, Terashi G, Verburgt JC, Zhang Y, Zhang Z, Fujuta H, Sekijima M, Kihara D, Khan O, Kotelnikov S, Ghani U, Padhorny D, Beglov D, Vajda S, Kozakov D, Negi SS, Ricciardelli T, Barradas‐Bautista D, Cao Z, Chawla M, Cavallo L, Oliva R, Yin R, Cheung M, Guest JD, Lee J, Pierce BG, Shor B, Cohen T, Halfon M, Schneidman‐Duhovny D, Zhu S, Yin R, Sun Y, Shen Y, Maszota‐Zieleniak M, Bojarski KK, Lubecka EA, Marcisz M, Danielsson A, Dziadek L, Gaardlos M, Gieldon A, Liwo A, Samsonov SA, Slusarz R, Zieba K, Sieradzan AK, Czaplewski C, Kobayashi S, Miyakawa Y, Kiyota Y, Takeda‐Shitaka M, Olechnovic K, Valancauskas L, Dapkunas J, Venclovas C, Wallner B, Yang L, Hou C, He X, Guo S, Jiang S, Ma X, Duan R, Qui L, Xu X, Zou X, Velankar S, Wodak SJ. Impact of AlphaFold on structure prediction of protein complexes: The CASP15-CAPRI experiment. Proteins 2023; 91:1658-1683. [PMID: 37905971 PMCID: PMC10841881 DOI: 10.1002/prot.26609] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 09/22/2023] [Accepted: 09/28/2023] [Indexed: 11/02/2023]
Abstract
We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homodimers, 3 homo-trimers, 13 heterodimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21 941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their five best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% of the targets compared to 8% two years earlier. This remarkable improvement is due to the wide use of the AlphaFold2 and AlphaFold2-Multimer software and the confidence metrics they provide. Notably, expanded sampling of candidate solutions by manipulating these deep learning inference engines, enriching multiple sequence alignments, or integration of advanced modeling tools, enabled top performing groups to exceed the performance of a standard AlphaFold2-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem.
Collapse
Affiliation(s)
- Marc F. Lensink
- Univ. Lille, CNRS, UMR8576 – UGSF – Unité de Glycobiologie Structurale et FonctionnelleLilleFrance
| | - Guillaume Brysbaert
- Univ. Lille, CNRS, UMR8576 – UGSF – Unité de Glycobiologie Structurale et FonctionnelleLilleFrance
| | - Nessim Raouraoua
- Univ. Lille, CNRS, UMR8576 – UGSF – Unité de Glycobiologie Structurale et FonctionnelleLilleFrance
| | - Paul A. Bates
- Biomolecular Modeling LaboratoryThe Francis Crick InstituteLondonUK
| | - Marco Giulini
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Rodrigo V. Honorato
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Charlotte van Noort
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Joao M. C. Teixeira
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Alexandre M. J. J. Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Ren Kong
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information EngineeringJiangsu University of TechnologyChangzhouChina
| | - Hang Shi
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information EngineeringJiangsu University of TechnologyChangzhouChina
| | - Xufeng Lu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information EngineeringJiangsu University of TechnologyChangzhouChina
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information EngineeringJiangsu University of TechnologyChangzhouChina
| | - Jian Liu
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Zhiye Guo
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Xiao Chen
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Alex Morehead
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Raj S. Roy
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Tianqi Wu
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Nabin Giri
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Farhan Quadir
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Chen Chen
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Jianlin Cheng
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | | | - Eichiro Ichiishi
- International University of Health and Welfare (IUHV Hospital)Nasushiobara‐CityJapan
| | - Luis A. Rodriguez‐Lumbreras
- Instituto de Ciencias de la Vida y del Vino (ICVV)CSIC ‐ Universidad de La Rioja ‐ Gobierno de La RiojaLogronoSpain
- Barcelona Supercomputing Center (BSC)BarcelonaSpain
| | - Juan Fernandez‐Recio
- Instituto de Ciencias de la Vida y del Vino (ICVV)CSIC ‐ Universidad de La Rioja ‐ Gobierno de La RiojaLogronoSpain
- Barcelona Supercomputing Center (BSC)BarcelonaSpain
| | - Ameya Harmalkar
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Lee‐Shin Chu
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Sam Canner
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Rituparna Smanta
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Jeffrey J. Gray
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
- Program in Molecular BiophysicsJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Hao Li
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Peicong Lin
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Jiahua He
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Huanyu Tao
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Sheng‐You Huang
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Jorge Roel‐Touris
- Protein Design and Modeling Lab, Dept. of Structural BiologyMolecular Biology Institute of Barcelona (IBMB‐CSIC)BarcelonaSpain
| | | | | | - Anika J. Jain
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Yuki Kagaya
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Harini Kannan
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
- Dept. of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology MadrasChennaiIndia
| | - Tsukasa Nakamura
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Genki Terashi
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Jacob C. Verburgt
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Yuanyuan Zhang
- Dept. of Computer SciencePurdue UniversityWest LafayetteIndianaUSA
| | - Zicong Zhang
- Dept. of Computer SciencePurdue UniversityWest LafayetteIndianaUSA
| | - Hayato Fujuta
- Dept. of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology MadrasChennaiIndia
| | | | - Daisuke Kihara
- Dept. of Computer SciencePurdue UniversityWest LafayetteIndianaUSA
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | | | | | | | | | | | | | | | - Surendra S. Negi
- Sealy Center for Structural Biology and Molecular BiophysicsUniversity of Texas Medical BranchGalvestonTexasUSA
| | | | | | - Zhen Cao
- King Abdullah University of Science and Technology (KAUST)Saudi Arabia
| | - Mohit Chawla
- King Abdullah University of Science and Technology (KAUST)Saudi Arabia
| | - Luigi Cavallo
- King Abdullah University of Science and Technology (KAUST)Saudi Arabia
- Department of Chemistry and BiologyUniversity of SalernoFiscianoItaly
| | | | - Rui Yin
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Melyssa Cheung
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Chemistry and BiochemistryUniversity of MarylandCollege ParkMarylandUSA
| | - Johnathan D. Guest
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Jessica Lee
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Brian G. Pierce
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Ben Shor
- School of Computer Science and EngineeringThe Hebrew University of JerusalemJerusalemIsrael
| | - Tomer Cohen
- School of Computer Science and EngineeringThe Hebrew University of JerusalemJerusalemIsrael
| | - Matan Halfon
- School of Computer Science and EngineeringThe Hebrew University of JerusalemJerusalemIsrael
| | | | - Shaowen Zhu
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTexasUSA
| | - Rujie Yin
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTexasUSA
| | - Yuanfei Sun
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTexasUSA
| | - Yang Shen
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTexasUSA
- Department of Computer Science and EngineeringTexas A&M UniversityCollege StationTexasUSA
- Institute of Biosciences and Technology and Department of Translational Medical SciencesTexas A&M UniversityHoustonTexasUSA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Yuta Miyakawa
- School of PharmacyKitasato UniversityMinato‐kuTokyoJapan
| | - Yasuomi Kiyota
- School of PharmacyKitasato UniversityMinato‐kuTokyoJapan
| | | | - Kliment Olechnovic
- Institute of Biotechnology, Life Sciences CenterVilnius UniversityVilniusLithuania
| | - Lukas Valancauskas
- Institute of Biotechnology, Life Sciences CenterVilnius UniversityVilniusLithuania
| | - Justas Dapkunas
- Institute of Biotechnology, Life Sciences CenterVilnius UniversityVilniusLithuania
| | - Ceslovas Venclovas
- Institute of Biotechnology, Life Sciences CenterVilnius UniversityVilniusLithuania
| | - Bjorn Wallner
- Bioinformatics Division, Department of Physics, Chemistry, and BiologyLinkoping UniversityLinköpingSweden
| | - Lin Yang
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
- School of Aerospace, Mechanical and Mechatronic EngineeringThe University of SydneyNew South WalesAustralia
| | - Chengyu Hou
- School of Electronics and Information EngineeringHarbin Institute of TechnologyHarbinChina
| | - Xiaodong He
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
- Shenzhen STRONG Advanced Materials Research Institute Col, LtdShenzhenPeople's Republic of China
| | - Shuai Guo
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
| | - Shenda Jiang
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
| | - Xiaoliang Ma
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
| | - Rui Duan
- Dalton Cardiovascular Research CenterUniversity of MissouriColumbiaMissouriUSA
| | - Liming Qui
- Dalton Cardiovascular Research CenterUniversity of MissouriColumbiaMissouriUSA
| | - Xianjin Xu
- Dalton Cardiovascular Research CenterUniversity of MissouriColumbiaMissouriUSA
| | - Xiaoqin Zou
- Dalton Cardiovascular Research CenterUniversity of MissouriColumbiaMissouriUSA
- Dept. of Physics and AstronomyUniversity of MissouriColumbiaMissouriUSA
- Dept. of BiochemistryUniversity of MissouriColumbiaMissouriUSA
- Institute for Data Science and InformaticsUniversity of MissouriColumbiaMissouriUSA
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)HinxtonCambridgeUK
| | | |
Collapse
|
7
|
Roy RS, Liu J, Giri N, Guo Z, Cheng J. Combining pairwise structural similarity and deep learning interface contact prediction to estimate protein complex model accuracy in CASP15. Proteins 2023; 91:1889-1902. [PMID: 37357816 PMCID: PMC10749984 DOI: 10.1002/prot.26542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Revised: 06/07/2023] [Accepted: 06/08/2023] [Indexed: 06/27/2023]
Abstract
Estimating the accuracy of quaternary structural models of protein complexes and assemblies (EMA) is important for predicting quaternary structures and applying them to studying protein function and interaction. The pairwise similarity between structural models is proven useful for estimating the quality of protein tertiary structural models, but it has been rarely applied to predicting the quality of quaternary structural models. Moreover, the pairwise similarity approach often fails when many structural models are of low quality and similar to each other. To address the gap, we developed a hybrid method (MULTICOM_qa) combining a pairwise similarity score (PSS) and an interface contact probability score (ICPS) based on the deep learning inter-chain contact prediction for estimating protein complex model accuracy. It blindly participated in the 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15) in 2022 and performed very well in estimating the global structure accuracy of assembly models. The average per-target correlation coefficient between the model quality scores predicted by MULTICOM_qa and the true quality scores of the models of CASP15 assembly targets is 0.66. The average per-target ranking loss in using the predicted quality scores to rank the models is 0.14. It was able to select good models for most targets. Moreover, several key factors (i.e., target difficulty, model sampling difficulty, skewness of model quality, and similarity between good/bad models) for EMA are identified and analyzed. The results demonstrate that combining the multi-model method (PSS) with the complementary single-model method (ICPS) is a promising approach to EMA.
Collapse
Affiliation(s)
- Raj S. Roy
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Jian Liu
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Nabin Giri
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Zhiye Guo
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
8
|
Ozden B, Kryshtafovych A, Karaca E. The impact of AI-based modeling on the accuracy of protein assembly prediction: Insights from CASP15. Proteins 2023; 91:1636-1657. [PMID: 37861057 PMCID: PMC10873090 DOI: 10.1002/prot.26598] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 09/12/2023] [Accepted: 09/14/2023] [Indexed: 10/21/2023]
Abstract
In CASP15, 87 predictors submitted around 11 000 models on 41 assembly targets. The community demonstrated exceptional performance in overall fold and interface contact predictions, achieving an impressive success rate of 90% (compared to 31% in CASP14). This remarkable accomplishment is largely due to the incorporation of DeepMind's AF2-Multimer approach into custom-built prediction pipelines. To evaluate the added value of participating methods, we compared the community models to the baseline AF2-Multimer predictor. In over 1/3 of cases, the community models were superior to the baseline predictor. The main reasons for this improved performance were the use of custom-built multiple sequence alignments, optimized AF2-Multimer sampling, and the manual assembly of AF2-Multimer-built subcomplexes. The best three groups, in order, are Zheng, Venclovas, and Wallner. Zheng and Venclovas reached a 73.2% success rate over all (41) cases, while Wallner attained 69.4% success rate over 36 cases. Nonetheless, challenges remain in predicting structures with weak evolutionary signals, such as nanobody-antigen, antibody-antigen, and viral complexes. Expectedly, modeling large complexes also remains challenging due to their high memory compute demands. In addition to the assembly category, we assessed the accuracy of modeling interdomain interfaces in the tertiary structure prediction targets. Models on seven targets featuring 17 unique interfaces were analyzed. Best predictors achieved a 76.5% success rate, with the UM-TBM group being the leader. In the interdomain category, we observed that the predictors faced challenges, as in the case of the assembly category, when the evolutionary signal for a given domain pair was weak or the structure was large. Overall, CASP15 witnessed unprecedented improvement in interface modeling, reflecting the AI revolution seen in CASP14.
Collapse
Affiliation(s)
- Burcu Ozden
- Izmir Biomedicine and Genome Center, Izmir, Türkiye
- Izmir International Biomedicine and Genome Institute, Dokuz Eylul University, Izmir, Türkiye
| | - Andriy Kryshtafovych
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, USA
| | - Ezgi Karaca
- Izmir Biomedicine and Genome Center, Izmir, Türkiye
- Izmir International Biomedicine and Genome Institute, Dokuz Eylul University, Izmir, Türkiye
| |
Collapse
|
9
|
van Keulen SC, Bonvin AMJJ. Improving the quality of co-evolution intermolecular contact prediction with DisVis. Proteins 2023; 91:1407-1416. [PMID: 37237441 DOI: 10.1002/prot.26514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 03/29/2023] [Accepted: 04/19/2023] [Indexed: 05/28/2023]
Abstract
The steep rise in protein sequences and structures has paved the way for bioinformatics approaches to predict residue-residue interactions in protein complexes. Multiple sequence alignments are commonly used in contact predictions to identify co-evolving residues. These contacts, however, often include false positives (FPs), which may impair their use to predict three dimensional structures of biomolecular complexes and affect the accuracy of the generated models. Previously, we have developed DisVis to identify FP in mass spectrometry cross-linking data. DisVis allows to assess the accessible interaction space between two proteins consistent with a set of distance restraints. Here, we investigate if a similar approach could be applied to co-evolution predicted contacts in order to improve their precision prior to using them for modeling. We analyze co-evolution contact predictions with DisVis for a set of 26 protein-protein complexes. The DisVis-reranked and the original co-evolution contacts are then used to model the complexes with our integrative docking software HADDOCK using different filtering scenarios. Our results show that HADDOCK is robust with respect to the precision of the predicted contacts due to the 50% random contact removal during docking and can enhance the quality of docking predictions when combined with DisVis filtering for low precision contact data. DisVis can thus have a beneficial effect on low quality data, but overall HADDOCK can accommodate FP restraints without negatively impacting the quality of the resulting models. Other more precision-sensitive docking protocols might, however, benefit from the increased precision of the predicted contacts after DisVis filtering.
Collapse
Affiliation(s)
- Siri C van Keulen
- Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, the Netherlands
| | - Alexandre M J J Bonvin
- Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, the Netherlands
| |
Collapse
|
10
|
Zhang L, Wang S, Hou J, Si D, Zhu J, Cao R. ComplexQA: a deep graph learning approach for protein complex structure assessment. Brief Bioinform 2023; 24:bbad287. [PMID: 37930021 DOI: 10.1093/bib/bbad287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 05/09/2023] [Accepted: 07/24/2023] [Indexed: 11/07/2023] Open
Abstract
MOTIVATION In recent years, the end-to-end deep learning method for single-chain protein structure prediction has achieved high accuracy. For example, the state-of-the-art method AlphaFold, developed by Google, has largely increased the accuracy of protein structure predictions to near experimental accuracy in some of the cases. At the same time, there are few methods that can evaluate the quality of protein complexes at the residue level. In particular, evaluating the quality of residues at the interface of protein complexes can lead to a wide range of applications, such as protein function analysis and drug design. In this paper, we introduce a new deep graph neural network-based method ComplexQA, to evaluate the local quality of interfaces for protein complexes by utilizing the residue-level structural information in 3D space and the sequence-level constraints. RESULTS We benchmark our method to other state-of-the-art quality assessment approaches on the HAF2 and DBM55-AF2 datasets (high-quality structural models predicted by AlphaFold-Multimer), and the BM5 docking dataset. The experimental results show that our proposed method achieves better or similar performance compared with other state-of-the-art methods, especially on difficult targets which only contain a few acceptable models. Our method is able to suggest a score for each interfac e residue, which demonstrates a powerful assessment tool for the ever-increasing number of protein complexes. AVAILABILITY https://github.com/Cao-Labs/ComplexQA.git. Contact: caora@plu.edu.
Collapse
Affiliation(s)
- Lei Zhang
- Department of Computer Science and Technology, AnHui University, Hefei, 230601, Anhui, China
| | - Sheng Wang
- Department of Computer Science and Technology, AnHui University, Hefei, 230601, Anhui, China
| | - Jie Hou
- Department of Computer Science, Saint Louis University, Saint. Louis, 63103, MO, USA
| | - Dong Si
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, 98011, WA, USA
| | - Junyong Zhu
- Department of Computer Science and Technology, AnHui University, Hefei, 230601, Anhui, China
| | - Renzhi Cao
- Department of Humanities, Pacific Lutheran University, Tacoma, 98447, WA, USA
| |
Collapse
|
11
|
Ozden B, Kryshtafovych A, Karaca E. The Impact of AI-Based Modeling on the Accuracy of Protein Assembly Prediction: Insights from CASP15. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.10.548341. [PMID: 37503072 PMCID: PMC10369898 DOI: 10.1101/2023.07.10.548341] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
In CASP15, 87 predictors submitted around 11,000 models on 41 assembly targets. The community demonstrated exceptional performance in overall fold and interface contact prediction, achieving an impressive success rate of 90% (compared to 31% in CASP14). This remarkable accomplishment is largely due to the incorporation of DeepMind's AF2-Multimer approach into custom-built prediction pipelines. To evaluate the added value of participating methods, we compared the community models to the baseline AF2-Multimer predictor. In over 1/3 of cases the community models were superior to the baseline predictor. The main reasons for this improved performance were the use of custom-built multiple sequence alignments, optimized AF2-Multimer sampling, and the manual assembly of AF2-Multimer-built subcomplexes. The best three groups, in order, are Zheng, Venclovas and Wallner. Zheng and Venclovas reached a 73.2% success rate over all (41) cases, while Wallner attained 69.4% success rate over 36 cases. Nonetheless, challenges remain in predicting structures with weak evolutionary signals, such as nanobody-antigen, antibody-antigen, and viral complexes. Expectedly, modeling large complexes remains also challenging due to their high memory compute demands. In addition to the assembly category, we assessed the accuracy of modeling interdomain interfaces in the tertiary structure prediction targets. Models on seven targets featuring 17 unique interfaces were analyzed. Best predictors achieved the 76.5% success rate, with the UM-TBM group being the leader. In the interdomain category, we observed that the predictors faced challenges, as in the case of the assembly category, when the evolutionary signal for a given domain pair was weak or the structure was large. Overall, CASP15 witnessed unprecedented improvement in interface modeling, reflecting the AI revolution seen in CASP14.
Collapse
Affiliation(s)
- Burcu Ozden
- Izmir Biomedicine and Genome Center, Izmir, Türkiye
- Izmir International Biomedicine and Genome Institute, Dokuz Eylul University, Izmir, Türkiye
| | - Andriy Kryshtafovych
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, USA
| | - Ezgi Karaca
- Izmir Biomedicine and Genome Center, Izmir, Türkiye
- Izmir International Biomedicine and Genome Institute, Dokuz Eylul University, Izmir, Türkiye
| |
Collapse
|
12
|
Marsan ES, Dreab A, Bayse CA. In silico insights into the dimer structure and deiodinase activity of type III iodothyronine deiodinase from bioinformatics, molecular dynamics simulations, and QM/MM calculations. J Biomol Struct Dyn 2023; 41:4819-4829. [PMID: 35579922 PMCID: PMC9878935 DOI: 10.1080/07391102.2022.2073271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 04/27/2022] [Indexed: 01/28/2023]
Abstract
The homodimeric family of iodothyronine deiodinases (Dios) regioselectively remove iodine from thyroid hormones. Currently, structural data has only been reported for the monomer of the mus type III thioredoxin (Trx) fold catalytic domain (Dio3Trx), but the mode of dimerization has not yet been determined. Various groups have proposed dimer structures that are similar to the A-type and B-type dimerization modes of peroxiredoxins. Computational methods are used to compare the sequence of Dio3Trx to related proteins known to form A-type and B-type dimers. Sequence analysis and in silico protein-protein docking methods suggest that Dio3Trx is more consistent with proteins that adopt B-type dimerization. Molecular dynamics (MD) simulations of the refined Dio3Trx dimer constructed using the SymmDock and GalaxyRefineComplex databases indicate stable dimer formation along the β4α3 interface consistent with other Trx fold B-type dimers. Free energy calculations show that the dimer is stabilized by interdimer interactions between the β-sheets and α-helices. A comparison of MD simulations of the apo and thyroxine-bound dimers suggests that the active site binding pocket is not affected by dimerization. Determination of the transition state for deiodination of thyroxine from the monomer structure using QM/MM methods provides an activation barrier consistent with previous small model DFT studies.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Eric S Marsan
- Department of Chemistry and Biochemistry, Old Dominion University, Norfolk, VA
| | - Ana Dreab
- Department of Chemistry and Biochemistry, Old Dominion University, Norfolk, VA
| | - Craig A Bayse
- Department of Chemistry and Biochemistry, Old Dominion University, Norfolk, VA
| |
Collapse
|
13
|
Wodak SJ, Vajda S, Lensink MF, Kozakov D, Bates PA. Critical Assessment of Methods for Predicting the 3D Structure of Proteins and Protein Complexes. Annu Rev Biophys 2023; 52:183-206. [PMID: 36626764 PMCID: PMC10885158 DOI: 10.1146/annurev-biophys-102622-084607] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Advances in a scientific discipline are often measured by small, incremental steps. In this review, we report on two intertwined disciplines in the protein structure prediction field, modeling of single chains and modeling of complexes, that have over decades emulated this pattern, as monitored by the community-wide blind prediction experiments CASP and CAPRI. However, over the past few years, dramatic advances were observed for the accurate prediction of single protein chains, driven by a surge of deep learning methodologies entering the prediction field. We review the mainscientific developments that enabled these recent breakthroughs and feature the important role of blind prediction experiments in building up and nurturing the structure prediction field. We discuss how the new wave of artificial intelligence-based methods is impacting the fields of computational and experimental structural biology and highlight areas in which deep learning methods are likely to lead to future developments, provided that major challenges are overcome.
Collapse
Affiliation(s)
- Shoshana J Wodak
- VIB-VUB Center for Structural Biology, Vrije Universiteit Brussel, Brussels, Belgium;
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA;
- Department of Chemistry, Boston University, Boston, Massachusetts, USA
| | - Marc F Lensink
- Univ. Lille, CNRS, UMR 8576-UGSF-Unité de Glycobiologie Structurale et Fonctionnelle, Lille, France;
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA;
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Paul A Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, United Kingdom;
| |
Collapse
|
14
|
Roy RS, Liu J, Giri N, Guo Z, Cheng J. Combining pairwise structural similarity and deep learning interface contact prediction to estimate protein complex model accuracy in CASP15. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.08.531814. [PMID: 36945536 PMCID: PMC10028888 DOI: 10.1101/2023.03.08.531814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Estimating the accuracy of quaternary structural models of protein complexes and assemblies (EMA) is important for predicting quaternary structures and applying them to studying protein function and interaction. The pairwise similarity between structural models is proven useful for estimating the quality of protein tertiary structural models, but it has been rarely applied to predicting the quality of quaternary structural models. Moreover, the pairwise similarity approach often fails when many structural models are of low quality and similar to each other. To address the gap, we developed a hybrid method (MULTICOM_qa) combining a pairwise similarity score (PSS) and an interface contact probability score (ICPS) based on the deep learning inter-chain contact prediction for estimating protein complex model accuracy. It blindly participated in the 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15) in 2022 and ranked first out of 24 predictors in estimating the global accuracy of assembly models. The average per-target correlation coefficient between the model quality scores predicted by MULTICOM_qa and the true quality scores of the models of CASP15 assembly targets is 0.66. The average per-target ranking loss in using the predicted quality scores to rank the models is 0.14. It was able to select good models for most targets. Moreover, several key factors (i.e., target difficulty, model sampling difficulty, skewness of model quality, and similarity between good/bad models) for EMA are identified and analayzed. The results demonstrate that combining the multi-model method (PSS) with the complementary single-model method (ICPS) is a promising approach to EMA. The source code of MULTICOM_qa is available at https://github.com/BioinfoMachineLearning/MULTICOM_qa .
Collapse
Affiliation(s)
- Raj S. Roy
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Jian Liu
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Nabin Giri
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Zhiye Guo
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
15
|
Barradas-Bautista D, Almajed A, Oliva R, Kalnis P, Cavallo L. Improving classification of correct and incorrect protein-protein docking models by augmenting the training set. BIOINFORMATICS ADVANCES 2023; 3:vbad012. [PMID: 36789292 PMCID: PMC9923443 DOI: 10.1093/bioadv/vbad012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 01/20/2023] [Accepted: 02/01/2023] [Indexed: 02/04/2023]
Abstract
Motivation Protein-protein interactions drive many relevant biological events, such as infection, replication and recognition. To control or engineer such events, we need to access the molecular details of the interaction provided by experimental 3D structures. However, such experiments take time and are expensive; moreover, the current technology cannot keep up with the high discovery rate of new interactions. Computational modeling, like protein-protein docking, can help to fill this gap by generating docking poses. Protein-protein docking generally consists of two parts, sampling and scoring. The sampling is an exhaustive search of the tridimensional space. The caveat of the sampling is that it generates a large number of incorrect poses, producing a highly unbalanced dataset. This limits the utility of the data to train machine learning classifiers. Results Using weak supervision, we developed a data augmentation method that we named hAIkal. Using hAIkal, we increased the labeled training data to train several algorithms. We trained and obtained different classifiers; the best classifier has 81% accuracy and 0.51 Matthews' correlation coefficient on the test set, surpassing the state-of-the-art scoring functions. Availability and implementation Docking models from Benchmark 5 are available at https://doi.org/10.5281/zenodo.4012018. Processed tabular data are available at https://repository.kaust.edu.sa/handle/10754/666961. Google colab is available at https://colab.research.google.com/drive/1vbVrJcQSf6\_C3jOAmZzgQbTpuJ5zC1RP?usp=sharing. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | - Ali Almajed
- Computer, Electrical and Mathematical Science and Engineering Division, Kaust Extreme Computing Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Romina Oliva
- Department of Sciences and Technologies, University of Naples “Parthenope”, I-80143 Naples, Italy
| | - Panos Kalnis
- Computer, Electrical and Mathematical Science and Engineering Division, Kaust Extreme Computing Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Luigi Cavallo
- Physical Sciences and Engineering Division, Kaust Catalysis Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| |
Collapse
|
16
|
Duzgun Z, Kural BV, Orem A, Yildiz I. In silico investigation of the interactions of certain drugs proposed for the treatment of Covid-19 with the paraoxonase-1. J Biomol Struct Dyn 2023; 41:884-896. [PMID: 34895069 DOI: 10.1080/07391102.2021.2014971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Coronavirus disease 2019 (Covid-19) has caused one of the biggest pandemics of modern times, infected over 240 million people and killed over 4.9 million people, and continues to do so. Although many drugs are widely recommended in the treatment of this disease, the interactions of these drugs with an anti-atherosclerotic enzyme, paraoxonase-1 (PON1), are not well known. In our study, we investigated the interactions of 18 different drugs, which are claimed to be effective against covid-19, with the PON1 enzyme and its genetics variants L55M and Q192R with molecular docking, molecular dynamics simulation and free energy calculation method MM/PBSA. We found that ruxolitinib, dexamethasone, colchicine; dexamethasone, sitagliptin, baricitinib and galidesivir, ruxolitinib, hydroxychloroquine were the most effective compounds in binding PON1-w, PON1L55M and PON1Q192R respectively. Mainly, sitagliptin, galidesivir and hydroxychloroquine have attracted attention by showing very high affinity (<-300 kJ/mol) according to the MM/PBSA method. We concluded that the drug interactions should be considered and more attention should be paid in the use of these drugs.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Zekeriya Duzgun
- Faculty of Medicine, Department of Medical Biology, Giresun University, Giresun, Turkey
| | - Birgül Vanizor Kural
- Faculty of Medicine, Department of Biochemistry, Karadeniz Technical University, Trabzon, Turkey
| | - Asim Orem
- Faculty of Medicine, Department of Biochemistry, Karadeniz Technical University, Trabzon, Turkey
| | - Ilkay Yildiz
- Faculty of Pharmacy, Department of Pharmaceutical Chemistry, Ankara University, Ankara, Turkey
| |
Collapse
|
17
|
Lin P, Yan Y, Huang SY. DeepHomo2.0: improved protein-protein contact prediction of homodimers by transformer-enhanced deep learning. Brief Bioinform 2023; 24:6849483. [PMID: 36440949 DOI: 10.1093/bib/bbac499] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/08/2022] [Accepted: 10/21/2022] [Indexed: 11/30/2022] Open
Abstract
Protein-protein interactions play an important role in many biological processes. However, although structure prediction for monomer proteins has achieved great progress with the advent of advanced deep learning algorithms like AlphaFold, the structure prediction for protein-protein complexes remains an open question. Taking advantage of the Transformer model of ESM-MSA, we have developed a deep learning-based model, named DeepHomo2.0, to predict protein-protein interactions of homodimeric complexes by leveraging the direct-coupling analysis (DCA) and Transformer features of sequences and the structure features of monomers. DeepHomo2.0 was extensively evaluated on diverse test sets and compared with eight state-of-the-art methods including protein language model-based, DCA-based and machine learning-based methods. It was shown that DeepHomo2.0 achieved a high precision of >70% with experimental monomer structures and >60% with predicted monomer structures for the top 10 predicted contacts on the test sets and outperformed the other eight methods. Moreover, even the version without using structure information, named DeepHomoSeq, still achieved a good precision of >55% for the top 10 predicted contacts. Integrating the predicted contacts into protein docking significantly improved the structure prediction of realistic Critical Assessment of Protein Structure Prediction homodimeric complexes. DeepHomo2.0 and DeepHomoSeq are available at http://huanglab.phys.hust.edu.cn/DeepHomo2/.
Collapse
Affiliation(s)
- Peicong Lin
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
18
|
Matamoros-Recio A, Mínguez-Toral M, Martín-Santamaría S. Modeling of Transmembrane Domain and Full-Length TLRs in Membrane Models. Methods Mol Biol 2023; 2700:3-38. [PMID: 37603172 DOI: 10.1007/978-1-0716-3366-3_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/22/2023]
Abstract
Toll-like receptors (TLRs), classified as pattern recognition receptors, have a primordial role in the activation of the innate immunity. In particular, TLR4 binds to lipopolysaccharides (LPS), a membrane constituent of Gram-negative bacteria, and, together with Myeloid Differentiation factor 2 (MD-2) protein, forms a heterodimeric complex which leads to the activation of the innate immune system response. Identification of TLRs has sparked great interest in the therapeutic manipulation of the innate immune system. In particular, TLR4 antagonists may be useful for the treatment of septic shock, certain autoimmune diseases, noninfectious inflammatory disorders, and neuropathic pain, and TLR4 agonists are under development as vaccine adjuvants in antitumoral treatments. Therefore, TLR4 has risen as a promising therapeutic target, and its modulation constitutes a highly relevant and active research area. Deep structural understanding of TLR4 signaling may help in the design and discovery of TLR4-modulating molecules with desirable therapeutic properties.Computational studies of the different independent domains composing the TLR4 were undertaken, to understand the differential domain organization of TLR4 in aqueous and membrane environments, including Liquid-disordered (Ld) and Liquid-ordered (Lo) membrane models, to account for the TLR4 recruitment in lipid rafts over activation. We modeled, by means of all-atom Molecular Dynamics (MD) simulations, the structural assembly of plausible full-length TLR4 models embedded into a realistic plasma membrane, accounting for the active (agonist) state of the TLR4, thus providing an analysis at both atomic/molecular and thermodynamic levels of the TLR4 assembly and biological activity. Our results unveil relevant molecular aspects involved in the mechanism of receptor activation, and adaptor recruitment in the innate immune pathways, and will promote the discovery of new TLR4 modulators and probes.
Collapse
Affiliation(s)
- Alejandra Matamoros-Recio
- Department of Structural and Chemical Biology, Centro de Investigaciones Biológicas Margarita Salas, CIB-CSIC, Madrid, Spain
| | - Marina Mínguez-Toral
- Department of Structural and Chemical Biology, Centro de Investigaciones Biológicas Margarita Salas, CIB-CSIC, Madrid, Spain
| | - Sonsoles Martín-Santamaría
- Department of Structural and Chemical Biology, Centro de Investigaciones Biológicas Margarita Salas, CIB-CSIC, Madrid, Spain.
| |
Collapse
|
19
|
Tsuchiya Y, Yamamori Y, Tomii K. Protein-protein interaction prediction methods: from docking-based to AI-based approaches. Biophys Rev 2022; 14:1341-1348. [PMID: 36570321 PMCID: PMC9759050 DOI: 10.1007/s12551-022-01032-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/30/2022] [Indexed: 12/23/2022] Open
Abstract
Protein-protein interactions (PPIs), such as protein-protein inhibitor, antibody-antigen complex, and supercomplexes play diverse and important roles in cells. Recent advances in structural analysis methods, including cryo-EM, for the determination of protein complex structures are remarkable. Nevertheless, much room remains for improvement and utilization of computational methods to predict PPIs because of the large number and great diversity of unresolved complex structures. This review introduces a wide array of computational methods, including our own, for estimating PPIs including antibody-antigen interactions, offering both historical and forward-looking perspectives.
Collapse
Affiliation(s)
- Yuko Tsuchiya
- grid.208504.b0000 0001 2230 7538Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-Ku, Tokyo, 135-0064 Japan
| | - Yu Yamamori
- grid.208504.b0000 0001 2230 7538Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-Ku, Tokyo, 135-0064 Japan
| | - Kentaro Tomii
- grid.208504.b0000 0001 2230 7538Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-Ku, Tokyo, 135-0064 Japan
| |
Collapse
|
20
|
Réau M, Renaud N, Xue LC, Bonvin AMJJ. DeepRank-GNN: a graph neural network framework to learn patterns in protein-protein interfaces. Bioinformatics 2022; 39:6845451. [PMID: 36420989 PMCID: PMC9805592 DOI: 10.1093/bioinformatics/btac759] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 10/19/2022] [Accepted: 11/23/2022] [Indexed: 11/25/2022] Open
Abstract
MOTIVATION Gaining structural insights into the protein-protein interactome is essential to understand biological phenomena and extract knowledge for rational drug design or protein engineering. We have previously developed DeepRank, a deep-learning framework to facilitate pattern learning from protein-protein interfaces using convolutional neural network (CNN) approaches. However, CNN is not rotation invariant and data augmentation is required to desensitize the network to the input data orientation which dramatically impairs the computation performance. Representing protein-protein complexes as atomic- or residue-scale rotation invariant graphs instead enables using graph neural networks (GNN) approaches, bypassing those limitations. RESULTS We have developed DeepRank-GNN, a framework that converts protein-protein interfaces from PDB 3D coordinates files into graphs that are further provided to a pre-defined or user-defined GNN architecture to learn problem-specific interaction patterns. DeepRank-GNN is designed to be highly modularizable, easily customized and is wrapped into a user-friendly python3 package. Here, we showcase DeepRank-GNN's performance on two applications using a dedicated graph interaction neural network: (i) the scoring of docking poses and (ii) the discriminating of biological and crystal interfaces. In addition to the highly competitive performance obtained in those tasks as compared to state-of-the-art methods, we show a significant improvement in speed and storage requirement using DeepRank-GNN as compared to DeepRank. AVAILABILITY AND IMPLEMENTATION DeepRank-GNN is freely available from https://github.com/DeepRank/DeepRank-GNN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Li C Xue
- Center for Molecular and Biomolecular Informatics, Radboudumc, Nijmegen 6525 GA, The Netherlands
| | | |
Collapse
|
21
|
Sun H, Wang A, Wang L, Wang B, Tian G, Yang J, Liao M. Systematic Tracing of Susceptible Animals to SARS-CoV-2 by a Bioinformatics Framework. Front Microbiol 2022; 13:781770. [PMID: 35308363 PMCID: PMC8931700 DOI: 10.3389/fmicb.2022.781770] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 01/18/2022] [Indexed: 01/02/2023] Open
Abstract
Since the outbreak of SARS-CoV-2 in 2019, the Chinese horseshoe bats were considered as a potential original host of SARS-CoV-2. In addition, cats, tigers, lions, mints, and ferrets were naturally or experimentally infected with SARS-CoV-2. For the surveillance and control of this highly infectious disease, it is critical to trace susceptible animals and predict the consequence of potential mutations at the binding region of viral spike protein and host ACE2 protein. This study proposed a novel bioinformatics framework to systematically trace susceptible animals to SARS-CoV-2 and predict the binding affinity between susceptible animals’ mutated/un-mutated ACE2 receptors. As a result, we identified a few animals posing a potential risk of infection with SARS-CoV-2 using the docking analysis of ACE2 protein and viral spike protein. The binding affinity of some of these species is weaker than that of humans but more potent than that of Chinese horseshoe bats. We also found that a few point mutations in human ACE2 protein or viral spike protein could significantly enhance their binding affinity, posing an enormous potential threat to public health. The ancestors of the Omicron may evolve rapidly through the accumulation of mutations in infecting the host and jumped into human beings. These findings indicate that if the epidemic expands, there may be a human-animal-human transmission route, which will increase the difficulty of disease prevention and control.
Collapse
Affiliation(s)
- Hailiang Sun
- College of Veterinary Medicine, South China Agricultural University, Guangzhou, China
| | | | | | - Bing Wang
- School of Electrical and Information Engineering, Anhui University of Technology, Maanshan, China
| | | | - Jialiang Yang
- Geneis Co., Ltd., Beijing, China
- Academician Workstation, Changsha Medical University, Changsha, China
- *Correspondence: Jialiang Yang,
| | - Ming Liao
- Institute of Animal Health, Guangdong Academy of Agricultural Sciences, Guangzhou, China
- Ming Liao,
| |
Collapse
|
22
|
Yamamori Y, Tomii K. Application of Homology Modeling by Enhanced Profile-Profile Alignment and Flexible-Fitting Simulation to Cryo-EM Based Structure Determination. Int J Mol Sci 2022; 23:ijms23041977. [PMID: 35216093 PMCID: PMC8879198 DOI: 10.3390/ijms23041977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 02/07/2022] [Accepted: 02/09/2022] [Indexed: 12/03/2022] Open
Abstract
Application of cryo-electron microscopy (cryo-EM) is crucially important for ascertaining the atomic structure of large biomolecules such as ribosomes and protein complexes in membranes. Advances in cryo-EM technology and software have made it possible to obtain data with near-atomic resolution, but the method is still often capable of producing only a density map with up to medium resolution, either partially or entirely. Therefore, bridging the gap separating the density map and the atomic model is necessary. Herein, we propose a methodology for constructing atomic structure models based on cryo-EM maps with low-to-medium resolution. The method is a combination of sensitive and accurate homology modeling using our profile–profile alignment method with a flexible-fitting method using molecular dynamics simulation. As described herein, this study used benchmark applications to evaluate the model constructions of human two-pore channel 2 (one target protein in CASP13 with its structure determined using cryo-EM data) and the overall structure of Enterococcus hirae V-ATPase complex.
Collapse
Affiliation(s)
- Yu Yamamori
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan;
| | - Kentaro Tomii
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan;
- AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory (RWBC-OIL), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
- Correspondence:
| |
Collapse
|
23
|
Roy RS, Quadir F, Soltanikazemi E, Cheng J. OUP accepted manuscript. Bioinformatics 2022; 38:1904-1910. [PMID: 35134816 PMCID: PMC8963319 DOI: 10.1093/bioinformatics/btac063] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Revised: 01/17/2022] [Accepted: 01/31/2022] [Indexed: 11/23/2022] Open
Abstract
Motivation Deep learning has revolutionized protein tertiary structure prediction recently. The cutting-edge deep learning methods such as AlphaFold can predict high-accuracy tertiary structures for most individual protein chains. However, the accuracy of predicting quaternary structures of protein complexes consisting of multiple chains is still relatively low due to lack of advanced deep learning methods in the field. Because interchain residue–residue contacts can be used as distance restraints to guide quaternary structure modeling, here we develop a deep dilated convolutional residual network method (DRCon) to predict interchain residue–residue contacts in homodimers from residue–residue co-evolutionary signals derived from multiple sequence alignments of monomers, intrachain residue–residue contacts of monomers extracted from true/predicted tertiary structures or predicted by deep learning, and other sequence and structural features. Results Tested on three homodimer test datasets (Homo_std dataset, DeepHomo dataset and CASP-CAPRI dataset), the precision of DRCon for top L/5 interchain contact predictions (L: length of monomer in a homodimer) is 43.46%, 47.10% and 33.50% respectively at 6 Å contact threshold, which is substantially better than DeepHomo and DNCON2_inter and similar to Glinter. Moreover, our experiments demonstrate that using predicted tertiary structure or intrachain contacts of monomers in the unbound state as input, DRCon still performs well, even though its accuracy is lower than using true tertiary structures in the bound state are used as input. Finally, our case study shows that good interchain contact predictions can be used to build high-accuracy quaternary structure models of homodimers. Availability and implementation The source code of DRCon is available at https://github.com/jianlin-cheng/DRCon. The datasets are available at https://zenodo.org/record/5998532#.YgF70vXMKsB. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Raj S Roy
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Farhan Quadir
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Elham Soltanikazemi
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | | |
Collapse
|
24
|
From complete cross-docking to partners identification and binding sites predictions. PLoS Comput Biol 2022; 18:e1009825. [PMID: 35089918 PMCID: PMC8827487 DOI: 10.1371/journal.pcbi.1009825] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Revised: 02/09/2022] [Accepted: 01/11/2022] [Indexed: 11/19/2022] Open
Abstract
Proteins ensure their biological functions by interacting with each other. Hence, characterising protein interactions is fundamental for our understanding of the cellular machinery, and for improving medicine and bioengineering. Over the past years, a large body of experimental data has been accumulated on who interacts with whom and in what manner. However, these data are highly heterogeneous and sometimes contradictory, noisy, and biased. Ab initio methods provide a means to a "blind" protein-protein interaction network reconstruction. Here, we report on a molecular cross-docking-based approach for the identification of protein partners. The docking algorithm uses a coarse-grained representation of the protein structures and treats them as rigid bodies. We applied the approach to a few hundred of proteins, in the unbound conformations, and we systematically investigated the influence of several key ingredients, such as the size and quality of the interfaces, and the scoring function. We achieved some significant improvement compared to previous works, and a very high discriminative power on some specific functional classes. We provide a readout of the contributions of shape and physico-chemical complementarity, interface matching, and specificity, in the predictions. In addition, we assessed the ability of the approach to account for protein surface multiple usages, and we compared it with a sequence-based deep learning method. This work may contribute to guiding the exploitation of the large amounts of protein structural models now available toward the discovery of unexpected partners and their complex structure characterisation.
Collapse
|
25
|
Verburgt J, Kihara D. Benchmarking of structure refinement methods for protein complex models. Proteins 2022; 90:83-95. [PMID: 34309909 PMCID: PMC8671191 DOI: 10.1002/prot.26188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 06/24/2021] [Accepted: 07/22/2021] [Indexed: 01/03/2023]
Abstract
Protein structure docking is the process in which the quaternary structure of a protein complex is predicted from individual tertiary structures of the protein subunits. Protein docking is typically performed in two main steps. The subunits are first docked while keeping them rigid to form the complex, which is then followed by structure refinement. Structure refinement is crucial for a practical use of computational protein docking models, as it is aimed for correcting conformations of interacting residues and atoms at the interface. Here, we benchmarked the performance of eight existing protein structure refinement methods in refinement of protein complex models. We show that the fraction of native contacts between subunits is by far the most straightforward metric to improve. However, backbone dependent metrics, based on the Root Mean Square Deviation proved more difficult to improve via refinement.
Collapse
Affiliation(s)
- Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
- Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
26
|
Barradas-Bautista D, Cao Z, Vangone A, Oliva R, Cavallo L. A random forest classifier for protein-protein docking models. BIOINFORMATICS ADVANCES 2021; 2:vbab042. [PMID: 36699405 PMCID: PMC9710594 DOI: 10.1093/bioadv/vbab042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 11/11/2021] [Accepted: 12/06/2021] [Indexed: 01/28/2023]
Abstract
Herein, we present the results of a machine learning approach we developed to single out correct 3D docking models of protein-protein complexes obtained by popular docking software. To this aim, we generated 3 × 10 4 docking models for each of the 230 complexes in the protein-protein benchmark, version 5, using three different docking programs (HADDOCK, FTDock and ZDOCK), for a cumulative set of ≈ 7 × 10 6 docking models. Three different machine learning approaches (Random Forest, Supported Vector Machine and Perceptron) were used to train classifiers with 158 different scoring functions (features). The Random Forest algorithm outperformed the other two algorithms and was selected for further optimization. Using a features selection algorithm, and optimizing the random forest hyperparameters, allowed us to train and validate a random forest classifier, named COnservation Driven Expert System (CoDES). Testing of CoDES on independent datasets, as well as results of its comparative performance with machine learning methods recently developed in the field for the scoring of docking decoys, confirm its state-of-the-art ability to discriminate correct from incorrect decoys both in terms of global parameters and in terms of decoys ranked at the top positions. Supplementary information Supplementary data are available at Bioinformatics Advances online. Software and data availability statement The docking models are available at https://doi.org/10.5281/zenodo.4012018. The programs underlying this article will be shared on request to the corresponding authors.
Collapse
Affiliation(s)
- Didier Barradas-Bautista
- Kaust Catalysis Center, Physical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Saudi Arabia,To whom correspondence should be addressed. or or
| | - Zhen Cao
- Kaust Catalysis Center, Physical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Saudi Arabia
| | - Anna Vangone
- Pharma Research and Early Development, Therapeutic Modalities, Roche Innovation Center Munich Large Molecule Research, 82377 Penzberg, Germany
| | - Romina Oliva
- Department of Sciences and Technologies, University Parthenope of Naples, Centro Direzionale Isola C4, I-80143 Naples, Italy,To whom correspondence should be addressed. or or
| | - Luigi Cavallo
- Kaust Catalysis Center, Physical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Saudi Arabia,To whom correspondence should be addressed. or or
| |
Collapse
|
27
|
Lensink MF, Brysbaert G, Mauri T, Nadzirin N, Velankar S, Chaleil RAG, Clarence T, Bates PA, Kong R, Liu B, Yang G, Liu M, Shi H, Lu X, Chang S, Roy RS, Quadir F, Liu J, Cheng J, Antoniak A, Czaplewski C, Giełdoń A, Kogut M, Lipska AG, Liwo A, Lubecka EA, Maszota-Zieleniak M, Sieradzan AK, Ślusarz R, Wesołowski PA, Zięba K, Del Carpio Muñoz CA, Ichiishi E, Harmalkar A, Gray JJ, Bonvin AMJJ, Ambrosetti F, Vargas Honorato R, Jandova Z, Jiménez-García B, Koukos PI, Van Keulen S, Van Noort CW, Réau M, Roel-Touris J, Kotelnikov S, Padhorny D, Porter KA, Alekseenko A, Ignatov M, Desta I, Ashizawa R, Sun Z, Ghani U, Hashemi N, Vajda S, Kozakov D, Rosell M, Rodríguez-Lumbreras LA, Fernandez-Recio J, Karczynska A, Grudinin S, Yan Y, Li H, Lin P, Huang SY, Christoffer C, Terashi G, Verburgt J, Sarkar D, Aderinwale T, Wang X, Kihara D, Nakamura T, Hanazono Y, Gowthaman R, Guest JD, Yin R, Taherzadeh G, Pierce BG, Barradas-Bautista D, Cao Z, Cavallo L, Oliva R, Sun Y, Zhu S, Shen Y, Park T, Woo H, Yang J, Kwon S, Won J, Seok C, Kiyota Y, Kobayashi S, Harada Y, Takeda-Shitaka M, Kundrotas PJ, Singh A, Vakser IA, Dapkūnas J, Olechnovič K, Venclovas Č, Duan R, Qiu L, Xu X, Zhang S, Zou X, Wodak SJ. Prediction of protein assemblies, the next frontier: The CASP14-CAPRI experiment. Proteins 2021; 89:1800-1823. [PMID: 34453465 PMCID: PMC8616814 DOI: 10.1002/prot.26222] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 07/24/2021] [Accepted: 08/05/2021] [Indexed: 12/19/2022]
Abstract
We present the results for CAPRI Round 50, the fourth joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of twelve targets, including six dimers, three trimers, and three higher-order oligomers. Four of these were easy targets, for which good structural templates were available either for the full assembly, or for the main interfaces (of the higher-order oligomers). Eight were difficult targets for which only distantly related templates were found for the individual subunits. Twenty-five CAPRI groups including eight automatic servers submitted ~1250 models per target. Twenty groups including six servers participated in the CAPRI scoring challenge submitted ~190 models per target. The accuracy of the predicted models was evaluated using the classical CAPRI criteria. The prediction performance was measured by a weighted scoring scheme that takes into account the number of models of acceptable quality or higher submitted by each group as part of their five top-ranking models. Compared to the previous CASP-CAPRI challenge, top performing groups submitted such models for a larger fraction (70-75%) of the targets in this Round, but fewer of these models were of high accuracy. Scorer groups achieved stronger performance with more groups submitting correct models for 70-80% of the targets or achieving high accuracy predictions. Servers performed less well in general, except for the MDOCKPP and LZERD servers, who performed on par with human groups. In addition to these results, major advances in methodology are discussed, providing an informative overview of where the prediction of protein assemblies currently stands.
Collapse
Affiliation(s)
- Marc F Lensink
- CNRS UMR8576 UGSF, Institute for Structural and Functional Glycobiology, University of Lille, Lille, France
| | - Guillaume Brysbaert
- CNRS UMR8576 UGSF, Institute for Structural and Functional Glycobiology, University of Lille, Lille, France
| | - Théo Mauri
- CNRS UMR8576 UGSF, Institute for Structural and Functional Glycobiology, University of Lille, Lille, France
| | - Nurul Nadzirin
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Sameer Velankar
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | | | - Tereza Clarence
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, UK
| | - Paul A Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, UK
| | - Ren Kong
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Bin Liu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Guangbo Yang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Ming Liu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Hang Shi
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Xufeng Lu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Raj S Roy
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Farhan Quadir
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Jian Liu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
- Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA
| | - Anna Antoniak
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
| | | | - Artur Giełdoń
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
| | - Mateusz Kogut
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
| | | | - Adam Liwo
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
| | - Emilia A Lubecka
- Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Gdansk, Poland
| | | | | | - Rafał Ślusarz
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
| | - Patryk A Wesołowski
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
- Intercollegiate Faculty of Biotechnology, University of Gdansk and Medical University of Gdansk, Gdansk, Poland
| | - Karolina Zięba
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
| | | | - Eiichiro Ichiishi
- International University of Health and Welfare Hospital (IUHW Hospital), Nasushiobara City, Japan
| | - Ameya Harmalkar
- Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, USA
| | - Jeffrey J Gray
- Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, USA
| | - Alexandre M J J Bonvin
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Francesco Ambrosetti
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Rodrigo Vargas Honorato
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Zuzana Jandova
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Brian Jiménez-García
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Panagiotis I Koukos
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Siri Van Keulen
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Charlotte W Van Noort
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Manon Réau
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Jorge Roel-Touris
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Sergei Kotelnikov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
- Innopolis University, Russia
| | - Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Kathryn A Porter
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Andrey Alekseenko
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
- Institute of Computer-Aided Design of the Russian Academy of Sciences, Moscow, Russia
| | - Mikhail Ignatov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Israel Desta
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Ryota Ashizawa
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Zhuyezi Sun
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Usman Ghani
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Nasser Hashemi
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
- Department of Chemistry, Boston University, Boston, Massachusetts, USA
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Mireia Rosell
- Instituto de Ciencias de la Vid y del Vino (ICVV), CSIC - Universidad de la Rioja - Gobierno de La Rioja, Logrono, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Luis A Rodríguez-Lumbreras
- Instituto de Ciencias de la Vid y del Vino (ICVV), CSIC - Universidad de la Rioja - Gobierno de La Rioja, Logrono, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Juan Fernandez-Recio
- Instituto de Ciencias de la Vid y del Vino (ICVV), CSIC - Universidad de la Rioja - Gobierno de La Rioja, Logrono, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | | | - Sergei Grudinin
- Université Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, Grenoble, France
| | - Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Hao Li
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Peicong Lin
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Tsukasa Nakamura
- Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi, Japan
| | - Yuya Hanazono
- Institute for Quantum Life Science, National Institutes for Quantum and Radiological Science and Technology, Tokai, Ibaraki, Japan
| | - Ragul Gowthaman
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland, USA
| | - Johnathan D Guest
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland, USA
| | - Rui Yin
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland, USA
| | - Ghazaleh Taherzadeh
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland, USA
| | - Brian G Pierce
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland, USA
| | | | - Zhen Cao
- King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Luigi Cavallo
- King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Romina Oliva
- University of Naples "Parthenope", Napoli, Italy
| | - Yuanfei Sun
- Department of Electrical and Computer Engineering, Texas A&M University, Texas, USA
| | - Shaowen Zhu
- Department of Electrical and Computer Engineering, Texas A&M University, Texas, USA
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, Texas, USA
| | - Taeyong Park
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Hyeonuk Woo
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Jinsol Yang
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Sohee Kwon
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Jonghun Won
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Yasuomi Kiyota
- School of Pharmacy, Kitasato University, Minato-ku, Tokyo, Japan
| | | | - Yoshiki Harada
- School of Pharmacy, Kitasato University, Minato-ku, Tokyo, Japan
| | | | - Petras J Kundrotas
- Computational Biology Program and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, USA
| | - Amar Singh
- Computational Biology Program and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, USA
| | - Ilya A Vakser
- Computational Biology Program and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, USA
| | - Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Rui Duan
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
| | - Liming Qiu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
| | - Xianjin Xu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
| | - Shuang Zhang
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
| | - Xiaoqin Zou
- Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri, USA
- Department of Biochemistry, University of Missouri, Columbia, Missouri, USA
| | | |
Collapse
|
28
|
Abstract
The biological significance of proteins attracted the scientific community in exploring their characteristics. The studies shed light on the interaction patterns and functions of proteins in a living body. Due to their practical difficulties, reliable experimental techniques pave the way for introducing computational methods in the interaction prediction. Automated methods reduced the difficulties but could not yet replace experimental studies as the field is still evolving. Interaction prediction problem being critical needs highly accurate results, but none of the existing methods could offer reliable performance that can parallel with experimental results yet. This article aims to assess the existing computational docking algorithms, their challenges, and future scope. Blind docking techniques are quite helpful when no information other than the individual structures are available. As more and more complex structures are being added to different databases, information-driven approaches can be a good alternative. Artificial intelligence, ruling over the major fields, is expected to take over this domain very shortly.
Collapse
|
29
|
Xie J, Zheng J, Hong X, Tong X, Liu X, Song Q, Liu S, Liu S. Protein-DNA complex structure modeling based on structural template. Biochem Biophys Res Commun 2021; 577:152-157. [PMID: 34517213 DOI: 10.1016/j.bbrc.2021.09.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 09/05/2021] [Accepted: 09/06/2021] [Indexed: 10/20/2022]
Abstract
DNA-binding is an important feature of proteins, and protein-DNA interaction involves in many life processes. Various computational methods have been developed to predict protein-DNA complex structures due to the difficulty of experimentally obtaining protein-DNA complex structures. However, prediction of protein-DNA complex is still a challenging problem compared with prediction of protein-RNA complex, this may be due to the large conformational changes between bound and unbound structure in both protein and DNA. We extend PRIME 2.0 to PRIME 2.0.1 to model protein-DNA complex structures. By comparing sequence and structure alignment methods, we found that structure-based methods can find more templates than sequence-based methods. The results of all-to-all structure alignments showed that DNA structure plays an important role in prediction of protein-DNA complex structure. By exploring the relationship of sequence and structure, we found that in protein-DNA interaction, numerous structures with dissimilar sequences have similar 3D structures and perform the similar function.
Collapse
Affiliation(s)
- Juan Xie
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Jinfang Zheng
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Xu Hong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Xiaoxue Tong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Xudong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Qi Song
- Key Laboratory of Fermentation Engineering (Ministry of Education), Hubei University of Technology, China
| | - Sen Liu
- Key Laboratory of Fermentation Engineering (Ministry of Education), Hubei University of Technology, China
| | - Shiyong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China.
| |
Collapse
|
30
|
Soltanikazemi E, Quadir F, Roy RS, Guo Z, Cheng J. Distance-based reconstruction of protein quaternary structures from inter-chain contacts. Proteins 2021; 90:720-731. [PMID: 34716620 PMCID: PMC8816881 DOI: 10.1002/prot.26269] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 09/25/2021] [Accepted: 10/12/2021] [Indexed: 12/21/2022]
Abstract
Predicting the quaternary structure of protein complex is an important problem. Inter‐chain residue‐residue contact prediction can provide useful information to guide the ab initio reconstruction of quaternary structures. However, few methods have been developed to build quaternary structures from predicted inter‐chain contacts. Here, we develop the first method based on gradient descent optimization (GD) to build quaternary structures of protein dimers utilizing inter‐chain contacts as distance restraints. We evaluate GD on several datasets of homodimers and heterodimers using true/predicted contacts and monomer structures as input. GD consistently performs better than both simulated annealing and Markov Chain Monte Carlo simulation. Starting from an arbitrarily quaternary structure randomly initialized from the tertiary structures of protein chains and using true inter‐chain contacts as input, GD can reconstruct high‐quality structural models for homodimers and heterodimers with average TM‐score ranging from 0.92 to 0.99 and average interface root mean square distance from 0.72 Å to 1.64 Å. On a dataset of 115 homodimers, using predicted inter‐chain contacts as restraints, the average TM‐score of the structural models built by GD is 0.76. For 46% of the homodimers, high‐quality structural models with TM‐score ≥ 0.9 are reconstructed from predicted contacts. There is a strong correlation between the quality of the reconstructed models and the precision and recall of predicted contacts. Only a moderate precision or recall of inter‐chain contact prediction is needed to build good structural models for most homodimers. Moreover, GD improves the quality of quaternary structures predicted by AlphaFold2 on a Critical Assessment of Techniques for Protein Structure Prediction–Critical Assessments of Predictions of Interactions dataset.
Collapse
Affiliation(s)
- Elham Soltanikazemi
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Farhan Quadir
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Raj S Roy
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Zhiye Guo
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| |
Collapse
|
31
|
Gaber A, Pavšič M. Modeling and Structure Determination of Homo-Oligomeric Proteins: An Overview of Challenges and Current Approaches. Int J Mol Sci 2021; 22:9081. [PMID: 34445785 PMCID: PMC8396596 DOI: 10.3390/ijms22169081] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 08/20/2021] [Accepted: 08/20/2021] [Indexed: 12/12/2022] Open
Abstract
Protein homo-oligomerization is a very common phenomenon, and approximately half of proteins form homo-oligomeric assemblies composed of identical subunits. The vast majority of such assemblies possess internal symmetry which can be either exploited to help or poses challenges during structure determination. Moreover, aspects of symmetry are critical in the modeling of protein homo-oligomers either by docking or by homology-based approaches. Here, we first provide a brief overview of the nature of protein homo-oligomerization. Next, we describe how the symmetry of homo-oligomers is addressed by crystallographic and non-crystallographic symmetry operations, and how biologically relevant intermolecular interactions can be deciphered from the ordered array of molecules within protein crystals. Additionally, we describe the most important aspects of protein homo-oligomerization in structure determination by NMR. Finally, we give an overview of approaches aimed at modeling homo-oligomers using computational methods that specifically address their internal symmetry and allow the incorporation of other experimental data as spatial restraints to achieve higher model reliability.
Collapse
|
32
|
Quadir F, Roy RS, Soltanikazemi E, Cheng J. DeepComplex: A Web Server of Predicting Protein Complex Structures by Deep Learning Inter-chain Contact Prediction and Distance-Based Modelling. Front Mol Biosci 2021; 8:716973. [PMID: 34497831 PMCID: PMC8419425 DOI: 10.3389/fmolb.2021.716973] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Accepted: 08/12/2021] [Indexed: 11/13/2022] Open
Abstract
Proteins interact to form complexes. Predicting the quaternary structure of protein complexes is useful for protein function analysis, protein engineering, and drug design. However, few user-friendly tools leveraging the latest deep learning technology for inter-chain contact prediction and the distance-based modelling to predict protein quaternary structures are available. To address this gap, we develop DeepComplex, a web server for predicting structures of dimeric protein complexes. It uses deep learning to predict inter-chain contacts in a homodimer or heterodimer. The predicted contacts are then used to construct a quaternary structure of the dimer by the distance-based modelling, which can be interactively viewed and analysed. The web server is freely accessible and requires no registration. It can be easily used by providing a job name and an email address along with the tertiary structure for one chain of a homodimer or two chains of a heterodimer. The output webpage provides the multiple sequence alignment, predicted inter-chain residue-residue contact map, and predicted quaternary structure of the dimer. DeepComplex web server is freely available at http://tulip.rnet.missouri.edu/deepcomplex/web_index.html.
Collapse
Affiliation(s)
| | | | | | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, United States
| |
Collapse
|
33
|
Gao F, Glaser J, Glotzer SC. The role of complementary shape in protein dimerization. SOFT MATTER 2021; 17:7376-7383. [PMID: 34304260 DOI: 10.1039/d1sm00468a] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Shape guides colloidal nanoparticles to form complex assemblies, but its role in defining interfaces in biomolecular complexes is less clear. In this work, we isolate the role of shape in protein complexes by studying the reversible binding processes of 46 protein dimer pairs, and investigate when entropic effects from shape complementarity alone are sufficient to predict the native protein binding interface. We employ depletants using a generic, implicit depletion model to amplify the magnitude of the entropic forces arising from lock-and-key binding and isolate the effect of shape complementarity in protein dimerization. For 13% of the complexes studied here, protein shape is sufficient to predict native complexes as equilibrium assemblies. We elucidate the results by analyzing the importance of competing binding configurations and how it affects the assembly. A machine learning classifier, with a precision of 89.14% and a recall of 77.11%, is able to identify the cases where shape alone predicts the native protein interface.
Collapse
Affiliation(s)
- Fengyi Gao
- Department of Chemical Engineering, University of Michigan, Ann Arbor, Michigan 48109, USA.
| | | | | |
Collapse
|
34
|
Quignot C, Postic G, Bret H, Rey J, Granger P, Murail S, Chacón P, Andreani J, Tufféry P, Guerois R. InterEvDock3: a combined template-based and free docking server with increased performance through explicit modeling of complex homologs and integration of covariation-based contact maps. Nucleic Acids Res 2021; 49:W277-W284. [PMID: 33978743 PMCID: PMC8265070 DOI: 10.1093/nar/gkab358] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Revised: 04/09/2021] [Accepted: 04/23/2021] [Indexed: 12/19/2022] Open
Abstract
The InterEvDock3 protein docking server exploits the constraints of evolution by multiple means to generate structural models of protein assemblies. The server takes as input either several sequences or 3D structures of proteins known to interact. It returns a set of 10 consensus candidate complexes, together with interface predictions to guide further experimental validation interactively. Three key novelties were implemented in InterEvDock3 to help obtain more reliable models: users can (i) generate template-based structural models of assemblies using close and remote homologs of known 3D structure, detected through an automated search protocol, (ii) select the assembly models most consistent with contact maps from external methods that implement covariation-based contact prediction with or without deep learning and (iii) exploit a novel coevolution-based scoring scheme at atomic level, which leads to significantly higher free docking success rates. The performance of the server was validated on two large free docking benchmark databases, containing respectively 230 unbound targets (Weng dataset) and 812 models of unbound targets (PPI4DOCK dataset). Its effectiveness has also been proven on a number of challenging examples. The InterEvDock3 web interface is available at http://bioserv.rpbs.univ-paris-diderot.fr/services/InterEvDock3/.
Collapse
Affiliation(s)
- Chloé Quignot
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Guillaume Postic
- Université de Paris, CNRS UMR 8251, INSERM U1133, RPBS, Paris 75205, France
| | - Hélène Bret
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Julien Rey
- Université de Paris, CNRS UMR 8251, INSERM U1133, RPBS, Paris 75205, France
| | - Pierre Granger
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Samuel Murail
- Université de Paris, CNRS UMR 8251, INSERM U1133, RPBS, Paris 75205, France
| | - Pablo Chacón
- Department of Biological Physical Chemistry, Rocasolano Institute of Physical Chemistry C.S.I.C, Madrid, Spain
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Pierre Tufféry
- Université de Paris, CNRS UMR 8251, INSERM U1133, RPBS, Paris 75205, France
| | - Raphaël Guerois
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| |
Collapse
|
35
|
Kurcinski M, Kmiecik S, Zalewski M, Kolinski A. Protein-Protein Docking with Large-Scale Backbone Flexibility Using Coarse-Grained Monte-Carlo Simulations. Int J Mol Sci 2021; 22:ijms22147341. [PMID: 34298961 PMCID: PMC8306105 DOI: 10.3390/ijms22147341] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2021] [Revised: 07/03/2021] [Accepted: 07/04/2021] [Indexed: 12/21/2022] Open
Abstract
Most of the protein–protein docking methods treat proteins as almost rigid objects. Only the side-chains flexibility is usually taken into account. The few approaches enabling docking with a flexible backbone typically work in two steps, in which the search for protein–protein orientations and structure flexibility are simulated separately. In this work, we propose a new straightforward approach for docking sampling. It consists of a single simulation step during which a protein undergoes large-scale backbone rearrangements, rotations, and translations. Simultaneously, the other protein exhibits small backbone fluctuations. Such extensive sampling was possible using the CABS coarse-grained protein model and Replica Exchange Monte Carlo dynamics at a reasonable computational cost. In our proof-of-concept simulations of 62 protein–protein complexes, we obtained acceptable quality models for a significant number of cases.
Collapse
|
36
|
Dapkūnas J, Olechnovič K, Venclovas Č. Modeling of protein complexes in CASP14 with emphasis on the interaction interface prediction. Proteins 2021; 89:1834-1843. [PMID: 34176161 PMCID: PMC9292421 DOI: 10.1002/prot.26167] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 06/21/2021] [Accepted: 06/23/2021] [Indexed: 01/08/2023]
Abstract
The goal of CASP experiments is to monitor the progress in the protein structure prediction field. During the 14th CASP edition we aimed to test our capabilities of predicting structures of protein complexes. Our protocol for modeling protein assemblies included both template‐based modeling and free docking. Structural templates were identified using sensitive sequence‐based searches. If sequence‐based searches failed, we performed structure‐based template searches using selected CASP server models. In the absence of reliable templates we applied free docking starting from monomers generated by CASP servers. We evaluated and ranked models of protein complexes using an improved version of our protein structure quality assessment method, VoroMQA, taking into account both interaction interface and global structure scores. If reliable templates could be identified, generally accurate models of protein assemblies were generated with the exception of an antibody‐antigen interaction. The success of free docking mainly depended on the accuracy of initial subunit models and on the scoring of docking solutions. To put our overall results in perspective, we analyzed our performance in the context of other CASP groups. Although the subunits in our assembly models often were not of the top quality, these models had, overall, the best‐predicted intersubunit interfaces according to several accuracy measures. We attribute our relative success primarily to the emphasis on the interaction interface when modeling and scoring.
Collapse
Affiliation(s)
- Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| |
Collapse
|
37
|
Prévost C, Sacquin-Mora S. Moving pictures: Reassessing docking experiments with a dynamic view of protein interfaces. Proteins 2021; 89:1315-1323. [PMID: 34038009 DOI: 10.1002/prot.26152] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 03/22/2021] [Accepted: 05/19/2021] [Indexed: 11/06/2022]
Abstract
The modeling of protein assemblies at the atomic level remains a central issue in structural biology, as protein interactions play a key role in numerous cellular processes. This problem is traditionally addressed using docking tools, where the quality of the models is based on their similarity to a single reference experimental structure. However, using a static reference does not take into account the dynamic quality of the protein interface. Here, we used all-atom classical Molecular Dynamics simulations to investigate the stability of the reference interface for three complexes that previously served as targets in the CAPRI competition. For each one of these targets, we also ran MD simulations for ten models that are distributed over the High, Medium and Acceptable accuracy categories. To assess the quality of these models from a dynamic perspective, we set up new criteria which take into account the stability of the reference experimental protein interface. We show that, when the protein interfaces are allowed to evolve along time, the original ranking based on the static CAPRI criteria no longer holds as over 50% of the docking models undergo a category change (which can be either toward a better or a lower accuracy group) when reassessing their quality using dynamic information.
Collapse
Affiliation(s)
- Chantal Prévost
- CNRS, Laboratoire de Biochimie Théorique, UPR9080, Université de Paris, Paris, France.,Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, Paris, France
| | - Sophie Sacquin-Mora
- CNRS, Laboratoire de Biochimie Théorique, UPR9080, Université de Paris, Paris, France.,Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, Paris, France
| |
Collapse
|
38
|
Sulimov VB, Kutov DC, Taschilova AS, Ilin IS, Tyrtyshnikov EE, Sulimov AV. Docking Paradigm in Drug Design. Curr Top Med Chem 2021; 21:507-546. [PMID: 33292135 DOI: 10.2174/1568026620666201207095626] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Revised: 09/28/2020] [Accepted: 10/16/2020] [Indexed: 11/22/2022]
Abstract
Docking is in demand for the rational computer aided structure based drug design. A review of docking methods and programs is presented. Different types of docking programs are described. They include docking of non-covalent small ligands, protein-protein docking, supercomputer docking, quantum docking, the new generation of docking programs and the application of docking for covalent inhibitors discovery. Taking into account the threat of COVID-19, we present here a short review of docking applications to the discovery of inhibitors of SARS-CoV and SARS-CoV-2 target proteins, including our own result of the search for inhibitors of SARS-CoV-2 main protease using docking and quantum chemical post-processing. The conclusion is made that docking is extremely important in the fight against COVID-19 during the process of development of antivirus drugs having a direct action on SARS-CoV-2 target proteins.
Collapse
Affiliation(s)
- Vladimir B Sulimov
- Research Computer Center of Lomonosov Moscow State University, Moscow, Russian Federation
| | - Danil C Kutov
- Research Computer Center of Lomonosov Moscow State University, Moscow, Russian Federation
| | - Anna S Taschilova
- Research Computer Center of Lomonosov Moscow State University, Moscow, Russian Federation
| | - Ivan S Ilin
- Research Computer Center of Lomonosov Moscow State University, Moscow, Russian Federation
| | - Eugene E Tyrtyshnikov
- Institute of Numerical Mathematics of Russian Academy of Sciences, Moscow, Russian Federation
| | - Alexey V Sulimov
- Research Computer Center of Lomonosov Moscow State University, Moscow, Russian Federation
| |
Collapse
|
39
|
Abstract
Biologists are increasingly aware of the importance of protein structure in revealing function. The computational tools now exist which allow researchers to model unknown proteins simply on the basis of their primary sequence. However, for the non-specialist bioinformatician, there is a dazzling array of terminology, acronyms, and competing computer software available for this process. This review is intended to highlight the key stages of computational protein structure prediction, as well as explain the reasons behind some of the procedures and list some established workarounds for common pitfalls. Thereafter follows a review of five one-stop servers for start-to-finish structure prediction.
Collapse
|
40
|
Barradas-Bautista D, Cao Z, Cavallo L, Oliva R. The CASP13-CAPRI targets as case studies to illustrate a novel scoring pipeline integrating CONSRANK with clustering and interface analyses. BMC Bioinformatics 2020; 21:262. [PMID: 32938371 PMCID: PMC7493188 DOI: 10.1186/s12859-020-03600-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Accepted: 06/10/2020] [Indexed: 08/27/2023] Open
Abstract
Background Properly scoring protein-protein docking models to single out the correct ones is an open challenge, also object of assessment in CAPRI (Critical Assessment of PRedicted Interactions), a community-wide blind docking experiment. We introduced in the field CONSRANK (CONSensus RANKing), the first pure consensus method. Also available as a web server, CONSRANK ranks docking models in an ensemble based on their ability to match the most frequent inter-residue contacts in it. We have been blindly testing CONSRANK in all the latest CAPRI rounds, where we showed it to perform competitively with the state-of-the-art energy and knowledge-based scoring functions. More recently, we developed Clust-CONSRANK, an algorithm introducing a contact-based clustering of the models as a preliminary step of the CONSRANK scoring process. In the latest CASP13-CAPRI joint experiment, we participated as scorers with a novel pipeline, combining both our scoring tools, CONSRANK and Clust-CONSRANK, with our interface analysis tool COCOMAPS. Selection of the 10 models for submission was guided by the strength of the emerging consensus, and their final ranking was assisted by results of the interface analysis. Results As a result of the above approach, we were by far the first scorer in the CASP13-CAPRI top-1 ranking, having high/medium quality models ranked at the top-1 position for the majority of targets (11 out of the total 19). We were also the first scorer in the top-10 ranking, on a par with another group, and the second scorer in the top-5 ranking. Further, we topped the ranking relative to the prediction of binding interfaces, among all the scorers and predictors. Using the CASP13-CAPRI targets as case studies, we illustrate here in detail the approach we adopted. Conclusions Introducing some flexibility in the final model selection and ranking, as well as differentiating the adopted scoring approach depending on the targets were the key assets for our highly successful performance, as compared to previous CAPRI rounds. The approach we propose is entirely based on methods made available to the community and could thus be reproduced by any user.
Collapse
|
41
|
Desta IT, Porter KA, Xia B, Kozakov D, Vajda S. Performance and Its Limits in Rigid Body Protein-Protein Docking. Structure 2020; 28:1071-1081.e3. [PMID: 32649857 DOI: 10.1016/j.str.2020.06.006] [Citation(s) in RCA: 274] [Impact Index Per Article: 68.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Revised: 04/19/2020] [Accepted: 06/19/2020] [Indexed: 12/13/2022]
Abstract
The development of fast Fourier transform (FFT) algorithms enabled the sampling of billions of complex conformations and thus revolutionized protein-protein docking. FFT-based methods are now widely available and have been used in hundreds of thousands of docking calculations. Although the methods perform "soft" docking, which allows for some overlap of component proteins, the rigid body assumption clearly introduces limitations on accuracy and reliability. In addition, the method can work only with energy expressions represented by sums of correlation functions. In this paper we use a well-established protein-protein docking benchmark set to evaluate the results of these limitations by focusing on the performance of the docking server ClusPro, which implements one of the best rigid body methods. Furthermore, we explore the theoretical limits of accuracy when using established energy terms for scoring, provide comparison with flexible docking algorithms, and review the historical performance of servers in the CAPRI docking experiment.
Collapse
Affiliation(s)
- Israel T Desta
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Kathryn A Porter
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Bing Xia
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11794, USA
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA.
| |
Collapse
|
42
|
Duan R, Qiu L, Xu X, Ma Z, Merideth BR, Shyu CR, Zou X. Performance of human and server prediction in CAPRI rounds 38-45. Proteins 2020; 88:1110-1120. [PMID: 32483825 DOI: 10.1002/prot.25956] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 03/26/2020] [Accepted: 05/27/2020] [Indexed: 11/11/2022]
Abstract
CAPRI challenges offer a variety of blind tests for protein-protein interaction prediction. In CAPRI Rounds 38-45, we generated a set of putative binding modes for each target with an FFT-based docking algorithm, and then scored and ranked these binding modes with a proprietary scoring function, ITScorePP. We have also developed a novel web server, Rebipp. The algorithm utilizes information retrieval to identify relevant biological information to significantly reduce the search space for a particular protein. In parallel, we have also constructed a GPU-based docking server, MDockPP, for protein-protein complex structure prediction. Here, the performance of our protocol in CAPRI rounds 38-45 is reported, which include 16 docking and scoring targets. Among them, three targets contain multiple interfaces: Targets 124, 125, and 136 have 2, 4, and 3 interfaces, respectively. In the predictor experiments, we predicted correct binding modes for nine targets, including one high-accuracy interface, six medium-accuracy binding modes, and six acceptable-accuracy binding modes. For the docking server prediction experiments, we predicted correct binding modes for eight targets, including one high-accuracy, three medium-accuracy, and five acceptable-accuracy binding modes.
Collapse
Affiliation(s)
- Rui Duan
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
| | - Liming Qiu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
| | - Xianjin Xu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
| | - Zhiwei Ma
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA.,Department of Physics and Astronomy, University of Missouri, Columbia, Missouri, USA
| | - Benjamin Ryan Merideth
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA.,Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA
| | - Chi-Ren Shyu
- Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA.,Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Xiaoqin Zou
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA.,Department of Physics and Astronomy, University of Missouri, Columbia, Missouri, USA.,Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA.,Department of Biochemistry, University of Missouri, Columbia, Missouri, USA
| |
Collapse
|
43
|
Yan Y, Tao H, He J, Huang SY. The HDOCK server for integrated protein–protein docking. Nat Protoc 2020; 15:1829-1852. [DOI: 10.1038/s41596-020-0312-x] [Citation(s) in RCA: 288] [Impact Index Per Article: 72.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2019] [Accepted: 02/03/2020] [Indexed: 12/27/2022]
|
44
|
Dudko HV, Urban VA, Davidovskii AI, Veresov VG. Structure-based modeling of turnover of Bcl-2 family proteins bound to voltage-dependent anion channel 2 (VDAC2): Implications for the mechanisms of proapoptotic activation of Bak and Bax in vivo. Comput Biol Chem 2020; 85:107203. [DOI: 10.1016/j.compbiolchem.2020.107203] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 12/31/2019] [Accepted: 01/13/2020] [Indexed: 12/15/2022]
|
45
|
Kong R, Liu R, Xu X, Zhang D, Xu X, Shi H, Chang S. Template‐based modeling and ab‐initio docking using
CoDock
in
CAPRI. Proteins 2020; 88:1100-1109. [DOI: 10.1002/prot.25892] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Revised: 12/21/2019] [Accepted: 03/07/2020] [Indexed: 01/11/2023]
Affiliation(s)
- Ren Kong
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology Changzhou China
| | - Ran‐Ran Liu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology Changzhou China
| | - Xi‐Ming Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology Changzhou China
- Innovation Center for Marine Drug Screening and Evaluation, Qingdao National Laboratory for Marine Science and Technology Qingdao China
| | - Da‐Wei Zhang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology Changzhou China
| | - Xiao‐Shuang Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology Changzhou China
| | - Hang Shi
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology Changzhou China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology Changzhou China
- Innovation Center for Marine Drug Screening and Evaluation, Qingdao National Laboratory for Marine Science and Technology Qingdao China
| |
Collapse
|
46
|
Padhorny D, Porter KA, Ignatov M, Alekseenko A, Beglov D, Kotelnikov S, Ashizawa R, Desta I, Alam N, Sun Z, Brini E, Dill K, Schueler-Furman O, Vajda S, Kozakov D. ClusPro in rounds 38 to 45 of CAPRI: Toward combining template-based methods with free docking. Proteins 2020; 88:1082-1090. [PMID: 32142178 DOI: 10.1002/prot.25887] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2019] [Revised: 02/27/2020] [Accepted: 03/04/2020] [Indexed: 01/01/2023]
Abstract
Targets in the protein docking experiment CAPRI (Critical Assessment of Predicted Interactions) generally present new challenges and contribute to new developments in methodology. In rounds 38 to 45 of CAPRI, most targets could be effectively predicted using template-based methods. However, the server ClusPro required structures rather than sequences as input, and hence we had to generate and dock homology models. The available templates also provided distance restraints that were directly used as input to the server. We show here that such an approach has some advantages. Free docking with template-based restraints using ClusPro reproduced some interfaces suggested by weak or ambiguous templates while not reproducing others, resulting in correct server predicted models. More recently we developed the fully automated ClusPro TBM server that performs template-based modeling and thus can use sequences rather than structures of component proteins as input. The performance of the server, freely available for noncommercial use at https://tbm.cluspro.org, is demonstrated by predicting the protein-protein targets of rounds 38 to 45 of CAPRI.
Collapse
Affiliation(s)
- Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Kathryn A Porter
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Mikhail Ignatov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Andrey Alekseenko
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA.,Institute of Computer Aided Design of the Russian Academy of Sciences, Moscow, Russia
| | - Dmitri Beglov
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA.,Acpharis Inc., Holliston, Massachusetts, USA
| | - Sergei Kotelnikov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA.,Innopolis University, Innopolis, Russia
| | - Ryota Ashizawa
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Israel Desta
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Nawsad Alam
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University, Jerusalem, Israel
| | - Zhuyezi Sun
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Emiliano Brini
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Ken Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA.,Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York, USA.,Department of Chemistry, Stony Brook University, Stony Brook, New York, USA
| | - Ora Schueler-Furman
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University, Jerusalem, Israel
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA.,Department of Chemistry, Boston University, Boston, Massachusetts, USA
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| |
Collapse
|
47
|
Lensink MF, Nadzirin N, Velankar S, Wodak SJ. Modeling protein‐protein, protein‐peptide, and protein‐oligosaccharide complexes: CAPRI 7th edition. Proteins 2020; 88:916-938. [DOI: 10.1002/prot.25870] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 12/19/2019] [Accepted: 12/26/2019] [Indexed: 12/19/2022]
Affiliation(s)
- Marc F. Lensink
- University of Lille, CNRS UMR8576 UGSF, Unité de Glycobiologie Structurale et Fonctionnelle F‐59000 Lille France
| | - Nurul Nadzirin
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI), Wellcome Trust Genome Campus Cambridge UK
| | - Sameer Velankar
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI), Wellcome Trust Genome Campus Cambridge UK
| | | |
Collapse
|
48
|
Jing X, Zeng H, Wang S, Xu J. A Web-Based Protocol for Interprotein Contact Prediction by Deep Learning. Methods Mol Biol 2020; 2074:67-80. [PMID: 31583631 DOI: 10.1007/978-1-4939-9873-9_6] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Identifying residue-residue contacts in protein-protein interactions or complex is crucial for understanding protein and cell functions. DCA (direct-coupling analysis) methods shed some light on this, but they need many sequence homologs to yield accurate prediction. Inspired by the success of our deep-learning method for intraprotein contact prediction, we have developed RaptorX-ComplexContact, a web server for interprotein residue-residue contact prediction. Given a pair of interacting protein sequences, RaptorX-ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA) based on genomic distance and phylogeny information, respectively. Then, RaptorX-ComplexContact uses two deep convolutional residual neural networks (ResNet) to predict interprotein contacts from sequential features and coevolution information of paired MSAs. RaptorX-ComplexContact shall be useful for protein docking, protein-protein interaction prediction, and protein interaction network construction.
Collapse
Affiliation(s)
- Xiaoyang Jing
- Toyota Technological Institute at Chicago, Chicago, IL, USA
- School of Computer Science, Fudan University, Shanghai, China
| | - Hong Zeng
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China
| | - Sheng Wang
- Toyota Technological Institute at Chicago, Chicago, IL, USA
| | - Jinbo Xu
- Toyota Technological Institute at Chicago, Chicago, IL, USA.
| |
Collapse
|
49
|
Abstract
Macromolecular complexes play a key role in cellular function. Predicting the structure and dynamics of these complexes is one of the key challenges in structural biology. Docking applications have traditionally been used to predict pairwise interactions between proteins. However, few methods exist for modeling multi-protein assemblies. Here we present two methods, CombDock and DockStar, that can predict multi-protein assemblies starting from subunit structural models. CombDock can assemble subunits without any assumptions about the pairwise interactions between subunits, while DockStar relies on the interaction graph or, alternatively, a homology model or a cryo-electron microscopy (EM) density map of the entire complex. We demonstrate the two methods using RNA polymerase II with 12 subunits and TRiC/CCT chaperonin with 16 subunits.
Collapse
Affiliation(s)
- Dina Schneidman-Duhovny
- School of Computer Science and Engineering and the Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Haim J Wolfson
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
50
|
Kotelnikov S, Alekseenko A, Liu C, Ignatov M, Padhorny D, Brini E, Lukin M, Coutsias E, Dill KA, Kozakov D. Sampling and refinement protocols for template-based macrocycle docking: 2018 D3R Grand Challenge 4. J Comput Aided Mol Des 2019; 34:179-189. [PMID: 31879831 DOI: 10.1007/s10822-019-00257-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 11/19/2019] [Indexed: 12/25/2022]
Abstract
We describe a new template-based method for docking flexible ligands such as macrocycles to proteins. It combines Monte-Carlo energy minimization on the manifold, a fast manifold search method, with BRIKARD for complex flexible ligand searching, and with the MELD accelerator of Replica-Exchange Molecular Dynamics simulations for atomistic degrees of freedom. Here we test the method in the Drug Design Data Resource blind Grand Challenge competition. This method was among the best performers in the competition, giving sub-angstrom prediction quality for the majority of the targets.
Collapse
Affiliation(s)
- Sergei Kotelnikov
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA.,Innopolis University, Innopolis, Russia
| | - Andrey Alekseenko
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA
| | - Cong Liu
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Chemistry, Stony Brook University, Stony Brook, NY, USA
| | - Mikhail Ignatov
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA.,Institute for Advanced Computational Sciences, Stony Brook University, Stony Brook, NY, USA
| | - Dzmitry Padhorny
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA
| | - Emiliano Brini
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA
| | - Mark Lukin
- Department of Pharmacological Sciences, Stony Brook University, Stony Brook, NY, USA
| | - Evangelos Coutsias
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA
| | - Ken A Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Chemistry, Stony Brook University, Stony Brook, NY, USA.,Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY, USA
| | - Dima Kozakov
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA. .,Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA. .,Institute for Advanced Computational Sciences, Stony Brook University, Stony Brook, NY, USA.
| |
Collapse
|