1
|
Mondal A, Singh B, Felkner RH, De Falco A, Swapna G, Montelione GT, Roth MJ, Perez A. A Computational Pipeline for Accurate Prioritization of Protein-Protein Binding Candidates in High-Throughput Protein Libraries. Angew Chem Int Ed Engl 2024; 63:e202405767. [PMID: 38588243 DOI: 10.1002/anie.202405767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 04/05/2024] [Accepted: 04/08/2024] [Indexed: 04/10/2024]
Abstract
Identifying the interactome for a protein of interest is challenging due to the large number of possible binders. High-throughput experimental approaches narrow down possible binding partners but often include false positives. Furthermore, they provide no information about what the binding region is (e.g., the binding epitope). We introduce a novel computational pipeline based on an AlphaFold2 (AF) Competitive Binding Assay (AF-CBA) to identify proteins that bind a target of interest from a pull-down experiment and the binding epitope. Our focus is on proteins that bind the Extraterminal (ET) domain of Bromo and Extraterminal domain (BET) proteins, but we also introduce nine additional systems to show transferability to other peptide-protein systems. We describe a series of limitations to the methodology based on intrinsic deficiencies of AF and AF-CBA to help users identify scenarios where the approach will be most useful. Given the method's speed and accuracy, we anticipate its broad applicability to identify binding epitope regions among potential partners, setting the stage for experimental verification.
Collapse
Affiliation(s)
- Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Leigh Hall 240, Gainesville, FL, USA
| | - Bhumika Singh
- Department of Chemistry and Quantum Theory Project, University of Florida, Leigh Hall 240, Gainesville, FL, USA
| | - Roland H Felkner
- Department of Pharmacology, Rutgers-Robert Wood Johnson Medical School, 675 Hoes Lane Rm 636, Piscataway, NJ 08854, USA
| | - Anna De Falco
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York 12180, USA
| | - Gvt Swapna
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York 12180, USA
| | - Gaetano T Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York 12180, USA
| | - Monica J Roth
- Department of Pharmacology, Rutgers-Robert Wood Johnson Medical School, 675 Hoes Lane Rm 636, Piscataway, NJ 08854, USA
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Leigh Hall 240, Gainesville, FL, USA
| |
Collapse
|
2
|
Rao J, Xie J, Yuan Q, Liu D, Wang Z, Lu Y, Zheng S, Yang Y. A variational expectation-maximization framework for balanced multi-scale learning of protein and drug interactions. Nat Commun 2024; 15:4476. [PMID: 38796523 DOI: 10.1038/s41467-024-48801-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 05/14/2024] [Indexed: 05/28/2024] Open
Abstract
Protein functions are characterized by interactions with proteins, drugs, and other biomolecules. Understanding these interactions is essential for deciphering the molecular mechanisms underlying biological processes and developing new therapeutic strategies. Current computational methods mostly predict interactions based on either molecular network or structural information, without integrating them within a unified multi-scale framework. While a few multi-view learning methods are devoted to fusing the multi-scale information, these methods tend to rely intensively on a single scale and under-fitting the others, likely attributed to the imbalanced nature and inherent greediness of multi-scale learning. To alleviate the optimization imbalance, we present MUSE, a multi-scale representation learning framework based on a variant expectation maximization to optimize different scales in an alternating procedure over multiple iterations. This strategy efficiently fuses multi-scale information between atomic structure and molecular network scale through mutual supervision and iterative optimization. MUSE outperforms the current state-of-the-art models not only in molecular interaction (protein-protein, drug-protein, and drug-drug) tasks but also in protein interface prediction at the atomic structure scale. More importantly, the multi-scale learning framework shows potential for extension to other scales of computational drug discovery.
Collapse
Affiliation(s)
- Jiahua Rao
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Jiancong Xie
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Qianmu Yuan
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Deqin Liu
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Zhen Wang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Yutong Lu
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.
| | - Shuangjia Zheng
- Global Institute of Future Technology, Shanghai Jiao Tong University, Shanghai, China.
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.
- Key Laboratory of Machine Intelligence and Advanced Computing (MOE), Sun Yat-sen University, Guangzhou, China.
- State Key Laboratory of Oncology in South China, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
3
|
Zhao H, Petrey D, Murray D, Honig B. ZEPPI: Proteome-scale sequence-based evaluation of protein-protein interaction models. Proc Natl Acad Sci U S A 2024; 121:e2400260121. [PMID: 38743624 PMCID: PMC11127014 DOI: 10.1073/pnas.2400260121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 04/18/2024] [Indexed: 05/16/2024] Open
Abstract
We introduce ZEPPI (Z-score Evaluation of Protein-Protein Interfaces), a framework to evaluate structural models of a complex based on sequence coevolution and conservation involving residues in protein-protein interfaces. The ZEPPI score is calculated by comparing metrics for an interface to those obtained from randomly chosen residues. Since contacting residues are defined by the structural model, this obviates the need to account for indirect interactions. Further, although ZEPPI relies on species-paired multiple sequence alignments, its focus on interfacial residues allows it to leverage quite shallow alignments. ZEPPI can be implemented on a proteome-wide scale and is applied here to millions of structural models of dimeric complexes in the Escherichia coli and human interactomes found in the PrePPI database. PrePPI's scoring function is based primarily on the evaluation of protein-protein interfaces, and ZEPPI adds a new feature to this analysis through the incorporation of evolutionary information. ZEPPI performance is evaluated through applications to experimentally determined complexes and to decoys from the CASP-CAPRI experiment. As we discuss, the standard CAPRI scores used to evaluate docking models are based on model quality and not on the ability to give yes/no answers as to whether two proteins interact. ZEPPI is able to detect weak signals from PPI models that the CAPRI scores define as incorrect and, similarly, to identify potential PPIs defined as low confidence by the current PrePPI scoring function. A number of examples that illustrate how the combination of PrePPI and ZEPPI can yield functional hypotheses are provided.
Collapse
Affiliation(s)
- Haiqing Zhao
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY10032
| | - Donald Petrey
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY10032
| | - Diana Murray
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY10032
| | - Barry Honig
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY10032
- Department of Biochemistry and Molecular Biophysics, Columbia University Irving Medical Center, New York, NY10032
- Department of Medicine, Columbia University, New York, NY10032
- Zuckerman Institute, Columbia University, New York, NY10027
| |
Collapse
|
4
|
Grassmann G, Miotto M, Desantis F, Di Rienzo L, Tartaglia GG, Pastore A, Ruocco G, Monti M, Milanetti E. Computational Approaches to Predict Protein-Protein Interactions in Crowded Cellular Environments. Chem Rev 2024; 124:3932-3977. [PMID: 38535831 PMCID: PMC11009965 DOI: 10.1021/acs.chemrev.3c00550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 04/11/2024]
Abstract
Investigating protein-protein interactions is crucial for understanding cellular biological processes because proteins often function within molecular complexes rather than in isolation. While experimental and computational methods have provided valuable insights into these interactions, they often overlook a critical factor: the crowded cellular environment. This environment significantly impacts protein behavior, including structural stability, diffusion, and ultimately the nature of binding. In this review, we discuss theoretical and computational approaches that allow the modeling of biological systems to guide and complement experiments and can thus significantly advance the investigation, and possibly the predictions, of protein-protein interactions in the crowded environment of cell cytoplasm. We explore topics such as statistical mechanics for lattice simulations, hydrodynamic interactions, diffusion processes in high-viscosity environments, and several methods based on molecular dynamics simulations. By synergistically leveraging methods from biophysics and computational biology, we review the state of the art of computational methods to study the impact of molecular crowding on protein-protein interactions and discuss its potential revolutionizing effects on the characterization of the human interactome.
Collapse
Affiliation(s)
- Greta Grassmann
- Department
of Biochemical Sciences “Alessandro Rossi Fanelli”, Sapienza University of Rome, Rome 00185, Italy
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
| | - Mattia Miotto
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
| | - Fausta Desantis
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- The
Open University Affiliated Research Centre at Istituto Italiano di
Tecnologia, Genoa 16163, Italy
| | - Lorenzo Di Rienzo
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
| | - Gian Gaetano Tartaglia
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- Department
of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa 16163, Italy
- Center
for Human Technologies, Genoa 16152, Italy
| | - Annalisa Pastore
- Experiment
Division, European Synchrotron Radiation
Facility, Grenoble 38043, France
| | - Giancarlo Ruocco
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- Department
of Physics, Sapienza University, Rome 00185, Italy
| | - Michele Monti
- RNA
System Biology Lab, Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa 16163, Italy
| | - Edoardo Milanetti
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- Department
of Physics, Sapienza University, Rome 00185, Italy
| |
Collapse
|
5
|
Jia P, Zhang F, Wu C, Li M. A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond. Brief Bioinform 2024; 25:bbae162. [PMID: 38739759 PMCID: PMC11089422 DOI: 10.1093/bib/bbae162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 02/17/2024] [Accepted: 03/31/2024] [Indexed: 05/16/2024] Open
Abstract
Proteins interact with diverse ligands to perform a large number of biological functions, such as gene expression and signal transduction. Accurate identification of these protein-ligand interactions is crucial to the understanding of molecular mechanisms and the development of new drugs. However, traditional biological experiments are time-consuming and expensive. With the development of high-throughput technologies, an increasing amount of protein data is available. In the past decades, many computational methods have been developed to predict protein-ligand interactions. Here, we review a comprehensive set of over 160 protein-ligand interaction predictors, which cover protein-protein, protein-nucleic acid, protein-peptide and protein-other ligands (nucleotide, heme, ion) interactions. We have carried out a comprehensive analysis of the above four types of predictors from several significant perspectives, including their inputs, feature profiles, models, availability, etc. The current methods primarily rely on protein sequences, especially utilizing evolutionary information. The significant improvement in predictions is attributed to deep learning methods. Additionally, sequence-based pretrained models and structure-based approaches are emerging as new trends.
Collapse
Affiliation(s)
- Pengzhen Jia
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Fuhao Zhang
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
- College of Information Engineering, Northwest A&F University, No. 3 Taicheng Road, Yangling, Shaanxi 712100, China
| | - Chaojin Wu
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| |
Collapse
|
6
|
Xiong D, Qiu Y, Zhao J, Zhou Y, Lee D, Gupta S, Torres M, Lu W, Liang S, Kang JJ, Eng C, Loscalzo J, Cheng F, Yu H. Structurally-informed human interactome reveals proteome-wide perturbations by disease mutations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.04.24.538110. [PMID: 37162909 PMCID: PMC10168245 DOI: 10.1101/2023.04.24.538110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Human genome sequencing studies have identified numerous loci associated with complex diseases. However, translating human genetic and genomic findings to disease pathobiology and therapeutic discovery remains a major challenge at multiscale interactome network levels. Here, we present a deep-learning-based ensemble framework, termed PIONEER (Protein-protein InteractiOn iNtErfacE pRediction), that accurately predicts protein binding partner-specific interfaces for all known protein interactions in humans and seven other common model organisms, generating comprehensive structurally-informed protein interactomes. We demonstrate that PIONEER outperforms existing state-of-the-art methods. We further systematically validated PIONEER predictions experimentally through generating 2,395 mutations and testing their impact on 6,754 mutation-interaction pairs, confirming the high quality and validity of PIONEER predictions. We show that disease-associated mutations are enriched in PIONEER-predicted protein-protein interfaces after mapping mutations from ~60,000 germline exomes and ~36,000 somatic genomes. We identify 586 significant protein-protein interactions (PPIs) enriched with PIONEER-predicted interface somatic mutations (termed oncoPPIs) from pan-cancer analysis of ~11,000 tumor whole-exomes across 33 cancer types. We show that PIONEER-predicted oncoPPIs are significantly associated with patient survival and drug responses from both cancer cell lines and patient-derived xenograft mouse models. We identify a landscape of PPI-perturbing tumor alleles upon ubiquitination by E3 ligases, and we experimentally validate the tumorigenic KEAP1-NRF2 interface mutation p.Thr80Lys in non-small cell lung cancer. We show that PIONEER-predicted PPI-perturbing alleles alter protein abundance and correlates with drug responses and patient survival in colon and uterine cancers as demonstrated by proteogenomic data from the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium. PIONEER, implemented as both a web server platform and a software package, identifies functional consequences of disease-associated alleles and offers a deep learning tool for precision medicine at multiscale interactome network levels.
Collapse
Affiliation(s)
- Dapeng Xiong
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| | - Yunguang Qiu
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Junfei Zhao
- Department of Systems Biology, Herbert Irving Comprehensive Center, Columbia University, New York, NY 10032, USA
| | - Yadi Zhou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Dongjin Lee
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Shobhita Gupta
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
- Biophysics Program, Cornell University, Ithaca, NY 14853, USA
| | - Mateo Torres
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| | - Weiqiang Lu
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Jin Joo Kang
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| | - Charis Eng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Joseph Loscalzo
- Channing Division of Network Medicine, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
7
|
Mondal A, Singh B, Felkner RH, De Falco A, Swapna GVT, Montelione GT, Roth MJ, Perez A. Sifting Through the Noise: A Computational Pipeline for Accurate Prioritization of Protein-Protein Binding Candidates in High-Throughput Protein Libraries. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.20.576374. [PMID: 38328039 PMCID: PMC10849530 DOI: 10.1101/2024.01.20.576374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Identifying the interactome for a protein of interest is challenging due to the large number of possible binders. High-throughput experimental approaches narrow down possible binding partners, but often include false positives. Furthermore, they provide no information about what the binding region is (e.g. the binding epitope). We introduce a novel computational pipeline based on an AlphaFold2 (AF) Competition Assay (AF-CBA) to identify proteins that bind a target of interest from a pull-down experiment, along with the binding epitope. Our focus is on proteins that bind the Extraterminal (ET) domain of Bromo and Extraterminal domain (BET) proteins, but we also introduce nine additional systems to show transferability to other peptide-protein systems. We describe a series of limitations to the methodology based on intrinsic deficiencies to AF and AF-CBA, to help users identify scenarios where the approach will be most useful. Given the speed and accuracy of the methodology, we expect it to be generally applicable to facilitate target selection for experimental verification starting from high-throughput protein libraries.
Collapse
Affiliation(s)
- Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Leigh Hall 240, Gainesville, FL
| | - Bhumika Singh
- Department of Chemistry and Quantum Theory Project, University of Florida, Leigh Hall 240, Gainesville, FL
| | - Roland H. Felkner
- Department of Pharmacology, Rutgers-Robert Wood Johnson Medical School, 675 Hoes Lane Rm 636, Piscataway, NJ 08854
| | - Anna De Falco
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York 12180, United States
| | - GVT Swapna
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York 12180, United States
| | - Gaetano T. Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York 12180, United States
| | - Monica J. Roth
- Department of Pharmacology, Rutgers-Robert Wood Johnson Medical School, 675 Hoes Lane Rm 636, Piscataway, NJ 08854
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Leigh Hall 240, Gainesville, FL
| |
Collapse
|
8
|
Michalik I, Kuder KJ. Machine Learning Methods in Protein-Protein Docking. Methods Mol Biol 2024; 2780:107-126. [PMID: 38987466 DOI: 10.1007/978-1-0716-3985-6_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
An exponential increase in the number of publications that address artificial intelligence (AI) usage in life sciences has been noticed in recent years, while new modeling techniques are constantly being reported. The potential of these methods is vast-from understanding fundamental cellular processes to discovering new drugs and breakthrough therapies. Computational studies of protein-protein interactions, crucial for understanding the operation of biological systems, are no exception in this field. However, despite the rapid development of technology and the progress in developing new approaches, many aspects remain challenging to solve, such as predicting conformational changes in proteins, or more "trivial" issues as high-quality data in huge quantities.Therefore, this chapter focuses on a short introduction to various AI approaches to study protein-protein interactions, followed by a description of the most up-to-date algorithms and programs used for this purpose. Yet, given the considerable pace of development in this hot area of computational science, at the time you read this chapter, the development of the algorithms described, or the emergence of new (and better) ones should come as no surprise.
Collapse
Affiliation(s)
- Ilona Michalik
- Department of Technology and Biotechnology of Drugs, Faculty of Pharmacy, Jagiellonian University Medical College, Kraków, Poland
| | - Kamil J Kuder
- Department of Technology and Biotechnology of Drugs, Faculty of Pharmacy, Jagiellonian University Medical College, Kraków, Poland.
| |
Collapse
|
9
|
Wu F, Lin C, Han Y, Zhou D, Chen K, Yang M, Xiao Q, Zhang H, Li W. Multi-omic analysis characterizes molecular susceptibility of receptors to SARS-CoV-2 spike protein. Comput Struct Biotechnol J 2023; 21:5583-5600. [PMID: 38034398 PMCID: PMC10681948 DOI: 10.1016/j.csbj.2023.11.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 11/05/2023] [Accepted: 11/05/2023] [Indexed: 12/02/2023] Open
Abstract
In the post COVID-19 era, new SARS-CoV-2 variant strains may continue emerging and long COVID is poised to be another public health challenge. Deciphering the molecular susceptibility of receptors to SARS-CoV-2 spike protein is critical for understanding the immune responses in COVID-19 and the rationale of multi-organ injuries. Currently, such systematic exploration remains limited. Here, we conduct multi-omic analysis of protein binding affinities, transcriptomic expressions, and single-cell atlases to characterize the molecular susceptibility of receptors to SARS-CoV-2 spike protein. Initial affinity analysis explains the domination of delta and omicron variants and demonstrates the strongest affinities between BSG (CD147) receptor and most variants. Further transcriptomic data analysis on 4100 experimental samples and single-cell atlases of 1.4 million cells suggest the potential involvement of BSG in multi-organ injuries and long COVID, and explain the high prevalence of COVID-19 in elders as well as the different risks for patients with underlying diseases. Correlation analysis validated moderate associations between BSG and viral RNA abundance in multiple cell types. Moreover, similar patterns were observed in primates and validated in proteomic expressions. Overall, our findings implicate important therapeutic targets for the development of receptor-specific vaccines and drugs for COVID-19.
Collapse
Affiliation(s)
- Fanjie Wu
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
| | - Chenghao Lin
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
| | - Yutong Han
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
| | - Dingli Zhou
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
| | - Kang Chen
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
| | - Minglei Yang
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
- Department of Pathology, First Affiliated Hospital of Zhengzhou University, Zhengzhou 450052, China
| | - Qinyuan Xiao
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
| | - Haiyue Zhang
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
| | - Weizhong Li
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
- Key Laboratory of Tropical Disease Control of Ministry of Education, Sun Yat-Sen University, Guangzhou 510080, China
- Center for Precision Medicine, Sun Yat-sen University, Guangzhou 510080, China
| |
Collapse
|
10
|
Liu Z, Zhu YH, Shen LC, Xiao X, Qiu WR, Yu DJ. Integrating unsupervised language model with multi-view multiple sequence alignments for high-accuracy inter-chain contact prediction. Comput Biol Med 2023; 166:107529. [PMID: 37748220 DOI: 10.1016/j.compbiomed.2023.107529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 08/30/2023] [Accepted: 09/19/2023] [Indexed: 09/27/2023]
Abstract
Accurate identification of inter-chain contacts in the protein complex is critical to determine the corresponding 3D structures and understand the biological functions. We proposed a new deep learning method, ICCPred, to deduce the inter-chain contacts from the amino acid sequences of the protein complex. This pipeline was built on the designed deep residual network architecture, integrating the pre-trained language model with three multiple sequence alignments (MSAs) from different biological views. Experimental results on 709 non-redundant benchmarking protein complexes showed that the proposed ICCPred significantly increased inter-chain contact prediction accuracy compared to the state-of-the-art approaches. Detailed data analyses showed that the significant advantage of ICCPred lies in the utilization of pre-trained transformer language models which can effectively extract the complementary co-evolution diversity from three MSAs. Meanwhile, the designed deep residual network enhances the correlation between the co-evolution diversity and the patterns of inter-chain contacts. These results demonstrated a new avenue for high-accuracy deep-learning inter-chain contact prediction that is applicable to large-scale protein-protein interaction annotations from sequence alone.
Collapse
Affiliation(s)
- Zi Liu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Xiaolingwei 200, Nanjing, 210094, China; Computer Department, Jingdezhen Ceramic University, Jingdezhen, 333403 , China
| | - Yi-Heng Zhu
- College of Artificial Intelligence, Nanjing Agricultural University, Nanjing, 210095 , China
| | - Long-Chen Shen
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Xiaolingwei 200, Nanjing, 210094, China
| | - Xuan Xiao
- Computer Department, Jingdezhen Ceramic University, Jingdezhen, 333403 , China
| | - Wang-Ren Qiu
- Computer Department, Jingdezhen Ceramic University, Jingdezhen, 333403 , China.
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Xiaolingwei 200, Nanjing, 210094, China.
| |
Collapse
|
11
|
Mou M, Pan Z, Zhou Z, Zheng L, Zhang H, Shi S, Li F, Sun X, Zhu F. A Transformer-Based Ensemble Framework for the Prediction of Protein-Protein Interaction Sites. RESEARCH (WASHINGTON, D.C.) 2023; 6:0240. [PMID: 37771850 PMCID: PMC10528219 DOI: 10.34133/research.0240] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 09/08/2023] [Indexed: 09/30/2023]
Abstract
The identification of protein-protein interaction (PPI) sites is essential in the research of protein function and the discovery of new drugs. So far, a variety of computational tools based on machine learning have been developed to accelerate the identification of PPI sites. However, existing methods suffer from the low predictive accuracy or the limited scope of application. Specifically, some methods learned only global or local sequential features, leading to low predictive accuracy, while others achieved improved performance by extracting residue interactions from structures but were limited in their application scope for the serious dependence on precise structure information. There is an urgent need to develop a method that integrates comprehensive information to realize proteome-wide accurate profiling of PPI sites. Herein, a novel ensemble framework for PPI sites prediction, EnsemPPIS, was therefore proposed based on transformer and gated convolutional networks. EnsemPPIS can effectively capture not only global and local patterns but also residue interactions. Specifically, EnsemPPIS was unique in (a) extracting residue interactions from protein sequences with transformer and (b) further integrating global and local sequential features with the ensemble learning strategy. Compared with various existing methods, EnsemPPIS exhibited either superior performance or broader applicability on multiple PPI sites prediction tasks. Moreover, pattern analysis based on the interpretability of EnsemPPIS demonstrated that EnsemPPIS was fully capable of learning residue interactions within the local structure of PPI sites using only sequence information. The web server of EnsemPPIS is freely available at http://idrblab.org/ensemppis.
Collapse
Affiliation(s)
- Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Zhimeng Zhou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Lingyan Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Hanyu Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Shuiyang Shi
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Fengcheng Li
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Xiuna Sun
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
12
|
Morehead A, Chen C, Sedova A, Cheng J. DIPS-Plus: The enhanced database of interacting protein structures for interface prediction. Sci Data 2023; 10:509. [PMID: 37537186 PMCID: PMC10400622 DOI: 10.1038/s41597-023-02409-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 07/24/2023] [Indexed: 08/05/2023] Open
Abstract
In this work, we expand on a dataset recently introduced for protein interface prediction (PIP), the Database of Interacting Protein Structures (DIPS), to present DIPS-Plus, an enhanced, feature-rich dataset of 42,112 complexes for machine learning of protein interfaces. While the original DIPS dataset contains only the Cartesian coordinates for atoms contained in the protein complex along with their types, DIPS-Plus contains multiple residue-level features including surface proximities, half-sphere amino acid compositions, and new profile hidden Markov model (HMM)-based sequence features for each amino acid, providing researchers a curated feature bank for training protein interface prediction methods. We demonstrate through rigorous benchmarks that training an existing state-of-the-art (SOTA) model for PIP on DIPS-Plus yields new SOTA results, surpassing the performance of some of the latest models trained on residue-level and atom-level encodings of protein complexes to date.
Collapse
Affiliation(s)
- Alex Morehead
- University of Missouri, Electrical Engineering & Computer Science, Columbia, MO, 65211, USA.
| | - Chen Chen
- University of Missouri, Electrical Engineering & Computer Science, Columbia, MO, 65211, USA
| | - Ada Sedova
- Oak Ridge National Laboratory, Oak Ridge, TN, 37830, USA
| | - Jianlin Cheng
- University of Missouri, Electrical Engineering & Computer Science, Columbia, MO, 65211, USA
| |
Collapse
|
13
|
Roche R, Moussad B, Shuvo MH, Bhattacharya D. E(3) equivariant graph neural networks for robust and accurate protein-protein interaction site prediction. PLoS Comput Biol 2023; 19:e1011435. [PMID: 37651442 PMCID: PMC10499216 DOI: 10.1371/journal.pcbi.1011435] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 09/13/2023] [Accepted: 08/15/2023] [Indexed: 09/02/2023] Open
Abstract
Artificial intelligence-powered protein structure prediction methods have led to a paradigm-shift in computational structural biology, yet contemporary approaches for predicting the interfacial residues (i.e., sites) of protein-protein interaction (PPI) still rely on experimental structures. Recent studies have demonstrated benefits of employing graph convolution for PPI site prediction, but ignore symmetries naturally occurring in 3-dimensional space and act only on experimental coordinates. Here we present EquiPPIS, an E(3) equivariant graph neural network approach for PPI site prediction. EquiPPIS employs symmetry-aware graph convolutions that transform equivariantly with translation, rotation, and reflection in 3D space, providing richer representations for molecular data compared to invariant convolutions. EquiPPIS substantially outperforms state-of-the-art approaches based on the same experimental input, and exhibits remarkable robustness by attaining better accuracy with predicted structural models from AlphaFold2 than what existing methods can achieve even with experimental structures. Freely available at https://github.com/Bhattacharya-Lab/EquiPPIS, EquiPPIS enables accurate PPI site prediction at scale.
Collapse
Affiliation(s)
- Rahmatullah Roche
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Bernard Moussad
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Md Hossain Shuvo
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Debswapna Bhattacharya
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| |
Collapse
|
14
|
Belkevich AE, Pascual HG, Fakhouri AM, Ball DG, Knutson BA. Distinct Interaction Modes for the Eukaryotic RNA Polymerase Alpha-like Subunits. Mol Cell Biol 2023; 43:269-282. [PMID: 37222571 PMCID: PMC10251799 DOI: 10.1080/10985549.2023.2210023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 03/26/2023] [Accepted: 04/12/2023] [Indexed: 05/25/2023] Open
Abstract
Eukaryotic DNA-dependent RNA polymerases (Pols I-III) encode two distinct alpha-like heterodimers where one is shared between Pols I and III, and the other is unique to Pol II. Human alpha-like subunit mutations are associated with several diseases including Treacher Collins Syndrome (TCS), 4H leukodystrophy, and primary ovarian sufficiency. Yeast is commonly used to model human disease mutations, yet it remains unclear whether the alpha-like subunit interactions are functionally similar between yeast and human homologs. To examine this, we mutated several regions of the yeast and human small alpha-like subunits and used biochemical and genetic assays to establish the regions and residues required for heterodimerization with their corresponding large alpha-like subunits. Here we show that different regions of the small alpha-like subunits serve differential roles in heterodimerization, in a polymerase- and species-specific manner. We found that the small human alpha-like subunits are more sensitive to mutations, including a "humanized" yeast that we used to characterize the molecular consequence of the TCS-causingPOLR1D G52E mutation. These findings help explain why some alpha subunit associated disease mutations have little to no effect when made in their yeast orthologs and offer a better yeast model to assess the molecular basis of POLR1D associated disease mutations.
Collapse
Affiliation(s)
- Alana E. Belkevich
- Department of Biochemistry and Molecular Biology, SUNY Upstate Medical University, Syracuse, New York, USA
| | - Haleigh G. Pascual
- Department of Biochemistry and Molecular Biology, SUNY Upstate Medical University, Syracuse, New York, USA
| | - Aula M. Fakhouri
- Department of Biochemistry and Molecular Biology, SUNY Upstate Medical University, Syracuse, New York, USA
| | - David G. Ball
- Department of Biochemistry and Molecular Biology, SUNY Upstate Medical University, Syracuse, New York, USA
| | - Bruce A. Knutson
- Department of Biochemistry and Molecular Biology, SUNY Upstate Medical University, Syracuse, New York, USA
| |
Collapse
|
15
|
Sunny S, Prakash PB, Gopakumar G, Jayaraj PB. DeepBindPPI: Protein-Protein Binding Site Prediction Using Attention Based Graph Convolutional Network. Protein J 2023:10.1007/s10930-023-10121-9. [PMID: 37198346 DOI: 10.1007/s10930-023-10121-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2023] [Indexed: 05/19/2023]
Abstract
Due to the importance of protein-protein interactions in defence mechanism of living body, attempts were made to investigate its attributes, including, but not limited to, binding affinity, and binding region. Contemporary strategies for binding site prediction largely resort to deep learning techniques but turned out to be low precision models. As laboratory experiments for drug discovery tasks utilize this information, increased false positives devalue the computational methods. This emphasize the need to develop enhanced strategies. DeepBindPPI employs deep learning technique to predict the binding regions of proteins, particularly antigen-antibody interaction sites. The results obtained are applied in a docking environment to confirm their correctness. An integration of graph convolutional network with attention mechanism predicts interacting amino acids with improved precision. The model learns the determining factors in interaction from a general pool of proteins and is then fine-tuned using antigen-antibody data. Comparison of the proposed method with existing techniques shows that the developed model has comparable performance. The use of a separate spatial network clearly improved the precision of the proposed method from 0.4 to 0.5. An attempt to utilize the interface information for docking using the HDOCK server gives promising results, with high-quality structures appearing in the top10 ranks.
Collapse
Affiliation(s)
- Sharon Sunny
- Department of CSE, National Institute of Technology, Calicut, Kerala, 673601, India.
| | | | - G Gopakumar
- Department of CSE, National Institute of Technology, Calicut, Kerala, 673601, India
| | - P B Jayaraj
- Department of CSE, National Institute of Technology, Calicut, Kerala, 673601, India
| |
Collapse
|
16
|
Saldinger JC, Raymond M, Elvati P, Violi A. Domain-agnostic predictions of nanoscale interactions in proteins and nanoparticles. NATURE COMPUTATIONAL SCIENCE 2023; 3:393-402. [PMID: 38177838 DOI: 10.1038/s43588-023-00438-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 03/24/2023] [Indexed: 01/06/2024]
Abstract
Although challenging, the accurate and rapid prediction of nanoscale interactions has broad applications for numerous biological processes and material properties. While several models have been developed to predict the interaction of specific biological components, they use system-specific information that hinders their application to more general materials. Here we present NeCLAS, a general and efficient machine learning pipeline that predicts the location of nanoscale interactions, providing human-intelligible predictions. NeCLAS outperforms current nanoscale prediction models for generic nanoparticles up to 10-20 nm, reproducing interactions for biological and non-biological systems. Two aspects contribute to these results: a low-dimensional representation of nanoparticles and molecules (to reduce the effect of data uncertainty), and environmental features (to encode the physicochemical neighborhood at multiple scales). This framework has several applications, from basic research to rapid prototyping and design in nanobiotechnology.
Collapse
Affiliation(s)
| | - Matt Raymond
- Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA
| | - Paolo Elvati
- Mechanical Engineering, University of Michigan, Ann Arbor, MI, USA
| | - Angela Violi
- Chemical Engineering, University of Michigan, Ann Arbor, MI, USA.
- Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA.
- Mechanical Engineering, University of Michigan, Ann Arbor, MI, USA.
- Biophysics Program, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
17
|
Durham J, Zhang J, Humphreys IR, Pei J, Cong Q. Recent advances in predicting and modeling protein-protein interactions. Trends Biochem Sci 2023; 48:527-538. [PMID: 37061423 DOI: 10.1016/j.tibs.2023.03.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 03/03/2023] [Accepted: 03/17/2023] [Indexed: 04/17/2023]
Abstract
Protein-protein interactions (PPIs) drive biological processes, and disruption of PPIs can cause disease. With recent breakthroughs in structure prediction and a deluge of genomic sequence data, computational methods to predict PPIs and model spatial structures of protein complexes are now approaching the accuracy of experimental approaches for permanent interactions and show promise for elucidating transient interactions. As we describe here, the key to this success is rich evolutionary information deciphered from thousands of homologous sequences that coevolve in interacting partners. This covariation signal, revealed by sophisticated statistical and machine learning (ML) algorithms, predicts physiological interactions. Accurate artificial intelligence (AI)-based modeling of protein structures promises to provide accurate 3D models of PPIs at a proteome-wide scale.
Collapse
Affiliation(s)
- Jesse Durham
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Jing Zhang
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Ian R Humphreys
- Department of Biochemistry, University of Washington, Seattle, WA, USA; Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Jimin Pei
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Qian Cong
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
18
|
Rui H, Ashton KS, Min J, Wang C, Potts PR. Protein-protein interfaces in molecular glue-induced ternary complexes: classification, characterization, and prediction. RSC Chem Biol 2023; 4:192-215. [PMID: 36908699 PMCID: PMC9994104 DOI: 10.1039/d2cb00207h] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 01/02/2023] [Indexed: 01/04/2023] Open
Abstract
Molecular glues are a class of small molecules that stabilize the interactions between proteins. Naturally occurring molecular glues are present in many areas of biology where they serve as central regulators of signaling pathways. Importantly, several clinical compounds act as molecular glue degraders that stabilize interactions between E3 ubiquitin ligases and target proteins, leading to their degradation. Molecular glues hold promise as a new generation of therapeutic agents, including those molecular glue degraders that can redirect the protein degradation machinery in a precise way. However, rational discovery of molecular glues is difficult in part due to the lack of understanding of the protein-protein interactions they stabilize. In this review, we summarize the structures of known molecular glue-induced ternary complexes and the interface properties. Detailed analysis shows different mechanisms of ternary structure formation. Additionally, we also review computational approaches for predicting protein-protein interfaces and highlight the promises and challenges. This information will ultimately help inform future approaches for rational molecular glue discovery.
Collapse
Affiliation(s)
- Huan Rui
- Center for Research Acceleration by Digital Innovation, Amgen Research Thousand Oaks CA 91320 USA
| | - Kate S Ashton
- Medicinal Chemistry, Amgen Research Thousand Oaks CA 91320 USA
| | - Jaeki Min
- Induced Proximity Platform, Amgen Research Thousand Oaks CA 91320 USA
| | - Connie Wang
- Digital, Technology & Innovation, Amgen Thousand Oaks CA 91320 USA
| | | |
Collapse
|
19
|
Lin P, Yan Y, Huang SY. DeepHomo2.0: improved protein-protein contact prediction of homodimers by transformer-enhanced deep learning. Brief Bioinform 2023; 24:6849483. [PMID: 36440949 DOI: 10.1093/bib/bbac499] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/08/2022] [Accepted: 10/21/2022] [Indexed: 11/30/2022] Open
Abstract
Protein-protein interactions play an important role in many biological processes. However, although structure prediction for monomer proteins has achieved great progress with the advent of advanced deep learning algorithms like AlphaFold, the structure prediction for protein-protein complexes remains an open question. Taking advantage of the Transformer model of ESM-MSA, we have developed a deep learning-based model, named DeepHomo2.0, to predict protein-protein interactions of homodimeric complexes by leveraging the direct-coupling analysis (DCA) and Transformer features of sequences and the structure features of monomers. DeepHomo2.0 was extensively evaluated on diverse test sets and compared with eight state-of-the-art methods including protein language model-based, DCA-based and machine learning-based methods. It was shown that DeepHomo2.0 achieved a high precision of >70% with experimental monomer structures and >60% with predicted monomer structures for the top 10 predicted contacts on the test sets and outperformed the other eight methods. Moreover, even the version without using structure information, named DeepHomoSeq, still achieved a good precision of >55% for the top 10 predicted contacts. Integrating the predicted contacts into protein docking significantly improved the structure prediction of realistic Critical Assessment of Protein Structure Prediction homodimeric complexes. DeepHomo2.0 and DeepHomoSeq are available at http://huanglab.phys.hust.edu.cn/DeepHomo2/.
Collapse
Affiliation(s)
- Peicong Lin
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
20
|
Yan Y, Huang T. The Interactome of Protein, DNA, and RNA. Methods Mol Biol 2023; 2695:89-110. [PMID: 37450113 DOI: 10.1007/978-1-0716-3346-5_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
Proteins participate in many processes of the organism and are very important for maintaining the health of the organism. However, proteins cannot function independently in the body. They must interact with proteins, DNA, RNA, and other substances to perform biological functions and maintain the body's health. At present, there are many experimental methods and software tools that can detect and predict the interaction between proteins and other substances. There are also many databases that record the interaction between proteins and other substances. This article mainly describes protein-protein, protein-DNA, and protein-RNA interactions in detail by introducing some commonly used experimental methods, the software tools produced with the accumulation of experimental data and the rapid development of machine learning, and the related databases that record the relationship between proteins and some substances. By this review, we hope that through the analysis and summary of various aspects, it will be convenient for researchers to conduct further research on protein interactions.
Collapse
Affiliation(s)
- Yuyao Yan
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Tao Huang
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, China.
| |
Collapse
|
21
|
Durairaj J, de Ridder D, van Dijk AD. Beyond sequence: Structure-based machine learning. Comput Struct Biotechnol J 2022; 21:630-643. [PMID: 36659927 PMCID: PMC9826903 DOI: 10.1016/j.csbj.2022.12.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 12/21/2022] [Accepted: 12/21/2022] [Indexed: 12/31/2022] Open
Abstract
Recent breakthroughs in protein structure prediction demarcate the start of a new era in structural bioinformatics. Combined with various advances in experimental structure determination and the uninterrupted pace at which new structures are published, this promises an age in which protein structure information is as prevalent and ubiquitous as sequence. Machine learning in protein bioinformatics has been dominated by sequence-based methods, but this is now changing to make use of the deluge of rich structural information as input. Machine learning methods making use of structures are scattered across literature and cover a number of different applications and scopes; while some try to address questions and tasks within a single protein family, others aim to capture characteristics across all available proteins. In this review, we look at the variety of structure-based machine learning approaches, how structures can be used as input, and typical applications of these approaches in protein biology. We also discuss current challenges and opportunities in this all-important and increasingly popular field.
Collapse
Affiliation(s)
- Janani Durairaj
- Biozentrum, University of Basel, Basel, Switzerland
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| | - Dick de Ridder
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| | - Aalt D.J. van Dijk
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| |
Collapse
|
22
|
Williams NP, Rodrigues CHM, Truong J, Ascher DB, Holien JK. DockNet: high-throughput protein-protein interface contact prediction. Bioinformatics 2022; 39:6885444. [PMID: 36484688 PMCID: PMC9825772 DOI: 10.1093/bioinformatics/btac797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 10/27/2022] [Accepted: 12/08/2022] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION Over 300 000 protein-protein interaction (PPI) pairs have been identified in the human proteome and targeting these is fast becoming the next frontier in drug design. Predicting PPI sites, however, is a challenging task that traditionally requires computationally expensive and time-consuming docking simulations. A major weakness of modern protein docking algorithms is the inability to account for protein flexibility, which ultimately leads to relatively poor results. RESULTS Here, we propose DockNet, an efficient Siamese graph-based neural network method which predicts contact residues between two interacting proteins. Unlike other methods that only utilize a protein's surface or treat the protein structure as a rigid body, DockNet incorporates the entire protein structure and places no limits on protein flexibility during an interaction. Predictions are modeled at the residue level, based on a diverse set of input node features including residue type, surface accessibility, residue depth, secondary structure, pharmacophore and torsional angles. DockNet is comparable to current state-of-the-art methods, achieving an area under the curve (AUC) value of up to 0.84 on an independent test set (DB5), can be applied to a variety of different protein structures and can be utilized in situations where accurate unbound protein structures cannot be obtained. AVAILABILITY AND IMPLEMENTATION DockNet is available at https://github.com/npwilliams09/docknet and an easy-to-use webserver at https://biosig.lab.uq.edu.au/docknet. All other data underlying this article are available in the article and in its online supplementary material. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Jia Truong
- STEM College, RMIT University, Melbourne, VIC, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia,School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| | | |
Collapse
|
23
|
Li M, Wu Z, Wang W, Lu K, Zhang J, Zhou Y, Chen Z, Li D, Zheng S, Chen P, Wang B. Protein-Protein Interaction Sites Prediction Based on an Under-Sampling Strategy and Random Forest Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3646-3654. [PMID: 34705656 DOI: 10.1109/tcbb.2021.3123269] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The computational methods of protein-protein interaction sites prediction can effectively avoid the shortcomings of high cost and time in traditional experimental approaches. However, the serious class imbalance between interface and non-interface residues on the protein sequences limits the prediction performance of these methods. This work therefore proposed a new strategy, NearMiss-based under-sampling for unbalancing datasets and Random Forest classification (NM-RF), to predict protein interaction sites. Herein, the residues on protein sequences were represented by the PSSM-derived features, hydropathy index (HI) and relative solvent accessibility (RSA). In order to resolve the class imbalance problem, an under-sampling method based on NearMiss algorithm is adopted to remove some non-interface residues, and then the random forest algorithm is used to perform binary classification on the balanced feature datasets. Experiments show that the accuracy of NM-RF model reaches 87.6% and 84.3% on Dtestset72 and PDBtestset164 respectively, which demonstrate the effectiveness of the proposed NM-RF method in differentiating the interface or non-interface residues.
Collapse
|
24
|
Liao J, Wang Q, Wu F, Huang Z. In Silico Methods for Identification of Potential Active Sites of Therapeutic Targets. Molecules 2022; 27:7103. [PMID: 36296697 PMCID: PMC9609013 DOI: 10.3390/molecules27207103] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/12/2022] [Accepted: 08/25/2022] [Indexed: 07/30/2023] Open
Abstract
Target identification is an important step in drug discovery, and computer-aided drug target identification methods are attracting more attention compared with traditional drug target identification methods, which are time-consuming and costly. Computer-aided drug target identification methods can greatly reduce the searching scope of experimental targets and associated costs by identifying the diseases-related targets and their binding sites and evaluating the druggability of the predicted active sites for clinical trials. In this review, we introduce the principles of computer-based active site identification methods, including the identification of binding sites and assessment of druggability. We provide some guidelines for selecting methods for the identification of binding sites and assessment of druggability. In addition, we list the databases and tools commonly used with these methods, present examples of individual and combined applications, and compare the methods and tools. Finally, we discuss the challenges and limitations of binding site identification and druggability assessment at the current stage and provide some recommendations and future perspectives.
Collapse
Affiliation(s)
- Jianbo Liao
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Key Laboratory of Computer-Aided Drug Design of Dongguan City, Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
- The Second School of Clinical Medicine, Guangdong Medical University, Dongguan 523808, China
| | - Qinyu Wang
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Key Laboratory of Computer-Aided Drug Design of Dongguan City, Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
| | - Fengxu Wu
- Hubei Key Laboratory of Wudang Local Chinese Medicine Research, School of Pharmaceutical Sciences, Hubei University of Medicine, Shiyan 442000, China
| | - Zunnan Huang
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Key Laboratory of Computer-Aided Drug Design of Dongguan City, Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
- Marine Biomedical Research Institute of Guangdong Zhanjiang, Zhanjiang 524023, China
| |
Collapse
|
25
|
Hao W, Dian M, Zhou Y, Zhong Q, Pang W, Li Z, Zhao Y, Ma J, Lin X, Luo R, Li Y, Jia J, Shen H, Huang S, Dai G, Wang J, Sun Y, Xiao D. Autophagy induction promoted by m 6A reader YTHDF3 through translation upregulation of FOXO3 mRNA. Nat Commun 2022; 13:5845. [PMID: 36195598 PMCID: PMC9532426 DOI: 10.1038/s41467-022-32963-0] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 08/24/2022] [Indexed: 12/08/2022] Open
Abstract
Autophagy is crucial for maintaining cellular energy homeostasis and for cells to adapt to nutrient deficiency, and nutrient sensors regulating autophagy have been reported previously. However, the role of eiptranscriptomic modifications such as m6A in the regulation of starvation-induced autophagy is unclear. Here, we show that the m6A reader YTHDF3 is essential for autophagy induction. m6A modification is up-regulated to promote autophagosome formation and lysosomal degradation upon nutrient deficiency. METTL3 depletion leads to a loss of functional m6A modification and inhibits YTHDF3-mediated autophagy flux. YTHDF3 promotes autophagy by recognizing m6A modification sites around the stop codon of FOXO3 mRNA. YTHDF3 also recruits eIF3a and eIF4B to facilitate FOXO3 translation, subsequently initiating autophagy. Overall, our study demonstrates that the epitranscriptome regulator YTHDF3 functions as a nutrient responder, providing a glimpse into the post-transcriptional RNA modifications that regulate metabolic homeostasis.
Collapse
Affiliation(s)
- WeiChao Hao
- Department of Oncology, The First Affiliated Hospital of Guangdong Pharmaceutical University, 510080, Guangzhou, China
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China
| | - MeiJuan Dian
- Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, 510515, Guangzhou, China
- Institute of Comparative Medicine & Laboratory Animal Center, Southern Medical University, 510515, Guangzhou, China
| | - Ying Zhou
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China
- Institute of Comparative Medicine & Laboratory Animal Center, Southern Medical University, 510515, Guangzhou, China
| | - QiuLing Zhong
- Department of Neurobiology, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China
| | - WenQian Pang
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China
| | - ZiJian Li
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China
| | - YaYan Zhao
- Department of Oncology, The First Affiliated Hospital of Guangdong Pharmaceutical University, 510080, Guangzhou, China
| | - JiaCheng Ma
- Tsinghua-Peking Center for Life Sciences, School of Life Sciences, Tsinghua University, 10084, Beijing, China
| | - XiaoLin Lin
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China
- Cancer Center, Integrated Hospital of Traditional Chinese Medicine, Southern Medical University, 510315, Guangzhou, China
| | - RenRu Luo
- School of Medicine, Shenzhen Campus of Sun Yat-sen University, 518107, Guangdong, China
| | - YongLong Li
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China
- Institute of Comparative Medicine & Laboratory Animal Center, Southern Medical University, 510515, Guangzhou, China
| | - JunShuang Jia
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China
| | - HongFen Shen
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China
| | - ShiHao Huang
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China
- Institute of Comparative Medicine & Laboratory Animal Center, Southern Medical University, 510515, Guangzhou, China
| | - GuanQi Dai
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China
- Institute of Comparative Medicine & Laboratory Animal Center, Southern Medical University, 510515, Guangzhou, China
| | - JiaHong Wang
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China.
| | - Yan Sun
- Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, 510080, Guangzhou, China.
| | - Dong Xiao
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China.
- Institute of Comparative Medicine & Laboratory Animal Center, Southern Medical University, 510515, Guangzhou, China.
- National Demonstration Center for Experimental Education of Basic Medical Sciences, Southern Medical University, 510515, Guangzhou, China.
| |
Collapse
|
26
|
Protein Function Analysis through Machine Learning. Biomolecules 2022; 12:biom12091246. [PMID: 36139085 PMCID: PMC9496392 DOI: 10.3390/biom12091246] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning (ML) has been an important arsenal in computational biology used to elucidate protein function for decades. With the recent burgeoning of novel ML methods and applications, new ML approaches have been incorporated into many areas of computational biology dealing with protein function. We examine how ML has been integrated into a wide range of computational models to improve prediction accuracy and gain a better understanding of protein function. The applications discussed are protein structure prediction, protein engineering using sequence modifications to achieve stability and druggability characteristics, molecular docking in terms of protein–ligand binding, including allosteric effects, protein–protein interactions and protein-centric drug discovery. To quantify the mechanisms underlying protein function, a holistic approach that takes structure, flexibility, stability, and dynamics into account is required, as these aspects become inseparable through their interdependence. Another key component of protein function is conformational dynamics, which often manifest as protein kinetics. Computational methods that use ML to generate representative conformational ensembles and quantify differences in conformational ensembles important for function are included in this review. Future opportunities are highlighted for each of these topics.
Collapse
|
27
|
Pozzati G, Kundrotas P, Elofsson A. Scoring of protein–protein docking models utilizing predicted interface residues. Proteins 2022; 90:1493-1505. [PMID: 35246997 PMCID: PMC9314140 DOI: 10.1002/prot.26330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 02/23/2022] [Accepted: 02/28/2022] [Indexed: 11/08/2022]
Abstract
Scoring docking solutions is a difficult task, and many methods have been developed for this purpose. In docking, only a handful of the hundreds of thousands of models generated by docking algorithms are acceptable, causing difficulties when developing scoring functions. Today's best scoring functions can significantly increase the number of top‐ranked models but still fail for most targets. Here, we examine the possibility of utilizing predicted interface residues to score docking models generated during the scan stage of a docking algorithm. Many methods have been developed to infer the regions of a protein surface that interact with another protein, but most have not been benchmarked using docking algorithms. This study systematically tests different interface prediction methods for scoring >300.000 low‐resolution rigid‐body template free docking decoys. Overall we find that contact‐based interface prediction by BIPSPI is the best method to score docking solutions, with >12% of first ranked docking models being acceptable. Additional experiments indicated precision as a high‐importance metric when estimating interface prediction quality, focusing on docking constraints production. Finally, we discussed several limitations for adopting interface predictions as constraints in a docking protocol.
Collapse
Affiliation(s)
- Gabriele Pozzati
- Department of Biochemistry and Biophysics and Science for Life Laboratory Stockholm University Solna Sweden
| | - Petras Kundrotas
- Department of Biochemistry and Biophysics and Science for Life Laboratory Stockholm University Solna Sweden
- Center for Bioinformatics and Department of Molecular Biosciences University of Kansas Lawrence Kansas USA
| | - Arne Elofsson
- Department of Biochemistry and Biophysics and Science for Life Laboratory Stockholm University Solna Sweden
| |
Collapse
|
28
|
Multi-task learning to leverage partially annotated data for PPI interface prediction. Sci Rep 2022; 12:10487. [PMID: 35729253 PMCID: PMC9213449 DOI: 10.1038/s41598-022-13951-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Accepted: 05/31/2022] [Indexed: 11/29/2022] Open
Abstract
Protein protein interactions (PPI) are crucial for protein functioning, nevertheless predicting residues in PPI interfaces from the protein sequence remains a challenging problem. In addition, structure-based functional annotations, such as the PPI interface annotations, are scarce: only for about one-third of all protein structures residue-based PPI interface annotations are available. If we want to use a deep learning strategy, we have to overcome the problem of limited data availability. Here we use a multi-task learning strategy that can handle missing data. We start with the multi-task model architecture, and adapted it to carefully handle missing data in the cost function. As related learning tasks we include prediction of secondary structure, solvent accessibility, and buried residue. Our results show that the multi-task learning strategy significantly outperforms single task approaches. Moreover, only the multi-task strategy is able to effectively learn over a dataset extended with structural feature data, without additional PPI annotations. The multi-task setup becomes even more important, if the fraction of PPI annotations becomes very small: the multi-task learner trained on only one-eighth of the PPI annotations—with data extension—reaches the same performances as the single-task learner on all PPI annotations. Thus, we show that the multi-task learning strategy can be beneficial for a small training dataset where the protein’s functional properties of interest are only partially annotated.
Collapse
|
29
|
Zhang W, Meng Q, Wang J, Guo F. HDIContact: a novel predictor of residue-residue contacts on hetero-dimer interfaces via sequential information and transfer learning strategy. Brief Bioinform 2022; 23:6599074. [PMID: 35653713 DOI: 10.1093/bib/bbac169] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 03/07/2022] [Accepted: 04/16/2022] [Indexed: 11/12/2022] Open
Abstract
Proteins maintain the functional order of cell in life by interacting with other proteins. Determination of protein complex structural information gives biological insights for the research of diseases and drugs. Recently, a breakthrough has been made in protein monomer structure prediction. However, due to the limited number of the known protein structure and homologous sequences of complexes, the prediction of residue-residue contacts on hetero-dimer interfaces is still a challenge. In this study, we have developed a deep learning framework for inferring inter-protein residue contacts from sequential information, called HDIContact. We utilized transfer learning strategy to produce Multiple Sequence Alignment (MSA) two-dimensional (2D) embedding based on patterns of concatenated MSA, which could reduce the influence of noise on MSA caused by mismatched sequences or less homology. For MSA 2D embedding, HDIContact took advantage of Bi-directional Long Short-Term Memory (BiLSTM) with two-channel to capture 2D context of residue pairs. Our comprehensive assessment on the Escherichia coli (E. coli) test dataset showed that HDIContact outperformed other state-of-the-art methods, with top precision of 65.96%, the Area Under the Receiver Operating Characteristic curve (AUROC) of 83.08% and the Area Under the Precision Recall curve (AUPR) of 25.02%. In addition, we analyzed the potential of HDIContact for human-virus protein-protein complexes, by achieving top five precision of 80% on O75475-P04584 related to Human Immunodeficiency Virus. All experiments indicated that our method was a valuable technical tool for predicting inter-protein residue contacts, which would be helpful for understanding protein-protein interaction mechanisms.
Collapse
Affiliation(s)
- Wei Zhang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Qiaozhen Meng
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
30
|
Lim H, Cankara F, Tsai CJ, Keskin O, Nussinov R, Gursoy A. Artificial intelligence approaches to human-microbiome protein–protein interactions. Curr Opin Struct Biol 2022; 73:102328. [DOI: 10.1016/j.sbi.2022.102328] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 12/01/2021] [Accepted: 12/31/2021] [Indexed: 02/08/2023]
|
31
|
Lee D, Xiong D, Wierbowski S, Li L, Liang S, Yu H. Deep learning methods for 3D structural proteome and interactome modeling. Curr Opin Struct Biol 2022; 73:102329. [PMID: 35139457 PMCID: PMC8957610 DOI: 10.1016/j.sbi.2022.102329] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 12/05/2021] [Accepted: 12/31/2021] [Indexed: 12/19/2022]
Abstract
Bolstered by recent methodological and hardware advances, deep learning has increasingly been applied to biological problems and structural proteomics. Such approaches have achieved remarkable improvements over traditional machine learning methods in tasks ranging from protein contact map prediction to protein folding, prediction of protein-protein interaction interfaces, and characterization of protein-drug binding pockets. In particular, emergence of ab initio protein structure prediction methods including AlphaFold2 has revolutionized protein structural modeling. From a protein function perspective, numerous deep learning methods have facilitated deconvolution of the exact amino acid residues and protein surface regions responsible for binding other proteins or small molecule drugs. In this review, we provide a comprehensive overview of recent deep learning methods applied in structural proteomics.
Collapse
|
32
|
Casadio R, Martelli PL, Savojardo C. Machine learning solutions for predicting protein–protein interactions. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Rita Casadio
- Biocomputing Group University of Bologna Bologna Italy
| | | | | |
Collapse
|
33
|
Delaunay M, Ha-Duong T. Computational Tools and Strategies to Develop Peptide-Based Inhibitors of Protein-Protein Interactions. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2405:205-230. [PMID: 35298816 DOI: 10.1007/978-1-0716-1855-4_11] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein-protein interactions play crucial and subtle roles in many biological processes and modifications of their fine mechanisms generally result in severe diseases. Peptide derivatives are very promising therapeutic agents for modulating protein-protein associations with sizes and specificities between those of small compounds and antibodies. For the same reasons, rational design of peptide-based inhibitors naturally borrows and combines computational methods from both protein-ligand and protein-protein research fields. In this chapter, we aim to provide an overview of computational tools and approaches used for identifying and optimizing peptides that target protein-protein interfaces with high affinity and specificity. We hope that this review will help to implement appropriate in silico strategies for peptide-based drug design that builds on available information for the systems of interest.
Collapse
Affiliation(s)
| | - Tâp Ha-Duong
- Université Paris-Saclay, CNRS, BioCIS, Châtenay-Malabry, France.
| |
Collapse
|
34
|
Comparative Genomic Analysis of Statistically Significant Genomic Islands of Helicobacter pylori strains for better understanding the disease prognosis. Biosci Rep 2022; 42:230988. [PMID: 35258077 PMCID: PMC8935386 DOI: 10.1042/bsr20212084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 02/25/2022] [Accepted: 03/07/2022] [Indexed: 11/17/2022] Open
Abstract
Bacterial virulence factors are often located in their genomic islands (GIs). Helicobacter pylori, a highly diverse organism is reported to be associated with several gastrointestinal diseases like, gastritis, gastric cancer, peptic ulcer, duodenal ulcer etc. A novel similarity score-based comparative analysis with GIs of fifty H. pylori strains revealed clear idea of the various factors which promote disease progression. Two putative pathogenic GIs in some of the H. pylori strains were identified. One GI, having a putative labile enterotoxin and other dynamin-like proteins (DLPs), is predicted to increase the release of toxin by membrane vesicular formation. Another island contains a virulence-associated protein D (vapD) which is a component of a type-II toxin-antitoxin system (TAs), leads to enhance the severity of the H. pylori infection. Besides the well-known virulence factors like CagA, and VacA, several GIs have been identified which showed to have direct or indirect impact on H. pylori clinical outcomes. One such GI, containing lipopolysaccharide (LPS) biosynthesis genes was revealed to be directly connected with disease development by inhibiting the immune response. Another collagenase-containing GI worsens ulcers by slowing down the healing process. GI consisted of fliD operon was found to be connected to flagellar assembly and biofilm production. By residing in biofilms, bacteria can avoid antibiotic therapy, resulting in chronic infection. Along with well-studied CagA and VacA virulent genes, it is equally important to study these identified virulence factors for better understanding H. pylori induced disease prognosis.
Collapse
|
35
|
BIPSPI+: Mining Type-Specific Datasets of Protein Complexes to Improve Protein Binding Site Prediction. J Mol Biol 2022; 434:167556. [DOI: 10.1016/j.jmb.2022.167556] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Revised: 03/12/2022] [Accepted: 03/16/2022] [Indexed: 11/20/2022]
|
36
|
Mahbub S, Bayzid MS. EGRET: edge aggregated graph attention networks and transfer learning improve protein-protein interaction site prediction. Brief Bioinform 2022; 23:6518045. [PMID: 35106547 DOI: 10.1093/bib/bbab578] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Revised: 11/25/2021] [Accepted: 12/16/2021] [Indexed: 12/18/2022] Open
Abstract
MOTIVATION Protein-protein interactions (PPIs) are central to most biological processes. However, reliable identification of PPI sites using conventional experimental methods is slow and expensive. Therefore, great efforts are being put into computational methods to identify PPI sites. RESULTS We present Edge Aggregated GRaph Attention NETwork (EGRET), a highly accurate deep learning-based method for PPI site prediction, where we have used an edge aggregated graph attention network to effectively leverage the structural information. We, for the first time, have used transfer learning in PPI site prediction. Our proposed edge aggregated network, together with transfer learning, has achieved notable improvement over the best alternate methods. Furthermore, we systematically investigated EGRET's network behavior to provide insights about the causes of its decisions. AVAILABILITY EGRET is freely available as an open source project at https://github.com/Sazan-Mahbub/EGRET. CONTACT shams_bayzid@cse.buet.ac.bd.
Collapse
Affiliation(s)
- Sazan Mahbub
- Department of Computer Science University of Maryland, College Park, Maryland 20742, USA
| | - Md Shamsuzzoha Bayzid
- Department of Computer Science and Engineering Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| |
Collapse
|
37
|
Abstract
The biological significance of proteins attracted the scientific community in exploring their characteristics. The studies shed light on the interaction patterns and functions of proteins in a living body. Due to their practical difficulties, reliable experimental techniques pave the way for introducing computational methods in the interaction prediction. Automated methods reduced the difficulties but could not yet replace experimental studies as the field is still evolving. Interaction prediction problem being critical needs highly accurate results, but none of the existing methods could offer reliable performance that can parallel with experimental results yet. This article aims to assess the existing computational docking algorithms, their challenges, and future scope. Blind docking techniques are quite helpful when no information other than the individual structures are available. As more and more complex structures are being added to different databases, information-driven approaches can be a good alternative. Artificial intelligence, ruling over the major fields, is expected to take over this domain very shortly.
Collapse
|
38
|
Antonia AL, Barnes AB, Martin AT, Wang L, Ko DC. Variation in Leishmania chemokine suppression driven by diversification of the GP63 virulence factor. PLoS Negl Trop Dis 2021; 15:e0009224. [PMID: 34710089 PMCID: PMC8577781 DOI: 10.1371/journal.pntd.0009224] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 11/09/2021] [Accepted: 10/17/2021] [Indexed: 11/18/2022] Open
Abstract
Leishmaniasis is a neglected tropical disease with diverse outcomes ranging from self-healing lesions, to progressive non-healing lesions, to metastatic spread and destruction of mucous membranes. Although resolution of cutaneous leishmaniasis is a classic example of type-1 immunity leading to self-healing lesions, an excess of type-1 related inflammation can contribute to immunopathology and metastatic spread. Leishmania genetic diversity can contribute to variation in polarization and robustness of the immune response through differences in both pathogen sensing by the host and immune evasion by the parasite. In this study, we observed a difference in parasite chemokine suppression between the Leishmania (L.) subgenus and the Viannia (V.) subgenus, which is associated with severe immune-mediated pathology such as mucocutaneous leishmaniasis. While Leishmania (L.) subgenus parasites utilize the virulence factor and metalloprotease glycoprotein-63 (gp63) to suppress the type-1 associated host chemokine CXCL10, L. (V.) panamensis did not suppress CXCL10. To understand the molecular basis for the inter-species variation in chemokine suppression, we used in silico modeling to identify a putative CXCL10-binding site on GP63. The putative CXCL10 binding site is in a region of gp63 under significant positive selection, and it varies from the L. major wild-type sequence in all gp63 alleles identified in the L. (V.) panamensis reference genome. Mutating wild-type L. (L.) major gp63 to the L. (V.) panamensis sequence at the putative binding site impaired cleavage of CXCL10 but not a non-specific protease substrate. Notably, Viannia clinical isolates confirmed that L. (V.) panamensis primarily encodes non-CXCL10-cleaving gp63 alleles. In contrast, L. (V.) braziliensis has an intermediate level of activity, consistent with this species having more equal proportions of both alleles. Our results demonstrate how parasite genetic diversity can contribute to variation in immune responses to Leishmania spp. infection that may play critical roles in the outcome of infection. Leishmaniasis is a neglected tropical disease caused by Leishmania parasites and spread by the bites of infected sand flies. Most cases of leishmaniasis present as self-healing sores that are resolved by a balanced immune response. Other cases of leishmaniasis involve spread to sites distant from the original bite, including damage of the inner surfaces of the mouth and nose. These cases of leishmaniasis involve an excessive immune response. Leishmania parasites produce virulence factor proteins, such as GP63, to trick the immune system into mounting a weaker response. GP63 specifically degrades signaling proteins that attract and activate certain immune cells. Here, we demonstrate that Leishmania parasite species have evolved to differ in their ability to degrade signaling proteins. In Leishmania species known to cause more immune-mediated tissue damage, the GP63 virulence factor has evolved to not degrade specific immune signaling proteins, thus attracting, and activating more immune cells. Our results demonstrate how diversity among Leishmania parasite species can contribute to variation in immune responses that may play critical roles in the outcome of infection.
Collapse
Affiliation(s)
- Alejandro L. Antonia
- Department of Molecular Genetics and Microbiology, School of Medicine, Duke University, Durham, North Carolina, United States of America
| | - Alyson B. Barnes
- Department of Molecular Genetics and Microbiology, School of Medicine, Duke University, Durham, North Carolina, United States of America
| | - Amelia T. Martin
- Department of Molecular Genetics and Microbiology, School of Medicine, Duke University, Durham, North Carolina, United States of America
| | - Liuyang Wang
- Department of Molecular Genetics and Microbiology, School of Medicine, Duke University, Durham, North Carolina, United States of America
| | - Dennis C. Ko
- Department of Molecular Genetics and Microbiology, School of Medicine, Duke University, Durham, North Carolina, United States of America
- Division of Infectious Diseases, Department of Medicine, School of Medicine, Duke University, Durham, North Carolina, United States of America
- * E-mail:
| |
Collapse
|
39
|
Yan Y, Huang SY. Accurate prediction of inter-protein residue-residue contacts for homo-oligomeric protein complexes. Brief Bioinform 2021; 22:bbab038. [PMID: 33693482 PMCID: PMC8425427 DOI: 10.1093/bib/bbab038] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2020] [Revised: 01/09/2021] [Indexed: 12/14/2022] Open
Abstract
Protein-protein interactions play a fundamental role in all cellular processes. Therefore, determining the structure of protein-protein complexes is crucial to understand their molecular mechanisms and develop drugs targeting the protein-protein interactions. Recently, deep learning has led to a breakthrough in intra-protein contact prediction, achieving an unusual high accuracy in recent Critical Assessment of protein Structure Prediction (CASP) structure prediction challenges. However, due to the limited number of known homologous protein-protein interactions and the challenge to generate joint multiple sequence alignments of two interacting proteins, the advances in inter-protein contact prediction remain limited. Here, we have proposed a deep learning model to predict inter-protein residue-residue contacts across homo-oligomeric protein interfaces, named as DeepHomo. Unlike previous deep learning approaches, we integrated intra-protein distance map and inter-protein docking pattern, in addition to evolutionary coupling, sequence conservation, and physico-chemical information of monomers. DeepHomo was extensively tested on both experimentally determined structures and realistic CASP-Critical Assessment of Predicted Interaction (CAPRI) targets. It was shown that DeepHomo achieved a high precision of >60% for the top predicted contact and outperformed state-of-the-art direct-coupling analysis and machine learning-based approaches. Integrating predicted inter-chain contacts into protein-protein docking significantly improved the docking accuracy on the benchmark dataset of realistic homo-dimeric targets from CASP-CAPRI experiments. DeepHomo is available at http://huanglab.phys.hust.edu.cn/DeepHomo/.
Collapse
Affiliation(s)
- Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| |
Collapse
|
40
|
Enhancement of SARS-CoV-2 Receptor Binding Domain -CR3022 Human Antibody Binding Affinity via In silico Engineering Approach. JOURNAL OF MEDICAL MICROBIOLOGY AND INFECTIOUS DISEASES 2021. [DOI: 10.52547/jommid.9.3.156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
41
|
PPAR-Responsive Elements Enriched with Alu Repeats May Contribute to Distinctive PPARγ-DNMT1 Interactions in the Genome. Cancers (Basel) 2021; 13:cancers13163993. [PMID: 34439147 PMCID: PMC8391462 DOI: 10.3390/cancers13163993] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 08/02/2021] [Accepted: 08/05/2021] [Indexed: 01/11/2023] Open
Abstract
Simple Summary This study aimed to explore the potential role of PPARγ–DNMT1 interaction through PPAR-responsive elements (PPREs), which we have found to be enriched with Alu repeats. Apart from protein–protein interactions and co-expression in multiple cancer types, we exclusively described a prognostic role for PPARγ in uveal melanoma (UM). Abstract Background: PPARγ (peroxisome proliferator-activated receptor gamma) is involved in the pathology of numerous diseases, including UM and other types of cancer. Emerging evidence suggests that an interaction between PPARγ and DNMTs (DNA methyltransferase) plays a role in cancer that is yet to be defined. Methods: The configuration of the repeating elements was performed with CAP3 and MAFFT, and the structural modelling was conducted with HDOCK. An evolutionary action scores algorithm was used to identify oncogenic variants. A systematic bioinformatic appraisal of PPARγ and DNMT1 was performed across 29 tumor types and UM available in The Cancer Genome Atlas (TCGA). Results: PPAR-responsive elements (PPREs) enriched with Alu repeats are associated with different genomic regions, particularly the promotor region of DNMT1. PPARγ–DNMT1 co-expression is significantly associated with several cancers. C-terminals of PPARγ and DNMT1 appear to be the potential protein–protein interaction sites where disease-specific mutations may directly impair the respective protein functions. Furthermore, PPARγ expression could be identified as an additional prognostic marker for UM. Conclusions: We hypothesize that the function of PPARγ requires an additional contribution of Alu repeats which may directly influence the DNMT1 network. Regarding UM, PPARγ appears to be an additional discriminatory prognostic marker, in particular in disomy 3 tumors.
Collapse
|
42
|
Beg AZ, Farhat N, Khan AU. Designing multi-epitope vaccine candidates against functional amyloids in Pseudomonas aeruginosa through immunoinformatic and structural bioinformatics approach. INFECTION GENETICS AND EVOLUTION 2021; 93:104982. [PMID: 34186254 DOI: 10.1016/j.meegid.2021.104982] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 06/09/2021] [Accepted: 06/24/2021] [Indexed: 10/21/2022]
Abstract
Pseudomonas aeruginosa (P. aeruginosa) displays high drug resistance and biofilm-mediated adaptability, which makes its infections difficult to treat. Alternative intervention methods and targets have made such infections treatment manageable. One of the biofilm components, functional amyloids of Pseudomonas (Fap) is correlated positively with virulence and mucoidy phenotype found in infection in cystic fibrosis (CF) patients. Extracellular accessibility, conservation across P. aeruginosa isolates and linkage with lung infections phenotype in CF patients, makes Fap a promising intervention target. Furthermore, the reported effect of bacterial amyloid on neuronal function and immune response makes it a targetable candidate. In the current study, Fap C protein and its immediate interactions were explored to extract antigenic T-cell and B-cell epitopes. A combination of epitopes and peptide adjuvants has been linked to derive vaccine candidate structures. The vaccine candidates were validated for antigenicity, allergenicity, physiochemical properties, stability and interactions with TLRs and MHC alleles. Immunosimulation studies have demonstrated that vaccines elicit Th1 dominated response, which can assist in good prognosis of infection in CF patients.
Collapse
Affiliation(s)
- Ayesha Z Beg
- Medical Microbiology and Molecular Biology Lab., Interdisciplinary Biotechnology Unit, Aligarh Muslim University, Aligarh, India
| | - Nabeela Farhat
- Medical Microbiology and Molecular Biology Lab., Interdisciplinary Biotechnology Unit, Aligarh Muslim University, Aligarh, India
| | - Asad U Khan
- Medical Microbiology and Molecular Biology Lab., Interdisciplinary Biotechnology Unit, Aligarh Muslim University, Aligarh, India; Centre for Bioinformatic on Antimicrobial Resistance, IBU, Aligarh Muslim University, Aligarh, India.
| |
Collapse
|
43
|
Dai B, Bailey-Kellogg C. Protein Interaction Interface Region Prediction by Geometric Deep Learning. Bioinformatics 2021; 37:2580-2588. [PMID: 33693581 PMCID: PMC8428585 DOI: 10.1093/bioinformatics/btab154] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 01/10/2021] [Accepted: 03/02/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Protein-protein interactions drive wide-ranging molecular processes, and characterizing at the atomic level how proteins interact (beyond just the fact that they interact) can provide key insights into understanding and controlling this machinery. Unfortunately, experimental determination of three-dimensional protein complex structures remains difficult and does not scale to the increasingly large sets of proteins whose interactions are of interest. Computational methods are thus required to meet the demands of large-scale, high-throughput prediction of how proteins interact, but unfortunately both physical modeling and machine learning methods suffer from poor precision and/or recall. RESULTS In order to improve performance in predicting protein interaction interfaces, we leverage the best properties of both data- and physics-driven methods to develop a unified Geometric Deep Neural Network, "PInet" (Protein Interface Network). PInet consumes pairs of point clouds encoding the structures of two partner proteins, in order to predict their structural regions mediating interaction. To make such predictions, PInet learns and utilizes models capturing both geometrical and physicochemical molecular surface complementarity. In application to a set of benchmarks, PInet simultaneously predicts the interface regions on both interacting proteins, achieving performance equivalent to or even much better than the state-of-the-art predictor for each dataset. Furthermore, since PInet is based on joint segmentation of a representation of a protein surfaces, its predictions are meaningful in terms of the underlying physical complementarity driving molecular recognition. AVAILABILITY PInet scripts and models are available at https://github.com/FTD007/PInet. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bowen Dai
- Computer Science Department, Dartmouth, Hanover, 03755, United States
| | | |
Collapse
|
44
|
Jaiswal G, Yaduvanshi S, Kumar V. A potential peptide inhibitor of SARS-CoV-2 S and human ACE2 complex. J Biomol Struct Dyn 2021; 40:6671-6681. [PMID: 33645443 PMCID: PMC7938657 DOI: 10.1080/07391102.2021.1889665] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The disease COVID-19 has caused heavy socio-economic burden and there is immediate need to control it. The disease is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus. The viral entry into human cell depends on the attachment of spike (S) protein via its receptor binding domain (RBD) to human cell receptor angiotensin-converting enzyme 2 (hACE2). Thus, blocking the virus attachment to hACE2 could serve as potential therapeutics for viral infection. We have designed a peptide inhibitor (ΔABP-α2) targeting the RBD of S protein using in-silico approach. Docking studies and computed affinities suggested that peptide inhibitor binds at the RBD with ∼95-fold higher affinity than hACE2. Molecular dynamics (MD) simulation confirms the stable binding of inhibitor to hACE2. Immunoinformatics studies suggest non-immunogenic and non-toxic nature of peptide. Thus, the proposed peptide could serve as potential blocker for viral attachment. Communicated by Ramaswamy H. Sarma
Collapse
Affiliation(s)
- Grijesh Jaiswal
- Amity Institute of Molecular Medicine and Stem Cell Research (AIMMSCR), Amity University, Noida, India
| | - Shivani Yaduvanshi
- Amity Institute of Molecular Medicine and Stem Cell Research (AIMMSCR), Amity University, Noida, India
| | - Veerendra Kumar
- Amity Institute of Molecular Medicine and Stem Cell Research (AIMMSCR), Amity University, Noida, India
| |
Collapse
|
45
|
Kowalski A. A survey of human histone H1 subtypes interaction networks: Implications for histones H1 functioning. Proteins 2021; 89:792-810. [PMID: 33550666 DOI: 10.1002/prot.26059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 12/23/2020] [Accepted: 01/31/2021] [Indexed: 11/08/2022]
Abstract
To show a spectrum of histone H1 subtypes (H1.1-H1.5) activity realized through the protein-protein interactions, data selected from APID resources were processed with sequence-based bioinformatics approaches. Histone H1 subtypes participate in over half a thousand interactions with nuclear and cytosolic proteins (ComPPI database) engaged in the enzymatic activity and binding of nucleic acids and proteins (SIFTER tool). Small-scale networks of H1 subtypes (STRING network) have similar topological parameters (P > .05) which are, however, different for networks hubs between subtype H1.1 and H1.4 and subtype H1.3 and H1.5 (P < .05) (Cytoscape software). Based on enriched GO terms (g:Profiler toolset) of interacting proteins, molecular function and biological process of networks hubs is related to RNA binding and ribosome biogenesis (subtype H1.1 and H1.4), cell cycle and cell division (subtype H1.3 and H1.5) and protein ubiquitination and degradation (subtype H1.2). The residue propensity (BIPSPI predictor) and secondary structures of interacting surfaces (GOR algorithm) as well as a value of equilibrium dissociation constant (ISLAND predictor) indicate that a type of H1 subtypes interactions is transient in term of the stability and medium-strong in relation to the strength of binding. Histone H1 subtypes bind interacting partners in the intrinsic disorder-dependent mode (FoldIndex, PrDOS predictor), according to the coupled folding and binding and mutual synergistic folding mechanism. These results evidence that multifunctional H1 subtypes operate via protein interactions in the networks of crucial cellular processes and, therefore, confirm a new histone H1 paradigm relating to its functioning in the protein-protein interaction networks.
Collapse
Affiliation(s)
- Andrzej Kowalski
- Division of Medical Biology, Institute of Biology, Jan Kochanowski University in Kielce, Kielce, Poland
| |
Collapse
|
46
|
Abstract
Biological processes are often mediated by complexes formed between proteins and various biomolecules. The 3D structures of such protein-biomolecule complexes provide insights into the molecular mechanism of their action. The structure of these complexes can be predicted by various computational methods. Choosing an appropriate method for modelling depends on the category of biomolecule that a protein interacts with and the availability of structural information about the protein and its interacting partner. We intend for the contents of this chapter to serve as a guide as to what software would be the most appropriate for the type of data at hand and the kind of 3D complex structure required. Particularly, we have dealt with protein-small molecule ligand, protein-peptide, protein-protein, and protein-nucleic acid interactions.Most, if not all, model building protocols perform some sampling and scoring. Typically, several alternate conformations and configurations of the interactors are sampled. Each such sample is then scored for optimization. To boost the confidence in these predicted models, their assessment using other independent scoring schemes besides the inbuilt/default ones would prove to be helpful. This chapter also lists such software and serves as a guide to gauge the fidelity of modelled structures of biomolecular complexes.
Collapse
|
47
|
Sun HL, Zhu AC, Gao Y, Terajima H, Fei Q, Liu S, Zhang L, Zhang Z, Harada BT, He YY, Bissonnette MB, Hung MC, He C. Stabilization of ERK-Phosphorylated METTL3 by USP5 Increases m 6A Methylation. Mol Cell 2020; 80:633-647.e7. [PMID: 33217317 DOI: 10.1016/j.molcel.2020.10.026] [Citation(s) in RCA: 73] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Revised: 08/31/2020] [Accepted: 10/16/2020] [Indexed: 12/14/2022]
Abstract
N6-methyladenosine (m6A) is the most abundant mRNA modification and is installed by the METTL3-METTL14-WTAP methyltransferase complex. Although the importance of m6A methylation in mRNA metabolism has been well documented recently, regulation of the m6A machinery remains obscure. Through a genome-wide CRISPR screen, we identify the ERK pathway and USP5 as positive regulators of the m6A deposition. We find that ERK phosphorylates METTL3 at S43/S50/S525 and WTAP at S306/S341, followed by deubiquitination by USP5, resulting in stabilization of the m6A methyltransferase complex. Lack of METTL3/WTAP phosphorylation reduces decay of m6A-labeled pluripotent factor transcripts and traps mouse embryonic stem cells in the pluripotent state. The same phosphorylation can also be found in ERK-activated human cancer cells and contribute to tumorigenesis. Our study reveals an unrecognized function of ERK in regulating m6A methylation.
Collapse
Affiliation(s)
- Hui-Lung Sun
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA; Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Allen C Zhu
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA; Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA; Medical Scientist Training Program, The University of Chicago, Chicago, IL 60637, USA
| | - Yawei Gao
- Clinical and Translational Research Center of Shanghai First Maternity and Infant Hospital, Shanghai Key Laboratory of Signaling and Disease Research, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Hideki Terajima
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA; Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Qili Fei
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA; Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Shun Liu
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA; Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Linda Zhang
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA; Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Zijie Zhang
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA; Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Bryan T Harada
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA; Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Yu-Ying He
- Department of Medicine, Section of Dermatology, University of Chicago, Chicago, IL 60637, USA
| | - Marc B Bissonnette
- Department of Medicine, The University of Chicago, Chicago, IL 60637, USA
| | | | - Chuan He
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA; Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA; Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, IL, USA.
| |
Collapse
|
48
|
Jaiswal G, Kumar V. In-silico design of a potential inhibitor of SARS-CoV-2 S protein. PLoS One 2020; 15:e0240004. [PMID: 33002032 PMCID: PMC7529220 DOI: 10.1371/journal.pone.0240004] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 09/18/2020] [Indexed: 12/25/2022] Open
Abstract
The SARS-CoV-2 virus has caused a pandemic and is public health emergency of international concern. As of now, no registered therapies are available for treatment of coronavirus infection. The viral infection depends on the attachment of spike (S) glycoprotein to human cell receptor angiotensin-converting enzyme 2 (ACE2). We have designed a protein inhibitor (ΔABP-D25Y) targeting S protein using computational approach. The inhibitor consists of two α helical peptides homologues to protease domain (PD) of ACE2. Docking studies and molecular dynamic simulation revealed that the inhibitor binds exclusively at the ACE2 binding site of S protein. The computed binding affinity of the inhibitor is higher than the ACE2 and thus will likely out compete ACE2 for binding to S protein. Hence, the proposed inhibitor ΔABP-D25Y could be a potential blocker of S protein and receptor binding domain (RBD) attachment.
Collapse
Affiliation(s)
- Grijesh Jaiswal
- Amity Institute of Molecular Medicine and Stem Cell Research (AIMMSCR), Amity University, Noida Uttar Pradesh, India
| | - Veerendra Kumar
- Amity Institute of Molecular Medicine and Stem Cell Research (AIMMSCR), Amity University, Noida Uttar Pradesh, India
| |
Collapse
|
49
|
Kumar R, Donakonda S, Müller SA, Bötzel K, Höglinger GU, Koeglsperger T. FGF2 Affects Parkinson's Disease-Associated Molecular Networks Through Exosomal Rab8b/Rab31. Front Genet 2020; 11:572058. [PMID: 33101391 PMCID: PMC7545478 DOI: 10.3389/fgene.2020.572058] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 09/02/2020] [Indexed: 01/24/2023] Open
Abstract
Ras-associated binding (Rab) proteins are small GTPases that regulate the trafficking of membrane components during endocytosis and exocytosis including the release of extracellular vesicles (EVs). Parkinson’s disease (PD) is one of the most prevalent neurodegenerative disorder in the elderly population, where pathological proteins such as alpha-synuclein (α-Syn) are transmitted in EVs from one neuron to another neuron and ultimately across brain regions, thereby facilitating the spreading of pathology. We recently demonstrated fibroblast growth factor-2 (FGF2) to enhance the release of EVs and delineated the proteomic signature of FGF2-triggered EVs in cultured primary hippocampal neurons. Out of 235 significantly upregulated proteins, we found that FGF2 specifically enriched EVs for the two Rab family members Rab8b and Rab31. Consequently, we investigated the interactions of Rab8b and Rab31 using a network analysis approach in order to estimate the global influence of their enrichment in EVs. To achieve this, we have demarcated a protein–protein interaction network (PPiN) for these Rabs and identified the proteins associated with PD in various cellular components of the central nervous system (CNS), in different brain regions, and in the enteric nervous system (ENS). A total of 126 direct or indirect interactions were reported for two Rab candidates, out of which 114 are Rab8b interactions and 54 are Rab31 interactions, ultimately resulting in an individual interaction score (IS) of 90.48 and 42.86%, respectively. Conclusively, these results for the first time demonstrate the relevance of FGF2-induced Rab-enrichment in EVs and its potential to regulate PD pathophysiology.
Collapse
Affiliation(s)
- Rohit Kumar
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany.,Faculty of Medicine, Klinikum Rechts der Isar, Technical University of Munich, Munich, Germany.,Department of Neurology, Ludwig Maximilian University, Munich, Germany
| | - Sainitin Donakonda
- Institute of Immunology and Experimental Oncology, Technical University of Munich, Munich, Germany
| | - Stephan A Müller
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
| | - Kai Bötzel
- Department of Neurology, Ludwig Maximilian University, Munich, Germany
| | - Günter U Höglinger
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany.,Faculty of Medicine, Klinikum Rechts der Isar, Technical University of Munich, Munich, Germany.,Department of Neurology, Hannover Medical School, Hanover, Germany
| | - Thomas Koeglsperger
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany.,Department of Neurology, Ludwig Maximilian University, Munich, Germany
| |
Collapse
|
50
|
Savojardo C, Martelli PL, Casadio R. Protein–Protein Interaction Methods and Protein Phase Separation. Annu Rev Biomed Data Sci 2020. [DOI: 10.1146/annurev-biodatasci-011720-104428] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In the last decade, newly developed experimental methods have made it possible to highlight that macromolecules in the cell milieu physically interact to support physiology. This has shifted the problem of protein–protein interaction from a microscopic, electron-density scale to a mesoscopic one. Further, nowadays there is increasing evidence that proteins in the nucleus and in the cytoplasm can aggregate in membraneless organelles for different physiological reasons. In this scenario, it is urgent to face the problem of biomolecule functional annotation with efficient computational methods, suited to extract knowledge from reliable data and transfer information across different domains of investigation. Here, we revise the present state of the art of our knowledge of protein–protein interaction and the computational methods that differently implement it. Furthermore, we explore experimental and computational features of a set of proteins involved in phase separation.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology and Interdepartmental Center “Luigi Galvani” for Integrated Studies of Bioinformatics, Biophysics, and Biocomplexity, University of Bologna, 40126 Bologna, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology and Interdepartmental Center “Luigi Galvani” for Integrated Studies of Bioinformatics, Biophysics, and Biocomplexity, University of Bologna, 40126 Bologna, Italy
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology and Interdepartmental Center “Luigi Galvani” for Integrated Studies of Bioinformatics, Biophysics, and Biocomplexity, University of Bologna, 40126 Bologna, Italy
- Institute of Biomembranes, Bioenergetics, and Molecular Biotechnologies (IBIOM), Italian National Research Council (CNR), 70126 Bari, Italy
| |
Collapse
|