1
|
Viswanathan R, Carroll M, Roffe A, Fajardo JE, Fiser A. Computational prediction of multiple antigen epitopes. Bioinformatics 2024; 40:btae556. [PMID: 39271143 PMCID: PMC11453099 DOI: 10.1093/bioinformatics/btae556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Revised: 08/08/2024] [Accepted: 09/11/2024] [Indexed: 09/15/2024] Open
Abstract
MOTIVATION Identifying antigen epitopes is essential in medical applications, such as immunodiagnostic reagent discovery, vaccine design, and drug development. Computational approaches can complement low-throughput, time-consuming, and costly experimental determination of epitopes. Currently available prediction methods, however, have moderate success predicting epitopes, which limits their applicability. Epitope prediction is further complicated by the fact that multiple epitopes may be located on the same antigen and complete experimental data is often unavailable. RESULTS Here, we introduce the antigen epitope prediction program ISPIPab that combines information from two feature-based methods and a docking-based method. We demonstrate that ISPIPab outperforms each of its individual classifiers as well as other state-of-the-art methods, including those designed specifically for epitope prediction. By combining the prediction algorithm with hierarchical clustering, we show that we can effectively capture epitopes that align with available experimental data while also revealing additional novel targets for future experimental investigations.
Collapse
Affiliation(s)
- Rajalakshmi Viswanathan
- Department of Chemistry and Biochemistry, Yeshiva College, New York, NY 10033, United States
| | - Moshe Carroll
- Department of Chemistry and Biochemistry, Yeshiva College, New York, NY 10033, United States
| | - Alexandra Roffe
- Department of Chemistry and Biochemistry, Stern College for Women, New York, NY 10016, United States
| | - Jorge E Fajardo
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY 10461, United States
| | - Andras Fiser
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY 10461, United States
| |
Collapse
|
2
|
Wang C, Wang J, Song W, Luo G, Jiang T. EpiScan: accurate high-throughput mapping of antibody-specific epitopes using sequence information. NPJ Syst Biol Appl 2024; 10:101. [PMID: 39251627 PMCID: PMC11383971 DOI: 10.1038/s41540-024-00432-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2023] [Accepted: 08/27/2024] [Indexed: 09/11/2024] Open
Abstract
The identification of antibody-specific epitopes on virus proteins is crucial for vaccine development and drug design. Nonetheless, traditional wet-lab approaches for the identification of epitopes are both costly and labor-intensive, underscoring the need for the development of efficient and cost-effective computational tools. Here, EpiScan, an attention-based deep learning framework for predicting antibody-specific epitopes, is presented. EpiScan adopts a multi-input and single-output strategy by designing independent blocks for different parts of antibodies, including variable heavy chain (VH), variable light chain (VL), complementary determining regions (CDRs), and framework regions (FRs). The block predictions are weighted and integrated for the prediction of potential epitopes. Using multiple experimental data samples, we show that EpiScan, which only uses antibody sequence information, can accurately map epitopes on specific antigen structures. The antibody-specific epitopes on the receptor binding domain (RBD) of SARS coronavirus 2 (SARS-CoV-2) were located by EpiScan, and the potentially valuable vaccine epitope was identified. EpiScan can expedite the epitope mapping process for high-throughput antibody sequencing data, supporting vaccine design and drug development. Availability: For the convenience of related wet-experimental researchers, the source code and web server of EpiScan are publicly available at https://github.com/gzBiomedical/EpiScan .
Collapse
Affiliation(s)
- Chuan Wang
- School of Life Sciences, Sun Yat-sen University, Guangzhou, China
- Guangzhou National Laboratory, Guangzhou, China
| | | | - Wenjun Song
- Guangzhou National Laboratory, Guangzhou, China
- Institute of Integration of Traditional and Western Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Guanzheng Luo
- School of Life Sciences, Sun Yat-sen University, Guangzhou, China.
| | - Taijiao Jiang
- Guangzhou National Laboratory, Guangzhou, China.
- State Key Laboratory of Respiratory Disease, The Key laboratory of Advanced Interdisciplinary Studies Center, the First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China.
| |
Collapse
|
3
|
Zheng Y, Li Q, Freiberger MI, Song H, Hu G, Zhang M, Gu R, Li J. Predicting the Dynamic Interaction of Intrinsically Disordered Proteins. J Chem Inf Model 2024; 64:6768-6777. [PMID: 39163306 DOI: 10.1021/acs.jcim.4c00930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/22/2024]
Abstract
Intrinsically disordered proteins (IDPs) participate in various biological processes. Interactions involving IDPs are usually dynamic and are affected by their inherent conformation fluctuations. Comprehensive characterization of these interactions based on current techniques is challenging. Here, we present GSALIDP, a GraphSAGE-embedded LSTM network, to capture the dynamic nature of IDP-involved interactions and predict their behaviors. This framework models multiple conformations of IDP as a dynamic graph, which can effectively describe the fluctuation of its flexible conformation. The dynamic interaction between IDPs is studied, and the data sets of IDP conformations and their interactions are obtained through atomistic molecular dynamic (MD) simulations. Residues of IDP are encoded through a series of features including their frustration. GSALIDP can effectively predict the interaction sites of IDP and the contact residue pairs between IDPs. Its performance in predicting IDP interactions is on par with or even better than the conventional models in predicting the interaction of structural proteins. To the best of our knowledge, this is the first model to extend the protein interaction prediction to IDP-involved interactions.
Collapse
Affiliation(s)
- Yuchuan Zheng
- School of Physics, Zhejiang University, Hangzhou 310058, PR China
| | - Qixiu Li
- School of Physics, Zhejiang University, Hangzhou 310058, PR China
| | - Maria I Freiberger
- Protein Physiology Lab, Departamento de Quimica Biologica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires-CONICET-IQUIBICEN, Buenos Aires C1428EGA, Argentina
| | - Haoyu Song
- School of Physics, Zhejiang University, Hangzhou 310058, PR China
| | - Guorong Hu
- School of Physics, Zhejiang University, Hangzhou 310058, PR China
| | - Moxin Zhang
- School of Physics, Zhejiang University, Hangzhou 310058, PR China
| | - Ruoxu Gu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, PR China
| | - Jingyuan Li
- School of Physics, Zhejiang University, Hangzhou 310058, PR China
| |
Collapse
|
4
|
Carroll M, Rosenbaum E, Viswanathan R. Computational Methods to Predict Conformational B-Cell Epitopes. Biomolecules 2024; 14:983. [PMID: 39199371 PMCID: PMC11352882 DOI: 10.3390/biom14080983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 08/04/2024] [Accepted: 08/08/2024] [Indexed: 09/01/2024] Open
Abstract
Accurate computational prediction of B-cell epitopes can greatly enhance biomedical research and rapidly advance efforts to develop therapeutics, monoclonal antibodies, vaccines, and immunodiagnostic reagents. Previous research efforts have primarily focused on the development of computational methods to predict linear epitopes rather than conformational epitopes; however, the latter is much more biologically predominant. Several conformational B-cell epitope prediction methods have recently been published, but their predictive performances are weak. Here, we present a review of the latest computational methods and assess their performances on a diverse test set of 29 non-redundant unbound antigen structures. Our results demonstrate that ISPIPab performs better than most methods and compares favorably with other recent antigen-specific methods. Finally, we suggest new strategies and opportunities to improve computational predictions of conformational B-cell epitopes.
Collapse
Affiliation(s)
| | | | - R. Viswanathan
- Department of Chemistry and Biochemistry, Yeshiva College, Yeshiva University, New York, NY 10033, USA; (M.C.); (E.R.)
| |
Collapse
|
5
|
Viswanathan R, Carroll M, Roffe A, Fajardo JE, Fiser A. Computational Prediction of Multiple Antigen Epitopes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.08.607232. [PMID: 39211281 PMCID: PMC11360938 DOI: 10.1101/2024.08.08.607232] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Motivation Identifying antigen epitopes is essential in medical applications, such as immunodiagnostic reagent discovery, vaccine design, and drug development. Computational approaches can complement low-throughput, time-consuming, and costly experimental determination of epitopes. Currently available prediction methods, however, have moderate success predicting epitopes, which limits their applicability. Epitope prediction is further complicated by the fact that multiple epitopes may be located on the same antigen and complete experimental data is often unavailable. Results Here, we introduce the antigen epitope prediction program ISPIPab that combines information from two feature-based methods and a docking-based method. We demonstrate that ISPIPab outperforms each of its individual classifiers as well as other state-of-the-art methods, including those designed specifically for epitope prediction. By combining the prediction algorithm with hierarchical clustering, we show that we can effectively capture epitopes that align with available experimental data while also revealing additional novel targets for future experimental investigations. Contact raji@yu.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
6
|
Feng Z, Huang W, Li H, Zhu H, Kang Y, Li Z. DGCPPISP: a PPI site prediction model based on dynamic graph convolutional network and two-stage transfer learning. BMC Bioinformatics 2024; 25:252. [PMID: 39085781 PMCID: PMC11293074 DOI: 10.1186/s12859-024-05864-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Accepted: 07/10/2024] [Indexed: 08/02/2024] Open
Abstract
BACKGROUND Proteins play a pivotal role in the diverse array of biological processes, making the precise prediction of protein-protein interaction (PPI) sites critical to numerous disciplines including biology, medicine and pharmacy. While deep learning methods have progressively been implemented for the prediction of PPI sites within proteins, the task of enhancing their predictive performance remains an arduous challenge. RESULTS In this paper, we propose a novel PPI site prediction model (DGCPPISP) based on a dynamic graph convolutional neural network and a two-stage transfer learning strategy. Initially, we implement the transfer learning from dual perspectives, namely feature input and model training that serve to supply efficacious prior knowledge for our model. Subsequently, we construct a network designed for the second stage of training, which is built on the foundation of dynamic graph convolution. CONCLUSIONS To evaluate its effectiveness, the performance of the DGCPPISP model is scrutinized using two benchmark datasets. The ensuing results demonstrate that DGCPPISP outshines competing methods in terms of performance. Specifically, DGCPPISP surpasses the second-best method, EGRET, by margins of 5.9%, 10.1%, and 13.3% for F1-measure, AUPRC, and MCC metrics respectively on Dset_186_72_PDB164. Similarly, on Dset_331, it eclipses the performance of the runner-up method, HN-PPISP, by 14.5%, 19.8%, and 29.9% respectively.
Collapse
Affiliation(s)
- Zijian Feng
- Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, School of Information Engineering, Huzhou University, Huzhou, 313000, Zhejiang, China
- College of Science, Zhejiang Sci-Tech University, Hangzhou, 310018, Zhejiang, China
| | - Weihong Huang
- Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, School of Information Engineering, Huzhou University, Huzhou, 313000, Zhejiang, China
- College of Science, Zhejiang Sci-Tech University, Hangzhou, 310018, Zhejiang, China
| | - Haohao Li
- College of Science, Zhejiang Sci-Tech University, Hangzhou, 310018, Zhejiang, China
| | - Hancan Zhu
- School of Mathematics, Physics and Information, Shaoxing University, Shaoxing, 312000, Zhejiang, China
| | - Yanlei Kang
- Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, School of Information Engineering, Huzhou University, Huzhou, 313000, Zhejiang, China
| | - Zhong Li
- Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, School of Information Engineering, Huzhou University, Huzhou, 313000, Zhejiang, China.
- College of Science, Zhejiang Sci-Tech University, Hangzhou, 310018, Zhejiang, China.
| |
Collapse
|
7
|
Hu CW, Wang A, Fan D, Worth M, Chen Z, Huang J, Xie J, Macdonald J, Li L, Jiang J. OGA mutant aberrantly hydrolyzes O-GlcNAc modification from PDLIM7 to modulate p53 and cytoskeleton in promoting cancer cell malignancy. Proc Natl Acad Sci U S A 2024; 121:e2320867121. [PMID: 38838015 PMCID: PMC11181094 DOI: 10.1073/pnas.2320867121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 05/10/2024] [Indexed: 06/07/2024] Open
Abstract
O-GlcNAcase (OGA) is the only human enzyme that catalyzes the hydrolysis (deglycosylation) of O-linked beta-N-acetylglucosaminylation (O-GlcNAcylation) from numerous protein substrates. OGA has broad implications in many challenging diseases including cancer. However, its role in cell malignancy remains mostly unclear. Here, we report that a cancer-derived point mutation on the OGA's noncatalytic stalk domain aberrantly modulates OGA interactome and substrate deglycosylation toward a specific set of proteins. Interestingly, our quantitative proteomic studies uncovered that the OGA stalk domain mutant preferentially deglycosylated protein substrates with +2 proline in the sequence relative to the O-GlcNAcylation site. One of the most dysregulated substrates is PDZ and LIM domain protein 7 (PDLIM7), which is associated with the tumor suppressor p53. We found that the aberrantly deglycosylated PDLIM7 suppressed p53 gene expression and accelerated p53 protein degradation by promoting the complex formation with E3 ubiquitin ligase MDM2. Moreover, deglycosylated PDLIM7 significantly up-regulated the actin-rich membrane protrusions on the cell surface, augmenting the cancer cell motility and aggressiveness. These findings revealed an important but previously unappreciated role of OGA's stalk domain in protein substrate recognition and functional modulation during malignant cell progression.
Collapse
Affiliation(s)
- Chia-Wei Hu
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, WI53705
| | - Ao Wang
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, WI53705
| | - Dacheng Fan
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, WI53705
| | - Matthew Worth
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, WI53705
| | - Zhengwei Chen
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI53706
| | - Junfeng Huang
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, WI53705
| | - Jinshan Xie
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, WI53705
| | - John Macdonald
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, WI53705
| | - Lingjun Li
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, WI53705
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI53706
| | - Jiaoyang Jiang
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, WI53705
| |
Collapse
|
8
|
Gong Y, Li R, Liu Y, Wang J, Cao B, Fu X, Li R, Chen DZ. MR2CPPIS: Accurate prediction of protein-protein interaction sites based on multi-scale Res2Net with coordinate attention mechanism. Comput Biol Med 2024; 176:108543. [PMID: 38744015 DOI: 10.1016/j.compbiomed.2024.108543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 04/09/2024] [Accepted: 04/28/2024] [Indexed: 05/16/2024]
Abstract
Proteins play a vital role in various biological processes and achieve their functions through protein-protein interactions (PPIs). Thus, accurate identification of PPI sites is essential. Traditional biological methods for identifying PPIs are costly, labor-intensive, and time-consuming. The development of computational prediction methods for PPI sites offers promising alternatives. Most known deep learning (DL) methods employ layer-wise multi-scale CNNs to extract features from protein sequences. But, these methods usually neglect the spatial positions and hierarchical information embedded within protein sequences, which are actually crucial for PPI site prediction. In this paper, we propose MR2CPPIS, a novel sequence-based DL model that utilizes the multi-scale Res2Net with coordinate attention mechanism to exploit multi-scale features and enhance PPI site prediction capability. We leverage the multi-scale Res2Net to expand the receptive field for each network layer, thus capturing multi-scale information of protein sequences at a granular level. To further explore the local contextual features of each target residue, we employ a coordinate attention block to characterize the precise spatial position information, enabling the network to effectively extract long-range dependencies. We evaluate our MR2CPPIS on three public benchmark datasets (Dset 72, Dset 186, and PDBset 164), achieving state-of-the-art performance. The source codes are available at https://github.com/YyinGong/MR2CPPIS.
Collapse
Affiliation(s)
- Yinyin Gong
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China; Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Hunan University, Changsha, 410082, China
| | - Rui Li
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China; Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Hunan University, Changsha, 410082, China.
| | - Yan Liu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China; Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Hunan University, Changsha, 410082, China
| | - Jilong Wang
- Peng Cheng Laboratory, Shenzhen, 518066, China
| | - Buwen Cao
- College of Information and Electronic Engineering, Hunan City University, Yiyang, 413002, China
| | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Renfa Li
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China; Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Hunan University, Changsha, 410082, China
| | - Danny Z Chen
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
9
|
Graef J, Ehrt C, Reim T, Rarey M. Database-Driven Identification of Structurally Similar Protein-Protein Interfaces. J Chem Inf Model 2024; 64:3332-3349. [PMID: 38470439 DOI: 10.1021/acs.jcim.3c01462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Analyzing the similarity of protein interfaces in protein-protein interactions gives new insights into protein function and assists in discovering new drugs. Usually, tools that assess the similarity focus on the interactions between two protein interfaces, while sometimes we only have one predicted interface. Herein, we present PiMine, a database-driven protein interface similarity search. It compares interface residues of one or two interacting chains by calculating and searching tetrahedral geometric patterns of α-carbon atoms and calculating physicochemical and shape-based similarity. On a dedicated, tailor-made dataset, we show that PiMine outperforms commonly used comparison tools in terms of early enrichment when considering interfaces of sequentially and structurally unrelated proteins. In an application example, we demonstrate its usability for protein interaction partner prediction by comparing predicted interfaces to known protein-protein interfaces.
Collapse
Affiliation(s)
- Joel Graef
- Universität Hamburg, ZBH─Center for Bioinformatics , Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| | - Christiane Ehrt
- Universität Hamburg, ZBH─Center for Bioinformatics , Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| | - Thorben Reim
- Universität Hamburg, ZBH─Center for Bioinformatics , Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH─Center for Bioinformatics , Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| |
Collapse
|
10
|
Yuan Q, Tian C, Yang Y. Genome-scale annotation of protein binding sites via language model and geometric deep learning. eLife 2024; 13:RP93695. [PMID: 38630609 PMCID: PMC11023698 DOI: 10.7554/elife.93695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024] Open
Abstract
Revealing protein binding sites with other molecules, such as nucleic acids, peptides, or small ligands, sheds light on disease mechanism elucidation and novel drug design. With the explosive growth of proteins in sequence databases, how to accurately and efficiently identify these binding sites from sequences becomes essential. However, current methods mostly rely on expensive multiple sequence alignments or experimental protein structures, limiting their genome-scale applications. Besides, these methods haven't fully explored the geometry of the protein structures. Here, we propose GPSite, a multi-task network for simultaneously predicting binding residues of DNA, RNA, peptide, protein, ATP, HEM, and metal ions on proteins. GPSite was trained on informative sequence embeddings and predicted structures from protein language models, while comprehensively extracting residual and relational geometric contexts in an end-to-end manner. Experiments demonstrate that GPSite substantially surpasses state-of-the-art sequence-based and structure-based approaches on various benchmark datasets, even when the structures are not well-predicted. The low computational cost of GPSite enables rapid genome-scale binding residue annotations for over 568,000 sequences, providing opportunities to unveil unexplored associations of binding sites with molecular functions, biological processes, and genetic variants. The GPSite webserver and annotation database can be freely accessed at https://bio-web1.nscc-gz.cn/app/GPSite.
Collapse
Affiliation(s)
- Qianmu Yuan
- School of Computer Science and Engineering, Sun Yat-sen UniversityGuangzhouChina
| | - Chong Tian
- School of Computer Science and Engineering, Sun Yat-sen UniversityGuangzhouChina
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen UniversityGuangzhouChina
| |
Collapse
|
11
|
Jia P, Zhang F, Wu C, Li M. A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond. Brief Bioinform 2024; 25:bbae162. [PMID: 38739759 PMCID: PMC11089422 DOI: 10.1093/bib/bbae162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 02/17/2024] [Accepted: 03/31/2024] [Indexed: 05/16/2024] Open
Abstract
Proteins interact with diverse ligands to perform a large number of biological functions, such as gene expression and signal transduction. Accurate identification of these protein-ligand interactions is crucial to the understanding of molecular mechanisms and the development of new drugs. However, traditional biological experiments are time-consuming and expensive. With the development of high-throughput technologies, an increasing amount of protein data is available. In the past decades, many computational methods have been developed to predict protein-ligand interactions. Here, we review a comprehensive set of over 160 protein-ligand interaction predictors, which cover protein-protein, protein-nucleic acid, protein-peptide and protein-other ligands (nucleotide, heme, ion) interactions. We have carried out a comprehensive analysis of the above four types of predictors from several significant perspectives, including their inputs, feature profiles, models, availability, etc. The current methods primarily rely on protein sequences, especially utilizing evolutionary information. The significant improvement in predictions is attributed to deep learning methods. Additionally, sequence-based pretrained models and structure-based approaches are emerging as new trends.
Collapse
Affiliation(s)
- Pengzhen Jia
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Fuhao Zhang
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
- College of Information Engineering, Northwest A&F University, No. 3 Taicheng Road, Yangling, Shaanxi 712100, China
| | - Chaojin Wu
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| |
Collapse
|
12
|
Kalidasan V, Suresh D, Zulkifle N, Hwei YS, Kok Hoong L, Rajasuriar R, Theva Das K. Investigating D-Amino Acid Oxidase Expression and Interaction Network Analyses in Pathways Associated With Cellular Stress: Implications in the Biology of Aging. Bioinform Biol Insights 2024; 18:11779322241234772. [PMID: 38425413 PMCID: PMC10903195 DOI: 10.1177/11779322241234772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 02/07/2024] [Indexed: 03/02/2024] Open
Abstract
D-amino acid oxidase (DAO) is a flavoenzyme that metabolizes D-amino acids by oxidative deamination, producing hydrogen peroxide (H2O2) as a by-product. The generation of intracellular H2O2 may alter the redox-homeostasis mechanism of cells and increase the oxidative stress levels in tissues, associated with the pathogenesis of age-related diseases and organ decline. This study investigates the effect of DAO knockdown using clustered regularly interspaced short palindromic repeats (CRISPR) through an in silico approach on its protein-protein interactions (PPIs) and their potential roles in the process of aging. The target sequence and guide RNA of DAO were designed using the CCTop database, PPI analysis using the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses, Reactome biological pathway, protein docking using GalaxyTongDock database, and structure analysis. The translated target sequence of DAO lies between amino acids 43 to 50. The 10 proteins that were predicted to interact with DAO are involved in peroxisome pathways such as acyl-coenzyme A oxidase 1 (ACOX1), alanine-glyoxylate and serine-pyruvate aminotransferase (AGXT), catalase (CAT), carnitine O-acetyltransferase (CRAT), glyceronephosphate O-acyltransferase (GNPAT), hydroxyacid oxidase 1 (HAO1), hydroxyacid oxidase 2 (HAO2), trans-L-3-hydroxyproline dehydratase (L3HYPDH), polyamine oxidase (PAOX), and pipecolic acid and sarcosine oxidase (PIPOX). In summary, DAO mutation would most likely reduce activity with its interacting proteins that generate H2O2. However, DAO mutation may result in peroxisomal disorders, and thus, alternative techniques should be considered for an in vivo approach.
Collapse
Affiliation(s)
- V Kalidasan
- Department of Biomedical Sciences, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Kepala Batas, Malaysia
| | - Darshinie Suresh
- Department of Biomedical Sciences, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Kepala Batas, Malaysia
- Department of Biological Sciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
| | - Nurulisa Zulkifle
- Department of Biomedical Sciences, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Kepala Batas, Malaysia
| | - Yap Siew Hwei
- Department of Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Leong Kok Hoong
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Reena Rajasuriar
- Department of Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, Malaysia
- Centre of Excellence for Research in AIDS (CERiA), Faculty of Medicine, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Kumitaa Theva Das
- Department of Biomedical Sciences, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Kepala Batas, Malaysia
| |
Collapse
|
13
|
Fu X, Yuan Y, Qiu H, Suo H, Song Y, Li A, Zhang Y, Xiao C, Li Y, Dou L, Zhang Z, Cui F. AGF-PPIS: A protein-protein interaction site predictor based on an attention mechanism and graph convolutional networks. Methods 2024; 222:142-151. [PMID: 38242383 DOI: 10.1016/j.ymeth.2024.01.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 01/04/2024] [Accepted: 01/13/2024] [Indexed: 01/21/2024] Open
Abstract
Protein-protein interactions play an important role in various biological processes. Interaction among proteins has a wide range of applications. Therefore, the correct identification of protein-protein interactions sites is crucial. In this paper, we propose a novel predictor for protein-protein interactions sites, AGF-PPIS, where we utilize a multi-head self-attention mechanism (introducing a graph structure), graph convolutional network, and feed-forward neural network. We use the Euclidean distance between each protein residue to generate the corresponding protein graph as the input of AGF-PPIS. On the independent test dataset Test_60, AGF-PPIS achieves superior performance over comparative methods in terms of seven different evaluation metrics (ACC, precision, recall, F1-score, MCC, AUROC, AUPRC), which fully demonstrates the validity and superiority of the proposed AGF-PPIS model. The source codes and the steps for usage of AGF-PPIS are available at https://github.com/fxh1001/AGF-PPIS.
Collapse
Affiliation(s)
- Xiuhao Fu
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Ye Yuan
- Beidahuang Industry Group General Hospital, Harbin 150001, China
| | - Haoye Qiu
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Haodong Suo
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Yingying Song
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Anqi Li
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Yupeng Zhang
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Cuilin Xiao
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Yazi Li
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Lijun Dou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland, OH 44106, USA
| | - Zilong Zhang
- School of Computer Science and Technology, Hainan University, Haikou 570228, China.
| | - Feifei Cui
- School of Computer Science and Technology, Hainan University, Haikou 570228, China.
| |
Collapse
|
14
|
Marsili L, Davis JL, Espay AJ, Gilthorpe J, Williams C, Kauffman MA, Porollo A. SOD1-Related Cerebellar Ataxia and Motor Neuron Disease: Cp Variant as Functional Modifier? CEREBELLUM (LONDON, ENGLAND) 2024; 23:205-209. [PMID: 36757662 DOI: 10.1007/s12311-023-01527-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 02/01/2023] [Indexed: 02/10/2023]
Abstract
We describe a novel superoxide dismutase (SOD1) mutation-associated clinical phenotype of cerebellar ataxia and motor neuron disease with a variant in the ceruloplasmin (Cp) gene, which may have possibly contributed to a multi-factorial phenotype, supported by genetic and protein structure analyses.
Collapse
Affiliation(s)
- Luca Marsili
- James J. and Joan A. Gardner Center for Parkinson's Disease and Movement Disorders, Department of Neurology, University of Cincinnati, OH, Cincinnati, USA.
| | - Jennie L Davis
- Valley Neuroscience Institute, University of Washington-Valley Medical Center, Renton, WA, USA
| | - Alberto J Espay
- James J. and Joan A. Gardner Center for Parkinson's Disease and Movement Disorders, Department of Neurology, University of Cincinnati, OH, Cincinnati, USA
| | - Jonathan Gilthorpe
- Department of Integrative Medical Biology, Umeå University, Umeå, Sweden
| | - Chloe Williams
- Department of Integrative Medical Biology, Umeå University, Umeå, Sweden
| | - Marcelo A Kauffman
- Consultorio Y Laboratorio de Neurogenética, Centro Universitario de Neurología José María Ramos Mejía, Buenos Aires, Argentina
| | - Aleksey Porollo
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
- Department of Pediatrics, University of Cincinnati, Cincinnati, OH, USA
| |
Collapse
|
15
|
Pitman C, Santiago-McRae E, Lohia R, Bassi K, Joseph TT, Hansen MEB, Brannigan G. The blobulator: a webtool for identification and visual exploration of hydrophobic modularity in protein sequences. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.15.575761. [PMID: 38293114 PMCID: PMC10827107 DOI: 10.1101/2024.01.15.575761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Motivation Clusters of hydrophobic residues are known to promote structured protein stability and drive protein aggregation. Recent work has shown that identifying contiguous hydrophobic residue clusters (termed "blobs") has proven useful in both intrinsically disordered protein (IDP) simulation and human genome studies. However, a graphical interface was unavailable. Results Here, we present the blobulator: an interactive and intuitive web interface to detect intrinsic modularity in any protein sequence based on hydrophobicity. We demonstrate three use cases of the blobulator and show how identifying blobs with biologically relevant parameters provides useful information about a globular protein, two orthologous membrane proteins, and an IDP. Other potential applications are discussed, including: predicting protein segments with critical roles in tertiary interactions, providing a definition of local order and disorder with clear edges, and aiding in predicting protein features from sequence. Availability The blobulator GUI can be found at www.blobulator.branniganlab.org, and the source code with pip installable command line tool can be found on GitHub at www.GitHub.com/BranniganLab/blobulator.
Collapse
Affiliation(s)
- Connor Pitman
- Center for Computational and Integrative Biology, Rutgers University-Camden, 201 Broadway, 08103, NJ, USA
| | - Ezry Santiago-McRae
- Center for Computational and Integrative Biology, Rutgers University-Camden, 201 Broadway, 08103, NJ, USA
| | - Ruchi Lohia
- Department of Physiology, University of Toronto, 1 King's College Circle, M5S 1A8, Toronto, Ontario, Canada
| | - Kaitlin Bassi
- Center for Computational and Integrative Biology, Rutgers University-Camden, 201 Broadway, 08103, NJ, USA
| | - Thomas T Joseph
- Department of Anesthesiology and Critical Care, Perelman School of Medicine, University of Pennsylvania, JMB 305, 3620 Hamilton Walk, 19104, PA, USA
| | - Matthew E B Hansen
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, 3400 Civic Center Blvd, 19104, PA, USA
| | - Grace Brannigan
- Center for Computational and Integrative Biology, Rutgers University-Camden, 201 Broadway, 08103, NJ, USA
- Department of Physics, Rutgers University-Camden, 201 Broadway, 08103, NJ, USA
| |
Collapse
|
16
|
Zanon A, Guida M, Lavdas AA, Corti C, Castelo Rueda MP, Negro A, Pramstaller PP, Domingues FS, Hicks AA, Pichler I. Intracellular delivery of Parkin-RING0-based fragments corrects Parkin-induced mitochondrial dysfunction through interaction with SLP-2. J Transl Med 2024; 22:59. [PMID: 38229174 PMCID: PMC10790385 DOI: 10.1186/s12967-024-04850-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 01/02/2024] [Indexed: 01/18/2024] Open
Abstract
BACKGROUND Loss-of-function mutations in the PRKN gene, encoding Parkin, are the most common cause of autosomal recessive Parkinson's disease (PD). We have previously identified mitoch ondrial Stomatin-like protein 2 (SLP-2), which functions in the assembly of respiratory chain proteins, as a Parkin-binding protein. Selective knockdown of either Parkin or SLP-2 led to reduced mitochondrial and neuronal function in neuronal cells and Drosophila, where a double knockdown led to a further worsening of Parkin-deficiency phenotypes. Here, we investigated the minimal Parkin region involved in the Parkin-SLP-2 interaction and explored the ability of Parkin-fragments and peptides from this minimal region to restore mitochondrial function. METHODS In fibroblasts, human induced pluripotent stem cell (hiPSC)-derived neurons, and neuroblastoma cells the interaction between Parkin and SLP-2 was investigated, and the Parkin domain responsible for the binding to SLP-2 was mapped. High resolution respirometry, immunofluorescence analysis and live imaging were used to analyze mitochondrial function. RESULTS Using a proximity ligation assay, we quantitatively assessed the Parkin-SLP-2 interaction in skin fibroblasts and hiPSC-derived neurons. When PD-associated PRKN mutations were present, we detected a significantly reduced interaction between the two proteins. We found a preferential binding of SLP-2 to the N-terminal part of Parkin, with a highest affinity for the RING0 domain. Computational modeling based on the crystal structure of Parkin protein predicted several potential binding sites for SLP-2 within the Parkin RING0 domain. Amongst these, three binding sites were observed to overlap with natural PD-causing missense mutations, which we demonstrated interfere substantially with the binding of Parkin to SLP-2. Finally, delivery of the isolated Parkin RING0 domain and a Parkin mini-peptide, conjugated to cell-permeant and mitochondrial transporters, rescued compromised mitochondrial function in Parkin-deficient neuroblastoma cells and hiPSC-derived neurons with endogenous, disease causing PRKN mutations. CONCLUSIONS These findings place further emphasis on the importance of the protein-protein interaction between Parkin and SLP-2 for the maintenance of optimal mitochondrial function. The possibility of restoring an abolished binding to SLP-2 by delivering the Parkin RING0 domain or the Parkin mini-peptide involved in this specific protein-protein interaction into cells might represent a novel organelle-specific therapeutic approach for correcting mitochondrial dysfunction in Parkin-linked PD.
Collapse
Affiliation(s)
- Alessandra Zanon
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Bolzano, Italy
| | - Marianna Guida
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Bolzano, Italy
| | - Alexandros A Lavdas
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Bolzano, Italy
| | - Corrado Corti
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Bolzano, Italy
| | | | - Alessandro Negro
- Department of Biomedical Sciences, University of Padova, Padua, Italy
| | - Peter P Pramstaller
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Bolzano, Italy
- Department of Neurology, University Medical Center Schleswig-Holstein, Campus Lübeck, Lübeck, Germany
| | - Francisco S Domingues
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Bolzano, Italy
| | - Andrew A Hicks
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Bolzano, Italy
| | - Irene Pichler
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Bolzano, Italy.
| |
Collapse
|
17
|
Ding H, Li X, Han P, Tian X, Jing F, Wang S, Song T, Fu H, Kang N. MEG-PPIS: a fast protein-protein interaction site prediction method based on multi-scale graph information and equivariant graph neural network. Bioinformatics 2024; 40:btae269. [PMID: 38640481 PMCID: PMC11252844 DOI: 10.1093/bioinformatics/btae269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 03/19/2024] [Accepted: 04/17/2024] [Indexed: 04/21/2024] Open
Abstract
MOTIVATION Protein-protein interaction sites (PPIS) are crucial for deciphering protein action mechanisms and related medical research, which is the key issue in protein action research. Recent studies have shown that graph neural networks have achieved outstanding performance in predicting PPIS. However, these studies often neglect the modeling of information at different scales in the graph and the symmetry of protein molecules within three-dimensional space. RESULTS In response to this gap, this article proposes the MEG-PPIS approach, a PPIS prediction method based on multi-scale graph information and E(n) equivariant graph neural network (EGNN). There are two channels in MEG-PPIS: the original graph and the subgraph obtained by graph pooling. The model can iteratively update the features of the original graph and subgraph through the weight-sharing EGNN. Subsequently, the max-pooling operation aggregates the updated features of the original graph and subgraph. Ultimately, the model feeds node features into the prediction layer to obtain prediction results. Comparative assessments against other methods on benchmark datasets reveal that MEG-PPIS achieves optimal performance across all evaluation metrics and gets the fastest runtime. Furthermore, specific case studies demonstrate that our method can predict more true positive and true negative sites than the current best method, proving that our model achieves better performance in the PPIS prediction task. AVAILABILITY AND IMPLEMENTATION The data and code are available at https://github.com/dhz234/MEG-PPIS.git.
Collapse
Affiliation(s)
- Hongzhen Ding
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Xue Li
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Peifu Han
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Xu Tian
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Fengrui Jing
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Shuang Wang
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Tao Song
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Hanjiao Fu
- School of Humanities and Law, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Na Kang
- The Ninth Department of Health Care Administration, the Second Medical Center, Chinese PLA General Hospital, Beijing, 100853, China
| |
Collapse
|
18
|
Hosseini S, Golding GB, Ilie L. Seq-InSite: sequence supersedes structure for protein interaction site prediction. Bioinformatics 2024; 40:btad738. [PMID: 38212995 PMCID: PMC10796176 DOI: 10.1093/bioinformatics/btad738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 11/17/2023] [Accepted: 01/10/2024] [Indexed: 01/13/2024] Open
Abstract
MOTIVATION Proteins accomplish cellular functions by interacting with each other, which makes the prediction of interaction sites a fundamental problem. As experimental methods are expensive and time consuming, computational prediction of the interaction sites has been studied extensively. Structure-based programs are the most accurate, while the sequence-based ones are much more widely applicable, as the sequences available outnumber the structures by two orders of magnitude. Ideally, we would like a tool that has the quality of the former and the applicability of the latter. RESULTS We provide here the first solution that achieves these two goals. Our new sequence-based program, Seq-InSite, greatly surpasses the performance of sequence-based models, matching the quality of state-of-the-art structure-based predictors, thus effectively superseding the need for models requiring structure. The predictive power of Seq-InSite is illustrated using an analysis of evolutionary conservation for four protein sequences. AVAILABILITY AND IMPLEMENTATION Seq-InSite is freely available as a web server at http://seq-insite.csd.uwo.ca/ and as free source code, including trained models and all datasets used for training and testing, at https://github.com/lucian-ilie/Seq-InSite.
Collapse
Affiliation(s)
- SeyedMohsen Hosseini
- Department of Computer Science, University of Western Ontario, London, ON N6A 5B7, Canada
| | - G Brian Golding
- Department of Biology, McMaster University, Hamilton, ON L8S 4K1, Canada
| | - Lucian Ilie
- Department of Computer Science, University of Western Ontario, London, ON N6A 5B7, Canada
| |
Collapse
|
19
|
Kiani YS, Jabeen I. Challenges of Protein-Protein Docking of the Membrane Proteins. Methods Mol Biol 2024; 2780:203-255. [PMID: 38987471 DOI: 10.1007/978-1-0716-3985-6_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Despite the recent advances in the determination of high-resolution membrane protein (MP) structures, the structural and functional characterization of MPs remains extremely challenging, mainly due to the hydrophobic nature, low abundance, poor expression, purification, and crystallization difficulties associated with MPs. Whereby the major challenges/hurdles for MP structure determination are associated with the expression, purification, and crystallization procedures. Although there have been significant advances in the experimental determination of MP structures, only a limited number of MP structures (approximately less than 1% of all) are available in the Protein Data Bank (PDB). Therefore, the structures of a large number of MPs still remain unresolved, which leads to the availability of widely unplumbed structural and functional information related to MPs. As a result, recent developments in the drug discovery realm and the significant biological contemplation have led to the development of several novel, low-cost, and time-efficient computational methods that overcome the limitations of experimental approaches, supplement experiments, and provide alternatives for the characterization of MPs. Whereby the fine tuning and optimizations of these computational approaches remains an ongoing endeavor.Computational methods offer a potential way for the elucidation of structural features and the augmentation of currently available MP information. However, the use of computational modeling can be extremely challenging for MPs mainly due to insufficient knowledge of (or gaps in) atomic structures of MPs. Despite the availability of numerous in silico methods for 3D structure determination the applicability of these methods to MPs remains relatively low since all methods are not well-suited or adequate for MPs. However, sophisticated methods for MP structure predictions are constantly being developed and updated to integrate the modifications required for MPs. Currently, different computational methods for (1) MP structure prediction, (2) stability analysis of MPs through molecular dynamics simulations, (3) modeling of MP complexes through docking, (4) prediction of interactions between MPs, and (5) MP interactions with its soluble partner are extensively used. Towards this end, MP docking is widely used. It is notable that the MP docking methods yet few in number might show greater potential in terms of filling the knowledge gap. In this chapter, MP docking methods and associated challenges have been reviewed to improve the applicability, accuracy, and the ability to model macromolecular complexes.
Collapse
Affiliation(s)
- Yusra Sajid Kiani
- School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Ishrat Jabeen
- School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan.
| |
Collapse
|
20
|
Zeng X, Meng FF, Li X, Zhong KY, Jiang B, Li Y. GHGPR-PPIS: A graph convolutional network for identifying protein-protein interaction site using heat kernel with Generalized PageRank techniques and edge self-attention feature processing block. Comput Biol Med 2024; 168:107683. [PMID: 37984202 DOI: 10.1016/j.compbiomed.2023.107683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 10/10/2023] [Accepted: 11/06/2023] [Indexed: 11/22/2023]
Abstract
Accurately pinpointing protein-protein interaction site (PPIS) on the molecular level is of utmost significance for annotating protein function and comprehending the mechanisms underpinning various diseases. While numerous computational methods for predicting PPIS have emerged, they have indeed mitigated the labor and time constraints associated with traditional experimental methods. However, the predictive accuracy of these methods has yet to reach the desired threshold. In this context, we proposed a groundbreaking graph-based computational model called GHGPR-PPIS. This innovative model leveraged a graph convolutional network using heat kernel (GraphHeat) in conjunction with Generalized PageRank techniques (GHGPR) to predict PPIS. Additionally, building upon the GHGPR framework, we devised an edge self-attention feature processing block, further augmenting the performance of the model. Experimental findings conclusively demonstrated that GHGPR-PPIS surpassed all competing state-of-the-art models when evaluated on the benchmark test set. Impressively, on two distinct independent test sets and a specific protein chain, GHGPR-PPIS consistently demonstrated superior generalization performance and practical applicability compared to the comparative model, AGAT-PPIS. Lastly, leveraging the t-SNE dimensionality reduction algorithm and clustering visualization technique, we delved into an interpretability analysis of the effectiveness of GHGPR-PPIS by meticulously comparing the outputs from different stages of the model.
Collapse
Affiliation(s)
- Xin Zeng
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China
| | - Fan-Fang Meng
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China
| | - Xin Li
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China
| | - Kai-Yang Zhong
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China
| | - Bei Jiang
- Yunnan Key Laboratory of Screening and Research on Anti-pathogenic Plant Resources from Western Yunnan, Dali University, Dali, 671000, China
| | - Yi Li
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China.
| |
Collapse
|
21
|
Vottero P, Olivetti EC, D'Agostino LC, Di Grazia L, Vezzetti E, Aminpour M, Tuszynski JA, Marcolin F. Understanding the contagiousness of Covid-19 strains: A geometric approach. J Mol Graph Model 2024; 126:108670. [PMID: 37984193 DOI: 10.1016/j.jmgm.2023.108670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 11/06/2023] [Accepted: 11/07/2023] [Indexed: 11/22/2023]
Abstract
Protein-protein interaction occurs on surface patches with some degree of complementary geometric and chemical features. Building on this understanding, this study endeavors to characterize the spike protein of the SARS-CoV-2 virus at the morphological and geometrical levels in its Alpha, Delta, and Omicron variants. In particular, the affinity between different SARS-CoV-2 spike proteins and the ACE2 receptor present on the membrane of the human respiratory system cells is investigated. To achieve an adequate degree of geometrical accuracy, the 3D depth maps of the proteins in exam are filtered by developing an ad-hoc convolutional filter with a kernel implemented as a sphere of varying radius, simulating a ball rolling on the surface (similar to the 'rolling ball' filter). This ball ideally models a hypothetical molecule that could interface with the protein and is inspired by the geometric approach to macromolecule-ligand interactions proposed by Kuntz et al. in 1982. The aim is to mitigate the imperfections and to obtain a smoother surface that could be studied from a geometrical perspective for binding purposes. A set of geometric descriptors, borrowed from the 3D face analysis context is then mapped point-by-point onto protein depth maps. Following a feature extraction phase inspired by Histogram of Oriented Gradients and Local Binary Patterns, the final histogram features are used as input for a Support Vector Machine classifier to automatically classify the proteins according to their surface affinity, where a similarity in shape is observed between ACE2 and the spike protein of the SARS-CoV-2 Omicron variant. Finally, Root Mean Square Error analysis is used to quantify the geometrical affinity between the ACE2 receptor and the respective Receptor Binding Domains of the three SARS-CoV-2 variants, culminating in a geometrical explanation for the higher contagiousness of Omicron relative to the other variants under study.
Collapse
Affiliation(s)
- Paola Vottero
- Department of Biomedical Engineering, University of Alberta, Edmonton, AB, T6G 2V2, Canada
| | - Elena Carlotta Olivetti
- Department of Management and Production Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129, Turin, Italy
| | - Lucia Chiara D'Agostino
- Department of Management and Production Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129, Turin, Italy
| | - Luca Di Grazia
- Department of Computer Science, University of Stuttgart, Universitätsstr. 38, 70569, Stuttgart, Germany
| | - Enrico Vezzetti
- Department of Management and Production Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129, Turin, Italy
| | - Maral Aminpour
- Department of Biomedical Engineering, University of Alberta, Edmonton, AB, T6G 2V2, Canada
| | - Jacek Adam Tuszynski
- Department of Physics, University of Alberta, Edmonton, AB, T6G 2H7, Canada; Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129, Turin, Italy; Department of Data Science and Engineering, The Silesian University of Technology, Gliwice, Poland.
| | - Federica Marcolin
- Department of Management and Production Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129, Turin, Italy
| |
Collapse
|
22
|
Cong H, Liu H, Cao Y, Liang C, Chen Y. Protein-protein interaction site prediction by model ensembling with hybrid feature and self-attention. BMC Bioinformatics 2023; 24:456. [PMID: 38053020 DOI: 10.1186/s12859-023-05592-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Accepted: 11/30/2023] [Indexed: 12/07/2023] Open
Abstract
BACKGROUND Protein-protein interactions (PPIs) are crucial in various biological functions and cellular processes. Thus, many computational approaches have been proposed to predict PPI sites. Although significant progress has been made, these methods still have limitations in encoding the characteristics of each amino acid in sequences. Many feature extraction methods rely on the sliding window technique, which simply merges all the features of residues into a vector. The importance of some key residues may be weakened in the feature vector, leading to poor performance. RESULTS We propose a novel sequence-based method for PPI sites prediction. The new network model, PPINet, contains multiple feature processing paths. For a residue, the PPINet extracts the features of the targeted residue and its context separately. These two types of features are processed by two paths in the network and combined to form a protein representation, where the two types of features are of relatively equal importance. The model ensembling technique is applied to make use of more features. The base models are trained with different features and then ensembled via stacking. In addition, a data balancing strategy is presented, by which our model can get significant improvement on highly unbalanced data. CONCLUSION The proposed method is evaluated on a fused dataset constructed from Dset186, Dset_72, and PDBset_164, as well as the public Dset_448 dataset. Compared with current state-of-the-art methods, the performance of our method is better than the others. In the most important metrics, such as AUPRC and recall, it surpasses the second-best programmer on the latter dataset by 6.9% and 4.7%, respectively. We also demonstrated that the improvement is essentially due to using the ensemble model, especially, the hybrid feature. We share our code for reproducibility and future research at https://github.com/CandiceCong/StackingPPINet .
Collapse
Affiliation(s)
- Hanhan Cong
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
- Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan, China
| | - Hong Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China.
- Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan, China.
| | - Yi Cao
- School of Information Science and Engineering, University of Jinan, Jinan, China
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Yuehui Chen
- School of Information Science and Engineering, University of Jinan, Jinan, China
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan, China
| |
Collapse
|
23
|
Yuan M, Shen A, Fu K, Guan J, Ma Y, Qiao Q, Wang M. ProteinMAE: masked autoencoder for protein surface self-supervised learning. Bioinformatics 2023; 39:btad724. [PMID: 38019955 PMCID: PMC10713117 DOI: 10.1093/bioinformatics/btad724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 10/27/2023] [Accepted: 11/28/2023] [Indexed: 12/01/2023] Open
Abstract
SUMMARY The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein-protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein-protein interaction prediction. The extensive experiments show that our method not only successfully improves the network's performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods. AVAILABILITY AND IMPLEMENTATION https://github.com/phdymz/ProteinMAE.
Collapse
Affiliation(s)
- Mingzhi Yuan
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Ao Shen
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Kexue Fu
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Jiaming Guan
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Yingfan Ma
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Qin Qiao
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Manning Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| |
Collapse
|
24
|
Fang Y, Jiang Y, Wei L, Ma Q, Ren Z, Yuan Q, Wei DQ. DeepProSite: structure-aware protein binding site prediction using ESMFold and pretrained language model. Bioinformatics 2023; 39:btad718. [PMID: 38015872 PMCID: PMC10723037 DOI: 10.1093/bioinformatics/btad718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 11/04/2023] [Accepted: 11/27/2023] [Indexed: 11/30/2023] Open
Abstract
MOTIVATION Identifying the functional sites of a protein, such as the binding sites of proteins, peptides, or other biological components, is crucial for understanding related biological processes and drug design. However, existing sequence-based methods have limited predictive accuracy, as they only consider sequence-adjacent contextual features and lack structural information. RESULTS In this study, DeepProSite is presented as a new framework for identifying protein binding site that utilizes protein structure and sequence information. DeepProSite first generates protein structures from ESMFold and sequence representations from pretrained language models. It then uses Graph Transformer and formulates binding site predictions as graph node classifications. In predicting protein-protein/peptide binding sites, DeepProSite outperforms state-of-the-art sequence- and structure-based methods on most metrics. Moreover, DeepProSite maintains its performance when predicting unbound structures, in contrast to competing structure-based prediction methods. DeepProSite is also extended to the prediction of binding sites for nucleic acids and other ligands, verifying its generalization capability. Finally, an online server for predicting multiple types of residue is established as the implementation of the proposed DeepProSite. AVAILABILITY AND IMPLEMENTATION The datasets and source codes can be accessed at https://github.com/WeiLab-Biology/DeepProSite. The proposed DeepProSite can be accessed at https://inner.wei-group.net/DeepProSite/.
Collapse
Affiliation(s)
- Yitian Fang
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200040, China
- Peng Cheng Laboratory, Shenzhen 518055, China
| | - Yi Jiang
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| | - Leyi Wei
- School of Software, Shandong University, Jinan, Shandong 250100, China
| | - Qin Ma
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| | | | - Qianmu Yuan
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200040, China
- Peng Cheng Laboratory, Shenzhen 518055, China
| |
Collapse
|
25
|
Li X, Wang GA, Wei Z, Wang H, Zhu X. Protein-DNA interface hotspots prediction based on fusion features of embeddings of protein language model and handcrafted features. Comput Biol Chem 2023; 107:107970. [PMID: 37866116 DOI: 10.1016/j.compbiolchem.2023.107970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 10/06/2023] [Accepted: 10/07/2023] [Indexed: 10/24/2023]
Abstract
The identification of hotspot residues at the protein-DNA binding interfaces plays a crucial role in various aspects such as drug discovery and disease treatment. Although experimental methods such as alanine scanning mutagenesis have been developed to determine the hotspot residues on protein-DNA interfaces, they are both inefficient and costly. Therefore, it is highly necessary to develop efficient and accurate computational methods for predicting hotspot residues. Several computational methods have been developed, however, they are mainly based on hand-crafted features which may not be able to represent all the information of proteins. In this regard, we propose a model called PDH-EH, which utilizes fused features of embeddings extracted from a protein language model (PLM) and handcrafted features. After we extracted the total 1141 dimensional features, we used mRMR to select the optimal feature subset. Based on the optimal feature subset, several different learning algorithms such as Random Forest, Support Vector Machine, and XGBoost were used to build the models. The cross-validation results on the training dataset show that the model built by using Random Forest achieves the highest AUROC. Further evaluation on the independent test set shows that our model outperforms the existing state-of-the-art models. Moreover, the effectiveness and interpretability of embeddings extracted from PLM were demonstrated in our analysis. The codes and datasets used in this study are available at: https://github.com/lixiangli01/PDH-EH.
Collapse
Affiliation(s)
- Xiang Li
- School of Sciences, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Gang-Ao Wang
- School of Sciences, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Zhuoyu Wei
- School of Sciences, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Hong Wang
- School of Sciences, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Xiaolei Zhu
- School of Sciences, Anhui Agricultural University, Hefei, Anhui 230036, China.
| |
Collapse
|
26
|
Nikam R, Yugandhar K, Gromiha MM. DeepBSRPred: deep learning-based binding site residue prediction for proteins. Amino Acids 2023; 55:1305-1316. [PMID: 36574037 DOI: 10.1007/s00726-022-03228-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 12/15/2022] [Indexed: 12/28/2022]
Abstract
MOTIVATION Proteins-protein interactions (PPIs) are important to govern several cellular activities. Amino acid residues, which are located at the interface are known as the binding sites and the information about binding sites helps to understand the binding affinities and functions of protein-protein complexes. RESULTS We have developed a deep neural network-based method, DeepBSRPred, for predicting the binding sites using protein sequence information and predicted structures from AlphaFold2. Specific sequence and structure-based features include position-specific scoring matrix (PSSM), solvent accessible surface area, conservation score and amino acid properties, and residue depth, respectively. Our method predicted the binding sites with an average F1 score of 0.73 in a dataset of 1236 proteins. Further, we compared the performance with other existing methods in the literature using four benchmark datasets and our method outperformed those methods. AVAILABILITY AND IMPLEMENTATION The DeepBSRPred web server can be found at https://web.iitm.ac.in/bioinfo2/deepbsrpred/index.html , along with all datasets used in this study. The trained models, the DeepBSRPred standalone source code, and the feature computation pipeline are freely available at https://web.iitm.ac.in/bioinfo2/deepbsrpred/download.html .
Collapse
Affiliation(s)
- Rahul Nikam
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Kumar Yugandhar
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
- Department of Computational Biology, Cornell University, New York, NY, USA
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India.
- Department of Computer Science, Tokyo Institute of Technology, Yokohama, Japan.
| |
Collapse
|
27
|
Mou M, Pan Z, Zhou Z, Zheng L, Zhang H, Shi S, Li F, Sun X, Zhu F. A Transformer-Based Ensemble Framework for the Prediction of Protein-Protein Interaction Sites. RESEARCH (WASHINGTON, D.C.) 2023; 6:0240. [PMID: 37771850 PMCID: PMC10528219 DOI: 10.34133/research.0240] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 09/08/2023] [Indexed: 09/30/2023]
Abstract
The identification of protein-protein interaction (PPI) sites is essential in the research of protein function and the discovery of new drugs. So far, a variety of computational tools based on machine learning have been developed to accelerate the identification of PPI sites. However, existing methods suffer from the low predictive accuracy or the limited scope of application. Specifically, some methods learned only global or local sequential features, leading to low predictive accuracy, while others achieved improved performance by extracting residue interactions from structures but were limited in their application scope for the serious dependence on precise structure information. There is an urgent need to develop a method that integrates comprehensive information to realize proteome-wide accurate profiling of PPI sites. Herein, a novel ensemble framework for PPI sites prediction, EnsemPPIS, was therefore proposed based on transformer and gated convolutional networks. EnsemPPIS can effectively capture not only global and local patterns but also residue interactions. Specifically, EnsemPPIS was unique in (a) extracting residue interactions from protein sequences with transformer and (b) further integrating global and local sequential features with the ensemble learning strategy. Compared with various existing methods, EnsemPPIS exhibited either superior performance or broader applicability on multiple PPI sites prediction tasks. Moreover, pattern analysis based on the interpretability of EnsemPPIS demonstrated that EnsemPPIS was fully capable of learning residue interactions within the local structure of PPI sites using only sequence information. The web server of EnsemPPIS is freely available at http://idrblab.org/ensemppis.
Collapse
Affiliation(s)
- Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Zhimeng Zhou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Lingyan Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Hanyu Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Shuiyang Shi
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Fengcheng Li
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Xiuna Sun
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
28
|
Wu H, Han J, Zhang S, Xin G, Mou C, Liu J. Spatom: a graph neural network for structure-based protein-protein interaction site prediction. Brief Bioinform 2023; 24:bbad345. [PMID: 37779247 DOI: 10.1093/bib/bbad345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 08/22/2023] [Accepted: 09/13/2023] [Indexed: 10/03/2023] Open
Abstract
Accurate identification of protein-protein interaction (PPI) sites remains a computational challenge. We propose Spatom, a novel framework for PPI site prediction. This framework first defines a weighted digraph for a protein structure to precisely characterize the spatial contacts of residues, then performs a weighted digraph convolution to aggregate both spatial local and global information and finally adds an improved graph attention layer to drive the predicted sites to form more continuous region(s). Spatom was tested on a diverse set of challenging protein-protein complexes and demonstrated the best performance among all the compared methods. Furthermore, when tested on multiple popular proteins in a case study, Spatom clearly identifies the interaction interfaces and captures the majority of hotspots. Spatom is expected to contribute to the understanding of protein interactions and drug designs targeting protein binding.
Collapse
Affiliation(s)
- Haonan Wu
- School of Mathematics and Statistics, Shandong University, Weihai 264209, China
- School of Mathematics, Shandong University, Jinan 250100, China
| | - Jiyun Han
- School of Mathematics and Statistics, Shandong University, Weihai 264209, China
| | - Shizhuo Zhang
- School of Mathematics and Statistics, Shandong University, Weihai 264209, China
| | - Gaojia Xin
- School of Mathematics and Statistics, Shandong University, Weihai 264209, China
| | - Chaozhou Mou
- School of Mathematics and Statistics, Shandong University, Weihai 264209, China
| | - Juntao Liu
- School of Mathematics and Statistics, Shandong University, Weihai 264209, China
| |
Collapse
|
29
|
Liu T, Gao H, Ren X, Xu G, Liu B, Wu N, Luo H, Wang Y, Tu T, Yao B, Guan F, Teng Y, Huang H, Tian J. Protein-protein interaction and site prediction using transfer learning. Brief Bioinform 2023; 24:bbad376. [PMID: 37870286 DOI: 10.1093/bib/bbad376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 09/14/2023] [Accepted: 10/02/2023] [Indexed: 10/24/2023] Open
Abstract
The advanced language models have enabled us to recognize protein-protein interactions (PPIs) and interaction sites using protein sequences or structures. Here, we trained the MindSpore ProteinBERT (MP-BERT) model, a Bidirectional Encoder Representation from Transformers, using protein pairs as inputs, making it suitable for identifying PPIs and their respective interaction sites. The pretrained model (MP-BERT) was fine-tuned as MPB-PPI (MP-BERT on PPI) and demonstrated its superiority over the state-of-the-art models on diverse benchmark datasets for predicting PPIs. Moreover, the model's capability to recognize PPIs among various organisms was evaluated on multiple organisms. An amalgamated organism model was designed, exhibiting a high level of generalization across the majority of organisms and attaining an accuracy of 92.65%. The model was also customized to predict interaction site propensity by fine-tuning it with PPI site data as MPB-PPISP. Our method facilitates the prediction of both PPIs and their interaction sites, thereby illustrating the potency of transfer learning in dealing with the protein pair task.
Collapse
Affiliation(s)
- Tuoyu Liu
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Han Gao
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Xiaopu Ren
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Guoshun Xu
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Bo Liu
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Ningfeng Wu
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Huiying Luo
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Yuan Wang
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Tao Tu
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Bin Yao
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Feifei Guan
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Yue Teng
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Academy of Military Medical Sciences, Beijing 100071, China
| | - Huoqing Huang
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Jian Tian
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| |
Collapse
|
30
|
Grudman S, Fajardo JE, Fiser A. Optimal selection of suitable templates in protein interface prediction. Bioinformatics 2023; 39:btad510. [PMID: 37603727 PMCID: PMC10491951 DOI: 10.1093/bioinformatics/btad510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 07/11/2023] [Accepted: 08/18/2023] [Indexed: 08/23/2023] Open
Abstract
MOTIVATION Molecular-level classification of protein-protein interfaces can greatly assist in functional characterization and rational drug design. The most accurate protein interface predictions rely on finding homologous proteins with known interfaces since most interfaces are conserved within the same protein family. The accuracy of these template-based prediction approaches depends on the correct choice of suitable templates. Choosing the right templates in the immunoglobulin superfamily (IgSF) is challenging because its members share low sequence identity and display a wide range of alternative binding sites despite structural homology. RESULTS We present a new approach to predict protein interfaces. First, template-specific, informative evolutionary profiles are established using a mutual information-based approach. Next, based on the similarity of residue level conservation scores derived from the evolutionary profiles, a query protein is hierarchically clustered with all available template proteins in its superfamily with known interface definitions. Once clustered, a subset of the most closely related templates is selected, and an interface prediction is made. These initial interface predictions are subsequently refined by extensive docking. This method was benchmarked on 51 IgSF proteins and can predict nontrivial interfaces of IgSF proteins with an average and median F-score of 0.64 and 0.78, respectively. We also provide a way to assess the confidence of the results. The average and median F-scores increase to 0.8 and 0.81, respectively, if 27% of low confidence cases and 17% of medium confidence cases are removed. Lastly, we provide residue level interface predictions, protein complexes, and confidence measurements for singletons in the IgSF. AVAILABILITY AND IMPLEMENTATION Source code is freely available at: https://gitlab.com/fiserlab.org/interdct_with_refinement.
Collapse
Affiliation(s)
- Steven Grudman
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - J Eduardo Fajardo
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Andras Fiser
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| |
Collapse
|
31
|
Bhowmik D, Bhuyan A, Gunalan S, Kothandan G, Kumar D. In silico and immunoinformatics based multiepitope subunit vaccine design for protection against visceral leishmaniasis. J Biomol Struct Dyn 2023:1-22. [PMID: 37655736 DOI: 10.1080/07391102.2023.2252901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Accepted: 08/22/2023] [Indexed: 09/02/2023]
Abstract
Visceral leishmaniasis (VL) is a vector-borne neglected tropical protozoan disease with high fatality and no certified vaccine. Conventional vaccine preparation is challenging and tedious. Here in this work, we created a global multiepitope subunit vaccination against VL utilizing innovative immunoinformatics technique based on the extensively conserved epitopic regions of the PrimPol protein of Leishmania donovani consisting of four subunits which were analyzed and studied, out of which DNA primase large subunit and DNA polymerase α subunit B were evaluated as antigens by Vaxijen 2.0. The multiepitope vaccine design includes a single adjuvant β-defensins, eight CTL epitopes, eight HTL epitopes, seven linear BCL epitopes and one discontinuous BCL epitope to induce innate, cellular and humoral immune responses against VL. The Expasy ProtParam tool characterized the physiochemical parameters of the vaccine. At the same time, SOLpro evaluated our vaccine constructs to be soluble upon expression. We also modeled the stable tertiary structure of our vaccine construct through Robetta modeling for molecular docking studies with toll-like receptor proteins through HADDOCK 2.4. Simulations based on molecular dynamics revealed an intact vaccine and TLR8 complex, supporting our vaccine design's immunogenicity. Also, the immune simulation of our vaccine by the C-ImmSim server demonstrated the potency of the multiepitope vaccine construct to induce proper immune response for host defense. Codon optimization and in silico cloning of our vaccine further assured high expression. The outcomes of our study on multiepitope vaccine design significantly produced a potential candidate against VL and can potentially eradicate the disease in the future after clinical investigations.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Deep Bhowmik
- Deparment of Microbiology, Assam University, Silchar, Assam, India
| | - Achyut Bhuyan
- Deparment of Microbiology, Assam University, Silchar, Assam, India
| | - Seshan Gunalan
- Biopolymer Modelling Laboratory, Centre of Advanced Study in Crystallography and Biophysics, Guindy Campus, University of Madras, Chennai, India
| | - Gugan Kothandan
- Biopolymer Modelling Laboratory, Centre of Advanced Study in Crystallography and Biophysics, Guindy Campus, University of Madras, Chennai, India
| | - Diwakar Kumar
- Deparment of Microbiology, Assam University, Silchar, Assam, India
| |
Collapse
|
32
|
Roche R, Moussad B, Shuvo MH, Bhattacharya D. E(3) equivariant graph neural networks for robust and accurate protein-protein interaction site prediction. PLoS Comput Biol 2023; 19:e1011435. [PMID: 37651442 PMCID: PMC10499216 DOI: 10.1371/journal.pcbi.1011435] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 09/13/2023] [Accepted: 08/15/2023] [Indexed: 09/02/2023] Open
Abstract
Artificial intelligence-powered protein structure prediction methods have led to a paradigm-shift in computational structural biology, yet contemporary approaches for predicting the interfacial residues (i.e., sites) of protein-protein interaction (PPI) still rely on experimental structures. Recent studies have demonstrated benefits of employing graph convolution for PPI site prediction, but ignore symmetries naturally occurring in 3-dimensional space and act only on experimental coordinates. Here we present EquiPPIS, an E(3) equivariant graph neural network approach for PPI site prediction. EquiPPIS employs symmetry-aware graph convolutions that transform equivariantly with translation, rotation, and reflection in 3D space, providing richer representations for molecular data compared to invariant convolutions. EquiPPIS substantially outperforms state-of-the-art approaches based on the same experimental input, and exhibits remarkable robustness by attaining better accuracy with predicted structural models from AlphaFold2 than what existing methods can achieve even with experimental structures. Freely available at https://github.com/Bhattacharya-Lab/EquiPPIS, EquiPPIS enables accurate PPI site prediction at scale.
Collapse
Affiliation(s)
- Rahmatullah Roche
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Bernard Moussad
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Md Hossain Shuvo
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Debswapna Bhattacharya
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| |
Collapse
|
33
|
Slough MM, Li R, Herbert AS, Lasso G, Kuehne AI, Monticelli SR, Bakken RR, Liu Y, Ghosh A, Moreau AM, Zeng X, Rey FA, Guardado-Calvo P, Almo SC, Dye JM, Jangra RK, Wang Z, Chandran K. Two point mutations in protocadherin-1 disrupt hantavirus recognition and afford protection against lethal infection. Nat Commun 2023; 14:4454. [PMID: 37488123 PMCID: PMC10366084 DOI: 10.1038/s41467-023-40126-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 07/06/2023] [Indexed: 07/26/2023] Open
Abstract
Andes virus (ANDV) and Sin Nombre virus (SNV) are the etiologic agents of severe hantavirus cardiopulmonary syndrome (HCPS) in the Americas for which no FDA-approved countermeasures are available. Protocadherin-1 (PCDH1), a cadherin-superfamily protein recently identified as a critical host factor for ANDV and SNV, represents a new antiviral target; however, its precise role remains to be elucidated. Here, we use computational and experimental approaches to delineate the binding surface of the hantavirus glycoprotein complex on PCDH1's first extracellular cadherin repeat domain. Strikingly, a single amino acid residue in this PCDH1 surface influences the host species-specificity of SNV glycoprotein-PCDH1 interaction and cell entry. Mutation of this and a neighboring residue substantially protects Syrian hamsters from pulmonary disease and death caused by ANDV. We conclude that PCDH1 is a bona fide entry receptor for ANDV and SNV whose direct interaction with hantavirus glycoproteins could be targeted to develop new interventions against HCPS.
Collapse
Affiliation(s)
- Megan M Slough
- Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Rong Li
- Department of Animal, Dairy and Veterinary Sciences, Utah State University, Logan, UT, USA
| | - Andrew S Herbert
- United States Army Medical Research Institute of Infectious Diseases, Fort Detrick, MD, USA
| | - Gorka Lasso
- Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Ana I Kuehne
- United States Army Medical Research Institute of Infectious Diseases, Fort Detrick, MD, USA
| | - Stephanie R Monticelli
- United States Army Medical Research Institute of Infectious Diseases, Fort Detrick, MD, USA
- The Geneva Foundation, Tacoma, WA, USA
| | - Russell R Bakken
- United States Army Medical Research Institute of Infectious Diseases, Fort Detrick, MD, USA
| | - Yanan Liu
- Department of Animal, Dairy and Veterinary Sciences, Utah State University, Logan, UT, USA
| | - Agnidipta Ghosh
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Alicia M Moreau
- United States Army Medical Research Institute of Infectious Diseases, Fort Detrick, MD, USA
| | - Xiankun Zeng
- United States Army Medical Research Institute of Infectious Diseases, Fort Detrick, MD, USA
| | - Félix A Rey
- Institut Pasteur, Université Paris Cité, CNRS UMR3569, Structural Virology Unit, F-75015, Paris, France
| | - Pablo Guardado-Calvo
- Institut Pasteur, Université Paris Cité, CNRS UMR3569, Structural Virology Unit, F-75015, Paris, France
- Institut Pasteur, Université Paris Cité, Structural Biology of Infectious Diseases Unit, F-75015, Paris, France
| | - Steven C Almo
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY, USA
| | - John M Dye
- United States Army Medical Research Institute of Infectious Diseases, Fort Detrick, MD, USA
| | - Rohit K Jangra
- Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, NY, USA.
- Microbiology and Immunology, Louisiana State University Health Sciences Center-Shreveport, Shreveport, LA, USA.
| | - Zhongde Wang
- Department of Animal, Dairy and Veterinary Sciences, Utah State University, Logan, UT, USA.
| | - Kartik Chandran
- Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, NY, USA.
| |
Collapse
|
34
|
Saldinger JC, Raymond M, Elvati P, Violi A. Domain-agnostic predictions of nanoscale interactions in proteins and nanoparticles. NATURE COMPUTATIONAL SCIENCE 2023; 3:393-402. [PMID: 38177838 DOI: 10.1038/s43588-023-00438-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 03/24/2023] [Indexed: 01/06/2024]
Abstract
Although challenging, the accurate and rapid prediction of nanoscale interactions has broad applications for numerous biological processes and material properties. While several models have been developed to predict the interaction of specific biological components, they use system-specific information that hinders their application to more general materials. Here we present NeCLAS, a general and efficient machine learning pipeline that predicts the location of nanoscale interactions, providing human-intelligible predictions. NeCLAS outperforms current nanoscale prediction models for generic nanoparticles up to 10-20 nm, reproducing interactions for biological and non-biological systems. Two aspects contribute to these results: a low-dimensional representation of nanoparticles and molecules (to reduce the effect of data uncertainty), and environmental features (to encode the physicochemical neighborhood at multiple scales). This framework has several applications, from basic research to rapid prototyping and design in nanobiotechnology.
Collapse
Affiliation(s)
| | - Matt Raymond
- Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA
| | - Paolo Elvati
- Mechanical Engineering, University of Michigan, Ann Arbor, MI, USA
| | - Angela Violi
- Chemical Engineering, University of Michigan, Ann Arbor, MI, USA.
- Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA.
- Mechanical Engineering, University of Michigan, Ann Arbor, MI, USA.
- Biophysics Program, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
35
|
Sharkia R, Jain S, Mahajnah M, Habib C, Azem A, Al-Shareef W, Zalan A. PTRH2 Gene Variants: Recent Review of the Phenotypic Features and Their Bioinformatics Analysis. Genes (Basel) 2023; 14:genes14051031. [PMID: 37239392 DOI: 10.3390/genes14051031] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 04/25/2023] [Accepted: 04/28/2023] [Indexed: 05/28/2023] Open
Abstract
Peptidyl-tRNA hydrolase 2 (PTRH2) is an evolutionarily highly conserved mitochondrial protein. The biallelic mutations in the PTRH2 gene have been suggested to cause a rare autosomal recessive disorder characterized by an infantile-onset multisystem neurologic endocrine and pancreatic disease (IMNEPD). Patients with IMNEPD present varying clinical manifestations, including global developmental delay associated with microcephaly, growth retardation, progressive ataxia, distal muscle weakness with ankle contractures, demyelinating sensorimotor neuropathy, sensorineural hearing loss, and abnormalities of thyroid, pancreas, and liver. In the current study, we conducted an extensive literature review with an emphasis on the variable clinical spectrum and genotypes in patients. Additionally, we reported on a new case with a previously documented mutation. A bioinformatics analysis of the various PTRH2 gene variants was also carried out from a structural perspective. It appears that the most common clinical characteristics among all patients include motor delay (92%), neuropathy (90%), distal weakness (86.4%), intellectual disability (84%), hearing impairment (80%), ataxia (79%), and deformity of head and face (~70%). The less common characteristics include hand deformity (64%), cerebellar atrophy/hypoplasia (47%), and pancreatic abnormality (35%), while the least common appear to be diabetes mellitus (~30%), liver abnormality (~22%), and hypothyroidism (16%). Three missense mutations were revealed in the PTRH2 gene, the most common one being Q85P, which was shared by four different Arab communities and was presented in our new case. Moreover, four different nonsense mutations in the PTRH2 gene were detected. It may be concluded that disease severity depends on the PTRH2 gene variant, as most of the clinical features are manifested by nonsense mutations, while only the common features are presented by missense mutations. A bioinformatics analysis of the various PTRH2 gene variants also suggested the mutations to be deleterious, as they seem to disrupt the structural confirmation of the enzyme, leading to loss of stability and functionality.
Collapse
Affiliation(s)
- Rajech Sharkia
- Unit of Human Biology and Genetics, Triangle Regional Research and Development Center, Kfar Qari 30075, Israel
- Unit of Natural Sciences, Beit-Berl Academic College, Beit-Berl 4490500, Israel
| | - Sahil Jain
- Department of Biochemistry and Molecular Biology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Muhammad Mahajnah
- The Ruth and Bruce Rappaport Faculty of Medicine, Technion-Israel Institute of Technology, Haifa 31096, Israel
- Child Neurology and Development Center, Hillel Yaffe Medical Center, Hadera 38100, Israel
| | - Clair Habib
- Genetics Institute, Rambam Health Care Campus, Haifa 31096, Israel
| | - Abdussalam Azem
- Department of Biochemistry and Molecular Biology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Wasif Al-Shareef
- Unit of Human Biology and Genetics, Triangle Regional Research and Development Center, Kfar Qari 30075, Israel
| | - Abdelnaser Zalan
- Unit of Human Biology and Genetics, Triangle Regional Research and Development Center, Kfar Qari 30075, Israel
| |
Collapse
|
36
|
Krapp LF, Abriata LA, Cortés Rodriguez F, Dal Peraro M. PeSTo: parameter-free geometric deep learning for accurate prediction of protein binding interfaces. Nat Commun 2023; 14:2175. [PMID: 37072397 PMCID: PMC10113261 DOI: 10.1038/s41467-023-37701-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 03/28/2023] [Indexed: 04/20/2023] Open
Abstract
Proteins are essential molecular building blocks of life, responsible for most biological functions as a result of their specific molecular interactions. However, predicting their binding interfaces remains a challenge. In this study, we present a geometric transformer that acts directly on atomic coordinates labeled only with element names. The resulting model-the Protein Structure Transformer, PeSTo-surpasses the current state of the art in predicting protein-protein interfaces and can also predict and differentiate between interfaces involving nucleic acids, lipids, ions, and small molecules with high confidence. Its low computational cost enables processing high volumes of structural data, such as molecular dynamics ensembles allowing for the discovery of interfaces that remain otherwise inconspicuous in static experimentally solved structures. Moreover, the growing foldome provided by de novo structural predictions can be easily analyzed, providing new opportunities to uncover unexplored biology.
Collapse
Affiliation(s)
- Lucien F Krapp
- Institute of Bioengineering, School of Life Sciences, Ecole Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics (SIB), Lausanne, 1015, Switzerland
| | - Luciano A Abriata
- Institute of Bioengineering, School of Life Sciences, Ecole Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics (SIB), Lausanne, 1015, Switzerland
| | - Fabio Cortés Rodriguez
- Institute of Bioengineering, School of Life Sciences, Ecole Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics (SIB), Lausanne, 1015, Switzerland
| | - Matteo Dal Peraro
- Institute of Bioengineering, School of Life Sciences, Ecole Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics (SIB), Lausanne, 1015, Switzerland.
| |
Collapse
|
37
|
Hu CW, Wang A, Fan D, Worth M, Chen Z, Huang J, Xie J, Macdonald J, Li L, Jiang J. Cancer-derived mutation in the OGA stalk domain promotes cell malignancy through dysregulating PDLIM7 and p53. RESEARCH SQUARE 2023:rs.3.rs-2709128. [PMID: 36993758 PMCID: PMC10055641 DOI: 10.21203/rs.3.rs-2709128/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
O-GlcNAcase (OGA) is the sole enzyme that hydrolyzes O-GlcNAcylation from thousands of proteins and is dysregulated in many diseases including cancer. However, the substrate recognition and pathogenic mechanisms of OGA remain largely unknown. Here we report the first discovery of a cancer-derived point mutation on the OGA's non-catalytic stalk domain that aberrantly regulated a small set of OGA-protein interactions and O-GlcNAc hydrolysis in critical cellular processes. We uncovered a novel cancer-promoting mechanism in which the OGA mutant preferentially hydrolyzed the O-GlcNAcylation from modified PDLIM7 and promoted cell malignancy by down-regulating p53 tumor suppressor in different types of cells through transcription inhibition and MDM2-mediated ubiquitination. Our study revealed the OGA deglycosylated PDLIM7 as a novel regulator of p53-MDM2 pathway, offered the first set of direct evidence on OGA substrate recognition beyond its catalytic site, and illuminated new directions to interrogate OGA's precise role without perturbing global O-GlcNAc homeostasis for biomedical applications.
Collapse
Affiliation(s)
| | - Ao Wang
- University of Wisconsin-Madison
| | | | | | | | | | | | | | | | - Jiaoyang Jiang
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison
| |
Collapse
|
38
|
Chen WF, Wang HF, Wang Y, Liu ZG, Xu BH. AmAtg2B-Mediated Lipophagy Regulates Lipolysis of Pupae in Apis mellifera. Int J Mol Sci 2023; 24:2096. [PMID: 36768418 PMCID: PMC9916532 DOI: 10.3390/ijms24032096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 12/23/2022] [Accepted: 12/29/2022] [Indexed: 01/21/2023] Open
Abstract
Lipophagy plays an important role in regulating lipid metabolism in mammals. The exact function of autophagy-related protein 2 (Atg2) has been investigated in mammals, but research on the existence and functions of Atg2 in Apis mellifera (AmAtg2) is still limited. Here, autophagy occurred in honeybee pupae, which targeted lipid droplets (LDs) in fat body, namely lipophagy, which was verified by co-localization of LDs with microtubule-associated protein 1A/1B light chain 3 beta (LC3). Moreover, AmAtg2 homolog B (AmAtg2B) was expressed specifically in pupal fat body, which indicated that AmAtg2B might have special function in fat body. Further, AmAtg2B antibody neutralization and AmAtg2B knock-down were undertaken to verify the functions in pupae. Results showed that low expression of AmAtg2B at the protein and transcriptional levels led to lipophagy inhibition, which down-regulated the expression levels of proteins and genes related to lipolysis. Altogether, results in this study systematically revealed that AmAtg2B interfered with lipophagy and then caused abnormal lipolysis in the pupal stage.
Collapse
Affiliation(s)
| | | | | | | | - Bao-Hua Xu
- College of Animal Science and Technology, Shandong Agricultural University, Tai’an 271018, China
| |
Collapse
|
39
|
Kang Y, Xu Y, Wang X, Pu B, Yang X, Rao Y, Chen J. HN-PPISP: a hybrid network based on MLP-Mixer for protein-protein interaction site prediction. Brief Bioinform 2023; 24:6833645. [PMID: 36403092 DOI: 10.1093/bib/bbac480] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 09/16/2022] [Accepted: 10/09/2022] [Indexed: 11/21/2022] Open
Abstract
MOTIVATION Biological experimental approaches to protein-protein interaction (PPI) site prediction are critical for understanding the mechanisms of biochemical processes but are time-consuming and laborious. With the development of Deep Learning (DL) techniques, the most popular Convolutional Neural Networks (CNN)-based methods have been proposed to address these problems. Although significant progress has been made, these methods still have limitations in encoding the characteristics of each amino acid in protein sequences. Current methods cannot efficiently explore the nature of Position Specific Scoring Matrix (PSSM), secondary structure and raw protein sequences by processing them all together. For PPI site prediction, how to effectively model the PPI context with attention to prediction remains an open problem. In addition, the long-distance dependencies of PPI features are important, which is very challenging for many CNN-based methods because the innate ability of CNN is difficult to outperform auto-regressive models like Transformers. RESULTS To effectively mine the properties of PPI features, a novel hybrid neural network named HN-PPISP is proposed, which integrates a Multi-layer Perceptron Mixer (MLP-Mixer) module for local feature extraction and a two-stage multi-branch module for global feature capture. The model merits Transformer, TextCNN and Bi-LSTM as a powerful alternative for PPI site prediction. On the one hand, this is the first application of an advanced Transformer (i.e. MLP-Mixer) with a hybrid network for sequence-based PPI prediction. On the other hand, unlike existing methods that treat global features altogether, the proposed two-stage multi-branch hybrid module firstly assigns different attention scores to the input features and then encodes the feature through different branch modules. In the first stage, different improved attention modules are hybridized to extract features from the raw protein sequences, secondary structure and PSSM, respectively. In the second stage, a multi-branch network is designed to aggregate information from both branches in parallel. The two branches encode the features and extract dependencies through several operations such as TextCNN, Bi-LSTM and different activation functions. Experimental results on real-world public datasets show that our model consistently achieves state-of-the-art performance over seven remarkable baselines. AVAILABILITY The source code of HN-PPISP model is available at https://github.com/ylxu05/HN-PPISP.
Collapse
Affiliation(s)
- Yan Kang
- National Pilot School of Software, Yunnan University, Kunming, 650091, P.R. China
| | - Yulong Xu
- National Pilot School of Software, Yunnan University, Kunming, 650091, P.R. China
| | - Xinchao Wang
- National Pilot School of Software, Yunnan University, Kunming, 650091, P.R. China
| | - Bin Pu
- College of Computer Science and Electronic Engineeringg, Hunan University, Changsha, 410082, P.R. China
| | - Xuekun Yang
- National Pilot School of Software, Yunnan University, Kunming, 650091, P.R. China
| | - Yulong Rao
- National Pilot School of Software, Yunnan University, Kunming, 650091, P.R. China
| | - Jianguo Chen
- School of Software Engineering, Sun Yat-Sen University, Zhuhai, 519082, P.R. China
| |
Collapse
|
40
|
Hou Z, Yang Y, Ma Z, Wong KC, Li X. Learning the protein language of proteome-wide protein-protein binding sites via explainable ensemble deep learning. Commun Biol 2023; 6:73. [PMID: 36653447 PMCID: PMC9849350 DOI: 10.1038/s42003-023-04462-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 01/11/2023] [Indexed: 01/20/2023] Open
Abstract
Protein-protein interactions (PPIs) govern cellular pathways and processes, by significantly influencing the functional expression of proteins. Therefore, accurate identification of protein-protein interaction binding sites has become a key step in the functional analysis of proteins. However, since most computational methods are designed based on biological features, there are no available protein language models to directly encode amino acid sequences into distributed vector representations to model their characteristics for protein-protein binding events. Moreover, the number of experimentally detected protein interaction sites is much smaller than that of protein-protein interactions or protein sites in protein complexes, resulting in unbalanced data sets that leave room for improvement in their performance. To address these problems, we develop an ensemble deep learning model (EDLM)-based protein-protein interaction (PPI) site identification method (EDLMPPI). Evaluation results show that EDLMPPI outperforms state-of-the-art techniques including several PPI site prediction models on three widely-used benchmark datasets including Dset_448, Dset_72, and Dset_164, which demonstrated that EDLMPPI is superior to those PPI site prediction models by nearly 10% in terms of average precision. In addition, the biological and interpretable analyses provide new insights into protein binding site identification and characterization mechanisms from different perspectives. The EDLMPPI webserver is available at http://www.edlmppi.top:5002/ .
Collapse
Affiliation(s)
- Zilong Hou
- School of Artificial Intelligence, Jilin University, Jilin, China
| | - Yuning Yang
- Information Science and Technology, Northeast Normal University, Jilin, China
| | - Zhiqiang Ma
- Information Science and Technology, Northeast Normal University, Jilin, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Jilin, China.
| |
Collapse
|
41
|
Li K, Quan L, Jiang Y, Li Y, Zhou Y, Wu T, Lyu Q. ctP 2ISP: Protein-Protein Interaction Sites Prediction Using Convolution and Transformer With Data Augmentation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:297-306. [PMID: 35213314 DOI: 10.1109/tcbb.2022.3154413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Protein-protein interactions are the basis of many cellular biological processes, such as cellular organization, signal transduction, and immune response. Identifying protein-protein interaction sites is essential for understanding the mechanisms of various biological processes, disease development, and drug design. However, it remains a challenging task to make accurate predictions, as the small amount of training data and severe imbalanced classification reduce the performance of computational methods. We design a deep learning method named ctP2ISP to improve the prediction of protein-protein interaction sites. ctP2ISP employs Convolution and Transformer to extract information and enhance information perception so that semantic features can be mined to identify protein-protein interaction sites. A weighting loss function with different sample weights is designed to suppress the preference of the model toward multi-category prediction. To efficiently reuse the information in the training set, a preprocessing of data augmentation with an improved sample-oriented sampling strategy is applied. The trained ctP2ISP was evaluated against current state-of-the-art methods on six public datasets. The results show that ctP2ISP outperforms all other competing methods on the balance metrics: F1, MCC, and AUPRC. In particular, our prediction on open tests related to viruses may also be consistent with biological insights. The source code and data can be obtained from https://github.com/lennylv/ctP2ISP.
Collapse
|
42
|
Interplay between C1-inhibitor and group IIA secreted phospholipase A 2 impairs their respective function. Immunol Res 2023; 71:70-82. [PMID: 36385678 PMCID: PMC9845149 DOI: 10.1007/s12026-022-09331-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 10/14/2022] [Indexed: 11/18/2022]
Abstract
High levels of human group IIA secreted phospholipase A2 (hGIIA) have been associated with various inflammatory disease conditions. We have recently shown that hGIIA activity and concentration are increased in the plasma of patients with hereditary angioedema due to C1-inhibitor deficiency (C1-INH-HAE) and negatively correlate with C1-INH plasma activity. In this study, we analyzed whether the presence of both hGIIA and C1-INH impairs their respective function on immune cells. hGIIA, but not recombinant and plasma-derived C1-INH, stimulates the production of IL-6, CXCL8, and TNF-α from peripheral blood mononuclear cells (PBMCs). PBMC activation mediated by hGIIA is blocked by RO032107A, a specific hGIIA inhibitor. Interestingly, C1-INH inhibits the hGIIA-induced production of IL-6, TNF-α, and CXCL8, while it does not affect hGIIA enzymatic activity. On the other hand, hGIIA reduces the capacity of C1-INH at inhibiting C1-esterase activity. Spectroscopic and molecular docking studies suggest a possible interaction between hGIIA and C1-INH but further experiments are needed to confirm this hypothesis. Together, these results provide evidence for a new interplay between hGIIA and C1-INH, which may be important in the pathophysiology of hereditary angioedema.
Collapse
|
43
|
Wang S, Chen W, Han P, Li X, Song T. RGN: Residue-Based Graph Attention and Convolutional Network for Protein-Protein Interaction Site Prediction. J Chem Inf Model 2022; 62:5961-5974. [PMID: 36398714 DOI: 10.1021/acs.jcim.2c01092] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The prediction of a protein-protein interaction site (PPI site) plays a very important role in the biochemical process, and lots of computational methods have been proposed in the past. However, the majority of the past methods are time consuming and lack accuracy. Hence, coming up with an effective computational method is necessary. In this article, we present a novel computational model called RGN (residue-based graph attention and convolutional network) to predict PPI sites. In our paper, the protein is treated as a graph. The amino acid can be seen as the node in the graph structure. The position-specific scoring matrix, hidden Markov model, hydrogen bond estimation algorithm, and ProtBert are applied as node features. The edges are decided by the spatial distance between the amino acids. Then, we utilize a residue-based graph convolutional network and graph attention network to further extract the deeper feature. Finally, the processed node feature is fed into the prediction layer. We show the superiority of our model by comparing it with the other four protein structure-based methods and five protein sequence-based methods. Our model obtains the best performance on all the evaluation metrics (accuracy, precision, recall, F1 score, Matthews correlation coefficient, area under the receiver operating characteristic curve, and area under the precision recall curve). We also conduct a case study to demonstrate that extracting the protein information from the protein structure perspective is effective and points out the difficult aspect of PPI site prediction.
Collapse
Affiliation(s)
- Shuang Wang
- College of Computer Science and Technology, China University of Petroleum, QingDao266580, China
| | - Wenqi Chen
- College of Computer Science and Technology, China University of Petroleum, QingDao266580, China
| | - Peifu Han
- College of Computer Science and Technology, China University of Petroleum, QingDao266580, China
| | - Xue Li
- College of Computer Science and Technology, China University of Petroleum, QingDao266580, China
| | - Tao Song
- College of Computer Science and Technology, China University of Petroleum, QingDao266580, China.,Department of Artificial Intelligence, Faculty of Computer Science, Polytechnical University of Madrid, Madrid28031, Spain
| |
Collapse
|
44
|
Li M, Wu Z, Wang W, Lu K, Zhang J, Zhou Y, Chen Z, Li D, Zheng S, Chen P, Wang B. Protein-Protein Interaction Sites Prediction Based on an Under-Sampling Strategy and Random Forest Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3646-3654. [PMID: 34705656 DOI: 10.1109/tcbb.2021.3123269] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The computational methods of protein-protein interaction sites prediction can effectively avoid the shortcomings of high cost and time in traditional experimental approaches. However, the serious class imbalance between interface and non-interface residues on the protein sequences limits the prediction performance of these methods. This work therefore proposed a new strategy, NearMiss-based under-sampling for unbalancing datasets and Random Forest classification (NM-RF), to predict protein interaction sites. Herein, the residues on protein sequences were represented by the PSSM-derived features, hydropathy index (HI) and relative solvent accessibility (RSA). In order to resolve the class imbalance problem, an under-sampling method based on NearMiss algorithm is adopted to remove some non-interface residues, and then the random forest algorithm is used to perform binary classification on the balanced feature datasets. Experiments show that the accuracy of NM-RF model reaches 87.6% and 84.3% on Dtestset72 and PDBtestset164 respectively, which demonstrate the effectiveness of the proposed NM-RF method in differentiating the interface or non-interface residues.
Collapse
|
45
|
Evaluation of the Effectiveness of Derived Features of AlphaFold2 on Single-Sequence Protein Binding Site Prediction. BIOLOGY 2022; 11:biology11101454. [PMID: 36290358 PMCID: PMC9598995 DOI: 10.3390/biology11101454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 09/30/2022] [Accepted: 09/30/2022] [Indexed: 11/06/2022]
Abstract
Simple Summary With the development of artificial intelligence, researchers can roughly predict the crystal structure of a protein by computer without the need for biological experiments, which provides new ideas and solutions to problems, such as protein-protein interaction and drug-target predictions. In this study, we proposed strategies to combine predicted protein structures with deep learning networks and evaluated them on different protein binding site prediction tasks. Our computational experiment results showed that all proposed strategies could effectively encode structural information for deep learning models. Abstract Though AlphaFold2 has attained considerably high precision on protein structure prediction, it is reported that directly inputting coordinates into deep learning networks cannot achieve desirable results on downstream tasks. Thus, how to process and encode the predicted results into effective forms that deep learning models can understand to improve the performance of downstream tasks is worth exploring. In this study, we tested the effects of five processing strategies of coordinates on two single-sequence protein binding site prediction tasks. These five strategies are spatial filtering, the singular value decomposition of a distance map, calculating the secondary structure feature, and the relative accessible surface area feature of proteins. The computational experiment results showed that all strategies were suitable and effective methods to encode structural information for deep learning models. In addition, by performing a case study of a mutated protein, we showed that the spatial filtering strategy could introduce structural changes into HHblits profiles and deep learning networks when protein mutation happens. In sum, this work provides new insight into the downstream tasks of protein-molecule interaction prediction, such as predicting the binding residues of proteins and estimating the effects of mutations.
Collapse
|
46
|
Soleymani F, Paquet E, Viktor H, Michalowski W, Spinello D. Protein-protein interaction prediction with deep learning: A comprehensive review. Comput Struct Biotechnol J 2022; 20:5316-5341. [PMID: 36212542 PMCID: PMC9520216 DOI: 10.1016/j.csbj.2022.08.070] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 08/29/2022] [Accepted: 08/30/2022] [Indexed: 11/15/2022] Open
Abstract
Most proteins perform their biological function by interacting with themselves or other molecules. Thus, one may obtain biological insights into protein functions, disease prevalence, and therapy development by identifying protein-protein interactions (PPI). However, finding the interacting and non-interacting protein pairs through experimental approaches is labour-intensive and time-consuming, owing to the variety of proteins. Hence, protein-protein interaction and protein-ligand binding problems have drawn attention in the fields of bioinformatics and computer-aided drug discovery. Deep learning methods paved the way for scientists to predict the 3-D structure of proteins from genomes, predict the functions and attributes of a protein, and modify and design new proteins to provide desired functions. This review focuses on recent deep learning methods applied to problems including predicting protein functions, protein-protein interaction and their sites, protein-ligand binding, and protein design.
Collapse
Affiliation(s)
- Farzan Soleymani
- Department of Mechanical Engineering, University of Ottawa, Ottawa, ON, Canada
| | - Eric Paquet
- National Research Council, 1200 Montreal Road, Ottawa, ON K1A 0R6, Canada
| | - Herna Viktor
- School of Electrical Engineering and Computer Science, University of Ottawa, ON, Canada
| | | | - Davide Spinello
- Department of Mechanical Engineering, University of Ottawa, Ottawa, ON, Canada
| |
Collapse
|
47
|
Murph M, Singh S, Schvarzstein M. A combined in silico and in vivo approach to the structure-function annotation of SPD-2 provides mechanistic insight into its functional diversity. Cell Cycle 2022; 21:1958-1979. [PMID: 35678569 PMCID: PMC9415446 DOI: 10.1080/15384101.2022.2078458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 04/10/2022] [Accepted: 05/04/2022] [Indexed: 11/03/2022] Open
Abstract
Centrosomes are organelles that function as hubs of microtubule nucleation and organization, with key roles in organelle positioning, asymmetric cell division, ciliogenesis, and signaling. Aberrant centrosome number, structure or function is linked to neurodegenerative diseases, developmental abnormalities, ciliopathies, and tumor development. A major regulator of centrosome biogenesis and function in C. elegans is the conserved Spindle-defective protein 2 (SPD-2), a homolog of the human CEP-192 protein. CeSPD-2 is required for centrosome maturation, centriole duplication, spindle assembly and possibly cell polarity establishment. Despite its importance, the specific molecular mechanism of CeSPD-2 regulation and function is poorly understood. Here, we combined computational analysis with cell biology approaches to uncover possible structure-function relationships of CeSPD-2 that may shed mechanistic light on its function. Domain prediction analysis corroborated and refined previously identified coiled-coils and ASH (Aspm-SPD-2 Hydin) domains and identified new domains: a GEF domain, an Ig-like domain, and a PDZ-like domain. In addition to these predicted structural features, CeSPD-2 is also predicted to be intrinsically disordered. Surface electrostatic maps identified a large basic region unique to the ASH domain of CeSPD-2. This basic region overlaps with most of the residues predicted to be involved in protein-protein interactions. In vivo, ASH::GFP localized to centrosomes and centrosome-associated microtubules. Our analysis groups ASH domains, PapD, Usher chaperone domains, and Major Sperm Protein (MSP) domains into a single superfold within the larger Immunoglobulin superfamily. This study lays the groundwork for designing rational hypothesis-based experiments to uncover the mechanisms of CeSPD-2 function in vivo.Abbreviations: AIR, Aurora kinase; ASH, Aspm-SPD-2 Hydin; ASP, Abnormal Spindle Protein; ASPM, Abnormal Spindle-like Microcephaly-associated Protein; CC, coiled-coil; CDK, Cyclin-dependent Kinase; Ce, Caenorhabditis elegans; CEP, Centrosomal Protein; CPAP, centrosomal P4.1-associated protein; D, Drosophila; GAP, GTPase activating protein; GEF, GTPase guanine nucleotide exchange factor; Hs, Homo sapiens/Human; Ig, Immunoglobulin; MAP, Microtubule associated Protein; MSP, Major Sperm Protein; MDP, Major Sperm Domain-Containing Protein; OCRL-1, Golgi endocytic trafficking protein Inositol polyphosphate 5-phosphatase; PAR, abnormal embryonic PARtitioning of the cytosol; PCM, Pericentriolar material; PCMD, pericentriolar matrix deficient; PDZ, PSD95/Dlg-1/zo-1; PLK, Polo like kinase; RMSD, Root Mean Square Deviation; SAS, Spindle assembly abnormal proteins; SPD, Spindle-defective protein; TRAPP, TRAnsport Protein Particle; Xe, Xenopus; ZYG, zygote defective protein.
Collapse
Affiliation(s)
- Mikaela Murph
- Department of Biology, City University of New York, Brooklyn College, New York, NY, USA
| | - Shaneen Singh
- Department of Biology, City University of New York, Brooklyn College, New York, NY, USA
- Department of Biology, The Graduate Center at City University of New York, New York, NY, USA
- Department Biochemistry, The Graduate Center at City University of New York, New York, NY, USA
| | - Mara Schvarzstein
- Department of Biology, City University of New York, Brooklyn College, New York, NY, USA
- Department of Biology, The Graduate Center at City University of New York, New York, NY, USA
- Department Biochemistry, The Graduate Center at City University of New York, New York, NY, USA
| |
Collapse
|
48
|
Staphylococcus aureus Exfoliative Toxin E, Oligomeric State and Flip of P186: Implications for Its Action Mechanism. Int J Mol Sci 2022; 23:ijms23179857. [PMID: 36077258 PMCID: PMC9456352 DOI: 10.3390/ijms23179857] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 08/23/2022] [Accepted: 08/26/2022] [Indexed: 11/17/2022] Open
Abstract
Staphylococcal exfoliative toxins (ETs) are glutamyl endopeptidases that specifically cleave the Glu381-Gly382 bond in the ectodomains of desmoglein 1 (Dsg1) via complex action mechanisms. To date, four ETs have been identified in different Staphylococcus aureus strains and ETE is the most recently characterized. The unusual properties of ETs have been attributed to a unique structural feature, i.e., the 180° flip of the carbonyl oxygen (O) of the nonconserved residue 192/186 (ETA/ETE numbering), not conducive to the oxyanion hole formation. We report the crystal structure of ETE determined at 1.61 Å resolution, in which P186(O) adopts two conformations displaying a 180° rotation. This finding, together with free energy calculations, supports the existence of a dynamic transition between the conformations under the tested conditions. Moreover, enzymatic assays showed no significant differences in the esterolytic efficiency of ETE and ETE/P186G, a mutant predicted to possess a functional oxyanion hole, thus downplaying the influence of the flip on the activity. Finally, we observed the formation of ETE homodimers in solution and the predicted homodimeric structure revealed the participation of a characteristic nonconserved loop in the interface and the partial occlusion of the protein active site, suggesting that monomerization is required for enzymatic activity.
Collapse
|
49
|
ProB-Site: Protein Binding Site Prediction Using Local Features. Cells 2022; 11:cells11132117. [PMID: 35805201 PMCID: PMC9266162 DOI: 10.3390/cells11132117] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 06/30/2022] [Accepted: 07/01/2022] [Indexed: 01/16/2023] Open
Abstract
Protein–protein interactions (PPIs) are responsible for various essential biological processes. This information can help develop a new drug against diseases. Various experimental methods have been employed for this purpose; however, their application is limited by their cost and time consumption. Alternatively, computational methods are considered viable means to achieve this crucial task. Various techniques have been explored in the literature using the sequential information of amino acids in a protein sequence, including machine learning and deep learning techniques. The current efficiency of interaction-site prediction still has growth potential. Hence, a deep neural network-based model, ProB-site, is proposed. ProB-site utilizes sequential information of a protein to predict its binding sites. The proposed model uses evolutionary information and predicted structural information extracted from sequential information of proteins, generating three unique feature sets for every amino acid in a protein sequence. Then, these feature sets are fed to their respective sub-CNN architecture to acquire complex features. Finally, the acquired features are concatenated and classified using fully connected layers. This methodology performed better than state-of-the-art techniques because of the selection of the best features and contemplation of local information of each amino acid.
Collapse
|
50
|
Tubiana J, Schneidman-Duhovny D, Wolfson HJ. ScanNet: A web server for structure-based prediction of protein binding sites with geometric deep learning. J Mol Biol 2022; 434:167758. [DOI: 10.1016/j.jmb.2022.167758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 07/18/2022] [Accepted: 07/19/2022] [Indexed: 11/28/2022]
|