1
|
Daou L, Hanna EM. Predicting protein complexes in protein interaction networks using Mapper and graph convolution networks. Comput Struct Biotechnol J 2024; 23:3595-3609. [PMID: 39493503 PMCID: PMC11530816 DOI: 10.1016/j.csbj.2024.10.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 10/04/2024] [Accepted: 10/04/2024] [Indexed: 11/05/2024] Open
Abstract
Protein complexes are groups of interacting proteins that are central to multiple biological processes. Studying protein complexes can enhance our understanding of cellular functions and malfunctions and thus support the development of effective disease treatments. High-throughput experimental techniques allow the generation of large-scale protein-protein interaction datasets. Accordingly, various computational approaches to predict protein complexes from protein-protein interactions were presented in the literature. They are typically based on networks in which nodes and edges represent proteins and their interactions, respectively. State-of-the-art approaches mainly rely on clustering static networks to identify complexes. However, since protein interactions are highly dynamic in nature, recent approaches seek to model such dynamics by typically integrating gene expression data and identifying protein complexes accordingly. We propose MComplex, a method that utilizes time-series gene expression with interaction data to generate a temporal network which is passed to a generative adversarial network whose generator is a graph convolutional network. This creates embeddings which are then analyzed using a modified graph-based version of the Mapper algorithm to predict corresponding protein complexes. We test our approach on multiple benchmark datasets and compare identified complexes against gold-standard protein complex datasets. Our results show that MComplex outperforms existing methods in several evaluation aspects, namely recall and maximum matching ratio as well as a composite score covering aggregated evaluation measures. The code and data are available for free download from https://github.com/LeonardoDaou/MComplex.
Collapse
Affiliation(s)
- Leonardo Daou
- Department of Computer Science and Mathematics, Lebanese American University, Byblos, Lebanon
| | - Eileen Marie Hanna
- Department of Computer Science and Mathematics, Lebanese American University, Byblos, Lebanon
| |
Collapse
|
2
|
Vasović LM, Pavlović-Lažetić GM, Kovačević JJ, Beljanski MV, Uversky VN. Intrinsically disordered proteins and liquid-liquid phase separation in SARS-CoV-2 interactomes. J Cell Biochem 2024; 125:e30502. [PMID: 37992221 DOI: 10.1002/jcb.30502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 10/12/2023] [Accepted: 11/06/2023] [Indexed: 11/24/2023]
Abstract
This paper discusses the properties of proteins and their relations in the interactomes of the selected subsets of SARS-CoV-2 proteome-the membrane protein, nonstructural proteins, and, finally, full proteome. Protein disorder according to several measures, liquid-liquid phase separation probabilities, and protein node degrees in the interaction networks were singled out as the features of interest. Additionally, viral interactomes were combined with the interactome of human lung tissue so as to examine if the new connections in the resulting viral-host interactome are linked to protein disorder. Correlation analysis shows that there is no clear relationship between raw features of interest, whereas there is a positive correlation between the protein disorder and its neighborhood mean disorder. There are also indications that highly connected viral hubs tend to be on average more ordered than proteins with a small number of connections. This is in contrast to previous similar studies conducted on eukaryotic interactomes and possibly raises new questions in research on viral interactomes.
Collapse
Affiliation(s)
- Lazar M Vasović
- Faculty of Mathematics, University of Belgrade, Belgrade, Serbia
| | | | | | - Miloš V Beljanski
- BioLab, Institute of General and Physical Chemistry, Belgrade, Serbia
| | - Vladimir N Uversky
- Department of Molecular Medicine and Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
| |
Collapse
|
3
|
Akbarzadeh S, Coşkun Ö, Günçer B. Studying protein-protein interactions: Latest and most popular approaches. J Struct Biol 2024; 216:108118. [PMID: 39214321 DOI: 10.1016/j.jsb.2024.108118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 08/20/2024] [Accepted: 08/23/2024] [Indexed: 09/04/2024]
Abstract
PPIs, or protein-protein interactions, are essential for many biological processes. According to the findings, abnormal PPIs have been linked to several diseases, such as cancer and infectious and neurological disorders. Consequently, focusing on PPIs is a path toward disease treatment and a crucial tool for producing novel medications. Many methods exist to investigate PPIs, including low- and high-throughput studies. Since many PPIs have been discovered using in vitro and in vivo experimental approaches, the use of computational methods to predict PPIs has grown due to the expanding scale of PPI data and the intrinsic complexity of interacting mechanisms. Recognizing PPI networks offers a systematic means of predicting protein functions, and pathways that are included. These investigations can help uncover the underlying molecular mechanisms of complex phenotypes and clarify the biological processes related to health and diseases. Therefore, our goal in this study is to provide an overview of the latest and most popular approaches for investigating PPIs. We also overview some important clinical approaches based on the PPIs and how these interactions can be targeted.
Collapse
Affiliation(s)
- Sama Akbarzadeh
- Department of Biophysics, Istanbul Faculty of Medicine, Istanbul University, Istanbul, Türkiye; Institute of Graduate Studies in Health Sciences, Istanbul University, Istanbul, Türkiye
| | - Özlem Coşkun
- Department of Biophysics, Faculty of Medicine, Çanakkale Onsekiz Mart University, Çanakkale, Türkiye
| | - Başak Günçer
- Department of Biophysics, Istanbul Faculty of Medicine, Istanbul University, Istanbul, Türkiye.
| |
Collapse
|
4
|
Wang S, Dong K, Liang D, Zhang Y, Li X, Song T. MIPPIS: protein-protein interaction site prediction network with multi-information fusion. BMC Bioinformatics 2024; 25:345. [PMID: 39497043 PMCID: PMC11536593 DOI: 10.1186/s12859-024-05964-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2024] [Accepted: 10/21/2024] [Indexed: 11/06/2024] Open
Abstract
BACKGROUND The prediction of protein-protein interaction sites plays a crucial role in biochemical processes. Investigating the interaction between viruses and receptor proteins through biological techniques aids in understanding disease mechanisms and guides the development of corresponding drugs. While various methods have been proposed in the past, they often suffer from drawbacks such as long processing times, high costs, and low accuracy. RESULTS Addressing these challenges, we propose a novel protein-protein interaction site prediction network based on multi-information fusion. In our approach, the initial amino acid features are depicted by the position-specific scoring matrix, hidden Markov model, dictionary of protein secondary structure, and one-hot encoding. Simultaneously, we adopt a multi-channel approach to extract deep-level amino acids features from different perspectives. The graph convolutional network channel effectively extracts spatial structural information. The bidirectional long short-term memory channel treats the amino acid sequence as natural language, capturing the protein's primary structure information. The ProtT5 protein large language model channel outputs a more comprehensive amino acid embedding representation, providing a robust complement to the two aforementioned channels. Finally, the obtained amino acid features are fed into the prediction layer for the final prediction. CONCLUSION Compared with six protein structure-based methods and six protein sequence-based methods, our model achieves optimal performance across evaluation metrics, including accuracy, precision, F1, Matthews correlation coefficient, and area under the precision recall curve, which demonstrates the superiority of our model.
Collapse
Affiliation(s)
- Shuang Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China
| | - Kaiyu Dong
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China
| | - Dingming Liang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China
| | - Yunjing Zhang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China
| | - Xue Li
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China
| | - Tao Song
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China.
- Department of Artificial Intelligence, Polytechnical University of Madrid, Madrid, 28031, Spain.
| |
Collapse
|
5
|
Khan DA, Adhikary T, Sultana MT, Toukir IA. A comprehensive identification of potential molecular targets and small drugs candidate for melanoma cancer using bioinformatics and network-based screening approach. J Biomol Struct Dyn 2024; 42:7349-7369. [PMID: 37534476 DOI: 10.1080/07391102.2023.2240409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 07/17/2023] [Indexed: 08/04/2023]
Abstract
Melanoma is the third most common malignant skin tumor and has increased in morbidity and mortality over the previous decade due to its rapid spread into the bloodstream or lymphatic system. This study used integrated bioinformatics and network-based methodologies to reliably identify molecular targets and small molecular medicines that may be more successful for Melanoma diagnosis, prognosis and treatment. The statistical LIMMA approach utilized for bioinformatics analysis in this study found 246 common differentially expressed genes (cDEGs) between case and control samples from two microarray gene-expression datasets (GSE130244 and GSE15605). Protein-protein interaction network study revealed 15 cDEGs (PTK2, STAT1, PNO1, CXCR4, WASL, FN1, RUNX2, SOCS3, ITGA4, GNG2, CDK6, BRAF, AGO2, GTF2H1 and AR) to be critical in the development of melanoma (KGs). According to regulatory network analysis, the most important transcriptional and post-transcriptional regulators of DEGs and hub-DEGs are ten transcription factors and three miRNAs. We discovered the pathogenetic mechanisms of MC by studying DEGs' biological processes, molecular function, cellular components and KEGG pathways. We used molecular docking and dynamics modeling to select the four most expressed genes responsible for melanoma malignancy to identify therapeutic candidates. Then, utilizing the Connectivity Map (CMap) database, we analyzed the top 4-hub-DEGs-guided repurposable drugs. We validated four melanoma cancer drugs (Fisetin, Epicatechin Gallate, 1237586-97-8 and PF 431396) using molecular dynamics simulation with their target proteins. As a result, the results of this study may provide resources to researchers and medical professionals for the wet-lab validation of MC diagnosis, prognosis and treatments.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Dhrubo Ahmed Khan
- Department of Genetic Engineering and Biotechnology, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Tonmoy Adhikary
- Department of Mathematics, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Mst Tania Sultana
- Department of Mathematics, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Imran Ahamed Toukir
- Department of Chemical Engineering, Jashore University of Science and Technology, Jashore, Bangladesh
| |
Collapse
|
6
|
Bogdanova EA, Novoseletsky VN. ProBAN: Neural network algorithm for predicting binding affinity in protein-protein complexes. Proteins 2024; 92:1127-1136. [PMID: 38722047 DOI: 10.1002/prot.26700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 03/22/2024] [Accepted: 04/26/2024] [Indexed: 08/07/2024]
Abstract
Determining binding affinities in protein-protein and protein-peptide complexes is a challenging task that directly impacts the development of peptide and protein pharmaceuticals. Although several models have been proposed to predict the value of the dissociation constant and the Gibbs free energy, they are currently not capable of making stable predictions with high accuracy, in particular for complexes consisting of more than two molecules. In this work, we present ProBAN, a new method for predicting binding affinity in protein-protein complexes based on a deep convolutional neural network. Prediction is carried out for the spatial structures of complexes, presented in the format of a 4D tensor, which includes information about the location of atoms and their abilities to participate in various types of interactions realized in protein-protein and protein-peptide complexes. The effectiveness of the model was assessed both on an internal test data set containing complexes consisting of three or more molecules, as well as on an external test for the PPI-Affinity service. As a result, we managed to achieve the best prediction quality on these data sets among all the analyzed models: on the internal test, Pearson correlation R = 0.6, MAE = 1.60, on the external test, R = 0.55, MAE = 1.75. The open-source code, the trained ProBAN model, and the collected dataset are freely available at the following link https://github.com/EABogdanova/ProBAN.
Collapse
|
7
|
Pollet L, Xia Y. Structure-guided Evolutionary Analysis of Interactome Network Rewiring at Single Residue Resolution in Yeasts. J Mol Biol 2024; 436:168641. [PMID: 38844045 DOI: 10.1016/j.jmb.2024.168641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 04/30/2024] [Accepted: 06/01/2024] [Indexed: 06/16/2024]
Abstract
Protein-protein interactions (PPIs) are known to rewire extensively during evolution leading to lineage-specific and species-specific changes in molecular processes. However, the detailed molecular evolutionary mechanisms underlying interactome network rewiring are not well-understood. Here, we combine high-confidence PPI data, high-resolution three-dimensional structures of protein complexes, and homology-based structural annotation transfer to construct structurally-resolved interactome networks for the two yeasts S. cerevisiae and S. pombe. We then classify PPIs according to whether they are preserved or different between the two yeast species and compare site-specific evolutionary rates of interfacial versus non-interfacial residues for these different categories of PPIs. We find that residues in PPI interfaces evolve significantly more slowly than non-interfacial residues when using lineage-specific measures of evolutionary rate, but not when using non-lineage-specific measures. Furthermore, both lineage-specific and non-lineage-specific evolutionary rate measures can distinguish interfacial residues from non-interfacial residues for preserved PPIs between the two yeasts, but only the lineage-specific measure is appropriate for rewired PPIs. Finally, both lineage-specific and non-lineage-specific evolutionary rate measures are appropriate for elucidating structural determinants of protein evolution for residues outside of PPI interfaces. Overall, our results demonstrate that unlike tertiary structures of single proteins, PPIs and PPI interfaces can be highly volatile in their evolution, thus requiring the use of lineage-specific measures when studying their evolution. These results yield insight into the evolutionary design principles of PPIs and the mechanisms by which interactions are preserved or rewired between species, improving our understanding of the molecular evolution of PPIs and PPI interfaces at the residue level.
Collapse
Affiliation(s)
- Léah Pollet
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, QC, Canada
| | - Yu Xia
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, QC, Canada.
| |
Collapse
|
8
|
Fatima T, Mubasher MM, Rehman HM, Niyazi S, Alanzi AR, Kalsoom M, Khalid S, Bashir H. Computational modeling study of IL-15-NGR peptide fusion protein: a targeted therapeutics for hepatocellular carcinoma. AMB Express 2024; 14:91. [PMID: 39133343 PMCID: PMC11319546 DOI: 10.1186/s13568-024-01747-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 07/25/2024] [Indexed: 08/13/2024] Open
Abstract
The primary challenge to improving existing cancer treatment is to develop drugs that specifically target tumor cell. NGR peptide is tumor homing peptide that selectively target cancer cells while interleukin 15 is a pleiotropic cytokine with anticancer properties. This study computationally engineered a IL15-NGR fusion peptide by linking the homing peptide NGR with the targeting peptide IL-15. After evaluating and validating the chimeric peptide, we docked it to the IL-15Rα/IL-15Rβ/γc heterodimer receptor, examining the interactions and binding energy and lastly, molecular dynamics simulations were performed. The secondary and tertiary structures, along with physicochemical properties of the designed IL-15-NGR chimeric protein, were predicted using GOR IV, trRosetta and ProtParam online servers respectively. The quality and 3D structure validation were confirmed via ProSA-web and SAVES 6.0 analysis which predicted an ERRAT score of 96.72%, with 97.6% of residues in the Ramachandran plot, validating its structure. Finally, Docking, MD simulations and interaction analysis were performed using ClusPro 2.0 and GROMACS and PDBsum, which exhibited significant hydrogen bonding and salt bridges, confirming the formation of a stable docked complex. These results were further corroborated by simulation analysis, which demonstrated a stable and dynamic behavior of the docked complex in a biological environment. The predicted high expression value of fusion protein was 0.844 in E.coli using SOLUPROT tool. These findings suggest efficient expression of the IL15-NGR fusion protein if its gene is inserted into E. coli and indicates its potential as a safe and effective anticancer treatment, paving the way for targeted therapeutic interventions.
Collapse
Affiliation(s)
- Tehreem Fatima
- Centre for Applied Molecular Biology (CAMB), University of the Punjab, 87-West canal, Bank Road, Lahore, 53700, Pakistan
| | | | - Hafiz Muhammad Rehman
- Centre for Applied Molecular Biology (CAMB), University of the Punjab, 87-West canal, Bank Road, Lahore, 53700, Pakistan.
- University Institute of Medical Lab Technology, Faculty of Allied health sciences, The University of Lahore, Lahore, 54590, Pakistan.
| | - Sakina Niyazi
- School of Biotechnology, IFTM University, Moradabad, 244102, India
| | - Abdullah R Alanzi
- Department of Pharmacognosy, College of Pharmacy, King Saud University, Riyadh, 11451, Saudi Arabia
| | - Maria Kalsoom
- Centre for Applied Molecular Biology (CAMB), University of the Punjab, 87-West canal, Bank Road, Lahore, 53700, Pakistan
| | - Sania Khalid
- Centre for Applied Molecular Biology (CAMB), University of the Punjab, 87-West canal, Bank Road, Lahore, 53700, Pakistan
| | - Hamid Bashir
- Centre for Applied Molecular Biology (CAMB), University of the Punjab, 87-West canal, Bank Road, Lahore, 53700, Pakistan.
| |
Collapse
|
9
|
Cao L, Yang W, Duan X, Shao Y, Zhang Z, Wang C, Sun K, Zhang M, Li H, Harada KH, Yang B. Novel analysis of functional relationship linking moyamoya disease to moyamoya syndrome. Heliyon 2024; 10:e34600. [PMID: 39149038 PMCID: PMC11325278 DOI: 10.1016/j.heliyon.2024.e34600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 07/10/2024] [Accepted: 07/12/2024] [Indexed: 08/17/2024] Open
Abstract
Objective The aim of this study was to elucidate the genetic pathways associated with Moyamoya disease (MMD) and Moyamoya syndrome (MMS), compare the functional activities, and validate relevant related genes in an independent dataset. Methods We conducted a comprehensive search for genetic studies on MMD and MMS across multiple databases and identified related genes. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichments analyses were performed for these genes. Commonly shared genes were selected for further validation in the independent dataset, GSE189993. The Sangerbox platform was used to perform statistical analysis and visualize the results. P<0.05 indicated a statistically significant result. Results We included 52 MMD and 51 MMS-related publications and identified 126 and 51 relevant genes, respectively. GO analysis for MMD showed significant enrichment in cytokine activity, cell membrane receptors, enzyme binding, and immune activity. A broader range of terms was enriched for MMS. KEGG pathway analysis for MMD highlighted immune and cellular activities and pathways related to MMS prominently featured inflammation and metabolic disorders. Notably, nine overlapping genes were identified and validated. The expressions of RNF213, PTPN11, and MTHFR demonstrated significant differences in GSE189993. A combined receiver operating characteristic curve showed high diagnostic accuracy (AUC = 0.918). Conclusions The findings indicate a close relationship of MMD with immune activity and MMS with inflammation, metabolic processes and other environmental factors in a given genetic background. Differentiating between MMD and MMS can enhance the understanding of their pathophysiology and inform the strategies for their diagnoses and treatment.
Collapse
Affiliation(s)
- Lei Cao
- Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Wenzhi Yang
- School of Life Science, Zhengzhou University, Zhengzhou, 450000, China
| | - Xiaozong Duan
- Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Yipu Shao
- Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Zhizhong Zhang
- Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Chenchao Wang
- Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Kaiwen Sun
- Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Manxia Zhang
- Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Hongwei Li
- Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Kouji H Harada
- Department of Health and Environmental Sciences, Kyoto University Graduate School of Medicine, Kyoto, 6068501, Japan
| | - Bo Yang
- Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| |
Collapse
|
10
|
Yadav AK, Murthy TPK, Divyashri G, Prasad N D, Prakash S, Vaishnavi V V, Shukla R, Singh TR. Computational screening of pathogenic missense nsSNPs in heme oxygenase 1 (HMOX1) gene and their structural and functional consequences. J Biomol Struct Dyn 2024; 42:5072-5091. [PMID: 37434323 DOI: 10.1080/07391102.2023.2231553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 06/07/2023] [Indexed: 07/13/2023]
Abstract
Heme Oxygenase 1 (HMOX1) is a cytoprotective enzyme, exhibiting the highest activity in the spleen, catalyzing the heme ring breakdown into products of biological significance- biliverdin, CO, and Fe2+. In vascular cells, HMOX1 possesses strong anti-apoptotic, antioxidant, anti-proliferative, anti-inflammatory, and immunomodulatory actions. The majority of these activities are crucial for the prevention of atherogenesis. Single amino acid substitutions in proteins generated by missense non-synonymous single nucleotide polymorphism (nsSNPs) in the protein-encoding regions of genes are potent enough to cause significant medical challenges due to the alteration of protein structure and function. The current study aimed at characterizing and analyzing high-risk nsSNPs associated with the human HMOX1 gene. Preliminary screening of the total available 288 missense SNPs was performed through the lens of deleteriousness and stability prediction tools. Finally, a total of seven nsSNPs (Y58D, A131T, Y134H, F166S, F167S, R183S and M186V) were found to be most deleterious by all tools that are present at highly conserved positions. Molecular dynamics simulations (MDS) analysis explained the mutational effects on the dynamic action of the wild-type and mutant proteins. In a nutshell, R183S (rs749644285) was identified as a highly detrimental mutation that could significantly render the enzymatic activity of HMOX1. The finding of this computational analysis might help subject the experimental confirmatory analysis to characterize the role of nsSNPs in HMOX1.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Arvind Kumar Yadav
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Solan, Himachal Pradesh, India
| | - T P Krishna Murthy
- Department of Biotechnology, Ramaiah Institute of Technology, Bengaluru, Karnataka, India
| | - Gangaraju Divyashri
- Department of Biotechnology, Ramaiah Institute of Technology, Bengaluru, Karnataka, India
| | - Durga Prasad N
- Department of Biotechnology, Ramaiah Institute of Technology, Bengaluru, Karnataka, India
| | - Sriraksha Prakash
- Department of Biotechnology, Ramaiah Institute of Technology, Bengaluru, Karnataka, India
| | - Vijaya Vaishnavi V
- Department of Biotechnology, Ramaiah Institute of Technology, Bengaluru, Karnataka, India
| | - Rohit Shukla
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Solan, Himachal Pradesh, India
| | - Tiratha Raj Singh
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Solan, Himachal Pradesh, India
| |
Collapse
|
11
|
Zhang F, Chang S, Wang B, Zhang X. DSSGNN-PPI: A Protein-Protein Interactions prediction model based on Double Structure and Sequence graph neural networks. Comput Biol Med 2024; 177:108669. [PMID: 38833802 DOI: 10.1016/j.compbiomed.2024.108669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 04/04/2024] [Accepted: 05/26/2024] [Indexed: 06/06/2024]
Abstract
The process of experimentally confirming complex interaction networks among proteins is time-consuming and laborious. This study aims to address Protein-Protein Interactions (PPIs) prediction based on graph neural networks (GNN). A novel multilevel prediction model for PPIs named DSSGNN-PPI (Double Structure and Sequence GNN for PPIs) is designed. Initially, a distance graph between amino acid residues is constructed. Subsequently, the distance graph is fed into an underlying graph attention network module. This enables us to efficiently learn vector representations that encode the three-dimensional structure of proteins and simultaneously aggregate key local patterns and overall topological information to obtain graph embedding that adequately represent local and global structural features. In addition, the embedding representations that reflect sequence properties are obtained. Two features are fused to construct high-level protein complex networks, which are fed into the designed gated graph attention network to extract complex topological patterns. By combining heterogeneous multi-source information from downstream structure graph and upstream sequence models, the understanding of PPIs is comprehensively enhanced. A series of evaluation results validate the remarkable effectiveness of DSSGNN-PPI framework in enhancing the prediction of multi-type interactions among proteins. The multilevel representation learning and information fusion strategies provide a new effective solution paradigm for structural biology problems. The source code for DSSGNN-PPI has been hosted on GitHub and is available at https://github.com/cstudy1/DSSGNN-PPI.
Collapse
Affiliation(s)
- Fan Zhang
- Huaihe Hospital of Henan University, Kaifeng 475004, China; School of Computer and Information Engineering, Henan University, Kaifeng 475004, China.
| | - Sheng Chang
- School of Computer and Information Engineering, Henan University, Kaifeng 475004, China.
| | - Binjie Wang
- Huaihe Hospital of Henan University, Kaifeng 475004, China.
| | - Xinhong Zhang
- School of Software, Henan University, Kaifeng, 475004, China.
| |
Collapse
|
12
|
Fan Z, Zhao H, Zhou J, Li D, Fan Y, Bi Y, Ji S. A versatile attention-based neural network for chemical perturbation analysis and its potential to aid surgical treatment: A experimental study. Int J Surg 2024; 110:01279778-990000000-01656. [PMID: 39017949 PMCID: PMC11634177 DOI: 10.1097/js9.0000000000001781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 05/30/2024] [Indexed: 07/18/2024]
Abstract
Deep learning models have emerged as rapid, accurate, and effective approaches for clinical decisions. Through a combination of drug screening and deep learning models, drugs that may benefit patients before and after surgery can be discovered to reduce the risk of complications or speed recovery. However, most existing drug prediction methods have high data requirements and lack interpretability, which has a limited role in adjuvant surgical treatment. To address these limitations, we propose the attention-based convolution transpositional interfusion network (ACTIN) for flexible and efficient drug discovery. ACTIN leverages the graph convolution and the transformer mechanism, utilizing drug and transcriptome data to assess the impact of chemical pharmacophores containing certain elements on gene expression. Remarkably, just with only 393 training instances, only one-tenth of the other models, ACTIN achieves state-of-the-art performance, demonstrating its effectiveness even with limited data. By incorporating chemical element embedding disparity and attention mechanism-based parameter analysis, it identifies the possible pharmacophore containing certain elements that could interfere with specific cell lines, which is particularly valuable for screening useful pharmacophores for new drugs tailored to adjuvant surgical treatment. To validate its reliability, we conducted comprehensive examinations by utilizing transcriptome data from the lung tissue of fatal COVID-19 patients as additional input for ACTIN, we generated novel lead chemicals that align with clinical evidence. In summary, ACTIN offers insights into the perturbation biases of elements within pharmacophore on gene expression, which holds the potential for guiding the development of new drugs that benefit surgical treatment.
Collapse
Affiliation(s)
- Zheqi Fan
- Department of Orthopaedics, The First Medical Centre, Chinese PLA General Hospital, Beijing
| | - Houming Zhao
- Department of Urology, The Third Medical Center, Chinese PLA General Hospital, Beijing
| | - Jingcheng Zhou
- Senior Department of Otolaryngology-Head and Neck Surgery, The Sixth Medical Center, Chinese PLA General Hospital, Beijing
| | - Dingchang Li
- Department of General Surgery, The First Medical Centre, Chinese PLA General Hospital, Beijing
| | - Yunlong Fan
- Department of Dermatology, The Seventh Medical Center, Chinese PLA General Hospital, Beijing
| | - Yiming Bi
- Graduate School of PLA Medical College, Chinese PLA General Hospital, Beijing, People’s Republic of China
| | - Shuaifei Ji
- Graduate School of PLA Medical College, Chinese PLA General Hospital, Beijing, People’s Republic of China
| |
Collapse
|
13
|
Gong Y, Li R, Liu Y, Wang J, Cao B, Fu X, Li R, Chen DZ. MR2CPPIS: Accurate prediction of protein-protein interaction sites based on multi-scale Res2Net with coordinate attention mechanism. Comput Biol Med 2024; 176:108543. [PMID: 38744015 DOI: 10.1016/j.compbiomed.2024.108543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 04/09/2024] [Accepted: 04/28/2024] [Indexed: 05/16/2024]
Abstract
Proteins play a vital role in various biological processes and achieve their functions through protein-protein interactions (PPIs). Thus, accurate identification of PPI sites is essential. Traditional biological methods for identifying PPIs are costly, labor-intensive, and time-consuming. The development of computational prediction methods for PPI sites offers promising alternatives. Most known deep learning (DL) methods employ layer-wise multi-scale CNNs to extract features from protein sequences. But, these methods usually neglect the spatial positions and hierarchical information embedded within protein sequences, which are actually crucial for PPI site prediction. In this paper, we propose MR2CPPIS, a novel sequence-based DL model that utilizes the multi-scale Res2Net with coordinate attention mechanism to exploit multi-scale features and enhance PPI site prediction capability. We leverage the multi-scale Res2Net to expand the receptive field for each network layer, thus capturing multi-scale information of protein sequences at a granular level. To further explore the local contextual features of each target residue, we employ a coordinate attention block to characterize the precise spatial position information, enabling the network to effectively extract long-range dependencies. We evaluate our MR2CPPIS on three public benchmark datasets (Dset 72, Dset 186, and PDBset 164), achieving state-of-the-art performance. The source codes are available at https://github.com/YyinGong/MR2CPPIS.
Collapse
Affiliation(s)
- Yinyin Gong
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China; Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Hunan University, Changsha, 410082, China
| | - Rui Li
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China; Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Hunan University, Changsha, 410082, China.
| | - Yan Liu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China; Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Hunan University, Changsha, 410082, China
| | - Jilong Wang
- Peng Cheng Laboratory, Shenzhen, 518066, China
| | - Buwen Cao
- College of Information and Electronic Engineering, Hunan City University, Yiyang, 413002, China
| | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Renfa Li
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China; Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Hunan University, Changsha, 410082, China
| | - Danny Z Chen
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
14
|
Lin B, Luo X, Liu Y, Jin X. A comprehensive review and comparison of existing computational methods for protein function prediction. Brief Bioinform 2024; 25:bbae289. [PMID: 39003530 PMCID: PMC11246557 DOI: 10.1093/bib/bbae289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 05/18/2024] [Indexed: 07/15/2024] Open
Abstract
Protein function prediction is critical for understanding the cellular physiological and biochemical processes, and it opens up new possibilities for advancements in fields such as disease research and drug discovery. During the past decades, with the exponential growth of protein sequence data, many computational methods for predicting protein function have been proposed. Therefore, a systematic review and comparison of these methods are necessary. In this study, we divide these methods into four different categories, including sequence-based methods, 3D structure-based methods, PPI network-based methods and hybrid information-based methods. Furthermore, their advantages and disadvantages are discussed, and then their performance is comprehensively evaluated and compared. Finally, we discuss the challenges and opportunities present in this field.
Collapse
Affiliation(s)
- Baohui Lin
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, Guangdong 518118, China
| | - Xiaoling Luo
- Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Shenzhen, Guangdong, China
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong 518061, China
| | - Yumeng Liu
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, Guangdong 518118, China
| | - Xiaopeng Jin
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, Guangdong 518118, China
| |
Collapse
|
15
|
González-Esparragoza D, Carrasco-Carballo A, Rosas-Murrieta NH, Millán-Pérez Peña L, Luna F, Herrera-Camacho I. In Silico Analysis of Protein-Protein Interactions of Putative Endoplasmic Reticulum Metallopeptidase 1 in Schizosaccharomyces pombe. Curr Issues Mol Biol 2024; 46:4609-4629. [PMID: 38785548 PMCID: PMC11120530 DOI: 10.3390/cimb46050280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/26/2024] [Accepted: 05/07/2024] [Indexed: 05/25/2024] Open
Abstract
Ermp1 is a putative metalloprotease from Schizosaccharomyces pombe and a member of the Fxna peptidases. Although their function is unknown, orthologous proteins from rats and humans have been associated with the maturation of ovarian follicles and increased ER stress. This study focuses on proposing the first prediction of PPI by comparison of the interologues between humans and yeasts, as well as the molecular docking and dynamics of the M28 domain of Ermp1 with possible target proteins. As results, 45 proteins are proposed that could interact with the metalloprotease. Most of these proteins are related to the transport of Ca2+ and the metabolism of amino acids and proteins. Docking and molecular dynamics suggest that the M28 domain of Ermp1 could hydrolyze leucine and methionine residues of Amk2, Ypt5 and Pex12. These results could support future experimental investigations of other Fxna peptidases, such as human ERMP1.
Collapse
Affiliation(s)
- Dalia González-Esparragoza
- Laboratorio de Bioquímica y Biología Molecular, Centro de Química del Instituto de Ciencias (ICUAP), Benemérita Universidad Autónoma de Puebla, Puebla 72570, Mexico; (D.G.-E.); (N.H.R.-M.); (L.M.-P.P.)
- Laboratorio de Elucidación y Síntesis en Química Orgánica, Instituto de Ciencias de la Universidad Autónoma de Puebla (ICUAP), Benemérita Universidad Autónoma de Puebla, Puebla 72570, Mexico
| | - Alan Carrasco-Carballo
- Laboratorio de Elucidación y Síntesis en Química Orgánica, Instituto de Ciencias de la Universidad Autónoma de Puebla (ICUAP), Benemérita Universidad Autónoma de Puebla, Puebla 72570, Mexico
- Consejo Nacional de Humanidades Ciencia y Tecnología, Instituto de Ciencias de la Universidad Autónoma de Puebla (ICUAP), Benemérita Universidad Autónoma de Puebla, Puebla 72570, Mexico
| | - Nora H. Rosas-Murrieta
- Laboratorio de Bioquímica y Biología Molecular, Centro de Química del Instituto de Ciencias (ICUAP), Benemérita Universidad Autónoma de Puebla, Puebla 72570, Mexico; (D.G.-E.); (N.H.R.-M.); (L.M.-P.P.)
| | - Lourdes Millán-Pérez Peña
- Laboratorio de Bioquímica y Biología Molecular, Centro de Química del Instituto de Ciencias (ICUAP), Benemérita Universidad Autónoma de Puebla, Puebla 72570, Mexico; (D.G.-E.); (N.H.R.-M.); (L.M.-P.P.)
| | - Felix Luna
- Laboratorio de Neuroendocrinología, Facultad de Ciencias Químicas, Benemérita Universidad Autónoma de Puebla, Puebla 72570, Mexico;
| | - Irma Herrera-Camacho
- Laboratorio de Bioquímica y Biología Molecular, Centro de Química del Instituto de Ciencias (ICUAP), Benemérita Universidad Autónoma de Puebla, Puebla 72570, Mexico; (D.G.-E.); (N.H.R.-M.); (L.M.-P.P.)
| |
Collapse
|
16
|
Wechsler D, Bascompte J. Mechanistic interactions as the origin of modularity in biological networks. Proc Biol Sci 2024; 291:20240269. [PMID: 38628127 PMCID: PMC11021940 DOI: 10.1098/rspb.2024.0269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 03/15/2024] [Indexed: 04/19/2024] Open
Abstract
Biological networks are often modular. Explanations for this peculiarity either assume an adaptive advantage of a modular design such as higher robustness, or attribute it to neutral factors such as constraints underlying network assembly. Interestingly, most insights on the origin of modularity stem from models in which interactions are either determined by highly simplistic mechanisms, or have no mechanistic basis at all. Yet, empirical knowledge suggests that biological interactions are often mediated by complex structural or behavioural traits. Here, we investigate the origins of modularity using a model in which interactions are determined by potentially complex traits. Specifically, we model system elements-such as the species in an ecosystem-as finite-state machines (FSMs), and determine their interactions by means of communication between the corresponding FSMs. Using this model, we show that modularity probably emerges for free. We further find that the more modular an interaction network is, the less complex are the traits that mediate the interactions. Altogether, our results suggest that the conditions for modularity to evolve may be much broader than previously thought.
Collapse
Affiliation(s)
- Daniel Wechsler
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse 19, CH-8057 Zurich, Switzerland
| | - Jordi Bascompte
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse 19, CH-8057 Zurich, Switzerland
| |
Collapse
|
17
|
Kim DY, Shin DY, Oh S, Kim I, Kim EJ. Gene Expression and DNA Methylation Profiling Suggest Potential Biomarkers for Azacitidine Resistance in Myelodysplastic Syndrome. Int J Mol Sci 2024; 25:4723. [PMID: 38731939 PMCID: PMC11083267 DOI: 10.3390/ijms25094723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 04/22/2024] [Accepted: 04/22/2024] [Indexed: 05/13/2024] Open
Abstract
Myelodysplastic syndrome/neoplasm (MDS) comprises a group of heterogeneous hematopoietic disorders that present with genetic mutations and/or cytogenetic changes and, in the advanced stage, exhibit wide-ranging gene hypermethylation. Patients with higher-risk MDS are typically treated with repeated cycles of hypomethylating agents, such as azacitidine. However, some patients fail to respond to this therapy, and fewer than 50% show hematologic improvement. In this context, we focused on the potential use of epigenetic data in clinical management to aid in diagnostic and therapeutic decision-making. First, we used the F-36P MDS cell line to establish an azacitidine-resistant F-36P cell line. We performed expression profiling of azacitidine-resistant and parental F-36P cells and used biological and bioinformatics approaches to analyze candidate azacitidine-resistance-related genes and pathways. Eighty candidate genes were identified and found to encode proteins previously linked to cancer, chronic myeloid leukemia, and transcriptional misregulation in cancer. Interestingly, 24 of the candidate genes had promoter methylation patterns that were inversely correlated with azacitidine resistance, suggesting that DNA methylation status may contribute to azacitidine resistance. In particular, the DNA methylation status and/or mRNA expression levels of the four genes (AMER1, HSPA2, NCX1, and TNFRSF10C) may contribute to the clinical effects of azacitidine in MDS. Our study provides information on azacitidine resistance diagnostic genes in MDS patients, which can be of great help in monitoring the effectiveness of treatment in progressing azacitidine treatment for newly diagnosed MDS patients.
Collapse
Affiliation(s)
- Da Yeon Kim
- Division of Radiation Biomedical Research, Korea Institute of Radiological and Medical Sciences, Seoul 01812, Republic of Korea;
- Department of Radiological and Medico-Oncological Sciences, University of Science and Technology, Daejeon 34113, Republic of Korea
| | - Dong-Yeop Shin
- Cancer Research Institute, Seoul National University College of Medicine, Seoul 03080, Republic of Korea; (D.-Y.S.); (S.O.)
- Center for Medical Innovation, Biomedical Research Institute, Seoul National University Hospital, Seoul 03080, Republic of Korea
- Division of Hematology and Medical Oncology, Department of Internal Medicine, Seoul National University Hospital, Seoul 03080, Republic of Korea
| | - Somi Oh
- Cancer Research Institute, Seoul National University College of Medicine, Seoul 03080, Republic of Korea; (D.-Y.S.); (S.O.)
| | - Inho Kim
- Cancer Research Institute, Seoul National University College of Medicine, Seoul 03080, Republic of Korea; (D.-Y.S.); (S.O.)
- Division of Hematology and Medical Oncology, Department of Internal Medicine, Seoul National University Hospital, Seoul 03080, Republic of Korea
| | - Eun Ju Kim
- Division of Radiation Biomedical Research, Korea Institute of Radiological and Medical Sciences, Seoul 01812, Republic of Korea;
- Department of Radiological and Medico-Oncological Sciences, University of Science and Technology, Daejeon 34113, Republic of Korea
- Institute for Molecular Bioscience, The University of Queensland, Carmody Rd., St Lucia, Brisbane, QLD 4072, Australia
- Genomics and Machine Learning Lab, QIMR Berghofer Medical Research Institute, Herston Rd., Herston, Brisbane, QLD 4006, Australia
| |
Collapse
|
18
|
Su Z, Griffin B, Emmons S, Wu Y. Prediction of interactions between cell surface proteins by machine learning. Proteins 2024; 92:567-580. [PMID: 38050713 DOI: 10.1002/prot.26648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 11/15/2023] [Accepted: 11/20/2023] [Indexed: 12/06/2023]
Abstract
Cells detect changes in their external environments or communicate with each other through proteins on their surfaces. These cell surface proteins form a complicated network of interactions in order to fulfill their functions. The interactions between cell surface proteins are highly dynamic and, thus, challenging to detect using traditional experimental techniques. Here, we tackle this challenge using a computational framework. The primary focus of the framework is to develop new tools to identify interactions between domains in the immunoglobulin (Ig) fold, which is the most abundant domain family in cell surface proteins. These interactions could be formed between ligands and receptors from different cells or between proteins on the same cell surface. In practice, we collected all structural data on Ig domain interactions and transformed them into an interface fragment pair library. A high-dimensional profile can then be constructed from the library for a given pair of query protein sequences. Multiple machine learning models were used to read this profile so that the probability of interaction between the query proteins could be predicted. We tested our models on an experimentally derived dataset that contains 564 cell surface proteins in humans. The cross-validation results show that we can achieve higher than 70% accuracy in identifying the PPIs within this dataset. We then applied this method to a group of 46 cell surface proteins in Caenorhabditis elegans. We screened every possible interaction between these proteins. Many interactions recognized by our machine learning classifiers have been experimentally confirmed in the literature. In conclusion, our computational platform serves as a useful tool to help identify potential new interactions between cell surface proteins in addition to current state-of-the-art experimental techniques. The tool is freely accessible for use by the scientific community. Moreover, the general framework of the machine learning classification can also be extended to study the interactions of proteins in other domain superfamilies.
Collapse
Affiliation(s)
- Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Brian Griffin
- Department of Genetics, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Scott Emmons
- Department of Genetics, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, USA
| |
Collapse
|
19
|
Idrees S, Paudel KR, Sadaf T, Hansbro PM. Uncovering domain motif interactions using high-throughput protein-protein interaction detection methods. FEBS Lett 2024; 598:725-742. [PMID: 38439692 DOI: 10.1002/1873-3468.14841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/09/2024] [Accepted: 02/18/2024] [Indexed: 03/06/2024]
Abstract
Protein-protein interactions (PPIs) are often mediated by short linear motifs (SLiMs) in one protein and domain in another, known as domain-motif interactions (DMIs). During the past decade, SLiMs have been studied to find their role in cellular functions such as post-translational modifications, regulatory processes, protein scaffolding, cell cycle progression, cell adhesion, cell signalling and substrate selection for proteasomal degradation. This review provides a comprehensive overview of the current PPI detection techniques and resources, focusing on their relevance to capturing interactions mediated by SLiMs. We also address the challenges associated with capturing DMIs. Moreover, a case study analysing the BioGrid database as a source of DMI prediction revealed significant known DMI enrichment in different PPI detection methods. Overall, it can be said that current high-throughput PPI detection methods can be a reliable source for predicting DMIs.
Collapse
Affiliation(s)
- Sobia Idrees
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Keshav Raj Paudel
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Tayyaba Sadaf
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Philip M Hansbro
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| |
Collapse
|
20
|
Singh S, Pandey AK, Prajapati VK. From genome to clinic: The power of translational bioinformatics in improving human health. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2024; 139:1-25. [PMID: 38448133 DOI: 10.1016/bs.apcsb.2023.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
Translational bioinformatics (TBI) has transformed healthcare by providing personalized medicine and tailored treatment options by integrating genomic data and clinical information. In recent years, TBI has bridged the gap between genome and clinical data because of significant advances in informatics like quantum computing and utilizing state-of-the-art technologies. This chapter discusses the power of translational bioinformatics in improving human health, from uncovering disease-causing genes and variations to establishing new therapeutic techniques. We discuss key application areas of bioinformatics in clinical genomics, such as data sources and methods used in translational bioinformatics, the impact of translational bioinformatics on human health, and how machine learning and artificial intelligence are being used to mine vast amounts of data for drug development and precision medicine. We also look at the problems, constraints, and ethical concerns connected with exploiting genomic data and the future of translational bioinformatics and its potential impact on medicine and human health. Ultimately, this chapter emphasizes the great potential of translational bioinformatics to alter healthcare and enhance patient outcomes.
Collapse
Affiliation(s)
- Satyendra Singh
- Department of Biochemistry, School of Life Sciences, Central University of Rajasthan, Bandarsindri, Kishangarh, Ajmer, Rajasthan, India
| | - Anurag Kumar Pandey
- College of Biotechnology, Sardar Vallabhbhai Patel University of Agriculture and Technology, Meerut, Uttar Pradesh, India
| | - Vijay Kumar Prajapati
- Department of Biochemistry, University of Delhi South Campus, Dhaula Kuan, New Delhi, India.
| |
Collapse
|
21
|
Bao W, Liu Y, Chen B. Oral_voting_transfer: classification of oral microorganisms' function proteins with voting transfer model. Front Microbiol 2024; 14:1277121. [PMID: 38384719 PMCID: PMC10879614 DOI: 10.3389/fmicb.2023.1277121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 12/19/2023] [Indexed: 02/23/2024] Open
Abstract
Introduction The oral microbial group typically represents the human body's highly complex microbial group ecosystem. Oral microorganisms take part in human diseases, including Oral cavity inflammation, mucosal disease, periodontal disease, tooth decay, and oral cancer. On the other hand, oral microbes can also cause endocrine disorders, digestive function, and nerve function disorders, such as diabetes, digestive system diseases, and Alzheimer's disease. It was noted that the proteins of oral microbes play significant roles in these serious diseases. Having a good knowledge of oral microbes can be helpful in analyzing the procession of related diseases. Moreover, the high-dimensional features and imbalanced data lead to the complexity of oral microbial issues, which can hardly be solved with traditional experimental methods. Methods To deal with these challenges, we proposed a novel method, which is oral_voting_transfer, to deal with such classification issues in the field of oral microorganisms. Such a method employed three features to classify the five oral microorganisms, including Streptococcus mutans, Staphylococcus aureus, abiotrophy adjacent, bifidobacterial, and Capnocytophaga. Firstly, we utilized the highly effective model, which successfully classifies the organelle's proteins and transfers to deal with the oral microorganisms. And then, some classification methods can be treated as the local classifiers in this work. Finally, the results are voting from the transfer classifiers and the voting ones. Results and discussion The proposed method achieved the well performances in the five oral microorganisms. The oral_voting_transfer is a standalone tool, and all its source codes are publicly available at https://github.com/baowz12345/voting_transfer.
Collapse
Affiliation(s)
- Wenzheng Bao
- School of Information Engineering, Xuzhou University of Technology, Xuzhou, China
| | - Yujun Liu
- School of Information Engineering, Xuzhou University of Technology, Xuzhou, China
| | - Baitong Chen
- The Affiliated Xuzhou Municipal Hospital of Xuzhou Medical University, Xuzhou, China
- Department of Stomatology, Xuzhou First People’s Hospital, Xuzhou, China
| |
Collapse
|
22
|
Fu X, Yuan Y, Qiu H, Suo H, Song Y, Li A, Zhang Y, Xiao C, Li Y, Dou L, Zhang Z, Cui F. AGF-PPIS: A protein-protein interaction site predictor based on an attention mechanism and graph convolutional networks. Methods 2024; 222:142-151. [PMID: 38242383 DOI: 10.1016/j.ymeth.2024.01.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 01/04/2024] [Accepted: 01/13/2024] [Indexed: 01/21/2024] Open
Abstract
Protein-protein interactions play an important role in various biological processes. Interaction among proteins has a wide range of applications. Therefore, the correct identification of protein-protein interactions sites is crucial. In this paper, we propose a novel predictor for protein-protein interactions sites, AGF-PPIS, where we utilize a multi-head self-attention mechanism (introducing a graph structure), graph convolutional network, and feed-forward neural network. We use the Euclidean distance between each protein residue to generate the corresponding protein graph as the input of AGF-PPIS. On the independent test dataset Test_60, AGF-PPIS achieves superior performance over comparative methods in terms of seven different evaluation metrics (ACC, precision, recall, F1-score, MCC, AUROC, AUPRC), which fully demonstrates the validity and superiority of the proposed AGF-PPIS model. The source codes and the steps for usage of AGF-PPIS are available at https://github.com/fxh1001/AGF-PPIS.
Collapse
Affiliation(s)
- Xiuhao Fu
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Ye Yuan
- Beidahuang Industry Group General Hospital, Harbin 150001, China
| | - Haoye Qiu
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Haodong Suo
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Yingying Song
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Anqi Li
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Yupeng Zhang
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Cuilin Xiao
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Yazi Li
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Lijun Dou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland, OH 44106, USA
| | - Zilong Zhang
- School of Computer Science and Technology, Hainan University, Haikou 570228, China.
| | - Feifei Cui
- School of Computer Science and Technology, Hainan University, Haikou 570228, China.
| |
Collapse
|
23
|
Meng W, Pan H, Sha Y, Zhai X, Xing A, Lingampelly SS, Sripathi SR, Wang Y, Li K. Metabolic Connectome and Its Role in the Prediction, Diagnosis, and Treatment of Complex Diseases. Metabolites 2024; 14:93. [PMID: 38392985 PMCID: PMC10890086 DOI: 10.3390/metabo14020093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 01/17/2024] [Accepted: 01/25/2024] [Indexed: 02/25/2024] Open
Abstract
The interconnectivity of advanced biological systems is essential for their proper functioning. In modern connectomics, biological entities such as proteins, genes, RNA, DNA, and metabolites are often represented as nodes, while the physical, biochemical, or functional interactions between them are represented as edges. Among these entities, metabolites are particularly significant as they exhibit a closer relationship to an organism's phenotype compared to genes or proteins. Moreover, the metabolome has the ability to amplify small proteomic and transcriptomic changes, even those from minor genomic changes. Metabolic networks, which consist of complex systems comprising hundreds of metabolites and their interactions, play a critical role in biological research by mediating energy conversion and chemical reactions within cells. This review provides an introduction to common metabolic network models and their construction methods. It also explores the diverse applications of metabolic networks in elucidating disease mechanisms, predicting and diagnosing diseases, and facilitating drug development. Additionally, it discusses potential future directions for research in metabolic networks. Ultimately, this review serves as a valuable reference for researchers interested in metabolic network modeling, analysis, and their applications.
Collapse
Affiliation(s)
- Weiyu Meng
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China; (W.M.); (H.P.); (Y.S.); (X.Z.); (A.X.)
| | - Hongxin Pan
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China; (W.M.); (H.P.); (Y.S.); (X.Z.); (A.X.)
| | - Yuyang Sha
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China; (W.M.); (H.P.); (Y.S.); (X.Z.); (A.X.)
| | - Xiaobing Zhai
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China; (W.M.); (H.P.); (Y.S.); (X.Z.); (A.X.)
| | - Abao Xing
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China; (W.M.); (H.P.); (Y.S.); (X.Z.); (A.X.)
| | | | - Srinivasa R. Sripathi
- Henderson Ocular Stem Cell Laboratory, Retina Foundation of the Southwest, Dallas, TX 75231, USA;
| | - Yuefei Wang
- National Key Laboratory of Chinese Medicine Modernization, State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin 301617, China
| | - Kefeng Li
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China; (W.M.); (H.P.); (Y.S.); (X.Z.); (A.X.)
| |
Collapse
|
24
|
Czerczak-Kwiatkowska K, Kaminska M, Fraczyk J, Majsterek I, Kolesinska B. Searching for EGF Fragments Recreating the Outer Sphere of the Growth Factor Involved in Receptor Interactions. Int J Mol Sci 2024; 25:1470. [PMID: 38338748 PMCID: PMC10855902 DOI: 10.3390/ijms25031470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 01/21/2024] [Accepted: 01/22/2024] [Indexed: 02/12/2024] Open
Abstract
The aims of this study were to determine whether it is possible to use peptide microarrays obtained using the SPOT technique (immobilized on cellulose) and specific polyclonal antibodies to select fragments that reconstruct the outer sphere of proteins and to ascertain whether the selected peptide fragments can be useful in the study of their protein-protein and/or peptide-protein interactions. Using this approach, epidermal growth factor (EGF) fragments responsible for the interaction with the EGF receptor were searched. A library of EGF fragments immobilized on cellulose was obtained using triazine condensing reagents. Experiments on the interactions with EGFR confirmed the high affinity of the selected peptide fragments. Biological tests on cells showed the lack of cytotoxicity of the EGF fragments. Selected EGF fragments can be used in various areas of medicine.
Collapse
Affiliation(s)
- Katarzyna Czerczak-Kwiatkowska
- Faculty of Chemistry, Institute of Organic Chemistry, Lodz University of Technology, Zeromskiego 116, 90-924 Lodz, Poland; (K.C.-K.); (J.F.)
| | - Marta Kaminska
- Division of Biophysics, Institute of Materials Science and Engineering, Lodz University of Technology, Stefanowskiego 1/15, 90-924 Lodz, Poland;
| | - Justyna Fraczyk
- Faculty of Chemistry, Institute of Organic Chemistry, Lodz University of Technology, Zeromskiego 116, 90-924 Lodz, Poland; (K.C.-K.); (J.F.)
| | - Ireneusz Majsterek
- Department of Clinical Chemistry and Biochemistry, Medical University of Lodz, Narutowicza 60, 90-136 Lodz, Poland;
| | - Beata Kolesinska
- Faculty of Chemistry, Institute of Organic Chemistry, Lodz University of Technology, Zeromskiego 116, 90-924 Lodz, Poland; (K.C.-K.); (J.F.)
| |
Collapse
|
25
|
Mansouri A, Yousef MS, Kowsar R, Miyamoto A. Homology Modeling, Molecular Dynamics Simulation, and Prediction of Bovine TLR2 Heterodimerization. Int J Mol Sci 2024; 25:1496. [PMID: 38338775 PMCID: PMC10855669 DOI: 10.3390/ijms25031496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 01/19/2024] [Accepted: 01/23/2024] [Indexed: 02/12/2024] Open
Abstract
Toll-like receptor 2 (TLR2) is a major membrane-bound receptor with ligand and species specificity that activates the host immune response. Heterodimerization of TLR2 with TLR1 (TLR2/1) or TLR6 (TLR2/6), triggered by ligand binding, is essential to initiating the signaling pathway. Bovine TLR2 (bTLR2) heterodimerization has not been defined yet compared with human and mouse TLR2s (hTLR2 and mTLR2). The aim of the present study was to model bovine TLRs (TLRs 1, 2 and 6) and create the heterodimeric forms of the bovine TLR2 using molecular dynamics (MD) simulations. We compared the intermolecular interactions in bTLR2/1-PAM3 and bTLR2/6-PAM2 with the hTLR2 and mTLR2 complexes through docking simulations and subsequent MD analyses. The present computational findings showed that bTLR2 dimerization could have a biological function and activate the immune response, similar to hTLR2 and mTLR2. Agonists and antagonists that are designed for hTLR2 and mTLR2 can target bTLR2. However, the experimental approaches to comparing the functional immune response of TLR2 across species were missing in the present study. This computational study provides a structural analysis of the bTLR2 interaction with bTLR1 and bTLR6 in the presence of an agonist/antagonist and reveals the three-dimensional structure of bTLR2 dimerization. The present findings could guide future experimental studies targeting bTLR2 with different ligands and lipopeptides.
Collapse
Affiliation(s)
- Alireza Mansouri
- Global AgroMedicine Research Center (GAMRC), Obihiro University of Agriculture and Veterinary Medicine, Obihiro 080-8555, Japan; (A.M.); (M.S.Y.)
| | - Mohamed Samy Yousef
- Global AgroMedicine Research Center (GAMRC), Obihiro University of Agriculture and Veterinary Medicine, Obihiro 080-8555, Japan; (A.M.); (M.S.Y.)
- Department of Theriogenology, Faculty of Veterinary Medicine, Assiut University, Assiut 71515, Egypt
| | - Rasoul Kowsar
- Department of Animal Sciences, College of Agriculture, Isfahan University of Technology, Isfahan 84156-83111, Iran;
| | - Akio Miyamoto
- Global AgroMedicine Research Center (GAMRC), Obihiro University of Agriculture and Veterinary Medicine, Obihiro 080-8555, Japan; (A.M.); (M.S.Y.)
| |
Collapse
|
26
|
Choubey J, Wolkenhauer O, Chatterjee T. Systems Biology Approach to Analyze Microarray Datasets for Identification of Disease-Causing Genes: Case Study of Oral Squamous Cell Carcinoma. Methods Mol Biol 2024; 2719:13-31. [PMID: 37803110 DOI: 10.1007/978-1-0716-3461-5_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/08/2023]
Abstract
The discovery of potential disease-causing genes can aid medical progress. The post-genomic era has made this a more difficult task. Modern high-throughput methods have not solved the problem of identifying disease genes. Conventional methods cannot be used to investigate many rare or lethal diseases. Monitoring gene expression values in different samples using microarray technology is one of the best and most accurate ways to identify disease-causing genes. One of the most recent advances in experimental molecular biology is microarrays, which allow researchers to simultaneously monitor the expression levels of thousands of genes. Statistical analysis of microarray data might aid gene discovery by revealing pathways related to the target gene and facilitating identification of candidate genes. Systems biology, an interdisciplinary approach, has emerged as a crucial analytic tool with the potential to reveal previously unidentified causes and consequences of human illness. Genetic, environmental, immunological, or neurological factors have been implicated in the developing complex disorders like cancer. Because of this, it is important to approach the study of such disease from a novel perspective. The system biology approach allows us to rapidly identify disease-causing genes and assess their viability as therapeutic targets. This chapter demonstrates systems biology approaches to identify candidate genes using public database. Oral squamous cell carcinoma (OSCC) is used as a model disease to show how systems biology can be used successfully to identify and prioritize disease genes.
Collapse
Affiliation(s)
| | - Olaf Wolkenhauer
- Department of Systems Biology & Bioinformatics, University of Rostock, Rostock, Germany
| | | |
Collapse
|
27
|
Gopalakrishnan S, Venkatraman S. Prediction of influential proteins and enzymes of certain diseases using a directed unimodular hypergraph. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:325-345. [PMID: 38303425 DOI: 10.3934/mbe.2024015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Protein-protein interaction (PPI) analysis based on mathematical modeling is an efficient means of identifying hub proteins, corresponding enzymes and many underlying structures. In this paper, a method for the analysis of PPI is introduced and used to analyze protein interactions of diseases such as Parkinson's, COVID-19 and diabetes melitus. A directed hypergraph is used to represent PPI interactions. A novel directed hypergraph depth-first search algorithm is introduced to find the longest paths. The minor hypergraph reduces the dimension of the directed hypergraph, representing the longest paths and results in the unimodular hypergraph. The property of unimodular hypergraph clusters influential proteins and enzymes that are related thereby providing potential avenues for disease treatment.
Collapse
Affiliation(s)
- Sathyanarayanan Gopalakrishnan
- Department of Mathematics, Srinivasa Ramanujan Centre, School of Arts, Sciences, Humanities and Education, SASTRA Deemed University, Thanjavur, India
| | - Swaminathan Venkatraman
- Department of Mathematics, School of Arts, Sciences, Humanities and Education, SASTRA Deemed University, Thanjavur, India
| |
Collapse
|
28
|
Wang H, Liu Z, Meng L, Zhang X. Comprehensive bioinformation analysis of differentially expressed genes in recurrent pregnancy loss. HUM FERTIL 2023; 26:1015-1022. [PMID: 35306956 DOI: 10.1080/14647273.2022.2045636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 01/25/2022] [Indexed: 11/04/2022]
Abstract
Recurrent pregnancy loss (RPL) occurs frequently, and its causes are complex. The aetiology of nearly 50% of RPL cases is still unknown. This study aimed to ascertain differentially expressed genes (DEGs) and pathways by comprehensive bioinformatics analysis. We downloaded the gene expression microarray of GSE165004 from the Gene Expression Omnibus (GEO). Gene ontology (GO) analysis and Kyoto Encyclopaedia of Gene and Genome (KEGG) pathway enrichment analyses were performed on selected genes by using the R Programming Language. A protein-protein interaction (PPI) network was constructed with the Retrieval of Interacting Genes (STRING). Our analysis revealed that 1,869 genes were differentially expressed in RPL and control groups. GO analysis revealed that the interferon type 1 and the glycoprotein-related biological processes played irreplaceable roles, meanwhile KEGG enrichment analysis also revealed that the cAMP signalling pathway and the prolactin signalling pathway played important roles. In the following study, we found that there were many DEGs in the RPL group that were closely related to endometrial decidualization, such as IL17RD, IL16, SOX4, CREBBP, and POFUT1 as well as Notch1 and RBPJ in the Notch signalling pathway family were down-regulated in the RPL group. The results provided valuable information on the pathogenesis of RPL.
Collapse
Affiliation(s)
- Huaibin Wang
- School of Public Health, North China University of Science and Technology, Tangshan, P.R. China
| | - Zhao Liu
- School of Public Health, North China University of Science and Technology, Tangshan, P.R. China
| | - Lijun Meng
- Department of Environmental and Chemical Engineering, Tangshan University, Tangshan, P.R. China
| | - Xiujun Zhang
- School of Public Health, North China University of Science and Technology, Tangshan, P.R. China
| |
Collapse
|
29
|
Wang M, Gao Y, Chen H, Shen Y, Cheng J, Wang G. Bioinformatics strategies to identify differences in molecular biomarkers for ischemic stroke and myocardial infarction. Medicine (Baltimore) 2023; 102:e35919. [PMID: 37986378 PMCID: PMC10659606 DOI: 10.1097/md.0000000000035919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 10/11/2023] [Accepted: 10/12/2023] [Indexed: 11/22/2023] Open
Abstract
Ischemic strokes (ISs) are commonly treated by intravenous thrombolysis using a recombinant tissue plasminogen activator; however, successful treatment can only occur within 3 hours after the stroke. Therefore, it is crucial to determine the causes and underlying molecular mechanisms, identify molecular biomarkers for early diagnosis, and develop precise preventive treatments for strokes. We aimed to clarify the differences in gene expression, molecular mechanisms, and drug prediction approaches between IS and myocardial infarction (MI) using comprehensive bioinformatics analysis. The pathogenesis of these diseases was explored to provide directions for future clinical research. The IS (GSE58294 and GSE16561) and MI (GSE60993 and GSE141512) datasets were downloaded from the Gene Expression Omnibus database. IS and MI transcriptome data were analyzed using bioinformatics methods, and the differentially expressed genes (DEGs) were screened. A protein-protein interaction network was constructed using the STRING database and visualized using Cytoscape, and the candidate genes with high confidence scores were identified using Degree, MCC, EPC, and DMNC in the cytoHubba plug-in. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses of the DEGs were performed using the database annotation, visualization, and integrated discovery database. Network Analyst 3.0 was used to construct transcription factor (TF) - gene and microRNA (miRNA) - gene regulatory networks of the identified candidate genes. The DrugBank 5.0 database was used to identify gene-drug interactions. After bioinformatics analysis of IS and MI microarray data, 115 and 44 DEGS were obtained in IS and MI, respectively. Moreover, 8 hub genes, 2 miRNAs, and 3 TFs for IS and 8 hub genes, 13 miRNAs, and 2 TFs for MI were screened. The molecular pathology between IS and MI presented differences in terms of GO and KEGG enrichment pathways, TFs, miRNAs, and drugs. These findings provide possible directions for the diagnosis of IS and MI in the future.
Collapse
Affiliation(s)
- Min Wang
- School of Clinical Medicine, Dali University, Dali, Yunnan, P.R. China
| | - Yuan Gao
- School of Clinical Medicine, Zhengzhou University, Zhengzhou, Henan, P.R. China
| | - Huaqiu Chen
- Xichang People’s Hospital, Xichang, Sichuan, P.R. China
| | - Ying Shen
- The First Hospital of Liangshan, Xichang, Sichuan, P.R. China
| | - Jianjie Cheng
- The First Affiliated Hospital of Dali University, Yunnan, P.R. China
| | - Guangming Wang
- School of Clinical Medicine, Dali University, Dali, Yunnan, P.R. China
- Center of Genetic Testing, The First Affiliated Hospital of Dali University, Dali, Yunnan, P.R. China
| |
Collapse
|
30
|
Santos TG, Silva KS, Lima RM, Silva LC, Pereira M. State of the art in protein-protein interactions within the fungi kingdom. Future Microbiol 2023; 18:1119-1131. [PMID: 37540069 DOI: 10.2217/fmb-2022-0274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/05/2023] Open
Abstract
Proteins rarely exert their function by themselves. Protein-protein interactions (PPIs) regulate virtually every biological process that takes place in a cell. Such interactions are targets for new therapeutic agents against all sorts of diseases, through the screening and design of a variety of inhibitors. Here we discuss several aspects of PPIs that contribute to prediction of protein function and drug discovery. As the high-throughput techniques continue to release biological data, targets for fungal therapeutics that rely on PPIs are being proposed worldwide. Computational approaches have reduced the time taken to develop new therapeutic approaches. The near future brings the possibility of developing new PPI and interaction network inhibitors and a revolution in the way we treat fungal diseases.
Collapse
Affiliation(s)
- Thaynara G Santos
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Goiás, 74 000, Brazil
| | - Kleber Sf Silva
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Goiás, 74 000, Brazil
| | - Raisa M Lima
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Goiás, 74 000, Brazil
| | - Lívia C Silva
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Goiás, 74 000, Brazil
| | - Maristela Pereira
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Goiás, 74 000, Brazil
| |
Collapse
|
31
|
Huang T, Jiang N, Song Y, Pan H, Du A, Yu B, Li X, He J, Yuan K, Wang Z. Bioinformatics and system biology approach to identify the influences of SARS-CoV-2 on metabolic unhealthy obese patients. Front Mol Biosci 2023; 10:1274463. [PMID: 37877121 PMCID: PMC10591333 DOI: 10.3389/fmolb.2023.1274463] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 09/25/2023] [Indexed: 10/26/2023] Open
Abstract
Introduction: The severe acute respiratory syndrome coronavirus 2 (SARS-COV-2) has posed a significant challenge to individuals' health. Increasing evidence shows that patients with metabolic unhealthy obesity (MUO) and COVID-19 have severer complications and higher mortality rate. However, the molecular mechanisms underlying the association between MUO and COVID-19 are poorly understood. Methods: We sought to reveal the relationship between MUO and COVID-19 using bioinformatics and systems biology analysis approaches. Here, two datasets (GSE196822 and GSE152991) were employed to extract differentially expressed genes (DEGs) to identify common hub genes, shared pathways, transcriptional regulatory networks, gene-disease relationship and candidate drugs. Results: Based on the identified 65 common DEGs, the complement-related pathways and neutrophil degranulation-related functions are found to be mainly affected. The hub genes, which included SPI1, CD163, C1QB, SIGLEC1, C1QA, ITGAM, CD14, FCGR1A, VSIG4 and C1QC, were identified. From the interaction network analysis, 65 transcription factors (TFs) were found to be the regulatory signals. Some infections, inflammation and liver diseases were found to be most coordinated with the hub genes. Importantly, Paricalcitol, 3,3',4,4',5-Pentachlorobiphenyl, PD 98059, Medroxyprogesterone acetate, Dexamethasone and Tretinoin HL60 UP have shown possibility as therapeutic agents against COVID-19 and MUO. Conclusion: This study provides new clues and references to treat both COVID-19 and MUO.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Kefei Yuan
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Zhen Wang
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
32
|
Hasib RA, Ali MC, Rahman MH, Ahmed S, Sultana S, Summa SZ, Shimu MSS, Afrin Z, Jamal MAHM. Integrated gene expression profiling and functional enrichment analyses to discover biomarkers and pathways associated with Guillain-Barré syndrome and autism spectrum disorder to identify new therapeutic targets. J Biomol Struct Dyn 2023; 42:11299-11321. [PMID: 37776011 DOI: 10.1080/07391102.2023.2262586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 09/17/2023] [Indexed: 10/01/2023]
Abstract
Guillain-Barré syndrome (GBS) is one of the most prominent and acute immune-mediated peripheral neuropathy, while autism spectrum disorders (ASD) are a group of heterogeneous neurodevelopmental disorders. The complete mechanism regarding the neuropathophysiology of these disorders is still ambiguous. Even after recent breakthroughs in molecular biology, the link between GBS and ASD remains a mystery. Therefore, we have implemented well-established bioinformatic techniques to identify potential biomarkers and drug candidates for GBS and ASD. 17 common differentially expressed genes (DEGs) were identified for these two disorders, which later guided the rest of the research. Common genes identified the protein-protein interaction (PPI) network and pathways associated with both disorders. Based on the PPI network, the constructed hub gene and module analysis network determined two common DEGs, namely CXCL9 and CXCL10, which are vital in predicting the top drug candidates. Furthermore, coregulatory networks of TF-gene and TF-miRNA were built to detect the regulatory biomolecules. Among drug candidates, imatinib had the highest docking and MM-GBSA score with the well-known chemokine receptor CXCR3 and remained stable during the 100 ns molecular dynamics simulation validated by the principal component analysis and the dynamic cross-correlation map. This study predicted the gene-based disease network for GBS and ASD and suggested prospective drug candidates. However, more in-depth research is required for clinical validation.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Rizone Al Hasib
- Department of Biotechnology and Genetic Engineering, Islamic University, Kushtia, Bangladesh
- Laboratory of Medical and Environmental Biotechnology Islamic University, Kushtia, Bangladesh
| | - Md Chayan Ali
- Department of Biochemistry, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Md Habibur Rahman
- Department of Computer Science and Engineering, Islamic University, Kushtia, Bangladesh
- Center for Advanced Bioinformatics and Artificial Intelligent Research, Islamic University, Kushtia, Bangladesh
| | - Sabbir Ahmed
- Department of Biotechnology and Genetic Engineering, Islamic University, Kushtia, Bangladesh
| | - Shaharin Sultana
- Department of Biotechnology and Genetic Engineering, Islamic University, Kushtia, Bangladesh
- Laboratory of Medical and Environmental Biotechnology Islamic University, Kushtia, Bangladesh
| | - Sadia Zannat Summa
- Department of Biotechnology and Genetic Engineering, Islamic University, Kushtia, Bangladesh
- Laboratory of Medical and Environmental Biotechnology Islamic University, Kushtia, Bangladesh
| | | | - Zinia Afrin
- Department of Biotechnology and Genetic Engineering, Islamic University, Kushtia, Bangladesh
| | - Mohammad Abu Hena Mostofa Jamal
- Department of Biotechnology and Genetic Engineering, Islamic University, Kushtia, Bangladesh
- Laboratory of Medical and Environmental Biotechnology Islamic University, Kushtia, Bangladesh
| |
Collapse
|
33
|
Su Z, Griffin B, Emmons S, Wu Y. Prediction of Interactions between Cell Surface Proteins by Machine Learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.12.557337. [PMID: 37745607 PMCID: PMC10515853 DOI: 10.1101/2023.09.12.557337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Cells detect changes of external environments or communicate with each other through proteins on their surfaces. These cell surface proteins form a complicated network of interactions in order to fulfill their functions. The interactions between cell surface proteins are highly dynamic and thus challenging to detect using traditional experimental techniques. Here we tackle this challenge by a computational framework. The primary focus of the framework is to develop new tools to identify interactions between domains in immunoglobulin (Ig) fold, which is the most abundant domain family in cell surface proteins. These interactions could be formed between ligands and receptors from different cells, or between proteins on the same cell surface. In practice, we collected all structural data of Ig domain interactions and transformed them into an interface fragment pair library. A high dimensional profile can be then constructed from the library for a given pair of query protein sequences. Multiple machine learning models were used to read this profile, so that the probability of interaction between the query proteins can be predicted. We tested our models to an experimentally derived dataset which contains 564 cell surface proteins in human. The cross-validation results show that we can achieve higher than 70% accuracy in identifying the PPIs within this dataset. We then applied this method to a group of 46 cell surface proteins in C elegans. We screened every possible interaction between these proteins. Many interactions recognized by our machine learning classifiers have been experimentally confirmed in the literatures. In conclusion, our computational platform serves a useful tool to help identifying potential new interactions between cell surface proteins in addition to current state-of-the-art experimental techniques. The tool is freely accessible for use by the scientific community. Moreover, the general framework of the machine learning classification can also be extended to study interactions of proteins in other domain superfamilies.
Collapse
Affiliation(s)
- Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461
| | - Brian Griffin
- Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461
| | - Scott Emmons
- Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461
| |
Collapse
|
34
|
Tanouye FT, Alves JR, Spinozzi F, Itri R. Unveiling protein-protein interaction potential through Monte Carlo simulation combined with small-angle X-ray scattering. Int J Biol Macromol 2023; 248:125869. [PMID: 37473888 DOI: 10.1016/j.ijbiomac.2023.125869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 07/06/2023] [Accepted: 07/15/2023] [Indexed: 07/22/2023]
Abstract
Protein interactions are investigated under different conditions of lysozyme concentration, temperature and ionic strength by means of in-solution small angle X-Ray scattering (SAXS) experiments and Monte Carlo (MC) simulations. Initially, experimental data were analysed through a Hard-Sphere Double Yukawa (HSDY) model combined with Random Phase Approximation (RPA), a closure relationship commonly used in the literature for monodisperse systems. We realized by means of MC that the HSDY/RPA modelling fails to describe the protein-protein pair potential for moderated and dense systems at low ionic strength, mainly due to inherent distortions of the RPA approximation. Our SAXS/MC results thus show that lysozyme concentrations between 2 (diluted) and 20 mg/mL (not crowded) present similar protein-protein pair potential preserving the values of surface net charge around 7 e, protein diameter of 28 Å, decay range of attractive well potential of 3 Å and a depth of the well potential varying from 1 to 5 kBT depending on temperature and salt addition. Noteworthy, we here propose a novel method to analyse the SAXS data from interacting proteins through MC simulations, which overcomes the deficiencies presented by the use of a closure relationship. Furthermore, this new methodology of combining SAXS with MC simulations gives a step forward to investigate more complex systems as those composed of a mixture of proteins of distinct species presenting different molecular weights (and hence sizes) and surface net charges at low, moderate and very dense systems.
Collapse
Affiliation(s)
| | | | - Francesco Spinozzi
- Department of Life and Environmental Sciences, Polytechnic University of Marche, Italy
| | | |
Collapse
|
35
|
Molla R, Joshi PN, Reddy NC, Biswas D, Rai V. Protein-Protein Interaction in Multicomponent Reaction Enables Chemoselective, Site-Selective, and Modular Labeling of Native Proteins. Org Lett 2023; 25:6385-6390. [PMID: 37603545 DOI: 10.1021/acs.orglett.3c02405] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
A protein's pool of functionalities presents a formidable challenge for its single-site modification. Here, we report a method to harness protein-protein interaction (PPI) to drive selective modification. It involves the chemoselective reversible generation of reactive intermediates and utilizes PPI-specificity to drive the subsequent site-selective irreversible step. The disintegrate (DIN) theory-driven multicomponent aza-Morita-Baylis-Hillman (aza-MBH) reaction offers homogeneous and modular single-site protein modification capable of late-stage mono- and dual-probe installation.
Collapse
Affiliation(s)
- Rajib Molla
- Department of Chemistry, Indian Institute of Science Education and Research (IISER) Bhopal, Bhopal Bypass Road, Bhauri, Bhopal, 462 066, M.P., India
| | - Pralhad N Joshi
- Department of Chemistry, Indian Institute of Science Education and Research (IISER) Bhopal, Bhopal Bypass Road, Bhauri, Bhopal, 462 066, M.P., India
| | - Neelesh C Reddy
- Department of Chemistry, Indian Institute of Science Education and Research (IISER) Bhopal, Bhopal Bypass Road, Bhauri, Bhopal, 462 066, M.P., India
| | - Dwaipayan Biswas
- Department of Chemistry, Indian Institute of Science Education and Research (IISER) Bhopal, Bhopal Bypass Road, Bhauri, Bhopal, 462 066, M.P., India
| | - Vishal Rai
- Department of Chemistry, Indian Institute of Science Education and Research (IISER) Bhopal, Bhopal Bypass Road, Bhauri, Bhopal, 462 066, M.P., India
| |
Collapse
|
36
|
Song Y, Huang T, Pan H, Du A, Wu T, Lan J, Zhou X, Lv Y, Xue S, Yuan K. The influence of COVID-19 on colorectal cancer was investigated using bioinformatics and systems biology techniques. Front Med (Lausanne) 2023; 10:1169562. [PMID: 37457582 PMCID: PMC10348756 DOI: 10.3389/fmed.2023.1169562] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 06/15/2023] [Indexed: 07/18/2023] Open
Abstract
Introduction Coronavirus disease 2019 (COVID-19) is a global pandemic and highly contagious, posing a serious threat to human health. Colorectal cancer (CRC) is a risk factor for COVID-19 infection. Therefore, it is vital to investigate the intrinsic link between these two diseases. Methods In this work, bioinformatics and systems biology techniques were used to detect the mutual pathways, molecular biomarkers, and potential drugs between COVID-19 and CRC. Results A total of 161 common differentially expressed genes (DEGs) were identified based on the RNA sequencing datasets of the two diseases. Functional analysis was performed using ontology keywords, and pathway analysis was also performed. The common DEGs were further utilized to create a protein-protein interaction (PPI) network and to identify hub genes and key modules. The datasets revealed transcription factors-gene interactions, co-regulatory networks with DEGs-miRNAs of common DEGs, and predicted possible drugs as well. The ten predicted drugs include troglitazone, estradiol, progesterone, calcitriol, genistein, dexamethasone, lucanthone, resveratrol, retinoic acid, phorbol 12-myristate 13-acetate, some of which have been investigated as potential CRC and COVID-19 therapies. Discussion By clarifying the relationship between COVID-19 and CRC, we hope to provide novel clues and promising therapeutic drugs to treat these two illnesses.
Collapse
Affiliation(s)
- Yujia Song
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy and Collaborative Innovation Center of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Tengda Huang
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy and Collaborative Innovation Center of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Hongyuan Pan
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy and Collaborative Innovation Center of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, College of Animal Science and Technology, Guangxi University, Nanning, China
| | - Ao Du
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy and Collaborative Innovation Center of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Tian Wu
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy and Collaborative Innovation Center of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, College of Animal Science and Technology, Guangxi University, Nanning, China
| | - Jiang Lan
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy and Collaborative Innovation Center of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Xinyi Zhou
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy and Collaborative Innovation Center of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Yue Lv
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy and Collaborative Innovation Center of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Shuai Xue
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy and Collaborative Innovation Center of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Kefei Yuan
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy and Collaborative Innovation Center of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
37
|
STARK RYAN. Protein-mediated interactions in the dynamic regulation of acute inflammation. BIOCELL 2023; 47:1191-1198. [PMID: 37261220 PMCID: PMC10231872 DOI: 10.32604/biocell.2023.027838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 02/09/2023] [Indexed: 06/02/2023]
Abstract
Protein-mediated interactions are the fundamental mechanism through which cells regulate health and disease. These interactions require physical contact between proteins and their respective targets of interest. These targets include not only other proteins but also nucleic acids and other important molecules as well. These proteins are often involved in multibody complexes that work dynamically to regulate cellular health and function. Various techniques have been adapted to study these important interactions, such as affinity-based assays, mass spectrometry, and fluorescent detection. The application of these techniques has led to a greater understanding of how protein interactions are responsible for both the instigation and resolution of acute inflammatory diseases. These pursuits aim to provide opportunities to target specific protein interactions to alleviate acute inflammation.
Collapse
Affiliation(s)
- RYAN STARK
- Department of Pediatric Critical Care Medicine, Vanderbilt University Medical Center, 2200 Children’s Way, 5121 Doctors’ Office Tower, Nashville, TN 37232-9075
| |
Collapse
|
38
|
High-Throughput Sequencing Reveals That Rotundine Inhibits Colorectal Cancer by Regulating Prognosis-Related Genes. J Pers Med 2023; 13:jpm13030550. [PMID: 36983731 PMCID: PMC10052610 DOI: 10.3390/jpm13030550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 03/13/2023] [Accepted: 03/14/2023] [Indexed: 03/22/2023] Open
Abstract
Background: Rotundine is an herbal medicine with anti-cancer effects. However, little is known about the anti-cancer effect of rotundine on colorectal cancer. Therefore, our study aimed to investigate the specific molecular mechanism of rotundine inhibition of colorectal cancer. Methods: MTT and cell scratch assay were performed to investigate the effects of rotundine on the viability, migration, and invasion ability of SW480 cells. Changes in cell apoptosis were analyzed by flow cytometry. DEGs were detected by high-throughput sequencing after the action of rotundine on SW480 cells, and the DEGs were subjected to function enrichment analysis. Bioinformatics analyses were performed to screen out prognosis-related DEGs of COAD. Followed by enrichment analysis of prognosis-related DEGs. Furthermore, prognostic models were constructed, including ROC analysis, risk curve analysis, PCA and t-SNE, Nomo analysis, and Kaplan–Meier prognostic analysis. Results: In this study, we showed that rotundine concentrations of 50 μM, 100 μM, 150 μM, and 200 μM inhibited the proliferation, migration, and invasion of SW480 cells in a time- and concentration-dependent manner. Rotundine does not induce SW480 cell apoptosis. Compared to the control group, high-throughput results showed that there were 385 DEGs in the SW480 group. And DEGs were associated with the Hippo signaling pathway. In addition, 16 of the DEGs were significantly associated with poorer prognosis in COAD, with MEF2B, CCDC187, PSD2, RGS16, PLXDC1, HELB, ASIC3, PLCH2, IGF2BP3, CLHC1, DNHD1, SACS, H1-4, ANKRD36, and ZNF117 being highly expressed in COAD and ARV1 being lowly expressed. Prognosis-related DEGs were mainly enriched in cancer-related pathways and biological functions, such as inositol phosphate metabolism, enterobactin transmembrane transporter activity, and enterobactin transport. Prognostic modeling also showed that these 16 DEGs could be used as predictors of overall survival prognosis in COAD patients. Conclusions: Rotundine inhibits the development and progression of colorectal cancer by regulating the expression of these prognosis-related genes. Our findings could further provide new directions for the treatment of colorectal cancer.
Collapse
|
39
|
Wu Z, Guo M, Jin X, Chen J, Liu B. CFAGO: cross-fusion of network and attributes based on attention mechanism for protein function prediction. Bioinformatics 2023; 39:7072461. [PMID: 36883697 PMCID: PMC10032634 DOI: 10.1093/bioinformatics/btad123] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 02/28/2023] [Accepted: 03/05/2023] [Indexed: 03/09/2023] Open
Abstract
MOTIVATION Protein function annotation is fundamental to understanding biological mechanisms. The abundant genome-scale protein-protein interaction (PPI) networks, together with other protein biological attributes, provide rich information for annotating protein functions. As PPI networks and biological attributes describe protein functions from different perspectives, it is highly challenging to cross-fuse them for protein function prediction. Recently, several methods combine the PPI networks and protein attributes via the graph neural networks (GNNs). However, GNNs may inherit or even magnify the bias caused by noisy edges in PPI networks. Besides, GNNs with stacking of many layers may cause the over-smoothing problem of node representations. RESULTS We develop a novel protein function prediction method, CFAGO, to integrate single-species PPI networks and protein biological attributes via a multi-head attention mechanism. CFAGO is first pre-trained with an encoder-decoder architecture to capture the universal protein representation of the two sources. It is then fine-tuned to learn more effective protein representations for protein function prediction. Benchmark experiments on human and mouse datasets show CFAGO outperforms state-of-the-art single-species network-based methods by at least 7.59%, 6.90%, 11.68% in terms of m-AUPR, M-AUPR, and Fmax, respectively, demonstrating cross-fusion by multi-head attention mechanism can greatly improve the protein function prediction. We further evaluate the quality of captured protein representations in terms of Davies Bouldin Score, whose results show that cross-fused protein representations by multi-head attention mechanism are at least 2.7% better than that of original and concatenated representations. We believe CFAGO is an effective tool for protein function prediction. AVAILABILITY AND IMPLEMENTATION The source code of CFAGO and experiments data are available at: http://bliulab.net/CFAGO/.
Collapse
Affiliation(s)
- Zhourun Wu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518055, China
| | - Mingyue Guo
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 101408, China
| | - Xiaopeng Jin
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, Guangdong 518118, China
| | - Junjie Chen
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518055, China
| | - Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518055, China
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
40
|
Aybey E, Gümüş Ö. SENSDeep: An Ensemble Deep Learning Method for Protein-Protein Interaction Sites Prediction. Interdiscip Sci 2023; 15:55-87. [PMID: 36346583 DOI: 10.1007/s12539-022-00543-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 10/15/2022] [Accepted: 10/17/2022] [Indexed: 11/09/2022]
Abstract
PURPOSE The determination of which amino acid in a protein interacts with other proteins is important in understanding the functional mechanism of that protein. Although there are experimental methods to detect protein-protein interaction sites (PPISs), these are costly, time-consuming, and require expertise. Therefore, many computational methods have been proposed to accelerate this type of research, but they are generally insufficient to predict PPISs accurately. There is a need for development in this field. METHODS In this study, we introduce a new PPISs prediction method. This method is a sequence-based Stacking ENSemble Deep (SENSDeep) learning method that has an ensemble learning model including the models of RNN, CNN, GRU sequence to sequence (GRUs2s), GRU sequence to sequence with an attention layer (GRUs2satt) and a multilayer perceptron. Two embedded features, secondary structure, and protein sequence information are added to the training data set in addition to twelve existing features to improve the prediction performance of the method. RESULTS SENSDeep trained on the training data set without two extra features obtains a better performance on some of the independent testing data sets than that of the other methods in the literature, especially on scoring metrics of sensitivity, F1, MCC, and AUPRC, having increments up to 63.5%, 19.3%, 18.5%, 11.4%, respectively. It is shown that the added extra features improve the performance of the method by having almost the same performance with less data as the method trained on the data set without these added features. On the other hand, different sizes of the sliding window are tried on the data sets and an optimal sliding window size for SENSDeep is found. Moreover, SENSDeep has also been compared to structure-based methods. Some of these methods have been found to perform better. Using SENSDeep obtained by training with both training data sets, PPISs prediction examples of various proteins that are not in these training data sets are also presented. Furthermore, execution times for SENSDeep and its submodels are shown. AVAILABILITY AND IMPLEMENTATION https://github.com/enginaybey/SENSDeep.
Collapse
Affiliation(s)
- Engin Aybey
- Department of Health Bioinformatics, Ege University, 35100, Bornova, Izmir, Turkey.
- Rectorate, Marmara University, 34722, Kadıköy, Istanbul, Turkey.
| | - Özgür Gümüş
- Department of Computer Engineering, Ege University, 35100, Bornova, Izmir, Turkey
| |
Collapse
|
41
|
Yuen HY, Jansson J. Normalized L3-based link prediction in protein-protein interaction networks. BMC Bioinformatics 2023; 24:59. [PMID: 36814208 PMCID: PMC9945744 DOI: 10.1186/s12859-023-05178-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Accepted: 02/08/2023] [Indexed: 02/24/2023] Open
Abstract
BACKGROUND Protein-protein interaction (PPI) data is an important type of data used in functional genomics. However, high-throughput experiments are often insufficient to complete the PPI interactome of different organisms. Computational techniques are thus used to infer missing data, with link prediction being one such approach that uses the structure of the network of PPIs known so far to identify non-edges whose addition to the network would make it more sound, according to some underlying assumptions. Recently, a new idea called the L3 principle introduced biological motivation into PPI link predictions, yielding predictors that are superior to general-purpose link predictors for complex networks. Interestingly, the L3 principle can be interpreted in another way, so that other signatures of PPI networks can also be characterized for PPI predictions. This alternative interpretation uncovers candidate PPIs that the current L3-based link predictors may not be able to fully capture, underutilizing the L3 principle. RESULTS In this article, we propose a formulation of link predictors that we call NormalizedL3 (L3N) which addresses certain missing elements within L3 predictors in the perspective of network modeling. Our computational validations show that the L3N predictors are able to find missing PPIs more accurately (in terms of true positives among the predicted PPIs) than the previously proposed methods on several datasets from the literature, including BioGRID, STRING, MINT, and HuRI, at the cost of using more computation time in some of the cases. In addition, we found that L3-based link predictors (including L3N) ranked a different pool of PPIs higher than the general-purpose link predictors did. This suggests that different types of PPIs can be predicted based on different topological assumptions, and that even better PPI link predictors may be obtained in the future by improved network modeling.
Collapse
Affiliation(s)
- Ho Yin Yuen
- Department of Biomedical Engineering, The Hong Kong Polytechnic University, Hong Kong, China.
| | - Jesper Jansson
- Graduate School of Informatics, Kyoto University, Kyoto, 606-8501, Japan.
| |
Collapse
|
42
|
Zhang Y, Li Z. RF_phage virion: Classification of phage virion proteins with a random forest model. Front Genet 2023; 13:1103783. [PMID: 36846294 PMCID: PMC9945117 DOI: 10.3389/fgene.2022.1103783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 12/30/2022] [Indexed: 02/10/2023] Open
Abstract
Introduction: Phages play essential roles in biological procession, and the virion proteins encoded by the phage genome constitute critical elements of the assembled phage particle. Methods: This study uses machine learning methods to classify phage virion proteins. We proposed a novel approach, RF_phage virion, for the effective classification of the virion and non-virion proteins. The model uses four protein sequence coding methods as features, and the random forest algorithm was employed to solve the classification problem. Results: The performance of the RF_phage virion model was analyzed by comparing the performance of this algorithm with that of classical machine learning methods. The proposed method achieved a specificity (Sp) of 93.37%%, sensitivity (Sn) of 90.30%, accuracy (Acc) of 91.84%, Matthews correlation coefficient (MCC) of .8371, and an F1 score of .9196.
Collapse
Affiliation(s)
- Yanqing Zhang
- School of Finance, Xuzhou University of Technology, Xuzhou, China
| | - Zhiyuan Li
- School of Artificial Intelligence and Software College, Jiangsu Normal University Kewen College, Xuzhou, China,*Correspondence: Zhiyuan Li,
| |
Collapse
|
43
|
Kang Y, Elofsson A, Jiang Y, Huang W, Yu M, Li Z. AFTGAN: prediction of multi-type PPI based on attention free transformer and graph attention network. Bioinformatics 2023; 39:7000335. [PMID: 36692145 PMCID: PMC9897180 DOI: 10.1093/bioinformatics/btad052] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 01/01/2023] [Accepted: 01/24/2023] [Indexed: 01/25/2023] Open
Abstract
MOTIVATION Protein-protein interaction (PPI) networks and transcriptional regulatory networks are critical in regulating cells and their signaling. A thorough understanding of PPIs can provide more insights into cellular physiology at normal and disease states. Although numerous methods have been proposed to predict PPIs, it is still challenging for interaction prediction between unknown proteins. In this study, a novel neural network named AFTGAN was constructed to predict multi-type PPIs. Regarding feature input, ESM-1b embedding containing much biological information for proteins was added as a protein sequence feature besides amino acid co-occurrence similarity and one-hot coding. An ensemble network was also constructed based on a transformer encoder containing an AFT module (performing the weight operation on vital protein sequence feature information) and graph attention network (extracting the relational features of protein pairs) for the part of the network framework. RESULTS The experimental results showed that the Micro-F1 of the AFTGAN based on three partitioning schemes (BFS, DFS and the random mode) on the SHS27K and SHS148K datasets was 0.685, 0.711 and 0.867, as well as 0.745, 0.819 and 0.920, respectively, all higher than that of other popular methods. In addition, the experimental comparisons confirmed the performance superiority of the proposed model for predicting PPIs of unknown proteins on the STRING dataset. AVAILABILITY AND IMPLEMENTATION The source code is publicly available at https://github.com/1075793472/AFTGAN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yanlei Kang
- Zhejiang Province Key Laboratory of Smart Management & Application of Modern Agricultural Resources, School of Information Engineering, Huzhou University, Huzhou, Zhejiang 313000, China
| | - Arne Elofsson
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Stockholm, Solna 17121, Sweden
| | - Yunliang Jiang
- School of Computer Science and Technology, Zhejiang Normal University, Jinhua, Zhejiang 321004, China
| | - Weihong Huang
- College of Science, Zhejiang Sci-Tech University, Hangzhou, Zhejiang 310018, China
| | - Minzhe Yu
- College of Science, Zhejiang Sci-Tech University, Hangzhou, Zhejiang 310018, China
| | - Zhong Li
- Zhejiang Province Key Laboratory of Smart Management & Application of Modern Agricultural Resources, School of Information Engineering, Huzhou University, Huzhou, Zhejiang 313000, China.,Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Stockholm, Solna 17121, Sweden.,College of Science, Zhejiang Sci-Tech University, Hangzhou, Zhejiang 310018, China
| |
Collapse
|
44
|
Hosseini S, Ilie L. Predicting Protein Interaction Sites Using PITHIA. Methods Mol Biol 2023; 2690:375-383. [PMID: 37450160 DOI: 10.1007/978-1-0716-3327-4_29] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
Several proteins work independently, but the majority work together to maintain the functions of the cell. Thus, it is crucial to know the interaction sites that facilitate protein-protein interactions. The development of effective computational methods is essential because experimental methods are expensive and time-consuming. This chapter is a guide to predicting protein interaction sites using the program "PITHIA." First, some installation guides are presented, followed by descriptions of input file formats. Afterward, PITHIA's commands and options are outlined with examples. Moreover, some notes are provided on how to extend PITHIA's installation and usage.
Collapse
Affiliation(s)
- SeyedMohsen Hosseini
- Department of Computer Science, University of Western Ontario, London, ON, Canada
| | - Lucian Ilie
- Department of Computer Science, University of Western Ontario, London, ON, Canada.
| |
Collapse
|
45
|
Nambiar A, Liu S, Heflin M, Forsyth JM, Maslov S, Hopkins M, Ritz A. Transformer Neural Networks for Protein Family and Interaction Prediction Tasks. J Comput Biol 2023; 30:95-111. [PMID: 35950958 DOI: 10.1089/cmb.2022.0132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
The scientific community is rapidly generating protein sequence information, but only a fraction of these proteins can be experimentally characterized. While promising deep learning approaches for protein prediction tasks have emerged, they have computational limitations or are designed to solve a specific task. We present a Transformer neural network that pre-trains task-agnostic sequence representations. This model is fine-tuned to solve two different protein prediction tasks: protein family classification and protein interaction prediction. Our method is comparable to existing state-of-the-art approaches for protein family classification while being much more general than other architectures. Further, our method outperforms other approaches for protein interaction prediction for two out of three different scenarios that we generated. These results offer a promising framework for fine-tuning the pre-trained sequence representations for other protein prediction tasks.
Collapse
Affiliation(s)
- Ananthan Nambiar
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Simon Liu
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.,Department of Computer Science, and University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Maeve Heflin
- Department of Computer Science, and University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - John Malcolm Forsyth
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.,Department of Computer Science, and University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Sergei Maslov
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.,Department of Computer Science, and University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Mark Hopkins
- Department of Computer Science and Reed College, Portland, Oregon, USA
| | - Anna Ritz
- Department of Biology, Reed College, Portland, Oregon, USA
| |
Collapse
|
46
|
Lee SH, Hwang D, Goo TW, Yun EY. Prediction of intestinal stem cell regulatory genes from Drosophila gut damage model created using multiple inducers: Differential gene expression-based protein-protein interaction network analysis. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2023; 138:104539. [PMID: 36087786 DOI: 10.1016/j.dci.2022.104539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 09/03/2022] [Accepted: 09/04/2022] [Indexed: 06/15/2023]
Abstract
Intestinal tissue functions in innate immunity to prevent the entry of harmful substances, and to maintain homeostasis through the constant proliferation of intestinal stem cells (ISC). To understand the mechanisms which regulate ISC in response to gut damage, we identified 81 differentially expressed genes (DEGs) through RNA-seq analysis after oral administration of three intestinal-damaging substances to Drosophila melanogaster. Through protein-protein interaction (PPI) and functional annotation studies, the top 22 DEGs ordered by the number of nodes in the PPI network were analyzed in relation to cell development. Through network topology analysis, we identified 12 essential seed genes. From this we confirmed that p53, RpL17, Fmr1, Stat92E, CG31343, Cnot4, CG9281, CG8184, Evi5, and to were essential for ISC proliferation during gut damage using knockdown RNAi Drosophila. This study presents a method for identifying candidate genes relating to intestinal damage that has scope for furthering our understanding of gut disease.
Collapse
Affiliation(s)
- Seung Hun Lee
- Department of Integrative Biological Sciences and Industry, Sejong University, Seoul, 05006, South Korea
| | - Dooseon Hwang
- Department of Integrative Biological Sciences and Industry, Sejong University, Seoul, 05006, South Korea
| | - Tae-Won Goo
- Department of Biochemistry, College of Medicine, Dongguk University, Gyeongju, 38766, South Korea
| | - Eun-Young Yun
- Department of Integrative Biological Sciences and Industry, Sejong University, Seoul, 05006, South Korea.
| |
Collapse
|
47
|
Li K, Quan L, Jiang Y, Li Y, Zhou Y, Wu T, Lyu Q. ctP 2ISP: Protein-Protein Interaction Sites Prediction Using Convolution and Transformer With Data Augmentation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:297-306. [PMID: 35213314 DOI: 10.1109/tcbb.2022.3154413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Protein-protein interactions are the basis of many cellular biological processes, such as cellular organization, signal transduction, and immune response. Identifying protein-protein interaction sites is essential for understanding the mechanisms of various biological processes, disease development, and drug design. However, it remains a challenging task to make accurate predictions, as the small amount of training data and severe imbalanced classification reduce the performance of computational methods. We design a deep learning method named ctP2ISP to improve the prediction of protein-protein interaction sites. ctP2ISP employs Convolution and Transformer to extract information and enhance information perception so that semantic features can be mined to identify protein-protein interaction sites. A weighting loss function with different sample weights is designed to suppress the preference of the model toward multi-category prediction. To efficiently reuse the information in the training set, a preprocessing of data augmentation with an improved sample-oriented sampling strategy is applied. The trained ctP2ISP was evaluated against current state-of-the-art methods on six public datasets. The results show that ctP2ISP outperforms all other competing methods on the balance metrics: F1, MCC, and AUPRC. In particular, our prediction on open tests related to viruses may also be consistent with biological insights. The source code and data can be obtained from https://github.com/lennylv/ctP2ISP.
Collapse
|
48
|
Basar MA, Hosen MF, Kumar Paul B, Hasan MR, Shamim S, Bhuyian T. Identification of drug and protein-protein interaction network among stress and depression: A bioinformatics approach. INFORMATICS IN MEDICINE UNLOCKED 2023. [DOI: 10.1016/j.imu.2023.101174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
|
49
|
Paulussen FM, Grossmann TN. Peptide-based covalent inhibitors of protein-protein interactions. J Pept Sci 2023; 29:e3457. [PMID: 36239115 PMCID: PMC10077911 DOI: 10.1002/psc.3457] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 10/10/2022] [Accepted: 10/11/2022] [Indexed: 12/13/2022]
Abstract
Protein-protein interactions (PPI) are involved in all cellular processes and many represent attractive therapeutic targets. However, the frequently rather flat and large interaction areas render the identification of small molecular PPI inhibitors very challenging. As an alternative, peptide interaction motifs derived from a PPI interface can serve as starting points for the development of inhibitors. However, certain proteins remain challenging targets when applying inhibitors with a competitive mode of action. For that reason, peptide-based ligands with an irreversible binding mode have gained attention in recent years. This review summarizes examples of covalent inhibitors that employ peptidic binders and have been tested in a biological context.
Collapse
Affiliation(s)
- Felix M Paulussen
- Amsterdam Institute of Molecular and Life Sciences (AIMMS), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.,Department of Chemistry and Pharmaceutical Sciences, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.,Department of Molecular Microbiology, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Tom N Grossmann
- Amsterdam Institute of Molecular and Life Sciences (AIMMS), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.,Department of Chemistry and Pharmaceutical Sciences, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
50
|
Kimothi D, Biyani P, Hogan JM, Davis MJ. Sequence Representations and Their Utility for Predicting Protein-Protein Interactions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:646-657. [PMID: 34941517 DOI: 10.1109/tcbb.2021.3137325] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Protein-Protein Interactions (PPIs) are a crucial mechanism underpinning the function of the cell. So far, a wide range of machine-learning based methods have been proposed for predicting these relationships. Their success is heavily dependent on the construction of the underlying feature vectors, with most using a set of physico-chemical properties derived from the sequence. Few work directly with the sequence itself. In this paper, we explore the utility of sequence embeddings for predicting protein-protein interactions. We construct a protein pair feature vector by concatenating the embeddings of their constituent sequence. These feature vectors are then used as input to a binary classifier to make predictions. To learn sequence embeddings, we use two established Word2Vec based methods - Seq2Vec and BioVec - and we also introduce a novel feature construction method called SuperVecNW. The embeddings generated through SuperVecNW capture some network information in addition to the contextual information present in the sequences. We test the efficacy of our proposed approach on human and yeast PPI datasets and on three well-known networks: CD9, the Ras-Raf-Mek-Erk-Elk-Srf pathway, and a Wnt-related network. We demonstrate that low dimensional sequence embeddings provide better results than most alternative representations based on physico-chemical properties while offering a far simple approach to feature vector construction.
Collapse
|