1
|
Wright SN, Colton S, Schaffer LV, Pillich RT, Churas C, Pratt D, Ideker T. State of the interactomes: an evaluation of molecular networks for generating biological insights. Mol Syst Biol 2025; 21:1-29. [PMID: 39653848 PMCID: PMC11697402 DOI: 10.1038/s44320-024-00077-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 11/07/2024] [Accepted: 11/11/2024] [Indexed: 12/18/2024] Open
Abstract
Advancements in genomic and proteomic technologies have powered the creation of large gene and protein networks ("interactomes") for understanding biological systems. However, the proliferation of interactomes complicates the selection of networks for specific applications. Here, we present a comprehensive evaluation of 45 current human interactomes, encompassing protein-protein interactions as well as gene regulatory, signaling, colocalization, and genetic interaction networks. Our analysis shows that large composite networks such as HumanNet, STRING, and FunCoup are most effective for identifying disease genes, while smaller networks such as DIP, Reactome, and SIGNOR demonstrate stronger performance in interaction prediction. Our study provides a benchmark for interactomes across diverse biological applications and clarifies factors that influence network performance. Furthermore, our evaluation pipeline paves the way for continued assessment of emerging and updated interaction networks in the future.
Collapse
Affiliation(s)
- Sarah N Wright
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Scott Colton
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Leah V Schaffer
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Rudolf T Pillich
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Christopher Churas
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Dexter Pratt
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Trey Ideker
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
2
|
Nayar G, Altman RB. Heterogeneous network approaches to protein pathway prediction. Comput Struct Biotechnol J 2024; 23:2727-2739. [PMID: 39035835 PMCID: PMC11260399 DOI: 10.1016/j.csbj.2024.06.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 06/17/2024] [Accepted: 06/18/2024] [Indexed: 07/23/2024] Open
Abstract
Understanding protein-protein interactions (PPIs) and the pathways they comprise is essential for comprehending cellular functions and their links to specific phenotypes. Despite the prevalence of molecular data generated by high-throughput sequencing technologies, a significant gap remains in translating this data into functional information regarding the series of interactions that underlie phenotypic differences. In this review, we present an in-depth analysis of heterogeneous network methodologies for modeling protein pathways, highlighting the critical role of integrating multifaceted biological data. It outlines the process of constructing these networks, from data representation to machine learning-driven predictions and evaluations. The work underscores the potential of heterogeneous networks in capturing the complexity of proteomic interactions, thereby offering enhanced accuracy in pathway prediction. This approach not only deepens our understanding of cellular processes but also opens up new possibilities in disease treatment and drug discovery by leveraging the predictive power of comprehensive proteomic data analysis.
Collapse
Affiliation(s)
- Gowri Nayar
- Department of Biomedical Data Science, Stanford University, United States
| | - Russ B. Altman
- Department of Biomedical Data Science, Stanford University, United States
- Department of Genetics, Stanford University, United States
- Department of Medicine, Stanford University, United States
- Department of Bioengineering, Stanford University, United States
| |
Collapse
|
3
|
Zheng H, Xu L, Xie H, Xie J, Ma Y, Hu Y, Wu L, Chen J, Wang M, Yi Y, Huang Y, Wang D. RIscoper 2.0: A deep learning tool to extract RNA biomedical relation sentences from literature. Comput Struct Biotechnol J 2024; 23:1469-1476. [PMID: 38623560 PMCID: PMC11016866 DOI: 10.1016/j.csbj.2024.03.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 03/15/2024] [Accepted: 03/21/2024] [Indexed: 04/17/2024] Open
Abstract
RNA plays an extensive role in a multi-dimensional regulatory system, and its biomedical relationships are scattered across numerous biological studies. However, text mining works dedicated to the extraction of RNA biomedical relations remain limited. In this study, we established a comprehensive and reliable corpus of RNA biomedical relations, recruiting over 30,000 sentences manually curated from more than 15,000 biomedical literature. We also updated RIscoper 2.0, a BERT-based deep learning tool to extract RNA biomedical relation sentences from literature. Benefiting from approximately 100,000 annotated named entities, we integrated the text classification and named entity recognition tasks in this tool. Additionally, RIscoper 2.0 outperformed the original tool in both tasks and can discover new RNA biomedical relations. Additionally, we provided a user-friendly online search tool that enables rapid scanning of RNA biomedical relationships using local and online resources. Both the online tools and data resources of RIscoper 2.0 are available at http://www.rnainter.org/riscoper.
Collapse
Affiliation(s)
- Hailong Zheng
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Linfu Xu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Hailong Xie
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Jiajing Xie
- National Institute for Data Science in Health and Medicine, Xiamen University, 361102 Xiamen, China
| | - Yapeng Ma
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Yongfei Hu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Le Wu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Jia Chen
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Meiyi Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Ying Yi
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Yan Huang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Dong Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
- Guangdong Province Key Laboratory of Molecular Tumor Pathology, 510515, Guangzhou, China
| |
Collapse
|
4
|
Chang M, Ahn J, Kang BG, Yoon S. Cross-modal embedding integrator for disease-gene/protein association prediction using a multi-head attention mechanism. Pharmacol Res Perspect 2024; 12:e70034. [PMID: 39560053 PMCID: PMC11574662 DOI: 10.1002/prp2.70034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 09/07/2024] [Accepted: 10/28/2024] [Indexed: 11/20/2024] Open
Abstract
Knowledge graphs, powerful tools that explicitly transfer knowledge to machines, have significantly advanced new knowledge inferences. Discovering unknown relationships between diseases and genes/proteins in biomedical knowledge graphs can lead to the identification of disease development mechanisms and new treatment targets. Generating high-quality representations of biomedical entities is essential for successfully predicting disease-gene/protein associations. We developed a computational model that predicts disease-gene/protein associations using the Precision Medicine Knowledge Graph, a biomedical knowledge graph. Embeddings of biomedical entities were generated using two different methods-a large language model (LLM) and the knowledge graph embedding (KGE) algorithm. The LLM utilizes information obtained from massive amounts of text data, whereas the KGE algorithm relies on graph structures. We developed a disease-gene/protein association prediction model, "Cross-Modal Embedding Integrator (CMEI)," by integrating embeddings from different modalities using a multi-head attention mechanism. The area under the receiver operating characteristic curve of CMEI was 0.9662 (± 0.0002) in predicting disease-gene/protein associations. In conclusion, we developed a computational model that effectively predicts disease-gene/protein associations. CMEI may contribute to the identification of disease development mechanisms and new treatment targets.
Collapse
Affiliation(s)
- Munyoung Chang
- Education and Research Program for Future ICT Pioneers, Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea
| | - Junyong Ahn
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul, South Korea
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, South Korea
| | - Bong Gyun Kang
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, South Korea
| | - Sungroh Yoon
- Education and Research Program for Future ICT Pioneers, Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, South Korea
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea
| |
Collapse
|
5
|
Akid H, Chennen K, Frey G, Thompson J, Ben Ayed M, Lachiche N. Graph-based machine learning model for weight prediction in protein-protein networks. BMC Bioinformatics 2024; 25:349. [PMID: 39511478 PMCID: PMC11546293 DOI: 10.1186/s12859-024-05973-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Accepted: 10/31/2024] [Indexed: 11/15/2024] Open
Abstract
Proteins interact with each other in complex ways to perform significant biological functions. These interactions, known as protein-protein interactions (PPIs), can be depicted as a graph where proteins are nodes and their interactions are edges. The development of high-throughput experimental technologies allows for the generation of numerous data which permits increasing the sophistication of PPI models. However, despite significant progress, current PPI networks remain incomplete. Discovering missing interactions through experimental techniques can be costly, time-consuming, and challenging. Therefore, computational approaches have emerged as valuable tools for predicting missing interactions. In PPI networks, a graph is usually used to model the interactions between proteins. An edge between two proteins indicates a known interaction, while the absence of an edge means the interaction is not known or missed. However, this binary representation overlooks the reliability of known interactions when predicting new ones. To address this challenge, we propose a novel approach for link prediction in weighted protein-protein networks, where interaction weights denote confidence scores. By leveraging data from the yeast Saccharomyces cerevisiae obtained from the STRING database, we introduce a new model that combines similarity-based algorithms and aggregated confidence score weights for accurate link prediction purposes. Our model significantly improves prediction accuracy, surpassing traditional approaches in terms of Mean Absolute Error, Mean Relative Absolute Error, and Root Mean Square Error. Our proposed approach holds the potential for improved accuracy in predicting PPIs, which is crucial for better understanding the underlying biological processes.
Collapse
Affiliation(s)
- Hajer Akid
- ICube, University of Strasbourg, 67412, Illkirch Cedex, France.
| | - Kirsley Chennen
- ICube, University of Strasbourg, 67412, Illkirch Cedex, France
| | - Gabriel Frey
- ICube, University of Strasbourg, 67412, Illkirch Cedex, France
| | - Julie Thompson
- ICube, University of Strasbourg, 67412, Illkirch Cedex, France
| | | | | |
Collapse
|
6
|
Bi Y, Jiao X, Lee YL, Zhou T. Inconsistency among evaluation metrics in link prediction. PNAS NEXUS 2024; 3:pgae498. [PMID: 39564572 PMCID: PMC11574622 DOI: 10.1093/pnasnexus/pgae498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 10/29/2024] [Indexed: 11/21/2024]
Abstract
Link prediction is a paradigmatic and challenging problem in network science, which aims to predict missing links, future links, and temporal links based on known topology. Along with the increasing number of link prediction algorithms, a critical yet previously ignored risk is that the evaluation metrics for algorithm performance are usually chosen at will. This paper implements extensive experiments on hundreds of real networks and 26 well-known algorithms, revealing significant inconsistency among evaluation metrics, namely different metrics probably produce remarkably different rankings of algorithms. Therefore, we conclude that any single metric cannot comprehensively or credibly evaluate algorithm performance. In terms of information content, we suggest the usage of at least two metrics: one is the area under the receiver operating characteristic curve, and the other is one of the following three candidates, say the area under the precision-recall curve, the area under the precision curve, and the normalized discounted cumulative gain. When the data are imbalanced, say the number of negative samples significantly outweighs the number of positive samples, the area under the generalized Receiver Operating Characteristic curve should also be used. In addition, as we have proved the essential equivalence of threshold-dependent metrics, if in a link prediction task, some specific thresholds are meaningful, we can consider any one threshold-dependent metric with those thresholds. This work completes a missing part in the landscape of link prediction, and provides a starting point toward a well-accepted criterion or standard to select proper evaluation metrics for link prediction.
Collapse
Affiliation(s)
- Yilin Bi
- CompleX Lab, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Xinshan Jiao
- CompleX Lab, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Yan-Li Lee
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, China
| | - Tao Zhou
- CompleX Lab, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
7
|
Ravan A, Procopio S, Chemla YR, Gruebele M. Temperature-jump microscopy and interaction of Hsp70 heat shock protein with a client protein in vivo. Methods 2024; 231:154-164. [PMID: 39362572 DOI: 10.1016/j.ymeth.2024.09.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 09/28/2024] [Accepted: 09/30/2024] [Indexed: 10/05/2024] Open
Abstract
Biomolecular processes such as protein-protein interactions can depend strongly on cell type and even vary within a single cell type. Here we develop a microscope with a Peltier-controlled temperature stage, a laser temperature jump to induce heat stress, and an autofocusing feature to mitigate temperature drift during experiments, to study a protein-protein interaction in a selected cell type within a live organism, the zebrafish larva. As an application of the instrument, we show that there is considerable cell-to-cell variation of the heat shock protein Hsp70 binding to one of its clients, phosphoglycerate kinase in vivo. We adapt a key feature from our previous folding study, rare transformation of cells within the larva, so that individual cells can be imaged and differentiated for cell-to-cell response. Our approach can be extended to other organisms and cell types than the ones demonstrated in this work.
Collapse
Affiliation(s)
- Aniket Ravan
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana Champaign, Urbana 61801, USA
| | - Samuel Procopio
- Department of Physics, University of Illinois Urbana Champaign, Urbana 61801, USA
| | - Yann R Chemla
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana Champaign, Urbana 61801, USA; Department of Physics, University of Illinois Urbana Champaign, Urbana 61801, USA
| | - Martin Gruebele
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana Champaign, Urbana 61801, USA; Department of Physics, University of Illinois Urbana Champaign, Urbana 61801, USA; Department of Chemistry and Carle-Illinois College of Medicine, University of Illinois Urbana Champaign, Urbana 61801, USA.
| |
Collapse
|
8
|
Pan F, Wang CN, Yu ZH, Wu ZR, Wang Z, Lou S, Li WH, Liu GX, Li T, Zhao YZ, Tang Y. NADPHnet: a novel strategy to predict compounds for regulation of NADPH metabolism via network-based methods. Acta Pharmacol Sin 2024; 45:2199-2211. [PMID: 38902503 PMCID: PMC11420228 DOI: 10.1038/s41401-024-01324-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 05/26/2024] [Indexed: 06/22/2024] Open
Abstract
Identification of compounds to modulate NADPH metabolism is crucial for understanding complex diseases and developing effective therapies. However, the complex nature of NADPH metabolism poses challenges in achieving this goal. In this study, we proposed a novel strategy named NADPHnet to predict key proteins and drug-target interactions related to NADPH metabolism via network-based methods. Different from traditional approaches only focusing on one single protein, NADPHnet could screen compounds to modulate NADPH metabolism from a comprehensive view. Specifically, NADPHnet identified key proteins involved in regulation of NADPH metabolism using network-based methods, and characterized the impact of natural products on NADPH metabolism using a combined score, NADPH-Score. NADPHnet demonstrated a broader applicability domain and improved accuracy in the external validation set. This approach was further employed along with molecular docking to identify 27 compounds from a natural product library, 6 of which exhibited concentration-dependent changes of cellular NADPH level within 100 μM, with Oxyberberine showing promising effects even at 10 μM. Mechanistic and pathological analyses of Oxyberberine suggest potential novel mechanisms to affect diabetes and cancer. Overall, NADPHnet offers a promising method for prediction of NADPH metabolism modulation and advances drug discovery for complex diseases.
Collapse
Affiliation(s)
- Fei Pan
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Cheng-Nuo Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Zhuo-Hang Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Zeng-Rui Wu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Ze Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Shang Lou
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Wei-Hua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Gui-Xia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Ting Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China.
| | - Yu-Zheng Zhao
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China.
| |
Collapse
|
9
|
Soobben M, Sayed Y, Achilonu I. Exploring the evolutionary trajectory and functional landscape of cannabinoid receptors: A comprehensive bioinformatic analysis. Comput Biol Chem 2024; 112:108138. [PMID: 38943725 DOI: 10.1016/j.compbiolchem.2024.108138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 05/24/2024] [Accepted: 06/20/2024] [Indexed: 07/01/2024]
Abstract
The bioinformatic analysis of cannabinoid receptors (CBRs) CB1 and CB2 reveals a detailed picture of their structure, evolution, and physiological significance within the endocannabinoid system (ECS). The study highlights the evolutionary conservation of these receptors evidenced by sequence alignments across diverse species including humans, amphibians, and fish. Both CBRs share a structural hallmark of seven transmembrane (TM) helices, characteristic of class A G-protein-coupled receptors (GPCRs), which are critical for their signalling functions. The study reports a similarity of 44.58 % between both CBR sequences, which suggests that while their evolutionary paths and physiological roles may differ, there is considerable conservation in their structures. Pathway databases like KEGG, Reactome, and WikiPathways were employed to determine the involvement of the receptors in various signalling pathways. The pathway analyses integrated within this study offer a detailed view of the CBRs interactions within a complex network of cannabinoid-related signalling pathways. High-resolution crystal structures (PDB ID: 5U09 for CB1 and 5ZTY for CB2) provided accurate structural information, showing the binding pocket volume and surface area of the receptors, essential for ligand interaction. The comparison between these receptors' natural sequences and their engineered pseudo-CBRs (p-CBRs) showed a high degree of sequence identity, confirming the validity of using p-CBRs in receptor-ligand interaction studies. This comprehensive analysis enhances the understanding of the structural and functional dynamics of cannabinoid receptors, highlighting their physiological roles and their potential as therapeutic targets within the ECS.
Collapse
MESH Headings
- Computational Biology
- Humans
- Amino Acid Sequence
- Receptor, Cannabinoid, CB2/metabolism
- Receptor, Cannabinoid, CB2/chemistry
- Receptor, Cannabinoid, CB2/genetics
- Receptors, Cannabinoid/metabolism
- Receptors, Cannabinoid/chemistry
- Receptor, Cannabinoid, CB1/metabolism
- Receptor, Cannabinoid, CB1/chemistry
- Receptor, Cannabinoid, CB1/genetics
- Evolution, Molecular
- Animals
- Sequence Alignment
Collapse
Affiliation(s)
- Marushka Soobben
- Protein Structure-Function Research Unit, School of Molecular and Cell Biology, University of the Witwatersrand, Johannesburg 2050, South Africa
| | - Yasien Sayed
- Protein Structure-Function Research Unit, School of Molecular and Cell Biology, University of the Witwatersrand, Johannesburg 2050, South Africa
| | - Ikechukwu Achilonu
- Protein Structure-Function Research Unit, School of Molecular and Cell Biology, University of the Witwatersrand, Johannesburg 2050, South Africa.
| |
Collapse
|
10
|
Rueter J, Rimbach G, Bilke S, Tholey A, Huebbe P. Readdressing the Localization of Apolipoprotein E (APOE) in Mitochondria-Associated Endoplasmic Reticulum (ER) Membranes (MAMs): An Investigation of the Hepatic Protein-Protein Interactions of APOE with the Mitochondrial Proteins Lon Protease (LONP1), Mitochondrial Import Receptor Subunit TOM40 (TOMM40) and Voltage-Dependent Anion-Selective Channel 1 (VDAC1). Int J Mol Sci 2024; 25:10597. [PMID: 39408926 PMCID: PMC11476584 DOI: 10.3390/ijms251910597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 09/27/2024] [Accepted: 09/29/2024] [Indexed: 10/20/2024] Open
Abstract
As a component of circulating lipoproteins, APOE binds to cell surface receptors mediating lipoprotein metabolism and cholesterol transport. A growing body of evidence, including the identification of a broad variety of cellular proteins interacting with APOE, suggests additional independent functions. Investigating cellular localization and protein-protein interactions in cultured human hepatocytes, we aimed to contribute to the elucidation of hitherto unnoted cellular functions of APOE. We observed a strong accumulation of APOE in MAMs, equally evident for the two major isoforms APOE3 and APOE4. Using mass spectrometry proteome analyses, novel and previously noted APOE interactors were identified, including the mitochondrial proteins TOMM40, LONP1 and VDAC1. All three interactors were present in MAM fractions, which we think initially facilitates interactions with APOE. LONP1 is a protease with chaperone activity, which migrated to MAMs in response to ER stress, displaying a reinforced interaction with APOE. We therefore hypothesize that APOE may help in the unfolded protein response (UPR) by acting as a co-chaperone in cooperation with LONP1 at the interface of mitochondria and ER membranes. The interaction of APOE with the integral proteins TOMM40 and VDAC1 may point to the formation of bridging complexes connecting mitochondria with other organelles.
Collapse
Affiliation(s)
- Johanna Rueter
- Institute of Human Nutrition and Food Science, University of Kiel, Hermann-Rodewald-Strasse 6, 24118 Kiel, Germany; (J.R.); (G.R.)
| | - Gerald Rimbach
- Institute of Human Nutrition and Food Science, University of Kiel, Hermann-Rodewald-Strasse 6, 24118 Kiel, Germany; (J.R.); (G.R.)
| | - Stephanie Bilke
- Institute of Experimental Medicine, University of Kiel, Niemannsweg 11, 24105 Kiel, Germany
| | - Andreas Tholey
- Institute of Experimental Medicine, University of Kiel, Niemannsweg 11, 24105 Kiel, Germany
| | - Patricia Huebbe
- Institute of Human Nutrition and Food Science, University of Kiel, Hermann-Rodewald-Strasse 6, 24118 Kiel, Germany; (J.R.); (G.R.)
| |
Collapse
|
11
|
Perdomo-Quinteiro P, Belmonte-Hernández A. Knowledge Graphs for drug repurposing: a review of databases and methods. Brief Bioinform 2024; 25:bbae461. [PMID: 39325460 PMCID: PMC11426166 DOI: 10.1093/bib/bbae461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 08/07/2024] [Accepted: 09/11/2024] [Indexed: 09/27/2024] Open
Abstract
Drug repurposing has emerged as a effective and efficient strategy to identify new treatments for a variety of diseases. One of the most effective approaches for discovering potential new drug candidates involves the utilization of Knowledge Graphs (KGs). This review comprehensively explores some of the most prominent KGs, detailing their structure, data sources, and how they facilitate the repurposing of drugs. In addition to KGs, this paper delves into various artificial intelligence techniques that enhance the process of drug repurposing. These methods not only accelerate the identification of viable drug candidates but also improve the precision of predictions by leveraging complex datasets and advanced algorithms. Furthermore, the importance of explainability in drug repurposing is emphasized. Explainability methods are crucial as they provide insights into the reasoning behind AI-generated predictions, thereby increasing the trustworthiness and transparency of the repurposing process. We will discuss several techniques that can be employed to validate these predictions, ensuring that they are both reliable and understandable.
Collapse
Affiliation(s)
- Pablo Perdomo-Quinteiro
- Grupo de Aplicación de Telecomunicaciones Visuales, Escuela Técnica Superior de Ingenieros de Telecomunicación, Universidad Politécnica de Madrid, Avenida Complutense 30, 28040 Madrid, Spain
| | - Alberto Belmonte-Hernández
- Grupo de Aplicación de Telecomunicaciones Visuales, Escuela Técnica Superior de Ingenieros de Telecomunicación, Universidad Politécnica de Madrid, Avenida Complutense 30, 28040 Madrid, Spain
| |
Collapse
|
12
|
Nguyen QH, Nguyen H, Oh EC, Nguyen T. Current approaches and outstanding challenges of functional annotation of metabolites: a comprehensive review. Brief Bioinform 2024; 25:bbae498. [PMID: 39397425 PMCID: PMC11471905 DOI: 10.1093/bib/bbae498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 09/03/2024] [Accepted: 10/02/2024] [Indexed: 10/15/2024] Open
Abstract
Metabolite profiling is a powerful approach for the clinical diagnosis of complex diseases, ranging from cardiometabolic diseases, cancer, and cognitive disorders to respiratory pathologies and conditions that involve dysregulated metabolism. Because of the importance of systems-level interpretation, many methods have been developed to identify biologically significant pathways using metabolomics data. In this review, we first describe a complete metabolomics workflow (sample preparation, data acquisition, pre-processing, downstream analysis, etc.). We then comprehensively review 24 approaches capable of performing functional analysis, including those that combine metabolomics data with other types of data to investigate the disease-relevant changes at multiple omics layers. We discuss their availability, implementation, capability for pre-processing and quality control, supported omics types, embedded databases, pathway analysis methodologies, and integration techniques. We also provide a rating and evaluation of each software, focusing on their key technique, software accessibility, documentation, and user-friendliness. Following our guideline, life scientists can easily choose a suitable method depending on method rating, available data, input format, and method category. More importantly, we highlight outstanding challenges and potential solutions that need to be addressed by future research. To further assist users in executing the reviewed methods, we provide wrappers of the software packages at https://github.com/tinnlab/metabolite-pathway-review-docker.
Collapse
Affiliation(s)
- Quang-Huy Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, United States
| | - Ha Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, United States
| | - Edwin C Oh
- Department of Internal Medicine, UNLV School of Medicine, University of Nevada, Las Vegas, NV 89154, United States
| | - Tin Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, United States
| |
Collapse
|
13
|
Yang S, Cheng P, Liu Y, Feng D, Wang S. Exploring the Knowledge of an Outstanding Protein to Protein Interaction Transformer. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1287-1298. [PMID: 38536676 DOI: 10.1109/tcbb.2024.3381825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2024]
Abstract
Protein-to-protein interaction (PPI) prediction aims to predict whether two given proteins interact or not. Compared with traditional experimental methods of high cost and low efficiency, the current deep learning based approach makes it possible to discover massive potential PPIs from large-scale databases. However, deep PPI prediction models perform poorly on unseen species, as their proteins are not in the training set. Targetting on this issue, the paper first proposes PPITrans, a Transformer based PPI prediction model that exploits a language model pre-trained on proteins to conduct binary PPI prediction. To validate the effectiveness on unseen species, PPITrans is trained with Human PPIs and tested on PPIs of other species. Experimental results show that PPITrans significantly outperforms the previous state-of-the-art on various metrics, especially on PPIs of unseen species. For example, the AUPR improves 0.339 absolutely on Fly PPIs. Aiming to explore the knowledge learned by PPITrans from PPI data, this paper also designs a series of probes belonging to three categories. Their results reveal several interesting findings, like that although PPITrans cannot capture the spatial structure of proteins, it can obtain knowledge of PPI type and binding affinity, learning more than binary PPI.
Collapse
|
14
|
Pang H, Wei S, Du Z, Zhao Y, Cai S, Zhao Y. Graph Representation Learning Based on Specific Subgraphs for Biomedical Interaction Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1552-1564. [PMID: 38767994 DOI: 10.1109/tcbb.2024.3402741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Discovering the novel associations of biomedical entities is of great significance and can facilitate not only the identification of network biomarkers of disease but also the search for putative drug targets.Graph representation learning (GRL) has incredible potential to efficiently predict the interactions from biomedical networks by modeling the robust representation for each node.> However, the current GRL-based methods learn the representation of nodes by aggregating the features of their neighbors with equal weights. Furthermore, they also fail to identify which features of higher-order neighbors are integrated into the representation of the central node. In this work, we propose a novel graph representation learning framework: a multi-order graph neural network based on reconstructed specific subgraphs (MGRS) for biomedical interaction prediction. In the MGRS, we apply the multi-order graph aggregation module (MOGA) to learn the wide-view representation by integrating the multi-hop neighbor features. Besides, we propose a subgraph selection module (SGSM) to reconstruct the specific subgraph with adaptive edge weights for each node. SGSM can clearly explore the dependency of the node representation on the neighbor features and learn the subgraph-based representation based on the reconstructed weighted subgraphs. Extensive experimental results on four public biomedical networks demonstrate that the MGRS performs better and is more robust than the latest baselines.
Collapse
|
15
|
Zitnik M, Li MM, Wells A, Glass K, Morselli Gysi D, Krishnan A, Murali TM, Radivojac P, Roy S, Baudot A, Bozdag S, Chen DZ, Cowen L, Devkota K, Gitter A, Gosline SJC, Gu P, Guzzi PH, Huang H, Jiang M, Kesimoglu ZN, Koyuturk M, Ma J, Pico AR, Pržulj N, Przytycka TM, Raphael BJ, Ritz A, Sharan R, Shen Y, Singh M, Slonim DK, Tong H, Yang XH, Yoon BJ, Yu H, Milenković T. Current and future directions in network biology. BIOINFORMATICS ADVANCES 2024; 4:vbae099. [PMID: 39143982 PMCID: PMC11321866 DOI: 10.1093/bioadv/vbae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 05/31/2024] [Accepted: 07/08/2024] [Indexed: 08/16/2024]
Abstract
Summary Network biology is an interdisciplinary field bridging computational and biological sciences that has proved pivotal in advancing the understanding of cellular functions and diseases across biological systems and scales. Although the field has been around for two decades, it remains nascent. It has witnessed rapid evolution, accompanied by emerging challenges. These stem from various factors, notably the growing complexity and volume of data together with the increased diversity of data types describing different tiers of biological organization. We discuss prevailing research directions in network biology, focusing on molecular/cellular networks but also on other biological network types such as biomedical knowledge graphs, patient similarity networks, brain networks, and social/contact networks relevant to disease spread. In more detail, we highlight areas of inference and comparison of biological networks, multimodal data integration and heterogeneous networks, higher-order network analysis, machine learning on networks, and network-based personalized medicine. Following the overview of recent breakthroughs across these five areas, we offer a perspective on future directions of network biology. Additionally, we discuss scientific communities, educational initiatives, and the importance of fostering diversity within the field. This article establishes a roadmap for an immediate and long-term vision for network biology. Availability and implementation Not applicable.
Collapse
Affiliation(s)
- Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Michelle M Li
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Aydin Wells
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Deisy Morselli Gysi
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
- Department of Statistics, Federal University of Paraná, Curitiba, Paraná 81530-015, Brazil
- Department of Physics, Northeastern University, Boston, MA 02115, United States
| | - Arjun Krishnan
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, United States
| | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, United States
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, United States
| | - Sushmita Roy
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Wisconsin Institute for Discovery, Madison, WI 53715, United States
| | - Anaïs Baudot
- Aix Marseille Université, INSERM, MMG, Marseille, France
| | - Serdar Bozdag
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- Department of Mathematics, University of North Texas, Denton, TX 76203, United States
| | - Danny Z Chen
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Lenore Cowen
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Kapil Devkota
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Morgridge Institute for Research, Madison, WI 53715, United States
| | - Sara J C Gosline
- Biological Sciences Division, Pacific Northwest National Laboratory, Seattle, WA 98109, United States
| | - Pengfei Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Pietro H Guzzi
- Department of Medical and Surgical Sciences, University Magna Graecia of Catanzaro, Catanzaro, 88100, Italy
| | - Heng Huang
- Department of Computer Science, University of Maryland College Park, College Park, MD 20742, United States
| | - Meng Jiang
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Ziynet Nesibe Kesimoglu
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Mehmet Koyuturk
- Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH 44106, United States
| | - Jian Ma
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, United States
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, United States
| | - Nataša Pržulj
- Department of Computer Science, University College London, London, WC1E 6BT, England
- ICREA, Catalan Institution for Research and Advanced Studies, Barcelona, 08010, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain
| | - Teresa M Przytycka
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
| | - Anna Ritz
- Department of Biology, Reed College, Portland, OR 97202, United States
| | - Roded Sharan
- School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, United States
| | - Donna K Slonim
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Hanghang Tong
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
| | - Xinan Holly Yang
- Department of Pediatrics, University of Chicago, Chicago, IL 60637, United States
| | - Byung-Jun Yoon
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY 11973, United States
| | - Haiyuan Yu
- Department of Computational Biology, Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, United States
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| |
Collapse
|
16
|
Cousins HC, Nayar G, Altman RB. Computational Approaches to Drug Repurposing: Methods, Challenges, and Opportunities. Annu Rev Biomed Data Sci 2024; 7:15-29. [PMID: 38598857 DOI: 10.1146/annurev-biodatasci-110123-025333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
Drug repurposing refers to the inference of therapeutic relationships between a clinical indication and existing compounds. As an emerging paradigm in drug development, drug repurposing enables more efficient treatment of rare diseases, stratified patient populations, and urgent threats to public health. However, prioritizing well-suited drug candidates from among a nearly infinite number of repurposing options continues to represent a significant challenge in drug development. Over the past decade, advances in genomic profiling, database curation, and machine learning techniques have enabled more accurate identification of drug repurposing candidates for subsequent clinical evaluation. This review outlines the major methodologic classes that these approaches comprise, which rely on (a) protein structure, (b) genomic signatures, (c) biological networks, and (d) real-world clinical data. We propose that realizing the full impact of drug repurposing methodologies requires a multidisciplinary understanding of each method's advantages and limitations with respect to clinical practice.
Collapse
Affiliation(s)
- Henry C Cousins
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA;
| | - Gowri Nayar
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA;
| | - Russ B Altman
- Departments of Genetics, Medicine, and Bioengineering, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA;
| |
Collapse
|
17
|
Cai W, Liu P, Wang Z, Jiang H, Liu C, Fei Z, Yang Z. Link prediction in protein-protein interaction network: A similarity multiplied similarity algorithm with paths of length three. J Theor Biol 2024; 589:111850. [PMID: 38740126 DOI: 10.1016/j.jtbi.2024.111850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 03/26/2024] [Accepted: 05/03/2024] [Indexed: 05/16/2024]
Abstract
Protein-protein interactions (PPIs) are crucial for various biological processes, and predicting PPIs is a major challenge. To solve this issue, the most common method is link prediction. Currently, the link prediction methods based on network Paths of Length Three (L3) have been proven to be highly effective. In this paper, we propose a novel link prediction algorithm, named SMS, which is based on L3 and protein similarities. We first design a mixed similarity that combines the topological structure and attribute features of nodes. Then, we compute the predicted value by summing the product of all similarities along the L3. Furthermore, we propose the Max Similarity Multiplied Similarity (maxSMS) algorithm from the perspective of maximum impact. Our computational prediction results show that on six datasets, including S. cerevisiae, H. sapiens, and others, the maxSMS algorithm improves the precision of the top 500, area under the precision-recall curve, and normalized discounted cumulative gain by an average of 26.99%, 53.67%, and 6.7%, respectively, compared to other optimal methods.
Collapse
Affiliation(s)
- Wangmin Cai
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Peiqiang Liu
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China.
| | - Zunfang Wang
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Hong Jiang
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Chang Liu
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Zhaojie Fei
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Zhuang Yang
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| |
Collapse
|
18
|
Li MM, Huang Y, Sumathipala M, Liang MQ, Valdeolivas A, Ananthakrishnan AN, Liao K, Marbach D, Zitnik M. Contextual AI models for single-cell protein biology. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.18.549602. [PMID: 37503080 PMCID: PMC10370131 DOI: 10.1101/2023.07.18.549602] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Understanding protein function and developing molecular therapies require deciphering the cell types in which proteins act as well as the interactions between proteins. However, modeling protein interactions across biological contexts remains challenging for existing algorithms. Here, we introduce Pinnacle, a geometric deep learning approach that generates context-aware protein representations. Leveraging a multi-organ single-cell atlas, Pinnacle learns on contextualized protein interaction networks to produce 394,760 protein representations from 156 cell type contexts across 24 tissues. Pinnacle's embedding space reflects cellular and tissue organization, enabling zero-shot retrieval of the tissue hierarchy. Pretrained protein representations can be adapted for downstream tasks: enhancing 3D structure-based representations for resolving immuno-oncological protein interactions, and investigating drugs' effects across cell types. Pinnacle outperforms state-of-the-art models in nominating therapeutic targets for rheumatoid arthritis and inflammatory bowel diseases, and pinpoints cell type contexts with higher predictive capability than context-free models. Pinnacle's ability to adjust its outputs based on the context in which it operates paves way for large-scale context-specific predictions in biology.
Collapse
Affiliation(s)
- Michelle M. Li
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Yepeng Huang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Marissa Sumathipala
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Man Qing Liang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Alberto Valdeolivas
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Ashwin N. Ananthakrishnan
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Gastroenterology, Massachusetts General Hospital, Boston, MA, USA
| | - Katherine Liao
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Brigham and Women’s Hospital, Boston, MA, USA
| | - Daniel Marbach
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Allston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Data Science Initiative, Cambridge, MA, USA
| |
Collapse
|
19
|
Jia X, Luo W, Li J, Xing J, Sun H, Wu S, Su X. A deep learning framework for predicting disease-gene associations with functional modules and graph augmentation. BMC Bioinformatics 2024; 25:214. [PMID: 38877401 PMCID: PMC11549817 DOI: 10.1186/s12859-024-05841-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 06/12/2024] [Indexed: 06/16/2024] Open
Abstract
BACKGROUND The exploration of gene-disease associations is crucial for understanding the mechanisms underlying disease onset and progression, with significant implications for prevention and treatment strategies. Advances in high-throughput biotechnology have generated a wealth of data linking diseases to specific genes. While graph representation learning has recently introduced groundbreaking approaches for predicting novel associations, existing studies always overlooked the cumulative impact of functional modules such as protein complexes and the incompletion of some important data such as protein interactions, which limits the detection performance. RESULTS Addressing these limitations, here we introduce a deep learning framework called ModulePred for predicting disease-gene associations. ModulePred performs graph augmentation on the protein interaction network using L3 link prediction algorithms. It builds a heterogeneous module network by integrating disease-gene associations, protein complexes and augmented protein interactions, and develops a novel graph embedding for the heterogeneous module network. Subsequently, a graph neural network is constructed to learn node representations by collectively aggregating information from topological structure, and gene prioritization is carried out by the disease and gene embeddings obtained from the graph neural network. Experimental results underscore the superiority of ModulePred, showcasing the effectiveness of incorporating functional modules and graph augmentation in predicting disease-gene associations. This research introduces innovative ideas and directions, enhancing the understanding and prediction of gene-disease relationships.
Collapse
Affiliation(s)
- Xianghu Jia
- College of Computer Science and Technology, Qingdao University, Qingdao, 266071, Shandong, China
| | - Weiwen Luo
- College of Computer Science and Technology, Qingdao University, Qingdao, 266071, Shandong, China
| | - Jiaqi Li
- College of Computer Science and Technology, Qingdao University, Qingdao, 266071, Shandong, China
| | - Jieqi Xing
- College of Computer Science and Technology, Qingdao University, Qingdao, 266071, Shandong, China
| | - Hongjie Sun
- College of Computer Science and Technology, Qingdao University, Qingdao, 266071, Shandong, China
| | - Shunyao Wu
- College of Computer Science and Technology, Qingdao University, Qingdao, 266071, Shandong, China.
| | - Xiaoquan Su
- College of Computer Science and Technology, Qingdao University, Qingdao, 266071, Shandong, China.
| |
Collapse
|
20
|
Rao J, Xie J, Yuan Q, Liu D, Wang Z, Lu Y, Zheng S, Yang Y. A variational expectation-maximization framework for balanced multi-scale learning of protein and drug interactions. Nat Commun 2024; 15:4476. [PMID: 38796523 PMCID: PMC11530528 DOI: 10.1038/s41467-024-48801-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 05/14/2024] [Indexed: 05/28/2024] Open
Abstract
Protein functions are characterized by interactions with proteins, drugs, and other biomolecules. Understanding these interactions is essential for deciphering the molecular mechanisms underlying biological processes and developing new therapeutic strategies. Current computational methods mostly predict interactions based on either molecular network or structural information, without integrating them within a unified multi-scale framework. While a few multi-view learning methods are devoted to fusing the multi-scale information, these methods tend to rely intensively on a single scale and under-fitting the others, likely attributed to the imbalanced nature and inherent greediness of multi-scale learning. To alleviate the optimization imbalance, we present MUSE, a multi-scale representation learning framework based on a variant expectation maximization to optimize different scales in an alternating procedure over multiple iterations. This strategy efficiently fuses multi-scale information between atomic structure and molecular network scale through mutual supervision and iterative optimization. MUSE outperforms the current state-of-the-art models not only in molecular interaction (protein-protein, drug-protein, and drug-drug) tasks but also in protein interface prediction at the atomic structure scale. More importantly, the multi-scale learning framework shows potential for extension to other scales of computational drug discovery.
Collapse
Affiliation(s)
- Jiahua Rao
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Jiancong Xie
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Qianmu Yuan
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Deqin Liu
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Zhen Wang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Yutong Lu
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.
| | - Shuangjia Zheng
- Global Institute of Future Technology, Shanghai Jiao Tong University, Shanghai, China.
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.
- Key Laboratory of Machine Intelligence and Advanced Computing (MOE), Sun Yat-sen University, Guangzhou, China.
- State Key Laboratory of Oncology in South China, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
21
|
Cao MY, Zainudin S, Daud KM. Protein features fusion using attributed network embedding for predicting protein-protein interaction. BMC Genomics 2024; 25:466. [PMID: 38741045 DOI: 10.1186/s12864-024-10361-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 04/29/2024] [Indexed: 05/16/2024] Open
Abstract
BACKGROUND Protein-protein interactions (PPIs) hold significant importance in biology, with precise PPI prediction as a pivotal factor in comprehending cellular processes and facilitating drug design. However, experimental determination of PPIs is laborious, time-consuming, and often constrained by technical limitations. METHODS We introduce a new node representation method based on initial information fusion, called FFANE, which amalgamates PPI networks and protein sequence data to enhance the precision of PPIs' prediction. A Gaussian kernel similarity matrix is initially established by leveraging protein structural resemblances. Concurrently, protein sequence similarities are gauged using the Levenshtein distance, enabling the capture of diverse protein attributes. Subsequently, to construct an initial information matrix, these two feature matrices are merged by employing weighted fusion to achieve an organic amalgamation of structural and sequence details. To gain a more profound understanding of the amalgamated features, a Stacked Autoencoder (SAE) is employed for encoding learning, thereby yielding more representative feature representations. Ultimately, classification models are trained to predict PPIs by using the well-learned fusion feature. RESULTS When employing 5-fold cross-validation experiments on SVM, our proposed method achieved average accuracies of 94.28%, 97.69%, and 84.05% in terms of Saccharomyces cerevisiae, Homo sapiens, and Helicobacter pylori datasets, respectively. CONCLUSION Experimental findings across various authentic datasets validate the efficacy and superiority of this fusion feature representation approach, underscoring its potential value in bioinformatics.
Collapse
Affiliation(s)
- Mei-Yuan Cao
- Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia.
| | - Suhaila Zainudin
- Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia
| | - Kauthar Mohd Daud
- Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia
| |
Collapse
|
22
|
Hao B, Kovács IA. Proper network randomization is key to assessing social balance. SCIENCE ADVANCES 2024; 10:eadj0104. [PMID: 38701217 PMCID: PMC11068007 DOI: 10.1126/sciadv.adj0104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 04/01/2024] [Indexed: 05/05/2024]
Abstract
Social ties, either positive or negative, lead to signed network patterns, the subject of balance theory. For example, strong balance introduces cycles with even numbers of negative edges. The statistical significance of such patterns is routinely assessed by comparisons to null models. Yet, results in signed networks remain controversial. Here, we show that even if a network exhibits strong balance by construction, current null models can fail to identify it. Our results indicate that matching the signed degree preferences of the nodes is a critical step and so is the preservation of network topology in the null model. As a solution, we propose the STP null model, which integrates both constraints within a maximum entropy framework. STP randomization leads to qualitatively different results, with most social networks consistently demonstrating strong balance in three- and four-node patterns. On the basis our results, we present a potential wiring mechanism behind the observed signed patterns and outline further applications of STP randomization.
Collapse
Affiliation(s)
- Bingjie Hao
- Department of Physics and Astronomy, Northwestern University, Evanston, IL 60208, USA
| | - István A. Kovács
- Department of Physics and Astronomy, Northwestern University, Evanston, IL 60208, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL 60208, USA
- Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
23
|
Park JH, Cho YR. Computational drug repositioning with attention walking. Sci Rep 2024; 14:10072. [PMID: 38698208 PMCID: PMC11066070 DOI: 10.1038/s41598-024-60756-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 04/26/2024] [Indexed: 05/05/2024] Open
Abstract
Drug repositioning aims to identify new therapeutic indications for approved medications. Recently, the importance of computational drug repositioning has been highlighted because it can reduce the costs, development time, and risks compared to traditional drug discovery. Most approaches in this area use networks for systematic analysis. Inferring drug-disease associations is then defined as a link prediction problem in a heterogeneous network composed of drugs and diseases. In this article, we present a novel method of computational drug repositioning, named drug repositioning with attention walking (DRAW). DRAW proceeds as follows: first, a subgraph enclosing the target link for prediction is extracted. Second, a graph convolutional network captures the structural features of the labeled nodes in the subgraph. Third, the transition probabilities are computed using attention mechanisms and converted into random walk profiles. Finally, a multi-layer perceptron takes random walk profiles and predicts whether a target link exists. As an experiment, we constructed two heterogeneous networks with drug-drug similarities based on chemical structures and anatomical therapeutic chemical classification (ATC) codes. Using 10-fold cross-validation, DRAW achieved an area under the receiver operating characteristic (ROC) curve of 0.903 and outperformed state-of-the-art methods. Moreover, we demonstrated the results of case studies for selected drugs and diseases to further confirm the capability of DRAW to predict drug-disease associations.
Collapse
Affiliation(s)
- Jong-Hoon Park
- Division of Software, Yonsei University Mirae Campus, Wonju-si, 26493, Gangwon-do, Korea
| | - Young-Rae Cho
- Division of Software, Yonsei University Mirae Campus, Wonju-si, 26493, Gangwon-do, Korea.
- Division of Digital Healthcare, Yonsei University Mirae Campus, Wonju-si, 26493, Gangwon-do, Korea.
| |
Collapse
|
24
|
Liu JX, Zhang X, Huang YQ, Hao GF, Yang GF. Multi-level bioinformatics resources support drug target discovery of protein-protein interactions. Drug Discov Today 2024; 29:103979. [PMID: 38608830 DOI: 10.1016/j.drudis.2024.103979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 03/14/2024] [Accepted: 04/05/2024] [Indexed: 04/14/2024]
Abstract
Drug discovery often begins with a new target. Protein-protein interactions (PPIs) are crucial to multitudinous cellular processes and offer a promising avenue for drug-target discovery. PPIs are characterized by multi-level complexity: at the protein level, interaction networks can be used to identify potential targets, whereas at the residue level, the details of the interactions of individual PPIs can be used to examine a target's druggability. Much great progress has been made in target discovery through multi-level PPI-related computational approaches, but these resources have not been fully discussed. Here, we systematically survey bioinformatics tools for identifying and assessing potential drug targets, examining their characteristics, limitations and applications. This work will aid the integration of the broader protein-to-network context with the analysis of detailed binding mechanisms to support the discovery of drug targets.
Collapse
Affiliation(s)
- Jia-Xin Liu
- National Key Laboratory of Green Pesticide, Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, PR China
| | - Xiao Zhang
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals, Guizhou University, Guiyang 550025, PR China
| | - Yuan-Qin Huang
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals, Guizhou University, Guiyang 550025, PR China
| | - Ge-Fei Hao
- National Key Laboratory of Green Pesticide, Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, PR China; State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals, Guizhou University, Guiyang 550025, PR China.
| | - Guang-Fu Yang
- National Key Laboratory of Green Pesticide, Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, PR China.
| |
Collapse
|
25
|
Wright SN, Colton S, Schaffer LV, Pillich RT, Churas C, Pratt D, Ideker T. State of the Interactomes: an evaluation of molecular networks for generating biological insights. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.26.587073. [PMID: 38746239 PMCID: PMC11092493 DOI: 10.1101/2024.04.26.587073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Advancements in genomic and proteomic technologies have powered the use of gene and protein networks ("interactomes") for understanding genotype-phenotype translation. However, the proliferation of interactomes complicates the selection of networks for specific applications. Here, we present a comprehensive evaluation of 46 current human interactomes, encompassing protein-protein interactions as well as gene regulatory, signaling, colocalization, and genetic interaction networks. Our analysis shows that large composite networks such as HumanNet, STRING, and FunCoup are most effective for identifying disease genes, while smaller networks such as DIP and SIGNOR demonstrate strong interaction prediction performance. These findings provide a benchmark for interactomes across diverse network biology applications and clarify factors that influence network performance. Furthermore, our evaluation pipeline paves the way for continued assessment of emerging and updated interaction networks in the future.
Collapse
|
26
|
Grassmann G, Miotto M, Desantis F, Di Rienzo L, Tartaglia GG, Pastore A, Ruocco G, Monti M, Milanetti E. Computational Approaches to Predict Protein-Protein Interactions in Crowded Cellular Environments. Chem Rev 2024; 124:3932-3977. [PMID: 38535831 PMCID: PMC11009965 DOI: 10.1021/acs.chemrev.3c00550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 04/11/2024]
Abstract
Investigating protein-protein interactions is crucial for understanding cellular biological processes because proteins often function within molecular complexes rather than in isolation. While experimental and computational methods have provided valuable insights into these interactions, they often overlook a critical factor: the crowded cellular environment. This environment significantly impacts protein behavior, including structural stability, diffusion, and ultimately the nature of binding. In this review, we discuss theoretical and computational approaches that allow the modeling of biological systems to guide and complement experiments and can thus significantly advance the investigation, and possibly the predictions, of protein-protein interactions in the crowded environment of cell cytoplasm. We explore topics such as statistical mechanics for lattice simulations, hydrodynamic interactions, diffusion processes in high-viscosity environments, and several methods based on molecular dynamics simulations. By synergistically leveraging methods from biophysics and computational biology, we review the state of the art of computational methods to study the impact of molecular crowding on protein-protein interactions and discuss its potential revolutionizing effects on the characterization of the human interactome.
Collapse
Affiliation(s)
- Greta Grassmann
- Department
of Biochemical Sciences “Alessandro Rossi Fanelli”, Sapienza University of Rome, Rome 00185, Italy
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
| | - Mattia Miotto
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
| | - Fausta Desantis
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- The
Open University Affiliated Research Centre at Istituto Italiano di
Tecnologia, Genoa 16163, Italy
| | - Lorenzo Di Rienzo
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
| | - Gian Gaetano Tartaglia
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- Department
of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa 16163, Italy
- Center
for Human Technologies, Genoa 16152, Italy
| | - Annalisa Pastore
- Experiment
Division, European Synchrotron Radiation
Facility, Grenoble 38043, France
| | - Giancarlo Ruocco
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- Department
of Physics, Sapienza University, Rome 00185, Italy
| | - Michele Monti
- RNA
System Biology Lab, Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa 16163, Italy
| | - Edoardo Milanetti
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- Department
of Physics, Sapienza University, Rome 00185, Italy
| |
Collapse
|
27
|
Jia P, Zhang F, Wu C, Li M. A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond. Brief Bioinform 2024; 25:bbae162. [PMID: 38739759 PMCID: PMC11089422 DOI: 10.1093/bib/bbae162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 02/17/2024] [Accepted: 03/31/2024] [Indexed: 05/16/2024] Open
Abstract
Proteins interact with diverse ligands to perform a large number of biological functions, such as gene expression and signal transduction. Accurate identification of these protein-ligand interactions is crucial to the understanding of molecular mechanisms and the development of new drugs. However, traditional biological experiments are time-consuming and expensive. With the development of high-throughput technologies, an increasing amount of protein data is available. In the past decades, many computational methods have been developed to predict protein-ligand interactions. Here, we review a comprehensive set of over 160 protein-ligand interaction predictors, which cover protein-protein, protein-nucleic acid, protein-peptide and protein-other ligands (nucleotide, heme, ion) interactions. We have carried out a comprehensive analysis of the above four types of predictors from several significant perspectives, including their inputs, feature profiles, models, availability, etc. The current methods primarily rely on protein sequences, especially utilizing evolutionary information. The significant improvement in predictions is attributed to deep learning methods. Additionally, sequence-based pretrained models and structure-based approaches are emerging as new trends.
Collapse
Affiliation(s)
- Pengzhen Jia
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Fuhao Zhang
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
- College of Information Engineering, Northwest A&F University, No. 3 Taicheng Road, Yangling, Shaanxi 712100, China
| | - Chaojin Wu
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| |
Collapse
|
28
|
Saha S, Chatterjee P, Basu S, Nasipuri M. EPI-SF: essential protein identification in protein interaction networks using sequence features. PeerJ 2024; 12:e17010. [PMID: 38495766 PMCID: PMC10944162 DOI: 10.7717/peerj.17010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Accepted: 02/05/2024] [Indexed: 03/19/2024] Open
Abstract
Proteins are considered indispensable for facilitating an organism's viability, reproductive capabilities, and other fundamental physiological functions. Conventional biological assays are characterized by prolonged duration, extensive labor requirements, and financial expenses in order to identify essential proteins. Therefore, it is widely accepted that employing computational methods is the most expeditious and effective approach to successfully discerning essential proteins. Despite being a popular choice in machine learning (ML) applications, the deep learning (DL) method is not suggested for this specific research work based on sequence features due to the restricted availability of high-quality training sets of positive and negative samples. However, some DL works on limited availability of data are also executed at recent times which will be our future scope of work. Conventional ML techniques are thus utilized in this work due to their superior performance compared to DL methodologies. In consideration of the aforementioned, a technique called EPI-SF is proposed here, which employs ML to identify essential proteins within the protein-protein interaction network (PPIN). The protein sequence is the primary determinant of protein structure and function. So, initially, relevant protein sequence features are extracted from the proteins within the PPIN. These features are subsequently utilized as input for various machine learning models, including XGB Boost Classifier, AdaBoost Classifier, logistic regression (LR), support vector classification (SVM), Decision Tree model (DT), Random Forest model (RF), and Naïve Bayes model (NB). The objective is to detect the essential proteins within the PPIN. The primary investigation conducted on yeast examined the performance of various ML models for yeast PPIN. Among these models, the RF model technique had the highest level of effectiveness, as indicated by its precision, recall, F1-score, and AUC values of 0.703, 0.720, 0.711, and 0.745, respectively. It is also found to be better in performance when compared to the other state-of-arts based on traditional centrality like betweenness centrality (BC), closeness centrality (CC), etc. and deep learning methods as well like DeepEP, as emphasized in the result section. As a result of its favorable performance, EPI-SF is later employed for the prediction of novel essential proteins inside the human PPIN. Due to the tendency of viruses to selectively target essential proteins involved in the transmission of diseases within human PPIN, investigations are conducted to assess the probable involvement of these proteins in COVID-19 and other related severe diseases.
Collapse
Affiliation(s)
- Sovan Saha
- Department of Computer Science & Engineering (Artificial Intelligence & Machine Learning), Techno Main Salt Lake, Kolkata, West Bengal, India
| | - Piyali Chatterjee
- Department of Computer Science & Engineering, Netaji Subhash Engineering College, Kolkata, West Bengal, India
| | - Subhadip Basu
- Department of Computer Science & Engineering, Jadavpur University, Kolkata, West Bengal, India
| | - Mita Nasipuri
- Department of Computer Science & Engineering, Jadavpur University, Kolkata, West Bengal, India
| |
Collapse
|
29
|
Xian L, Wang Y. Advances in Computational Methods for Protein–Protein Interaction Prediction. ELECTRONICS 2024; 13:1059. [DOI: 10.3390/electronics13061059] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Abstract
Protein–protein interactions (PPIs) are pivotal in various physiological processes inside biological entities. Accurate identification of PPIs holds paramount significance for comprehending biological processes, deciphering disease mechanisms, and advancing medical research. Given the costly and labor-intensive nature of experimental approaches, a multitude of computational methods have been devised to enable swift and large-scale PPI prediction. This review offers a thorough examination of recent strides in computational methodologies for PPI prediction, with a particular focus on the utilization of deep learning techniques within this domain. Alongside a systematic classification and discussion of relevant databases, feature extraction strategies, and prominent computational approaches, we conclude with a thorough analysis of current challenges and prospects for the future of this field.
Collapse
Affiliation(s)
- Lei Xian
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Yansu Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
30
|
Ran Y, Xu XK, Jia T. The maximum capability of a topological feature in link prediction. PNAS NEXUS 2024; 3:pgae113. [PMID: 38528954 PMCID: PMC10962729 DOI: 10.1093/pnasnexus/pgae113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 02/21/2024] [Indexed: 03/27/2024]
Abstract
Networks offer a powerful approach to modeling complex systems by representing the underlying set of pairwise interactions. Link prediction is the task that predicts links of a network that are not directly visible, with profound applications in biological, social, and other complex systems. Despite intensive utilization of the topological feature in this task, it is unclear to what extent a feature can be leveraged to infer missing links. Here, we aim to unveil the capability of a topological feature in link prediction by identifying its prediction performance upper bound. We introduce a theoretical framework that is compatible with different indexes to gauge the feature, different prediction approaches to utilize the feature, and different metrics to quantify the prediction performance. The maximum capability of a topological feature follows a simple yet theoretically validated expression, which only depends on the extent to which the feature is held in missing and nonexistent links. Because a family of indexes based on the same feature shares the same upper bound, the potential of all others can be estimated from one single index. Furthermore, a feature's capability is lifted in the supervised prediction, which can be mathematically quantified, allowing us to estimate the benefit of applying machine learning algorithms. The universality of the pattern uncovered is empirically verified by 550 structurally diverse networks. The findings have applications in feature and method selection, and shed light on network characteristics that make a topological feature effective in link prediction.
Collapse
Affiliation(s)
- Yijun Ran
- College of Computer and Information Science, Southwest University, Chongqing 400715, P.R. China
- Center for Computational Communication Research, Beijing Normal University, Zhuhai 519087, P.R. China
- School of Journalism and Communication, Beijing Normal University, Beijing 100875, P.R. China
| | - Xiao-Ke Xu
- Center for Computational Communication Research, Beijing Normal University, Zhuhai 519087, P.R. China
- School of Journalism and Communication, Beijing Normal University, Beijing 100875, P.R. China
| | - Tao Jia
- College of Computer and Information Science, Southwest University, Chongqing 400715, P.R. China
| |
Collapse
|
31
|
Moutinho JP, Magano D, Coutinho B. On the complexity of quantum link prediction in complex networks. Sci Rep 2024; 14:1026. [PMID: 38200071 PMCID: PMC10781705 DOI: 10.1038/s41598-023-49906-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Accepted: 12/13/2023] [Indexed: 01/12/2024] Open
Abstract
Link prediction methods use patterns in known network data to infer which connections may be missing. Previous work has shown that continuous-time quantum walks can be used to represent path-based link prediction, which we further study here to develop a more optimized quantum algorithm. Using a sampling framework for link prediction, we analyze the query access to the input network required to produce a certain number of prediction samples. Considering both well-known classical path-based algorithms using powers of the adjacency matrix as well as our proposed quantum algorithm for path-based link prediction, we argue that there is a polynomial quantum advantage on the dependence on N, the number of nodes in the network. We further argue that the complexity of our algorithm, although sub-linear in N, is limited by the complexity of performing a quantum simulation of the network's adjacency matrix, which may prove to be an important problem in the development of quantum algorithms for network science in general.
Collapse
Affiliation(s)
- João P Moutinho
- Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal.
- Instituto de Telecomunicações, Lisboa, Portugal.
| | - Duarte Magano
- Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
- Instituto de Telecomunicações, Lisboa, Portugal
| | | |
Collapse
|
32
|
Yu Z, Wu Z, Wang Z, Wang Y, Zhou M, Li W, Liu G, Tang Y. Network-Based Methods and Their Applications in Drug Discovery. J Chem Inf Model 2024; 64:57-75. [PMID: 38150548 DOI: 10.1021/acs.jcim.3c01613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2023]
Abstract
Drug discovery is time-consuming, expensive, and predominantly follows the "one drug → one target → one disease" paradigm. With the rapid development of systems biology and network pharmacology, a novel drug discovery paradigm, "multidrug → multitarget → multidisease", has emerged. This new holistic paradigm of drug discovery aligns well with the essence of networks, leading to the emergence of network-based methods in the field of drug discovery. In this Perspective, we initially introduce the concept and data sources of networks and highlight classical methodologies employed in network-based methods. Subsequently, we focus on the practical applications of network-based methods across various areas of drug discovery, such as target prediction, virtual screening, prediction of drug therapeutic effects or adverse drug events, and elucidation of molecular mechanisms. In addition, we provide representative web servers for researchers to use network-based methods in specific applications. Finally, we discuss several challenges of network-based methods and the directions for future development. In a word, network-based methods could serve as powerful tools to accelerate drug discovery.
Collapse
Affiliation(s)
- Zhuohang Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Zengrui Wu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Ze Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yimeng Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Moran Zhou
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
33
|
Samad SS, Schwartz JM, Francavilla C. Functional selectivity of Receptor Tyrosine Kinases regulates distinct cellular outputs. Front Cell Dev Biol 2024; 11:1348056. [PMID: 38259512 PMCID: PMC10800419 DOI: 10.3389/fcell.2023.1348056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 12/19/2023] [Indexed: 01/24/2024] Open
Abstract
Functional selectivity refers to the activation of differential signalling and cellular outputs downstream of the same membrane-bound receptor when activated by two or more different ligands. Functional selectivity has been described and extensively studied for G-protein Coupled Receptors (GPCRs), leading to specific therapeutic options for dysregulated GPCRs functions. However, studies regarding the functional selectivity of Receptor Tyrosine Kinases (RTKs) remain sparse. Here, we will summarize recent data about RTK functional selectivity focusing on how the nature and the amount of RTK ligands and the crosstalk of RTKs with other membrane proteins regulate the specificity of RTK signalling. In addition, we will discuss how structural changes in RTKs upon ligand binding affects selective signalling pathways. Much remains to be known about the integration of different signals affecting RTK signalling specificity to orchestrate long-term cellular outcomes. Recent advancements in omics, specifically quantitative phosphoproteomics, and in systems biology methods to study, model and integrate different types of large-scale omics data have increased our ability to compare several signals affecting RTK functional selectivity in a global, system-wide fashion. We will discuss how such methods facilitate the exploration of important signalling hubs and enable data-driven predictions aiming at improving the efficacy of therapeutics for diseases like cancer, where redundant RTK signalling pathways often compromise treatment efficacy.
Collapse
Affiliation(s)
- Sakim S. Samad
- Division of Molecular and Cellular Functions, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
- Division of Evolution, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
| | - Jean-Marc Schwartz
- Division of Evolution, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
| | - Chiara Francavilla
- Division of Molecular and Cellular Functions, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
- Section of Protein Science and Biotherapeutics, Department of Bioengineering and Biomedicine, Danish Technical University, Lyngby, Denmark
| |
Collapse
|
34
|
Lyu H, Kureh YH, Vendrow J, Porter MA. Learning low-rank latent mesoscale structures in networks. Nat Commun 2024; 15:224. [PMID: 38172092 PMCID: PMC10764844 DOI: 10.1038/s41467-023-42859-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 10/24/2023] [Indexed: 01/05/2024] Open
Abstract
Researchers in many fields use networks to represent interactions between entities in complex systems. To study the large-scale behavior of complex systems, it is useful to examine mesoscale structures in networks as building blocks that influence such behavior. In this paper, we present an approach to describe low-rank mesoscale structures in networks. We find that many real-world networks possess a small set of latent motifs that effectively approximate most subgraphs at a fixed mesoscale. Such low-rank mesoscale structures allow one to reconstruct networks by approximating subgraphs of a network using combinations of latent motifs. Employing subgraph sampling and nonnegative matrix factorization enables the discovery of these latent motifs. The ability to encode and reconstruct networks using a small set of latent motifs has many applications in network analysis, including network comparison, network denoising, and edge inference.
Collapse
Affiliation(s)
- Hanbaek Lyu
- Department of Mathematics, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| | - Yacoub H Kureh
- Department of Mathematics, University of California, Los Angeles, CA, 90095, USA
| | - Joshua Vendrow
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Mason A Porter
- Department of Mathematics, University of California, Los Angeles, CA, 90095, USA
- Department of Sociology, University of California, Los Angeles, CA, 90095, USA
- Santa Fe Institute, Sante FE, NM, 87501, USA
| |
Collapse
|
35
|
Son J, Kim D. Applying network link prediction in drug discovery: an overview of the literature. Expert Opin Drug Discov 2024; 19:43-56. [PMID: 37794688 DOI: 10.1080/17460441.2023.2267020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 10/02/2023] [Indexed: 10/06/2023]
Abstract
INTRODUCTION Network representation can give a holistic view of relationships for biomedical entities through network topology. Link prediction estimates the probability of link formation between the pair of unconnected nodes. In the drug discovery process, the link prediction method not only enables the detection of connectivity patterns but also predicts the effects of one biomedical entity to multiple entities simultaneously and vice versa, which is useful for many applications. AREAS COVERED The authors provide a comprehensive overview of network link prediction in drug discovery. Link prediction methodologies such as similarity-based approaches, embedding-based approaches, probabilistic model-based approaches, and preprocessing methods are summarized with examples. In addition to describing their properties and limitations, the authors discuss the applications of link prediction in drug discovery based on the relationship between biomedical concepts. EXPERT OPINION Link prediction is a powerful method to infer the existence of novel relationships in drug discovery. However, link prediction has been hampered by the sparsity of data and the lack of negative links in biomedical networks. With preprocessing to balance positive and negative samples and the collection of more data, the authors believe it is possible to develop more reliable link prediction methods that can become invaluable tools for successful drug discovery.
Collapse
Affiliation(s)
- Jeongtae Son
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Dongsup Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| |
Collapse
|
36
|
Jang YH, Han J, Shim SK, Cheong S, Lee SH, Han JK, Hwang CS. Cross-Wired Memristive Crossbar Array for Effective Graph Data Analysis. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2023:e2311040. [PMID: 38145578 DOI: 10.1002/adma.202311040] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 12/06/2023] [Indexed: 12/27/2023]
Abstract
Graphs adequately represent the enormous interconnections among numerous entities in big data, incurring high computational costs in analyzing them with conventional hardware. Physical graph representation (PGR) is an approach that replicates the graph within a physical system, allowing for efficient analysis. This study introduces a cross-wired crossbar array (cwCBA), uniquely connecting diagonal and non-diagonal components in a CBA by a cross-wiring process. The cross-wired diagonal cells enable cwCBA to achieve precise PGR and dynamic node state control. For this purpose, a cwCBA is fabricated using Pt/Ta2 O5 /HfO2 /TiN (PTHT) memristor with high on/off and self-rectifying characteristics. The structural and device benefits of PTHT cwCBA for enhanced PGR precision are highlighted, and the practical efficacy is demonstrated for two applications. First, it executes a dynamic path-finding algorithm, identifying the shortest paths in a dynamic graph. PTHT cwCBA shows a more accurate inferred distance and ≈1/3800 lower processing complexity than the conventional method. Second, it analyzes the protein-protein interaction (PPI) networks containing self-interacting proteins, which possess intricate characteristics compared to typical graphs. The PPI prediction results exhibit an average of 30.5% and 21.3% improvement in area under the curve and F1-score, respectively, compared to existing algorithms.
Collapse
Affiliation(s)
- Yoon Ho Jang
- Department of Materials Science and Engineering and Inter-university Semiconductor Research Center, College of Engineering, Seoul National University, Seoul, 08826, Republic of Korea
| | - Janguk Han
- Department of Materials Science and Engineering and Inter-university Semiconductor Research Center, College of Engineering, Seoul National University, Seoul, 08826, Republic of Korea
| | - Sung Keun Shim
- Department of Materials Science and Engineering and Inter-university Semiconductor Research Center, College of Engineering, Seoul National University, Seoul, 08826, Republic of Korea
| | - Sunwoo Cheong
- Department of Materials Science and Engineering and Inter-university Semiconductor Research Center, College of Engineering, Seoul National University, Seoul, 08826, Republic of Korea
| | - Soo Hyung Lee
- Department of Materials Science and Engineering and Inter-university Semiconductor Research Center, College of Engineering, Seoul National University, Seoul, 08826, Republic of Korea
| | - Joon-Kyu Han
- Department of Materials Science and Engineering and Inter-university Semiconductor Research Center, College of Engineering, Seoul National University, Seoul, 08826, Republic of Korea
| | - Cheol Seong Hwang
- Department of Materials Science and Engineering and Inter-university Semiconductor Research Center, College of Engineering, Seoul National University, Seoul, 08826, Republic of Korea
| |
Collapse
|
37
|
Lu H, Uddin S. Embedding-based link predictions to explore latent comorbidity of chronic diseases. Health Inf Sci Syst 2023; 11:2. [PMID: 36593862 PMCID: PMC9803807 DOI: 10.1007/s13755-022-00206-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 12/13/2022] [Indexed: 12/31/2022] Open
Abstract
Purpose Comorbidity is a term used to describe when a patient simultaneously has more than one chronic disease. Comorbidity is a significant health issue that affects people worldwide. This study aims to use machine learning and graph theory to predict the comorbidity of chronic diseases. Methods A patient-disease bipartite graph is constructed based on the administrative claim data. The bipartite graph projection approach was used to create the comorbidity network. For the link prediction task, three graph machine learning embedding-based models (node2vec, graph neural networks and hand-crafted approach) with different variants were used on the comorbidity network to compare their performance. This study also considered three commonly used similarity-based link prediction approaches (Jaccard coefficient, Adamic-Adar index and Resource allocation index) for performance comparison. Results The results showed that the embedding-based hand-crafted features technique achieved outstanding performance compared with the remaining similarity-based and embedding-based models. Especially, the hand-crafted technique with the extreme gradient boosting classifier achieved the highest accuracy (91.67%), followed by the same technique with the Logistic regression classifier (90.26%). For this shallow embedding method, the Jaccard coefficient and the degree centrality of the original chronic disease were the most important features for comorbidity prediction. Conclusion The proposed framework can be used to predict the comorbidity of chronic disease at an early stage of hospital admission. Thus, the prediction outcome could be valuable for medical practice, giving healthcare providers more control over their services and lowering expenses.
Collapse
Affiliation(s)
- Haohui Lu
- School of Project Management, Faculty of Engineering, The University of Sydney, Level 2, 21 Ross Street, Forest Lodge, NSW 2037 Australia
| | - Shahadat Uddin
- School of Project Management, Faculty of Engineering, The University of Sydney, Level 2, 21 Ross Street, Forest Lodge, NSW 2037 Australia
| |
Collapse
|
38
|
Chen J, Gu Z, Lai L, Pei J. In silico protein function prediction: the rise of machine learning-based approaches. MEDICAL REVIEW (2021) 2023; 3:487-510. [PMID: 38282798 PMCID: PMC10808870 DOI: 10.1515/mr-2023-0038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/11/2023] [Indexed: 01/30/2024]
Abstract
Proteins function as integral actors in essential life processes, rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investigation. Within the context of protein research, an imperious demand arises to uncover protein functionalities and untangle intricate mechanistic underpinnings. Due to the exorbitant costs and limited throughput inherent in experimental investigations, computational models offer a promising alternative to accelerate protein function annotation. In recent years, protein pre-training models have exhibited noteworthy advancement across multiple prediction tasks. This advancement highlights a notable prospect for effectively tackling the intricate downstream task associated with protein function prediction. In this review, we elucidate the historical evolution and research paradigms of computational methods for predicting protein function. Subsequently, we summarize the progress in protein and molecule representation as well as feature extraction techniques. Furthermore, we assess the performance of machine learning-based algorithms across various objectives in protein function prediction, thereby offering a comprehensive perspective on the progress within this field.
Collapse
Affiliation(s)
- Jiaxiao Chen
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Zhonghui Gu
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
- Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, China
| |
Collapse
|
39
|
Mirzaei G. Constructing gene similarity networks using co-occurrence probabilities. BMC Genomics 2023; 24:697. [PMID: 37990157 PMCID: PMC10662556 DOI: 10.1186/s12864-023-09780-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 11/01/2023] [Indexed: 11/23/2023] Open
Abstract
Gene similarity networks play important role in unraveling the intricate associations within diverse cancer types. Conventionally, gauging the similarity between genes has been approached through experimental methodologies involving chemical and molecular analyses, or through the lens of mathematical techniques. However, in our work, we have pioneered a distinctive mathematical framework, one rooted in the co-occurrence of attribute values and single point mutations, thereby establishing a novel approach for quantifying the dissimilarity or similarity among genes. Central to our approach is the recognition of mutations as key players in the evolutionary trajectory of cancer. Anchored in this understanding, our methodology hinges on the consideration of two categorical attributes: mutation type and nucleotide change. These attributes are pivotal, as they encapsulate the critical variations that can precipitate substantial changes in gene behavior and ultimately influence disease progression. Our study takes on the challenge of formulating similarity measures that are intrinsic to genes' categorical data. Taking into account the co-occurrence probability of attribute values within single point mutations, our innovative mathematical approach surpasses the boundaries of conventional methods. We thereby provide a robust and comprehensive means to assess gene similarity and take a significant step forward in refining the tools available for uncovering the subtle yet impactful associations within the complex realm of gene interactions in cancer.
Collapse
Affiliation(s)
- Golrokh Mirzaei
- Department of Computer Science and Engineering, The Ohio State University, Marion, USA.
| |
Collapse
|
40
|
Morselli Gysi D, Barabási AL. Noncoding RNAs improve the predictive power of network medicine. Proc Natl Acad Sci U S A 2023; 120:e2301342120. [PMID: 37906646 PMCID: PMC10636370 DOI: 10.1073/pnas.2301342120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 09/09/2023] [Indexed: 11/02/2023] Open
Abstract
Network medicine has improved the mechanistic understanding of disease, offering quantitative insights into disease mechanisms, comorbidities, and novel diagnostic tools and therapeutic treatments. Yet, most network-based approaches rely on a comprehensive map of protein-protein interactions (PPI), ignoring interactions mediated by noncoding RNAs (ncRNAs). Here, we systematically combine experimentally confirmed binding interactions mediated by ncRNA with PPI, constructing a comprehensive network of all physical interactions in the human cell. We find that the inclusion of ncRNA expands the number of genes in the interactome by 46% and the number of interactions by 107%, significantly enhancing our ability to identify disease modules. Indeed, we find that 132 diseases lacked a statistically significant disease module in the protein-based interactome but have a statistically significant disease module after inclusion of ncRNA-mediated interactions, making these diseases accessible to the tools of network medicine. We show that the inclusion of ncRNAs helps unveil disease-disease relationships that were not detectable before and expands our ability to predict comorbidity patterns between diseases. Taken together, we find that including noncoding interactions improves both the breath and the predictive accuracy of network medicine.
Collapse
Affiliation(s)
- Deisy Morselli Gysi
- Network Science Institute, Northeastern University, Boston, MA02115
- Department of Physics, Northeastern University, Boston, MA02115
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA02115
- US Department of Veteran Affairs, Boston, MA02130
| | - Albert-László Barabási
- Network Science Institute, Northeastern University, Boston, MA02115
- Department of Physics, Northeastern University, Boston, MA02115
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA02115
- US Department of Veteran Affairs, Boston, MA02130
- Department of Network and Data Science, Central European University, Budapest1051, Hungary
| |
Collapse
|
41
|
Guang B, Gao X, Chen X, Li R, Ma L. Dissection of action mechanisms of Zuogui Pill in the treatment of liver cancer based on machine learning and network pharmacology: A review. Medicine (Baltimore) 2023; 102:e35628. [PMID: 37861529 PMCID: PMC10589513 DOI: 10.1097/md.0000000000035628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 09/22/2023] [Indexed: 10/21/2023] Open
Abstract
This study aimed to investigate the underlying mechanism of Zuogui Pill in its efficacy against liver cancer, employing a combination of data mining approaches and network pharmacology methods. A novel clustering analysis algorithm was proposed to identify the core gene modules of Zuogui Pill. This algorithm successfully identified 5 core modules, with the first large module comprised of twelve proteins forming a 12-clique, representing the strongest connections among them. By utilizing GEO platform, ten key target proteins were detected, including FOS, PTGS2, and MYC. According to the GO annotation and KEGG analysis, desired target proteins were significantly enriched in various biological processes (BP). The analysis showed that ten key targets were strongly associated with signaling pathways mainly centered on MAPK and PI3K-Akt pathway. Additionally, molecular docking revealed strong binding affinities between core active ingredients of Zuogui Pill and these key targets, and the best affinity modes were observed for PTGS2-Sesamin, PRKCA-Sesamin, FOS-delta-Carotene. In order to establish the relationships between clinical symptoms and drug targets, a heterogeneous targets-related network was constructed. A total of 60 key target-symptom association pairs were detected, exemplified by the strongly association between fever and PTGS2 through the intermediary of Shu Di Huang. In summary, symptom-target associations are valuable in uncovering the underlying molecular mechanisms of Zuogui Pill. Our work reinforced the notion that Zuogui pill exhibits therapeutic potential on liver cancer through network targets, as well as synergistic effects of multi-component and multi-pathway. This study provided specific references for future experiments at the cost of less time.
Collapse
Affiliation(s)
- Biao Guang
- College of Information Engineering, Hubei University of Chinese Medicine, Wuhan, China
| | - Xiang Gao
- Institute of Liver Disease, Hospital of Hubei University of Chinese Medicine, Wuhan, China
- Affiliated Hospital of Hubei University of Chinese Medicine, Wuhan, China
| | - Xiangrong Chen
- School of Foreign Language, Hubei University of Chinese Medicine, Wuhan, China
| | - RuiLing Li
- College of Information Engineering, Hubei University of Chinese Medicine, Wuhan, China
| | - Li Ma
- College of Information Engineering, Hubei University of Chinese Medicine, Wuhan, China
| |
Collapse
|
42
|
Caprioli B, Eichler RAS, Silva RNO, Martucci LF, Reckziegel P, Ferro ES. Neurolysin Knockout Mice in a Diet-Induced Obesity Model. Int J Mol Sci 2023; 24:15190. [PMID: 37894869 PMCID: PMC10607720 DOI: 10.3390/ijms242015190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 10/06/2023] [Accepted: 10/09/2023] [Indexed: 10/29/2023] Open
Abstract
Neurolysin oligopeptidase (E.C.3.4.24.16; Nln), a member of the zinc metallopeptidase M3 family, was first identified in rat brain synaptic membranes hydrolyzing neurotensin at the Pro-Tyr peptide bond. The previous development of C57BL6/N mice with suppression of Nln gene expression (Nln-/-), demonstrated the biological relevance of this oligopeptidase for insulin signaling and glucose uptake. Here, several metabolic parameters were investigated in Nln-/- and wild-type C57BL6/N animals (WT; n = 5-8), male and female, fed either a standard (SD) or a hypercaloric diet (HD), for seven weeks. Higher food intake and body mass gain was observed for Nln-/- animals fed HD, compared to both male and female WT control animals fed HD. Leptin gene expression was higher in Nln-/- male and female animals fed HD, compared to WT controls. Both WT and Nln-/- females fed HD showed similar gene expression increase of dipeptidyl peptidase 4 (DPP4), a peptidase related to glucagon-like peptide-1 (GLP-1) metabolism. The present data suggest that Nln participates in the physiological mechanisms related to diet-induced obesity. Further studies will be necessary to better understand the molecular mechanism responsible for the higher body mass gain observed in Nln-/- animals fed HD.
Collapse
Affiliation(s)
- Bruna Caprioli
- Pharmacology Department, Biomedical Sciences Institute (ICB), São Paulo 05508-000, SP, Brazil; (B.C.); (R.A.S.E.); (R.N.O.S.); (L.F.M.)
| | - Rosangela A. S. Eichler
- Pharmacology Department, Biomedical Sciences Institute (ICB), São Paulo 05508-000, SP, Brazil; (B.C.); (R.A.S.E.); (R.N.O.S.); (L.F.M.)
| | - Renée N. O. Silva
- Pharmacology Department, Biomedical Sciences Institute (ICB), São Paulo 05508-000, SP, Brazil; (B.C.); (R.A.S.E.); (R.N.O.S.); (L.F.M.)
| | - Luiz Felipe Martucci
- Pharmacology Department, Biomedical Sciences Institute (ICB), São Paulo 05508-000, SP, Brazil; (B.C.); (R.A.S.E.); (R.N.O.S.); (L.F.M.)
| | - Patricia Reckziegel
- Department of Clinical and Toxicological Analysis, Faculty of Pharmaceutical Sciences (FCF), University of São Paulo (USP), São Paulo 05508-000, SP, Brazil;
| | - Emer S. Ferro
- Pharmacology Department, Biomedical Sciences Institute (ICB), São Paulo 05508-000, SP, Brazil; (B.C.); (R.A.S.E.); (R.N.O.S.); (L.F.M.)
| |
Collapse
|
43
|
Singh P, Kuder H, Ritz A. Identification of disease modules using higher-order network structure. BIOINFORMATICS ADVANCES 2023; 3:vbad140. [PMID: 37860106 PMCID: PMC10582521 DOI: 10.1093/bioadv/vbad140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 09/18/2023] [Accepted: 10/03/2023] [Indexed: 10/21/2023]
Abstract
Motivation Higher-order interaction patterns among proteins have the potential to reveal mechanisms behind molecular processes and diseases. While clustering methods are used to identify functional groups within molecular interaction networks, these methods largely focus on edge density and do not explicitly take into consideration higher-order interactions. Disease genes in these networks have been shown to exhibit rich higher-order structure in their vicinity, and considering these higher-order interaction patterns in network clustering have the potential to reveal new disease-associated modules. Results We propose a higher-order community detection method which identifies community structure in networks with respect to specific higher-order connectivity patterns beyond edges. Higher-order community detection on four different protein-protein interaction networks identifies biologically significant modules and disease modules that conventional edge-based clustering methods fail to discover. Higher-order clusters also identify disease modules from genome-wide association study data, including new modules that were not discovered by top-performing approaches in a Disease Module DREAM Challenge. Our approach provides a more comprehensive view of community structure that enables us to predict new disease-gene associations. Availability and implementation https://github.com/Reed-CompBio/graphlet-clustering.
Collapse
Affiliation(s)
- Pramesh Singh
- Biology Department, Reed College, Portland, OR 97202, United States
- Data Intensive Studies Center, Tufts University, Medford, MA 02155, United States
| | - Hannah Kuder
- Physics Department, Reed College, Portland, OR 97202, United States
| | - Anna Ritz
- Biology Department, Reed College, Portland, OR 97202, United States
| |
Collapse
|
44
|
Peng R, Deng M. Mapping the protein-protein interactome in the tumor immune microenvironment. Antib Ther 2023; 6:311-321. [PMID: 38098892 PMCID: PMC10720949 DOI: 10.1093/abt/tbad026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 10/01/2023] [Accepted: 11/02/2023] [Indexed: 12/17/2023] Open
Abstract
The cell-to-cell communication primarily occurs through cell-surface and secreted proteins, which form a sophisticated network that coordinates systemic immune function. Uncovering these protein-protein interactions (PPIs) is indispensable for understanding the molecular mechanism and elucidating immune system aberrances under diseases. Traditional biological studies typically focus on a limited number of PPI pairs due to the relative low throughput of commonly used techniques. Encouragingly, classical methods have advanced, and many new systems tailored for large-scale protein-protein screening have been developed and successfully utilized. These high-throughput PPI investigation techniques have already made considerable achievements in mapping the immune cell interactome, enriching PPI databases and analysis tools, and discovering therapeutic targets for cancer and other diseases, which will definitely bring unprecedented insight into this field.
Collapse
Affiliation(s)
- Rui Peng
- Peking University International Cancer Institute, Health Science Center, Peking University, Beijing 100191, PR China
- School of Basic Medical Sciences, Health Science Center, Peking University, Beijing 100191, PR China
| | - Mi Deng
- Peking University International Cancer Institute, Health Science Center, Peking University, Beijing 100191, PR China
- School of Basic Medical Sciences, Health Science Center, Peking University, Beijing 100191, PR China
- Peking University Cancer Hospital and Institute, Peking University, Beijing 100142, PR China
| |
Collapse
|
45
|
Pan Y, Li R, Li W, Lv L, Guan J, Zhou S. HPC-Atlas: Computationally Constructing A Comprehensive Atlas of Human Protein Complexes. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:976-990. [PMID: 37730114 PMCID: PMC10928439 DOI: 10.1016/j.gpb.2023.05.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 04/23/2023] [Accepted: 05/08/2023] [Indexed: 09/22/2023]
Abstract
A fundamental principle of biology is that proteins tend to form complexes to play important roles in the core functions of cells. For a complete understanding of human cellular functions, it is crucial to have a comprehensive atlas of human protein complexes. Unfortunately, we still lack such a comprehensive atlas of experimentally validated protein complexes, which prevents us from gaining a complete understanding of the compositions and functions of human protein complexes, as well as the underlying biological mechanisms. To fill this gap, we built Human Protein Complexes Atlas (HPC-Atlas), as far as we know, the most accurate and comprehensive atlas of human protein complexes available to date. We integrated two latest protein interaction networks, and developed a novel computational method to identify nearly 9000 protein complexes, including many previously uncharacterized complexes. Compared with the existing methods, our method achieved outstanding performance on both testing and independent datasets. Furthermore, with HPC-Atlas we identified 751 severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-affected human protein complexes, and 456 multifunctional proteins that contain many potential moonlighting proteins. These results suggest that HPC-Atlas can serve as not only a computing framework to effectively identify biologically meaningful protein complexes by integrating multiple protein data sources, but also a valuable resource for exploring new biological findings. The HPC-Atlas webserver is freely available at http://www.yulpan.top/HPC-Atlas.
Collapse
Affiliation(s)
- Yuliang Pan
- Department of Computer Science and Technology, College of Electronic and Information Engineering, Tongji University, Shanghai 201804, China
| | - Ruiyi Li
- Translational Medical Center for Stem Cell Therapy, Shanghai East Hospital, School of Medicine, Tongji University, Shanghai 200120, China
| | - Wengen Li
- Department of Computer Science and Technology, College of Electronic and Information Engineering, Tongji University, Shanghai 201804, China
| | - Liuzhenghao Lv
- Department of Computer Science and Technology, College of Electronic and Information Engineering, Tongji University, Shanghai 201804, China
| | - Jihong Guan
- Department of Computer Science and Technology, College of Electronic and Information Engineering, Tongji University, Shanghai 201804, China.
| | - Shuigeng Zhou
- Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai 200433, China.
| |
Collapse
|
46
|
Martucci LF, Eichler RA, Silva RN, Costa TJ, Tostes RC, Busatto GF, Seelaender MC, Duarte AJ, Souza HP, Ferro ES. Intracellular peptides in SARS-CoV-2-infected patients. iScience 2023; 26:107542. [PMID: 37636076 PMCID: PMC10448160 DOI: 10.1016/j.isci.2023.107542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 05/29/2023] [Accepted: 08/01/2023] [Indexed: 08/29/2023] Open
Abstract
Intracellular peptides (InPeps) generated by the orchestrated action of the proteasome and intracellular peptidases have biological and pharmacological significance. Here, human plasma relative concentration of specific InPeps was compared between 175 patients infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), and 45 SARS-CoV-2 non-infected patients; 2,466 unique peptides were identified, of which 67% were InPeps. The results revealed differences of a specific group of peptides in human plasma comparing non-infected individuals to patients infected by SARS-CoV-2, following the results of the semi-quantitative analyses by isotope-labeled electrospray mass spectrometry. The protein-protein interactions networks enriched pathways, drawn by genes encoding the proteins from which the peptides originated, revealed the presence of the coronavirus disease/COVID-19 network solely in the group of patients fatally infected by SARS-CoV-2. Thus, modulation of the relative plasma levels of specific InPeps could be employed as a predictive tool for disease outcome.
Collapse
Affiliation(s)
- Luiz Felipe Martucci
- Department of Pharmacology, Biomedical Sciences Institute, São Paulo 05508-000, Brazil
| | | | - Renée N.O. Silva
- Department of Pharmacology, Biomedical Sciences Institute, São Paulo 05508-000, Brazil
| | - Tiago J. Costa
- Department of Pharmacology, Ribeirao Preto Medical School, Ribeirão Preto 14049-900, Brazil
| | - Rita C. Tostes
- Department of Pharmacology, Ribeirao Preto Medical School, Ribeirão Preto 14049-900, Brazil
| | - Geraldo F. Busatto
- Department of Psichiatry, Medical School and Hospital das Clínicas, University of São Paulo, 01246-903 SP, Brazil
| | - Marilia C.L. Seelaender
- Department of Surgery, Medical School and Hospital das Clínicas, University of São Paulo, 01246-903 SP, Brazil
| | - Alberto J.S. Duarte
- Department of Patology, Medical School and Hospital das Clínicas, University of São Paulo, 01246-903 SP, Brazil
| | - Heraldo P. Souza
- Department of Internal Medicine, Medical School and Hospital das Clínicas, University of São Paulo, 01246-903 SP, Brazil
| | - Emer S. Ferro
- Department of Pharmacology, Biomedical Sciences Institute, São Paulo 05508-000, Brazil
- Department of Patology, Medical School and Hospital das Clínicas, University of São Paulo, 01246-903 SP, Brazil
- Department of Internal Medicine, Medical School and Hospital das Clínicas, University of São Paulo, 01246-903 SP, Brazil
| |
Collapse
|
47
|
Aziz F, Slater LT, Bravo-Merodio L, Acharjee A, Gkoutos GV. Link prediction in complex network using information flow. Sci Rep 2023; 13:14660. [PMID: 37669983 PMCID: PMC10480459 DOI: 10.1038/s41598-023-41476-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 08/27/2023] [Indexed: 09/07/2023] Open
Abstract
Link prediction in complex networks has recently attracted a great deal of attraction in diverse scientific domains, including social and biological sciences. Given a snapshot of a network, the goal is to predict links that are missing in the network or that are likely to occur in the near future. This problem has both theoretical and practical significance; it not only helps us to identify missing links in a network more efficiently by avoiding the expensive and time consuming experimental processes, but also allows us to study the evolution of a network with time. To address the problem of link prediction, numerous attempts have been made over the recent years that exploit the local and the global topological properties of the network to predict missing links in the network. In this paper, we use parametrised matrix forest index (PMFI) to predict missing links in a network. We show that, for small parameter values, this index is linked to a heat diffusion process on a graph and therefore encodes geometric properties of the network. We then develop a framework that combines the PMFI with a local similarity index to predict missing links in the network. The framework is applied to numerous networks obtained from diverse domains such as social network, biological network, and transport network. The results show that the proposed method can predict missing links with higher accuracy when compared to other state-of-the-art link prediction methods.
Collapse
Affiliation(s)
- Furqan Aziz
- School of Computing and Mathematical Sciences, University of Leicester, University Rd, Leicester, LE1 7RH, UK.
- Centre for Health Data Science, Birmingham, B15 2WB, UK.
| | - Luke T Slater
- Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, B15 2TT, UK
- Institute of Translational Medicine, University of Birmingham, Birmingham, B15 2TT, UK
- Centre for Health Data Science, Birmingham, B15 2WB, UK
| | - Laura Bravo-Merodio
- Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, B15 2TT, UK
- Institute of Translational Medicine, University of Birmingham, Birmingham, B15 2TT, UK
- Centre for Health Data Science, Birmingham, B15 2WB, UK
| | - Animesh Acharjee
- Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, B15 2TT, UK
- Institute of Translational Medicine, University of Birmingham, Birmingham, B15 2TT, UK
- MRC Health Data Research UK (HDR UK), London, UK
- Centre for Health Data Science, Birmingham, B15 2WB, UK
| | - Georgios V Gkoutos
- Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, B15 2TT, UK
- Institute of Translational Medicine, University of Birmingham, Birmingham, B15 2TT, UK
- NIHR Surgical Reconstruction and Microbiology Research Centre, University Hospital Birmingham, Birmingham, B15 2WB, UK
- MRC Health Data Research UK (HDR UK), London, UK
- NIHR Experimental Cancer Medicine Centre, Birmingham, B15 2TT, UK
- Centre for Health Data Science, Birmingham, B15 2WB, UK
- Centre for Environmental Research & Advocacy, University of Birmingham, Birmingham, B15 2TT, UK
| |
Collapse
|
48
|
Luo X, Wang L, Hu P, Hu L. Predicting Protein-Protein Interactions Using Sequence and Network Information via Variational Graph Autoencoder. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3182-3194. [PMID: 37155405 DOI: 10.1109/tcbb.2023.3273567] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Protein-protein interactions (PPIs) play a critical role in the proteomics study, and a variety of computational algorithms have been developed to predict PPIs. Though effective, their performance is constrained by high false-positive and false-negative rates observed in PPI data. To overcome this problem, a novel PPI prediction algorithm, namely PASNVGA, is proposed in this work by combining the sequence and network information of proteins via variational graph autoencoder. To do so, PASNVGA first applies different strategies to extract the features of proteins from their sequence and network information, and obtains a more compact form of these features using principal component analysis. In addition, PASNVGA designs a scoring function to measure the higher-order connectivity between proteins and so as to obtain a higher-order adjacency matrix. With all these features and adjacency matrices, PASNVGA trains a variational graph autoencoder model to further learn the integrated embeddings of proteins. The prediction task is then completed by using a simple feedforward neural network. Extensive experiments have been conducted on five PPI datasets collected from different species. Compared with several state-of-the-art algorithms, PASNVGA has been demonstrated as a promising PPI prediction algorithm.
Collapse
|
49
|
Fan P, Zeng L, Ding Y, Kofler J, Silverstein J, Krivinko J, Sweet RA, Wang L. Combination of antidepressants and antipsychotics as a novel treatment option for psychosis in Alzheimer's disease. CPT Pharmacometrics Syst Pharmacol 2023; 12:1119-1131. [PMID: 37128639 PMCID: PMC10431054 DOI: 10.1002/psp4.12979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 04/18/2023] [Accepted: 04/21/2023] [Indexed: 05/03/2023] Open
Abstract
Psychotic symptoms are reported as one of the most common complications of Alzheimer's disease (AD), in whom they are associated with more rapid deterioration and increased mortality. Empiric treatments, namely first and second-generation antipsychotics, confer modest efficacy in patients with AD and with psychosis (AD+P) and themselves increase mortality. Recent studies suggested the use and beneficial effects of antidepressants among patients with AD+P. This motivates our rationale for exploring their potential as a novel combination therapy option among these patients. We included electronic medical records of 10,260 patients with AD in our study. Survival analysis was performed to assess the effects of the combination of antipsychotics and antidepressants on the mortality of these patients. A protein-protein interaction network representing AD+P was built, and network analysis methods were used to quantify the efficacy of these drugs on AD+P. A combined score was developed to measure the potential synergetic effect against AD+P. Our survival analyses showed that the co-administration of antidepressants with antipsychotics have a significant beneficial effect in reducing mortality. Our network analysis showed that the targets of antipsychotics and antidepressants are well-separated, and antipsychotics and antidepressants have similar Signed Jaccard Index (SJI) scores to AD+P. Eight drug pairs, including some popular recommendations like aripiprazole/sertraline, showed higher than average scores which suggest their potential in treating AD+P via strong synergetic effects. Our proposed combinations of antipsychotic and antidepressant therapy showed a strong superiority over current antipsychotics treatment for AD+P. The observed beneficial effects can be further strengthened by optimizing drug-pair selection based on our systems pharmacology analysis.
Collapse
Affiliation(s)
- Peihao Fan
- Computational Chemical Genomics Screening Center, Department of Pharmaceutical Sciences/School of PharmacyUniversity of PittsburghPittsburghPennsylvaniaUSA
| | - Lang Zeng
- Graduate School of Public HealthUniversity of PittsburghPittsburghPennsylvaniaUSA
| | - Ying Ding
- Graduate School of Public HealthUniversity of PittsburghPittsburghPennsylvaniaUSA
| | - Julia Kofler
- Division of Neuropathology, Department of PathologyUniversity of Pittsburgh Medical CenterPittsburghPennsylvaniaUSA
| | - Jonathan Silverstein
- Department of Biomedical Informatics, School of MedicineUniversity of PittsburghPittsburghPennsylvaniaUSA
| | - Joshua Krivinko
- Department of Psychiatry, School of MedicineUniversity of PittsburghPittsburghPennsylvaniaUSA
| | - Robert A. Sweet
- Department of Psychiatry, School of MedicineUniversity of PittsburghPittsburghPennsylvaniaUSA
- Alzheimer Disease Research CenterUniversity of Pittsburgh Medical CenterPittsburghPennsylvaniaUSA
| | - Lirong Wang
- Computational Chemical Genomics Screening Center, Department of Pharmaceutical Sciences/School of PharmacyUniversity of PittsburghPittsburghPennsylvaniaUSA
| |
Collapse
|
50
|
Budel G, Jin Y, Van Mieghem P, Kitsak M. Topological properties and organizing principles of semantic networks. Sci Rep 2023; 13:11728. [PMID: 37474614 PMCID: PMC10359341 DOI: 10.1038/s41598-023-37294-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 06/19/2023] [Indexed: 07/22/2023] Open
Abstract
Interpreting natural language is an increasingly important task in computer algorithms due to the growing availability of unstructured textual data. Natural Language Processing (NLP) applications rely on semantic networks for structured knowledge representation. The fundamental properties of semantic networks must be taken into account when designing NLP algorithms, yet they remain to be structurally investigated. We study the properties of semantic networks from ConceptNet, defined by 7 semantic relations from 11 different languages. We find that semantic networks have universal basic properties: they are sparse, highly clustered, and many exhibit power-law degree distributions. Our findings show that the majority of the considered networks are scale-free. Some networks exhibit language-specific properties determined by grammatical rules, for example networks from highly inflected languages, such as e.g. Latin, German, French and Spanish, show peaks in the degree distribution that deviate from a power law. We find that depending on the semantic relation type and the language, the link formation in semantic networks is guided by different principles. In some networks the connections are similarity-based, while in others the connections are more complementarity-based. Finally, we demonstrate how knowledge of similarity and complementarity in semantic networks can improve NLP algorithms in missing link inference.
Collapse
Affiliation(s)
- Gabriel Budel
- Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 CD, Delft, The Netherlands
| | - Ying Jin
- Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 CD, Delft, The Netherlands
| | - Piet Van Mieghem
- Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 CD, Delft, The Netherlands
| | - Maksim Kitsak
- Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 CD, Delft, The Netherlands.
| |
Collapse
|