51
|
Li L, Li H, Ishdorj TO, Zheng C, Su Y. MDNNSyn: A Multi-Modal Deep Learning Framework for Drug Synergy Prediction. IEEE J Biomed Health Inform 2024; 28:6225-6236. [PMID: 38954565 DOI: 10.1109/jbhi.2024.3421916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
Synergistic drug combination prediction tasks based on the computational models have been widely studied and applied in the cancer field. However, most of models only consider the interactions between drug pairs and specific cell lines, without taking into account the multiple biological relationships of drug-drug and cell line-cell line that also largely affect synergistic mechanisms. To this end, here we propose a multi-modal deep learning framework, termed MDNNSyn, which adequately applies multi-source information and trains multi-modal features to infer potential synergistic drug combinations. MDNNSyn extracts topology modality features by implementing the multi-layer hypergraph neural network on drug synergy hypergraph and constructs semantic modality features through similarity strategy. A multi-modal fusion network layer with gated neural network is then employed for synergy score prediction. MDNNSyn is compared to five classic and state-of-the-art prediction methods on DrugCombDB and Oncology-Screen datasets. The model achieves area under the curve (AUC) scores of 0.8682 and 0.9013 on two datasets, an improvement of 3.70 % and 2.71 % over the second-best model. Case study indicates that MDNNSyn is capable of detecting potential synergistic drug combinations.
Collapse
|
52
|
Yao W, Wei A, Xiao Z, Zhao W, Shen X, Jiang X, He T. An Improved Framework for Drug-Side Effect Associations Prediction via Counterfactual Inference-Based Data Augmentation. IEEE Trans Nanobioscience 2024; 23:540-547. [PMID: 39141449 DOI: 10.1109/tnb.2024.3443244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/16/2024]
Abstract
Detecting side effects of drugs is a fundamental task in drug development. With the expansion of publicly available biomedical data, researchers have proposed many computational methods for predicting drug-side effect associations (DSAs), among which network-based methods attract wide attention in the biomedical field. However, the problem of data scarcity poses a great challenge for existing DSAs prediction models. Although several data augmentation methods have been proposed to address this issue, most of existing methods employ a random way to manipulate the original networks, which ignores the causality of existence of DSAs, leading to the poor performance on the task of DSAs prediction. In this paper, we propose a counterfactual inference-based data augmentation method for improving the performance of the task. First, we construct a heterogeneous information network (HIN) by integrating multiple biomedical data. Based on the community detection on the HIN, a counterfactual inference-based method is designed to derive augmented links, and an augmented HIN is obtained accordingly. Then, a meta-path-based graph neural network is applied to learn high-quality representations of drugs and side effects, on which the predicted DSAs are obtained. Finally, comprehensive experiments are conducted, and the results demonstrate the effectiveness of the proposed counterfactual inference-based data augmentation for the task of DSAs prediction.
Collapse
|
53
|
Hu X, Yi H, Cheng H, Zhao Y, Zhang D, Li J, Ruan J, Zhang J, Lu X. Multiple Heterogeneous Networks Representation With Latent Space for Synthetic Lethality Prediction. IEEE Trans Nanobioscience 2024; 23:564-571. [PMID: 39150817 DOI: 10.1109/tnb.2024.3444922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/18/2024]
Abstract
Computational synthetic lethality (SL) method has become a promising strategy to identify SL gene pairs for targeted cancer therapy and cancer medicine development. Feature representation for integrating various biological networks is crutial to improve the identification performance. However, previous feature representation, such as matrix factorization and graph neural network, projects gene features onto latent variables by keeping a specific geometric metric. There is a lack of models of gene representational latent space with considerating multiple dimentionalities correlation and preserving latent geometric structures in both sample and feature spaces. Therefore, we propose a novel method to model gene Latent Space using matrix Tri-Factorization (LSTF) to obtain gene representation with embedding variables resulting from the potential interpretation of synthetic lethality. Meanwhile, manifold subspace regularization is applied to the tri-factorization to capture the geometrical manifold structure in the latent space with gene PPI functional and GO semantic embeddings. Then, SL gene pairs are identified by the reconstruction of the associations with gene representations in the latent space. The experimental results illustrate that LSTF is superior to other state-of-the-art methods. Case study demonstrate the effectiveness of the predicted SL associations.
Collapse
|
54
|
Gervas-Arruga J, Barba-Romero MÁ, Fernández-Martín JJ, Gómez-Cerezo JF, Segú-Vergés C, Ronzoni G, Cebolla JJ. In Silico Modeling of Fabry Disease Pathophysiology for the Identification of Early Cellular Damage Biomarker Candidates. Int J Mol Sci 2024; 25:10329. [PMID: 39408658 PMCID: PMC11477023 DOI: 10.3390/ijms251910329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2024] [Revised: 09/19/2024] [Accepted: 09/24/2024] [Indexed: 10/20/2024] Open
Abstract
Fabry disease (FD) is an X-linked lysosomal disease whose ultimate consequences are the accumulation of sphingolipids and subsequent inflammatory events, mainly at the endothelial level. The outcomes include different nervous system manifestations as well as multiple organ damage. Despite the availability of known biomarkers, early detection of FD remains a medical need. This study aimed to develop an in silico model based on machine learning to identify candidate vascular and nervous system proteins for early FD damage detection at the cellular level. A combined systems biology and machine learning approach was carried out considering molecular characteristics of FD to create a computational model of vascular and nervous system disease. A data science strategy was applied to identify risk classifiers by using 10 K-fold cross-validation. Further biological and clinical criteria were used to prioritize the most promising candidates, resulting in the identification of 36 biomarker candidates with classifier abilities, which are easily measurable in body fluids. Among them, we propose four candidates, CAMK2A, ILK, LMNA, and KHSRP, which have high classification capabilities according to our models (cross-validated accuracy ≥ 90%) and are related to the vascular and nervous systems. These biomarkers show promise as high-risk cellular and tissue damage indicators that are potentially applicable in clinical settings, although in vivo validation is still needed.
Collapse
Affiliation(s)
| | - Miguel Ángel Barba-Romero
- Department of Internal Medicine, Albacete University Hospital, 02006 Albacete, Spain;
- Albacete Medical School, Castilla-La Mancha University, 02006 Albacete, Spain
| | | | - Jorge Francisco Gómez-Cerezo
- Department of Internal Medicine, Infanta Sofía University Hospital, 28702 Madrid, Spain;
- Faculty of Medicine, European University of Madrid, 28670 Madrid, Spain
| | | | | | | |
Collapse
|
55
|
Mazein I, Rougny A, Mazein A, Henkel R, Gütebier L, Michaelis L, Ostaszewski M, Schneider R, Satagopam V, Jensen LJ, Waltemath D, Wodke JAH, Balaur I. Graph databases in systems biology: a systematic review. Brief Bioinform 2024; 25:bbae561. [PMID: 39565895 PMCID: PMC11578065 DOI: 10.1093/bib/bbae561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 09/28/2024] [Accepted: 10/21/2024] [Indexed: 11/22/2024] Open
Abstract
Graph databases are becoming increasingly popular across scientific disciplines, being highly suitable for storing and connecting complex heterogeneous data. In systems biology, they are used as a backend solution for biological data repositories, ontologies, networks, pathways, and knowledge graph databases. In this review, we analyse all publications using or mentioning graph databases retrieved from PubMed and PubMed Central full-text search, focusing on the top 16 available graph databases, Publications are categorized according to their domain and application, focusing on pathway and network biology and relevant ontologies and tools. We detail different approaches and highlight the advantages of outstanding resources, such as UniProtKB, Disease Ontology, and Reactome, which provide graph-based solutions. We discuss ongoing efforts of the systems biology community to standardize and harmonize knowledge graph creation and the maintenance of integrated resources. Outlining prospects, including the use of graph databases as a way of communication between biological data repositories, we conclude that efficient design, querying, and maintenance of graph databases will be key for knowledge generation in systems biology and other research fields with heterogeneous data.
Collapse
Affiliation(s)
- Ilya Mazein
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Adrien Rougny
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Alexander Mazein
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Ron Henkel
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Lea Gütebier
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Lea Michaelis
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Marek Ostaszewski
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Venkata Satagopam
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Lars Juhl Jensen
- Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Grønnegårdsvej 15, 1870 Frederiksberg C, Denmark
| | - Dagmar Waltemath
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Judith A H Wodke
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Irina Balaur
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| |
Collapse
|
56
|
Zhao MX, Ding RF, Chen Q, Meng J, Li F, Fu S, Huang B, Liu Y, Ji ZL, Zhao Y. Nphos: Database and Predictor of Protein N-phosphorylation. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae032. [PMID: 39380205 PMCID: PMC12016571 DOI: 10.1093/gpbjnl/qzae032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 03/03/2024] [Accepted: 04/01/2024] [Indexed: 10/10/2024]
Abstract
Protein N-phosphorylation is widely present in nature and participates in various biological processes. However, current knowledge on N-phosphorylation is extremely limited compared to that on O-phosphorylation. In this study, we collected 11,710 experimentally verified N-phosphosites of 7344 proteins from 39 species and subsequently constructed the database Nphos to share up-to-date information on protein N-phosphorylation. Upon these substantial data, we characterized the sequential and structural features of protein N-phosphorylation. Moreover, after comparing hundreds of learning models, we chose and optimized gradient boosting decision tree (GBDT) models to predict three types of human N-phosphorylation, achieving mean area under the receiver operating characteristic curve (AUC) values of 90.56%, 91.24%, and 92.01% for pHis, pLys, and pArg, respectively. Meanwhile, we discovered 488,825 distinct N-phosphosites in the human proteome. The models were also deployed in Nphos for interactive N-phosphosite prediction. In summary, this work provides new insights and points for both flexible and focused investigations of N-phosphorylation. It will also facilitate a deeper and more systematic understanding of protein N-phosphorylation modification by providing a data and technical foundation. Nphos is freely available at http://www.bio-add.org/Nphos/ and http://ppodd.org.cn/Nphos/.
Collapse
Affiliation(s)
- Ming-Xiao Zhao
- Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
- Department of Chemical Biology, Key Laboratory for Chemical Biology of Fujian Province, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Ruo-Fan Ding
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361102, China
| | - Qiang Chen
- Zhejiang Key Laboratory of Pathophysiology, Department of Biochemistry and Molecular Biology, Health Science Center, Ningbo University, Ningbo 315211, China
| | - Junhua Meng
- BGI Genomics, BGI-Shenzhen, Shenzhen 518083, China
| | - Fulai Li
- Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
| | - Songsen Fu
- Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
| | - Biling Huang
- Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
| | - Yan Liu
- Department of Chemical Biology, Key Laboratory for Chemical Biology of Fujian Province, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Zhi-Liang Ji
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361102, China
| | - Yufen Zhao
- Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
- Department of Chemical Biology, Key Laboratory for Chemical Biology of Fujian Province, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- Key Laboratory of Bioorganic Phosphorus Chemistry & Chemical Biology, Department of Chemistry, Tsinghua University, Beijing 100084, China
| |
Collapse
|
57
|
Cingiz MÖ. Ensemble decision of local similarity indices on the biological network for disease related gene prediction. PeerJ 2024; 12:e17975. [PMID: 39247551 PMCID: PMC11380840 DOI: 10.7717/peerj.17975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 08/05/2024] [Indexed: 09/10/2024] Open
Abstract
Link prediction (LP) is a task for the identification of potential, missing and spurious links in complex networks. Protein-protein interaction (PPI) networks are important for understanding the underlying biological mechanisms of diseases. Many complex networks have been constructed using LP methods; however, there are a limited number of studies that focus on disease-related gene predictions and evaluate these genes using various evaluation criteria. The main objective of the study is to investigate the effect of a simple ensemble method in disease related gene predictions. Local similarity indices (LSIs) based disease related gene predictions were integrated by a simple ensemble decision method, simple majority voting (SMV), on the PPI network to detect accurate disease related genes. Human PPI network was utilized to discover potential disease related genes using four LSIs for the gene prediction. LSIs discovered potential links between disease related genes, which were obtained from OMIM database for gastric, colorectal, breast, prostate and lung cancers. LSIs based disease related genes were ranked due to their LSI scores in descending order for retrieving the top 10, 50 and 100 disease related genes. SMV integrated four LSIs based predictions to obtain SMV based the top 10, 50 and 100 disease related genes. The performance of LSIs based and SMV based genes were evaluated separately by employing overlap analyses, which were performed with GeneCard disease-gene relation dataset and Gene Ontology (GO) terms. The GO-terms were used for biological assessment for the inferred gene lists by LSIs and SMV on all cancer types. Adamic-Adar (AA), Resource Allocation Index (RAI), and SMV based gene lists are generally achieved good performance results on all cancers in both overlap analyses. SMV also outperformed on breast cancer data. The increment in the selection of the number of the top ranked disease related genes also enhanced the performance results of SMV.
Collapse
Affiliation(s)
- Mustafa Özgür Cingiz
- Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Bursa Technical University, Bursa, Turkey
| |
Collapse
|
58
|
Guo X, Song Y, Xu D, Jin X, Shang X. Genotype and Phenotype Association Analysis Based on Multi-omics Statistical Data. Curr Bioinform 2024; 19:933-942. [DOI: 10.2174/0115748936276861240109045208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/29/2023] [Accepted: 12/07/2023] [Indexed: 01/03/2025]
Abstract
Background:
When using clinical data for multi-omics analysis, there are issues such as
the insufficient number of omics data types and relatively small sample size due to the protection of
patients' privacy, the requirements of data management by various institutions, and the relatively
large number of features of each omics data. This paper describes the analysis of multi-omics pathway
relationships using statistical data in the absence of clinical data.
Methods:
We proposed a novel approach to exploit easily accessible statistics in public databases.
This approach introduces phenotypic associations that are not included in the clinical data and uses
these data to build a three-layer heterogeneous network. To simplify the analysis, we decomposed
the three-layer network into double two-layer networks to predict the weights of the inter-layer associations.
By adding a hyperparameter β, the weights of the two layers of the network were
merged, and then k-fold cross-validation was used to evaluate the accuracy of this method. In calculating
the weights of the two-layer networks, the RWR with fixed restart probability was combined
with PBMDA and CIPHER to generate the PCRWR with biased weights and improved accuracy.
Results:
The area under the receiver operating characteristic curve was increased by approximately
7% in the case of the RWR with initial weights.
Conclusion:
Multi-omics statistical data were used to establish genotype and phenotype correlation
networks for analysis, which was similar to the effect of clinical multi-omics analysis.
Collapse
Affiliation(s)
- Xinpeng Guo
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, 710051, People’s Republic of China
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710072, People’s Republic of China
| | - Yafei Song
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, 710051, People’s Republic of China
| | - Dongyan Xu
- Department of Basic Sciences, Air Force Engineering University, Xi’an, 710051, People’s Republic
of China
| | - Xueping Jin
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, 710051, People’s Republic of China
| | - Xuequn Shang
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710072, People’s
Republic of China
| |
Collapse
|
59
|
Li Z, Zhang Y, Zhou P. Temporal Protein Complex Identification Based on Dynamic Heterogeneous Protein Information Network Representation Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1154-1164. [PMID: 38190662 DOI: 10.1109/tcbb.2024.3351078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Abstract
Protein complexes, as the fundamental units of cellular function and regulation, play a crucial role in understanding the normal physiological functions of cells. Existing methods for protein complex identification attempt to introduce other biological information on top of the protein-protein interaction (PPI) network to assist in evaluating the degree of association between proteins. However, these methods usually treat protein interaction networks as flat homogeneous static networks. They cannot distinguish the roles and importance of different types of biological information, nor can they reflect the dynamic changes of protein complexes. In recent years, heterogeneous network representation learning has achieved great success in processing complex heterogeneous information and mining deep semantics. We thus propose a temporal protein complex identification method based on Dynamic Heterogeneous Protein information network Representation Learning, DHPRL. DHPRL naturally integrates multiple types of heterogeneous biological information in the cellular temporal dimension. It simultaneously models the temporal dynamic properties of proteins and the heterogeneity of biological information to improve the understanding of protein interactions and the accuracy of complex prediction. Firstly, we construct Dynamic Heterogeneous Protein Information Network (DHPIN) by integrating temporal gene expression information and GO attribute information. Then we design a dual-view collaborative contrast mechanism. Specifically, proposing to learn protein representations from two views of DHPIN (1-hop relation view and meta-path view) to model the consistency and specificity between nearest-neighbour bio information and deeper biological semantics. The dynamic PPI network is thereafter re-weighted based on the learned protein representations. Finally, we perform protein identification on the re-weighted dynamic PPI network. Extensive experimental results demonstrate that DHPRL can effectively model complicated biological information and achieve state-of-the-art performance in most cases.
Collapse
|
60
|
Zhang B, Niu D, Zhang L, Zhang Q, Li Z. MSH-DTI: multi-graph convolution with self-supervised embedding and heterogeneous aggregation for drug-target interaction prediction. BMC Bioinformatics 2024; 25:275. [PMID: 39179993 PMCID: PMC11342675 DOI: 10.1186/s12859-024-05904-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 08/16/2024] [Indexed: 08/26/2024] Open
Abstract
BACKGROUND The rise of network pharmacology has led to the widespread use of network-based computational methods in predicting drug target interaction (DTI). However, existing DTI prediction models typically rely on a limited amount of data to extract drug and target features, potentially affecting the comprehensiveness and robustness of features. In addition, although multiple networks are used for DTI prediction, the integration of heterogeneous information often involves simplistic aggregation and attention mechanisms, which may impose certain limitations. RESULTS MSH-DTI, a deep learning model for predicting drug-target interactions, is proposed in this paper. The model uses self-supervised learning methods to obtain drug and target structure features. A Heterogeneous Interaction-enhanced Feature Fusion Module is designed for multi-graph construction, and the graph convolutional networks are used to extract node features. With the help of an attention mechanism, the model focuses on the important parts of different features for prediction. Experimental results show that the AUROC and AUPR of MSH-DTI are 0.9620 and 0.9605 respectively, outperforming other models on the DTINet dataset. CONCLUSION The proposed MSH-DTI is a helpful tool to discover drug-target interactions, which is also validated through case studies in predicting new DTIs.
Collapse
Affiliation(s)
- Beiyi Zhang
- College of Computer Science and Technology, Qingdao University, Ningxia Road, Qingdao, 266071, Shandong, China
| | - Dongjiang Niu
- College of Computer Science and Technology, Qingdao University, Ningxia Road, Qingdao, 266071, Shandong, China
| | - Lianwei Zhang
- College of Computer Science and Technology, Qingdao University, Ningxia Road, Qingdao, 266071, Shandong, China
| | - Qiang Zhang
- College of Computer Science and Technology, Qingdao University, Ningxia Road, Qingdao, 266071, Shandong, China
| | - Zhen Li
- College of Computer Science and Technology, Qingdao University, Ningxia Road, Qingdao, 266071, Shandong, China.
| |
Collapse
|
61
|
Cebolla JJ, Giraldo P, Gómez J, Montoto C, Gervas-Arruga J. Machine Learning-Driven Biomarker Discovery for Skeletal Complications in Type 1 Gaucher Disease Patients. Int J Mol Sci 2024; 25:8586. [PMID: 39201273 PMCID: PMC11354847 DOI: 10.3390/ijms25168586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 08/01/2024] [Accepted: 08/05/2024] [Indexed: 09/02/2024] Open
Abstract
Type 1 Gaucher disease (GD1) is a rare, autosomal recessive disorder caused by glucocerebrosidase deficiency. Skeletal manifestations represent one of the most debilitating and potentially irreversible complications of GD1. Although imaging studies are the gold standard, early diagnostic/prognostic tools, such as molecular biomarkers, are needed for the rapid management of skeletal complications. This study aimed to identify potential protein biomarkers capable of predicting the early diagnosis of bone skeletal complications in GD1 patients using artificial intelligence. An in silico study was performed using the novel Therapeutic Performance Mapping System methodology to construct mathematical models of GD1-associated complications at the protein level. Pathophysiological characterization was performed before modeling, and a data science strategy was applied to the predicted protein activity for each protein in the models to identify classifiers. Statistical criteria were used to prioritize the most promising candidates, and 18 candidates were identified. Among them, PDGFB, IL1R2, PTH and CCL3 (MIP-1α) were highlighted due to their ease of measurement in blood. This study proposes a validated novel tool to discover new protein biomarkers to support clinician decision-making in an area where medical needs have not yet been met. However, confirming the results using in vitro and/or in vivo studies is necessary.
Collapse
Affiliation(s)
| | - Pilar Giraldo
- FEETEG, 50006 Zaragoza, Spain;
- Hospital QuirónSalud Zaragoza, 50012 Zaragoza, Spain
| | | | | | | |
Collapse
|
62
|
Gabriel GC, Ganapathiraju M, Lo CW. The Role of Cilia and the Complex Genetics of Congenital Heart Disease. Annu Rev Genomics Hum Genet 2024; 25:309-327. [PMID: 38724024 DOI: 10.1146/annurev-genom-121222-105345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024]
Abstract
Congenital heart disease (CHD) can affect up to 1% of live births, and despite abundant evidence of a genetic etiology, the genetic landscape of CHD is still not well understood. A large-scale mouse chemical mutagenesis screen for mutations causing CHD yielded a preponderance of cilia-related genes, pointing to a central role for cilia in CHD pathogenesis. The genes uncovered by the screen included genes that regulate ciliogenesis and cilia-transduced cell signaling as well as many that mediate endocytic trafficking, a cell process critical for both ciliogenesis and cell signaling. The clinical relevance of these findings is supported by whole-exome sequencing analysis of CHD patients that showed enrichment for pathogenic variants in ciliome genes. Surprisingly, among the ciliome CHD genes recovered were many that encoded direct protein-protein interactors. Assembly of the CHD genes into a protein-protein interaction network yielded a tight interactome that suggested this protein-protein interaction may have functional importance and that its disruption could contribute to the pathogenesis of CHD. In light of these and other findings, we propose that an interactome enriched for ciliome genes may provide the genomic context for the complex genetics of CHD and its often-observed incomplete penetrance and variable expressivity.
Collapse
Affiliation(s)
- George C Gabriel
- Department of Developmental Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA; ,
| | - Madhavi Ganapathiraju
- Carnegie Mellon University in Qatar, Doha, Qatar
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA;
| | - Cecilia W Lo
- Department of Developmental Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA; ,
| |
Collapse
|
63
|
Teimouri H, Medvedeva A, Kolomeisky AB. Unraveling the role of physicochemical differences in predicting protein-protein interactions. J Chem Phys 2024; 161:045102. [PMID: 39051836 DOI: 10.1063/5.0219501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 07/09/2024] [Indexed: 07/27/2024] Open
Abstract
The ability to accurately predict protein-protein interactions is critically important for understanding major cellular processes. However, current experimental and computational approaches for identifying them are technically very challenging and still have limited success. We propose a new computational method for predicting protein-protein interactions using only primary sequence information. It utilizes the concept of physicochemical similarity to determine which interactions will most likely occur. In our approach, the physicochemical features of proteins are extracted using bioinformatics tools for different organisms. Then they are utilized in a machine-learning method to identify successful protein-protein interactions via correlation analysis. It was found that the most important property that correlates most with the protein-protein interactions for all studied organisms is dipeptide amino acid composition (the frequency of specific amino acid pairs in a protein sequence). While current approaches often overlook the specificity of protein-protein interactions with different organisms, our method yields context-specific features that determine protein-protein interactions. The analysis is specifically applied to the bacterial two-component system that includes histidine kinase and transcriptional response regulators, as well as to the barnase-barstar complex, demonstrating the method's versatility across different biological systems. Our approach can be applied to predict protein-protein interactions in any biological system, providing an important tool for investigating complex biological processes' mechanisms.
Collapse
Affiliation(s)
- Hamid Teimouri
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, USA
| | - Angela Medvedeva
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, USA
| | - Anatoly B Kolomeisky
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
64
|
Zhu Y, Ning C, Zhang N, Wang M, Zhang Y. GSRF-DTI: a framework for drug-target interaction prediction based on a drug-target pair network and representation learning on a large graph. BMC Biol 2024; 22:156. [PMID: 39020316 PMCID: PMC11256582 DOI: 10.1186/s12915-024-01949-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 07/01/2024] [Indexed: 07/19/2024] Open
Abstract
BACKGROUND Identification of potential drug-target interactions (DTIs) with high accuracy is a key step in drug discovery and repositioning, especially concerning specific drug targets. Traditional experimental methods for identifying the DTIs are arduous, time-intensive, and financially burdensome. In addition, robust computational methods have been developed for predicting the DTIs and are widely applied in drug discovery research. However, advancing more precise algorithms for predicting DTIs is essential to meet the stringent standards demanded by drug discovery. RESULTS We proposed a novel method called GSRF-DTI, which integrates networks with a deep learning algorithm to identify DTIs. Firstly, GSRF-DTI learned the embedding representation of drugs and targets by integrating multiple drug association information and target association information, respectively. Then, GSRF-DTI considered the influence of drug-target pair (DTP) association on DTI prediction to construct a drug-target pair network (DTP-NET). Next, we utilized GraphSAGE on DTP-NET to learn the potential features of the network and applied random forest (RF) to predict the DTIs. Furthermore, we conducted ablation experiments to validate the necessity of integrating different types of network features for identifying DTIs. It is worth noting that GSRF-DTI proposed three novel DTIs. CONCLUSIONS GSRF-DTI not only considered the influence of the interaction relationship between drug and target but also considered the impact of DTP association relationship on DTI prediction. We initially use GraphSAGE to aggregate the neighbor information of nodes for better identification. Experimental analysis on Luo's dataset and the newly constructed dataset revealed that the GSRF-DTI framework outperformed several state-of-the-art methods significantly.
Collapse
Affiliation(s)
- Yongdi Zhu
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China
| | - Chunhui Ning
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China
| | - Naiqian Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China
| | - Mingyi Wang
- Department of Central Lab, Weihai Municipal Hospital, Weihai, Shandong, China.
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China.
| |
Collapse
|
65
|
Abu-Bakar A, Ismail M, Zulkifli MZI, Zaini NAS, Shukor NIA, Harun S, Inayat-Hussain SH. Mapping the influence of hydrocarbons mixture on molecular mechanisms, involved in breast and lung neoplasms: in silico toxicogenomic data-mining. Genes Environ 2024; 46:15. [PMID: 38982523 PMCID: PMC11232146 DOI: 10.1186/s41021-024-00310-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 06/07/2024] [Indexed: 07/11/2024] Open
Abstract
BACKGROUND Exposure to chemical mixtures inherent in air pollution, has been shown to be associated with the risk of breast and lung cancers. However, studies on the molecular mechanisms of exposure to a mixture of these pollutants, such as hydrocarbons, in the development of breast and lung cancers are scarce. We utilized in silico toxicogenomic analysis to elucidate the molecular pathways linked to both cancers that are influenced by exposure to a mixture of selected hydrocarbons. The Comparative Toxicogenomics Database and Cytoscape software were used for data mining and visualization. RESULTS Twenty-five hydrocarbons, common in air pollution with carcinogenicity classification of 1 A/B or 2 (known/presumed or suspected human carcinogen), were divided into three groups: alkanes and alkenes, halogenated hydrocarbons, and polyaromatic hydrocarbons. The in silico data-mining revealed 87 and 44 genes commonly interacted with most of the investigated hydrocarbons are linked to breast and lung cancer, respectively. The dominant interactions among the common genes are co-expression, physical interaction, genetic interaction, co-localization, and interaction in shared protein domains. Among these genes, only 16 are common in the development of both cancers. Benzo(a)pyrene and tetrachlorodibenzodioxin interacted with all 16 genes. The molecular pathways potentially affected by the investigated hydrocarbons include aryl hydrocarbon receptor, chemical carcinogenesis, ferroptosis, fluid shear stress and atherosclerosis, interleukin 17 signaling pathway, lipid and atherosclerosis, NRF2 pathway, and oxidative stress response. CONCLUSIONS Within the inherent limitations of in silico toxicogenomics tools, we elucidated the molecular pathways associated with breast and lung cancer development potentially affected by hydrocarbons mixture. Our findings indicate adaptive responses to oxidative stress and inflammatory damages are instrumental in the development of both cancers. Additionally, ferroptosis-a non-apoptotic programmed cell death driven by lipid peroxidation and iron homeostasis-was identified as a new player in these responses. Finally, AHR potential involvement in modulating IL-8, a critical gene that mediates breast cancer invasion and metastasis to the lungs, was also highlighted. A deeper understanding of the interplay between genes associated with these pathways, and other survival signaling pathways identified in this study, will provide invaluable knowledge in assessing the risk of inhalation exposure to hydrocarbons mixture. The findings offer insights into future in vivo and in vitro laboratory investigations that focus on inhalation exposure to the hydrocarbons mixture.
Collapse
Affiliation(s)
- A'edah Abu-Bakar
- Product Stewardship and Toxicology, Environment, Social Performance & Product Stewardship (ESPPS), Group Health, Safety and Environment (GHSE), Petroliam Nasional Berhad (PETRONAS), Kuala Lumpur, 50088, Malaysia.
| | - Maihani Ismail
- Product Stewardship and Toxicology, Environment, Social Performance & Product Stewardship (ESPPS), Group Health, Safety and Environment (GHSE), Petroliam Nasional Berhad (PETRONAS), Kuala Lumpur, 50088, Malaysia.
| | - M Zaqrul Ieman Zulkifli
- Product Stewardship and Toxicology, Environment, Social Performance & Product Stewardship (ESPPS), Group Health, Safety and Environment (GHSE), Petroliam Nasional Berhad (PETRONAS), Kuala Lumpur, 50088, Malaysia
| | - Nur Aini Sofiyya Zaini
- Product Stewardship and Toxicology, Environment, Social Performance & Product Stewardship (ESPPS), Group Health, Safety and Environment (GHSE), Petroliam Nasional Berhad (PETRONAS), Kuala Lumpur, 50088, Malaysia
| | - Nur Izzah Abd Shukor
- Health, Safety and Environment (HSE), KLCC Urusharta, Kuala Lumpur, 50088, Malaysia
| | - Sarahani Harun
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi, Selangor, 43600 UKM, Malaysia
| | - Salmaan Hussain Inayat-Hussain
- ESPPS, GHSE, PETRONAS, Kuala Lumpur, 50088, Malaysia
- Department of Environmental Health Sciences, Yale School of Public Health, Yale University, 60 College St, New Haven, CT, 06250, USA
| |
Collapse
|
66
|
Piersma SR, Valles-Marti A, Rolfs F, Pham TV, Henneman AA, Jiménez CR. Inferring kinase activity from phosphoproteomic data: Tool comparison and recent applications. MASS SPECTROMETRY REVIEWS 2024; 43:725-751. [PMID: 36156810 DOI: 10.1002/mas.21808] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Aberrant cellular signaling pathways are a hallmark of cancer and other diseases. One of the most important signaling mechanisms involves protein phosphorylation/dephosphorylation. Protein phosphorylation is catalyzed by protein kinases, and over 530 protein kinases have been identified in the human genome. Aberrant kinase activity is one of the drivers of tumorigenesis and cancer progression and results in altered phosphorylation abundance of downstream substrates. Upstream kinase activity can be inferred from the global collection of phosphorylated substrates. Mass spectrometry-based phosphoproteomic experiments nowadays routinely allow identification and quantitation of >10k phosphosites per biological sample. This substrate phosphorylation footprint can be used to infer upstream kinase activities using tools like Kinase Substrate Enrichment Analysis (KSEA), Posttranslational Modification Substrate Enrichment Analysis (PTM-SEA), and Integrative Inferred Kinase Activity Analysis (INKA). Since the topic of kinase activity inference is very active with many new approaches reported in the past 3 years, we would like to give an overview of the field. In this review, an inventory of kinase activity inference tools, their underlying algorithms, statistical frameworks, kinase-substrate databases, and user-friendliness is presented. The most widely-used tools are compared in-depth. Subsequently, recent applications of the tools are described focusing on clinical tissues and hematological samples. Two main application areas for kinase activity inference tools can be discerned. (1) Maximal biological insights can be obtained from large data sets with group comparisons using multiple complementary tools (e.g., PTM-SEA and KSEA or INKA). (2) In the oncology context where personalized treatment requires analysis of single samples, INKA for example, has emerged as tool that can prioritize actionable kinases for targeted inhibition.
Collapse
Affiliation(s)
- Sander R Piersma
- OncoProteomics Laboratory Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| | - Andrea Valles-Marti
- OncoProteomics Laboratory Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| | - Frank Rolfs
- OncoProteomics Laboratory Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| | - Thang V Pham
- OncoProteomics Laboratory Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| | - Alex A Henneman
- OncoProteomics Laboratory Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| | - Connie R Jiménez
- OncoProteomics Laboratory Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| |
Collapse
|
67
|
Chaudhari JK, Pant S, Jha R, Pathak RK, Singh DB. Biological big-data sources, problems of storage, computational issues, and applications: a comprehensive review. Knowl Inf Syst 2024; 66:3159-3209. [DOI: 10.1007/s10115-023-02049-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 09/12/2023] [Accepted: 12/11/2023] [Indexed: 01/03/2025]
|
68
|
Menor-Flores M, Vega-Rodríguez MA. A protein-protein interaction network aligner study in the multi-objective domain. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108188. [PMID: 38657382 DOI: 10.1016/j.cmpb.2024.108188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 04/14/2024] [Accepted: 04/17/2024] [Indexed: 04/26/2024]
Abstract
BACKGROUND AND OBJECTIVE The protein-protein interaction (PPI) network alignment has proven to be an efficient technique in the diagnosis and prevention of certain diseases. However, the difficulty in maximizing, at the same time, the two qualities that measure the goodness of alignments (topological and biological quality) has led aligners to produce very different alignments. Thus making a comparative study among alignments of such different qualities a big challenge. Multi-objective optimization is a computer method, which is very powerful in this kind of contexts because both conflicting qualities are considered together. Analysing the alignments of each PPI network aligner with multi-objective methodologies allows you to visualize a bigger picture of the alignments and their qualities, obtaining very interesting conclusions. This paper proposes a comprehensive PPI network aligner study in the multi-objective domain. METHODS Alignments from each aligner and all aligners together were studied and compared to each other via Pareto dominance methodologies. The best alignments produced by each aligner and all aligners together for five different alignment scenarios were displayed in Pareto front graphs. Later, the aligners were ranked according to the topological, biological, and combined quality of their alignments. Finally, the aligners were also ranked based on their average runtimes. RESULTS Regarding aligners constructing the best overall alignments, we found that SAlign, BEAMS, SANA, and HubAlign are the best options. Additionally, the alignments of best topological quality are produced by: SANA, SAlign, and HubAlign aligners. On the contrary, the aligners returning the alignments of best biological quality are: BEAMS, TAME, and WAVE. However, if there are time constraints, it is recommended to select SAlign to obtain high topological quality alignments and PISwap or SAlign aligners for high biological quality alignments. CONCLUSIONS The use of the SANA aligner is recommended for obtaining the best alignments of topological quality, BEAMS for alignments of the best biological quality, and SAlign for alignments of the best combined topological and biological quality. Simultaneously, SANA and BEAMS have above-average runtimes. Therefore, it is suggested, if necessary due to time restrictions, to choose other, faster aligners like SAlign or PISwap whose alignments are also of high quality.
Collapse
Affiliation(s)
- Manuel Menor-Flores
- Escuela Politécnica, Universidad de Extremadura,(1) Campus Universitario s/n, 10003 Cáceres, Spain.
| | - Miguel A Vega-Rodríguez
- Escuela Politécnica, Universidad de Extremadura,(1) Campus Universitario s/n, 10003 Cáceres, Spain.
| |
Collapse
|
69
|
Rout T, Mohapatra A, Kar M. A systematic review of graph-based explorations of PPI networks: methods, resources, and best practices. NETWORK MODELING ANALYSIS IN HEALTH INFORMATICS AND BIOINFORMATICS 2024; 13:29. [DOI: 10.1007/s13721-024-00467-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 04/09/2024] [Accepted: 05/16/2024] [Indexed: 01/03/2025]
|
70
|
Karunakaran KB, Ganapathiraju MK. Malignant peritoneal mesothelioma interactome with 417 novel protein-protein interactions. BJC REPORTS 2024; 2:42. [PMID: 39516360 PMCID: PMC11524009 DOI: 10.1038/s44276-024-00062-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 04/11/2024] [Accepted: 04/16/2024] [Indexed: 11/16/2024]
Abstract
BACKGROUND Malignant peritoneal mesothelioma (MPeM) is an aggressive cancer affecting the abdominal peritoneal lining and intra-abdominal organs, with a median survival of ~2.5 years. METHODS We constructed the protein interactome of 59 MPeM-associated genes with previously known protein-protein interactions (PPIs) as well as novel PPIs predicted using our previously developed HiPPIP computational model and analysed it for transcriptomic and functional associations and for repurposable drugs. RESULTS The MPeM interactome had over 400 computationally predicted PPIs and 4700 known PPIs. Transcriptomic evidence validated 75.6% of the genes in the interactome and 65% of the novel interactors. Some genes had tissue-specific expression in extramedullary hematopoietic sites and the expression of some genes could be correlated with unfavourable prognoses in various cancers. 39 out of 152 drugs that target the proteins in the interactome were identified as potentially repurposable for MPeM, with 29 having evidence from prior clinical trials, animal models or cell lines for effectiveness against peritoneal and pleural mesothelioma and primary peritoneal cancer. Functional modules related to chromosomal segregation, transcriptional dysregulation, IL-6 production and hematopoiesis were identified from the interactome. The MPeM interactome overlapped significantly with the malignant pleural mesothelioma interactome, revealing shared molecular pathways. CONCLUSIONS Our findings demonstrate the utility of the interactome in uncovering biological associations and in generating clinically translatable results.
Collapse
Affiliation(s)
- Kalyani B Karunakaran
- Supercomputer Education and Research Centre, Indian Institute of Science, Bengaluru, 560012, India.
| | - Madhavi K Ganapathiraju
- Department of Biomedical Informatics, School of Medicine, and Intelligent Systems Program, School of Computing and Information, University of Pittsburgh, 5607 Baum Blvd, 5th Floor, Pittsburgh, PA, 15206, USA.
- Carnegie Mellon University in Qatar, Doha, Qatar.
| |
Collapse
|
71
|
Rojas-Rodriguez F, Schmidt MK, Canisius S. Assessing the validity of driver gene identification tools for targeted genome sequencing data. BIOINFORMATICS ADVANCES 2024; 4:vbae073. [PMID: 38808071 PMCID: PMC11132814 DOI: 10.1093/bioadv/vbae073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 04/16/2024] [Accepted: 05/22/2024] [Indexed: 05/30/2024]
Abstract
Motivation Most cancer driver gene identification tools have been developed for whole-exome sequencing data. Targeted sequencing is a popular alternative to whole-exome sequencing for large cancer studies due to its greater depth at a lower cost per tumor. Unlike whole-exome sequencing, targeted sequencing only enables mutation calling for a selected subset of genes. Whether existing driver gene identification tools remain valid in that context has not previously been studied. Results We evaluated the validity of seven popular driver gene identification tools when applied to targeted sequencing data. Based on whole-exome data of 14 different cancer types from TCGA, we constructed matching targeted datasets by keeping only the mutations overlapping with the pan-cancer MSK-IMPACT panel and, in the case of breast cancer, also the breast-cancer-specific B-CAST panel. We then compared the driver gene predictions obtained on whole-exome and targeted mutation data for each of the seven tools. Differences in how the tools model background mutation rates were the most important determinant of their validity on targeted sequencing data. Based on our results, we recommend OncodriveFML, OncodriveCLUSTL, 20/20+, dNdSCv, and ActiveDriver for driver gene identification in targeted sequencing data, whereas MutSigCV and DriverML are best avoided in that context. Availability and implementation Code for the analyses is available at https://github.com/SchmidtGroupNKI/TGSdrivergene_validity.
Collapse
Affiliation(s)
- Felipe Rojas-Rodriguez
- Division of Molecular Pathology, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands
| | - Marjanka K Schmidt
- Division of Molecular Pathology, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands
- Department of Clinical Genetics, Leiden University Medical Center, 2333 ZC Leiden, The Netherlands
- Division of Psychosocial Research and Epidemiology, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands
| | - Sander Canisius
- Division of Molecular Pathology, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands
- Division of Molecular Carcinogenesis, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands
| |
Collapse
|
72
|
Chereda H, Leha A, Beißbarth T. Stable feature selection utilizing Graph Convolutional Neural Network and Layer-wise Relevance Propagation for biomarker discovery in breast cancer. Artif Intell Med 2024; 151:102840. [PMID: 38658129 DOI: 10.1016/j.artmed.2024.102840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 03/05/2024] [Accepted: 03/10/2024] [Indexed: 04/26/2024]
Abstract
High-throughput technologies are becoming increasingly important in discovering prognostic biomarkers and in identifying novel drug targets. With Mammaprint, Oncotype DX, and many other prognostic molecular signatures breast cancer is one of the paradigmatic examples of the utility of high-throughput data to deliver prognostic biomarkers, that can be represented in a form of a rather short gene list. Such gene lists can be obtained as a set of features (genes) that are important for the decisions of a Machine Learning (ML) method applied to high-dimensional gene expression data. Several studies have identified predictive gene lists for patient prognosis in breast cancer, but these lists are unstable and have only a few genes in common. Instability of feature selection impedes biological interpretability: genes that are relevant for cancer pathology should be members of any predictive gene list obtained for the same clinical type of patients. Stability and interpretability of selected features can be improved by including information on molecular networks in ML methods. Graph Convolutional Neural Network (GCNN) is a contemporary deep learning approach applicable to gene expression data structured by a prior knowledge molecular network. Layer-wise Relevance Propagation (LRP) and SHapley Additive exPlanations (SHAP) are methods to explain individual decisions of deep learning models. We used both GCNN+LRP and GCNN+SHAP techniques to construct feature sets by aggregating individual explanations. We suggest a methodology to systematically and quantitatively analyze the stability, the impact on the classification performance, and the interpretability of the selected feature sets. We used this methodology to compare GCNN+LRP to GCNN+SHAP and to more classical ML-based feature selection approaches. Utilizing a large breast cancer gene expression dataset we show that, while feature selection with SHAP is useful in applications where selected features have to be impactful for classification performance, among all studied methods GCNN+LRP delivers the most stable (reproducible) and interpretable gene lists.
Collapse
Affiliation(s)
- Hryhorii Chereda
- Medical Bioinformatics, University Medical Center Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany
| | - Andreas Leha
- Medical Bioinformatics, University Medical Center Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany; Medical Statistics, University Medical Center Göttingen, Humboldtallee 32, Göttingen, 37073, Germany; Scientific Core Facility Medical Biometry and Statistical Bioinformatics, University Medical Center Göttingen, Humboldtallee 32, Göttingen, 37073, Germany
| | - Tim Beißbarth
- Medical Bioinformatics, University Medical Center Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany; Campus-Institute Data Science (CIDAS), University of Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany.
| |
Collapse
|
73
|
Priyanka P, Gopalakrishnan AP, Nisar M, Shivamurthy PB, George M, John L, Sanjeev D, Yandigeri T, Thomas SD, Rafi A, Dagamajalu S, Velikkakath AKG, Abhinand CS, Kanekar S, Prasad TSK, Balaya RDA, Raju R. A global phosphosite-correlated network map of Thousand And One Kinase 1 (TAOK1). Int J Biochem Cell Biol 2024; 170:106558. [PMID: 38479581 DOI: 10.1016/j.biocel.2024.106558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 02/19/2024] [Accepted: 03/09/2024] [Indexed: 03/25/2024]
Abstract
Thousand and one amino acid kinase 1 (TAOK1) is a sterile 20 family Serine/Threonine kinase linked to microtubule dynamics, checkpoint signaling, DNA damage response, and neurological functions. Molecular-level alterations of TAOK1 have been associated with neurodevelopment disorders and cancers. Despite their known involvement in physiological and pathophysiological processes, and as a core member of the hippo signaling pathway, the phosphoregulatory network of TAOK1 has not been visualized. Aimed to explore this network, we first analyzed the predominantly detected and differentially regulated TAOK1 phosphosites in global phosphoproteome datasets across diverse experimental conditions. Based on 709 qualitative and 210 quantitative differential cellular phosphoproteome datasets that were systematically assembled, we identified that phosphorylation at Ser421, Ser9, Ser965, and Ser445 predominantly represented TAOK1 in almost 75% of these datasets. Surprisingly, the functional role of all these phosphosites in TAOK1 remains unexplored. Hence, we employed a robust strategy to extract the phosphosites in proteins that significantly correlated in expression with predominant TAOK1 phosphosites. This led to the first categorization of the phosphosites including those in the currently known and predicted interactors, kinases, and substrates, that positively/negatively correlated with the expression status of each predominant TAOK1 phosphosites. Subsequently, we also analyzed the phosphosites in core proteins of the hippo signaling pathway. Based on the TAOK1 phosphoregulatory network analysis, we inferred the potential role of the predominant TAOK1 phosphosites. Especially, we propose pSer9 as an autophosphorylation and TAOK1 kinase activity-associated phosphosite and pS421, the most frequently detected phosphosite in TAOK1, as a significant regulatory phosphosite involved in the maintenance of genome integrity. Considering that the impact of all phosphosites that predominantly represent each kinase is essential for the efficient interpretation of global phosphoproteome datasets, we believe that the approach undertaken in this study is suitable to be extended to other kinases for accelerated research.
Collapse
Affiliation(s)
- Pahal Priyanka
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Athira Perunelly Gopalakrishnan
- Center for Systems Biology and Molecular Medicine (CSBMM), Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Mahammad Nisar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | | | - Mejo George
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Levin John
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Diya Sanjeev
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Tanuja Yandigeri
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Sonet D Thomas
- Center for Systems Biology and Molecular Medicine (CSBMM), Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Ahmad Rafi
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Shobha Dagamajalu
- Center for Systems Biology and Molecular Medicine (CSBMM), Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Anoop Kumar G Velikkakath
- Center for Systems Biology and Molecular Medicine (CSBMM), Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Chandran S Abhinand
- Center for Systems Biology and Molecular Medicine (CSBMM), Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Saptami Kanekar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | | | | | - Rajesh Raju
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| |
Collapse
|
74
|
Wright SN, Colton S, Schaffer LV, Pillich RT, Churas C, Pratt D, Ideker T. State of the Interactomes: an evaluation of molecular networks for generating biological insights. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.26.587073. [PMID: 38746239 PMCID: PMC11092493 DOI: 10.1101/2024.04.26.587073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Advancements in genomic and proteomic technologies have powered the use of gene and protein networks ("interactomes") for understanding genotype-phenotype translation. However, the proliferation of interactomes complicates the selection of networks for specific applications. Here, we present a comprehensive evaluation of 46 current human interactomes, encompassing protein-protein interactions as well as gene regulatory, signaling, colocalization, and genetic interaction networks. Our analysis shows that large composite networks such as HumanNet, STRING, and FunCoup are most effective for identifying disease genes, while smaller networks such as DIP and SIGNOR demonstrate strong interaction prediction performance. These findings provide a benchmark for interactomes across diverse network biology applications and clarify factors that influence network performance. Furthermore, our evaluation pipeline paves the way for continued assessment of emerging and updated interaction networks in the future.
Collapse
|
75
|
Lai W, Xie R, Chen C, Lou W, Yang H, Deng L, Lu Q, Tang X. Integrated analysis of scRNA-seq and bulk RNA-seq identifies FBXO2 as a candidate biomarker associated with chemoresistance in HGSOC. Heliyon 2024; 10:e28490. [PMID: 38590858 PMCID: PMC10999934 DOI: 10.1016/j.heliyon.2024.e28490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 03/19/2024] [Accepted: 03/20/2024] [Indexed: 04/10/2024] Open
Abstract
Background High-grade serous ovarian carcinoma (HGSOC) is the most prevalent and aggressive histological subtype of epithelial ovarian cancer. Around 80% of individuals will experience a recurrence within five years because of resistance to chemotherapy, despite initially responding well to platinum-based treatment. Biomarkers associated with chemoresistance are desperately needed in clinical practice. Methods We jointly analyzed the transcriptomic profiles of single-cell and bulk datasets of HGSOC to identify cell types associated with chemoresistance. Copy number variation (CNV) inference was performed to identify malignant cells. We subsequently analyzed the expression of candidate biomarkers and their relationship with patients' prognosis. The enrichment analysis and potential biological function of candidate biomarkers were explored. Then, we validated the candidate biomarker using in vitro experiments. Results We identified 8871 malignant epithelial cells in a single-cell RNA sequencing dataset, of which 861 cells were associated with chemoresistance. Among these malignant epithelial cells, FBXO2 (F-box protein 2) is highly expressed in cells related to chemoresistance. Moreover, FBXO2 expression was found to be higher in epithelial cells from chemoresistance samples compared to those from chemosensitivity samples in a separate single-cell RNA sequencing dataset. Patients exhibiting elevated levels of FBXO2 experienced poorer outcomes in terms of both overall survival (OS) and progression-free survival (PFS). FBXO2 could impact chemoresistance by influencing the PI3K-Akt signaling pathway, focal adhesion, and ECM-receptor interactions and regulating tumorigenesis. The 50% maximum inhibitory concentration (IC50) of cisplatin decreased in A2780 and SKOV3 ovarian carcinoma cell lines with silenced FBXO2 during an in vitro experiment. Conclusions We determined that FBXO2 is a potential biomarker linked to chemoresistance in HGSOC by combining single-cell RNA-seq and bulk RNA-seq dataset. Our results suggest that FBXO2 could serve as a valuable prognostic marker and potential target for drug development in HGSOC.
Collapse
Affiliation(s)
- Wenwen Lai
- Department of Organ Transplantation, The Second Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
- Jiangxi Provincial Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
- Department of Biostatistics and Epidemiology, School of Public Health, Nanchang University, Nanchang, Jiangxi, China
| | - Ruixiang Xie
- School of Life Science, Nanchang University, Nanchang University, Nanchang, China
| | - Chen Chen
- College of Basic Medical Science, Nanchang University, Nanchang, China
| | - Weiming Lou
- Academic Affairs Office, The Second Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
| | - Haiyan Yang
- Jiangxi Provincial Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
- Department of Biostatistics and Epidemiology, School of Public Health, Nanchang University, Nanchang, Jiangxi, China
| | - Libin Deng
- Jiangxi Provincial Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
- Department of Biostatistics and Epidemiology, School of Public Health, Nanchang University, Nanchang, Jiangxi, China
| | - Quqin Lu
- Jiangxi Provincial Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
- Department of Biostatistics and Epidemiology, School of Public Health, Nanchang University, Nanchang, Jiangxi, China
| | - Xiaoli Tang
- College of Basic Medical Science, Nanchang University, Nanchang, China
| |
Collapse
|
76
|
Saravanan KS, Satish KS, Saraswathy GR, Kuri U, Vastrad SJ, Giri R, Dsouza PL, Kumar AP, Nair G. Innovative target mining stratagems to navigate drug repurposing endeavours. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2024; 205:303-355. [PMID: 38789185 DOI: 10.1016/bs.pmbts.2024.03.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
The conventional theory linking a single gene with a particular disease and a specific drug contributes to the dwindling success rates of traditional drug discovery. This requires a substantial shift focussing on contemporary drug design or drug repurposing, which entails linking multiple genes to diverse physiological or pathological pathways and drugs. Lately, drug repurposing, the art of discovering new/unlabelled indications for existing drugs or candidates in clinical trials, is gaining attention owing to its success rates. The rate-limiting phase of this strategy lies in target identification, which is generally driven through disease-centric and/or drug-centric approaches. The disease-centric approach is based on exploration of crucial biomolecules such as genes or proteins underlying pathological cascades of the disease of interest. Investigating these pathological interplays aids in the identification of potential drug targets that can be leveraged for novel therapeutic interventions. The drug-centric approach involves various strategies such as exploring the mechanism of adverse drug reactions that can unearth potential targets, as these untoward reactions might be considered desirable therapeutic actions in other disease conditions. Currently, artificial intelligence is an emerging robust tool that can be used to translate the aforementioned intricate biological networks to render interpretable data for extracting precise molecular targets. Integration of multiple approaches, big data analytics, and clinical corroboration are essential for successful target mining. This chapter highlights the contemporary strategies steering target identification and diverse frameworks for drug repurposing. These strategies are illustrated through case studies curated from recent drug repurposing research inclined towards neurodegenerative diseases, cancer, infections, immunological, and cardiovascular disorders.
Collapse
Affiliation(s)
- Kamatchi Sundara Saravanan
- Department of Pharmacognosy, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Kshreeraja S Satish
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Ganesan Rajalekshmi Saraswathy
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India.
| | - Ushnaa Kuri
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Soujanya J Vastrad
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Ritesh Giri
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Prizvan Lawrence Dsouza
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Adusumilli Pramod Kumar
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Gouri Nair
- Department of Pharmacology, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| |
Collapse
|
77
|
Cao J, Chen Q, Qiu J, Wang Y, Lan W, Du X, Tan K. NGCN: Drug-target interaction prediction by integrating information and feature learning from heterogeneous network. J Cell Mol Med 2024; 28:e18224. [PMID: 38509739 PMCID: PMC10955156 DOI: 10.1111/jcmm.18224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 02/14/2024] [Accepted: 02/26/2024] [Indexed: 03/22/2024] Open
Abstract
Drug-target interaction (DTI) prediction is essential for new drug design and development. Constructing heterogeneous network based on diverse information about drugs, proteins and diseases provides new opportunities for DTI prediction. However, the inherent complexity, high dimensionality and noise of such a network prevent us from taking full advantage of these network characteristics. This article proposes a novel method, NGCN, to predict drug-target interactions from an integrated heterogeneous network, from which to extract relevant biological properties and association information while maintaining the topology information. It focuses on learning the topology representation of drugs and targets to improve the performance of DTI prediction. Unlike traditional methods, it focuses on learning the low-dimensional topology representation of drugs and targets via graph-based convolutional neural network. NGCN achieves substantial performance improvements over other state-of-the-art methods, such as a nearly 1.0% increase in AUPR value. Moreover, we verify the robustness of NGCN through benchmark tests, and the experimental results demonstrate it is an extensible framework capable of combining heterogeneous information for DTI prediction.
Collapse
Affiliation(s)
- Junyue Cao
- College of Life Science and TechnologyGuangxi UniversityNanningChina
| | - Qingfeng Chen
- School of Computer, Electronics and InformationGuangxi UniversityNanningChina
| | - Junlai Qiu
- School of Computer, Electronics and InformationGuangxi UniversityNanningChina
| | - Yiming Wang
- School of Computer, Electronics and InformationGuangxi UniversityNanningChina
| | - Wei Lan
- School of Computer, Electronics and InformationGuangxi UniversityNanningChina
| | - Xiaojing Du
- School of Computer, Electronics and InformationGuangxi UniversityNanningChina
| | - Kai Tan
- School of Computer, Electronics and InformationGuangxi UniversityNanningChina
| |
Collapse
|
78
|
Chen S, Li M, Semenov I. MFA-DTI: Drug-target interaction prediction based on multi-feature fusion adopted framework. Methods 2024; 224:79-92. [PMID: 38430967 DOI: 10.1016/j.ymeth.2024.02.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 02/16/2024] [Accepted: 02/23/2024] [Indexed: 03/05/2024] Open
Abstract
The identification of drug-target interactions (DTI) is a valuable step in the drug discovery and repositioning process. However, traditional laboratory experiments are time-consuming and expensive. Computational methods have streamlined research to determine DTIs. The application of deep learning methods has significantly improved the prediction performance for DTIs. Modern deep learning methods can leverage multiple sources of information, including sequence data that contains biological structural information, and interaction data. While useful, these methods cannot be effectively applied to each type of information individually (e.g., chemical structure and interaction network) and do not take into account the specificity of DTI data such as low- or zero-interaction biological entities. To overcome these limitations, we propose a method called MFA-DTI (Multi-feature Fusion Adopted framework for DTI). MFA-DTI consists of three modules: an interaction graph learning module that processes the interaction network to generate interaction vectors, a chemical structure learning module that extracts features from the chemical structure, and a fusion module that combines these features for the final prediction. To validate the performance of MFA-DTI, we conducted experiments on six public datasets under different settings. The results indicate that the proposed method is highly effective in various settings and outperforms state-of-the-art methods.
Collapse
Affiliation(s)
- Siqi Chen
- School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing, 400074, China.
| | - Minghui Li
- Beidahuang Industry Group General Hospital, Harbin, 150006, China
| | - Ivan Semenov
- College of Intelligence and Computing, Tianjin University, Tianjin, 300072, China
| |
Collapse
|
79
|
Idrees S, Paudel KR, Sadaf T, Hansbro PM. Uncovering domain motif interactions using high-throughput protein-protein interaction detection methods. FEBS Lett 2024; 598:725-742. [PMID: 38439692 DOI: 10.1002/1873-3468.14841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/09/2024] [Accepted: 02/18/2024] [Indexed: 03/06/2024]
Abstract
Protein-protein interactions (PPIs) are often mediated by short linear motifs (SLiMs) in one protein and domain in another, known as domain-motif interactions (DMIs). During the past decade, SLiMs have been studied to find their role in cellular functions such as post-translational modifications, regulatory processes, protein scaffolding, cell cycle progression, cell adhesion, cell signalling and substrate selection for proteasomal degradation. This review provides a comprehensive overview of the current PPI detection techniques and resources, focusing on their relevance to capturing interactions mediated by SLiMs. We also address the challenges associated with capturing DMIs. Moreover, a case study analysing the BioGrid database as a source of DMI prediction revealed significant known DMI enrichment in different PPI detection methods. Overall, it can be said that current high-throughput PPI detection methods can be a reliable source for predicting DMIs.
Collapse
Affiliation(s)
- Sobia Idrees
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Keshav Raj Paudel
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Tayyaba Sadaf
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Philip M Hansbro
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| |
Collapse
|
80
|
Liu C, Xiao K, Yu C, Lei Y, Lyu K, Tian T, Zhao D, Zhou F, Tang H, Zeng J. A probabilistic knowledge graph for target identification. PLoS Comput Biol 2024; 20:e1011945. [PMID: 38578805 PMCID: PMC11034645 DOI: 10.1371/journal.pcbi.1011945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 04/22/2024] [Accepted: 02/24/2024] [Indexed: 04/07/2024] Open
Abstract
Early identification of safe and efficacious disease targets is crucial to alleviating the tremendous cost of drug discovery projects. However, existing experimental methods for identifying new targets are generally labor-intensive and failure-prone. On the other hand, computational approaches, especially machine learning-based frameworks, have shown remarkable application potential in drug discovery. In this work, we propose Progeni, a novel machine learning-based framework for target identification. In addition to fully exploiting the known heterogeneous biological networks from various sources, Progeni integrates literature evidence about the relations between biological entities to construct a probabilistic knowledge graph. Graph neural networks are then employed in Progeni to learn the feature embeddings of biological entities to facilitate the identification of biologically relevant target candidates. A comprehensive evaluation of Progeni demonstrated its superior predictive power over the baseline methods on the target identification task. In addition, our extensive tests showed that Progeni exhibited high robustness to the negative effect of exposure bias, a common phenomenon in recommendation systems, and effectively identified new targets that can be strongly supported by the literature. Moreover, our wet lab experiments successfully validated the biological significance of the top target candidates predicted by Progeni for melanoma and colorectal cancer. All these results suggested that Progeni can identify biologically effective targets and thus provide a powerful and useful tool for advancing the drug discovery process.
Collapse
Affiliation(s)
- Chang Liu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Kaimin Xiao
- School of Pharmaceutical Sciences, Tsinghua University, Beijing, China
- Joint Graduate Program of Peking-Tsinghua-NIBS, School of Life Sciences, Tsinghua University, Beijing, China
| | - Cuinan Yu
- Machine Learning Department, Silexon AI Technology Co., Ltd., Nanjing, Jiangsu Province, China
| | - Yipin Lei
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Kangbo Lyu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Tingzhong Tian
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Dan Zhao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Fengfeng Zhou
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, Jilin Province, China
| | - Haidong Tang
- School of Pharmaceutical Sciences, Tsinghua University, Beijing, China
| | - Jianyang Zeng
- School of Engineering, Westlake University, Hangzhou, China
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, China
- Research Center for Industries of the Future and School of Engineering, Westlake University, Hangzhou, Zhejiang Province, China
| |
Collapse
|
81
|
Kim KM, Lee KG, Lee S, Hong BK, Yun H, Park YJ, Yoo SA, Kim WU. The acute phase reactant orosomucoid-2 directly promotes rheumatoid inflammation. Exp Mol Med 2024; 56:890-903. [PMID: 38556552 PMCID: PMC11058272 DOI: 10.1038/s12276-024-01188-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 12/04/2023] [Accepted: 12/20/2023] [Indexed: 04/02/2024] Open
Abstract
Acute phase proteins involved in chronic inflammatory diseases have not been systematically analyzed. Here, global proteome profiling of serum and urine revealed that orosomucoid-2 (ORM2), an acute phase reactant, was differentially expressed in rheumatoid arthritis (RA) patients and showed the highest fold change. Therefore, we questioned the extent to which ORM2, which is produced mainly in the liver, actively participates in rheumatoid inflammation. Surprisingly, ORM2 expression was upregulated in the synovial fluids and synovial membranes of RA patients. The major cell types producing ORM2 were synovial macrophages and fibroblast-like synoviocytes (FLSs) from RA patients. Recombinant ORM2 robustly increased IL-6, TNF-α, CXCL8 (IL-8), and CCL2 production by RA macrophages and FLSs via the NF-κB and p38 MAPK pathways. Interestingly, glycophorin C, a membrane protein for determining erythrocyte shape, was the receptor for ORM2. Intra-articular injection of ORM2 increased the severity of arthritis in mice and accelerated the infiltration of macrophages into the affected joints. Moreover, circulating ORM2 levels correlated with RA activity and radiographic progression. In conclusion, the acute phase protein ORM2 can directly increase the production of proinflammatory mediators and promote chronic arthritis in mice, suggesting that ORM2 could be a new therapeutic target for RA.
Collapse
Affiliation(s)
- Ki-Myo Kim
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea
- Department of Biomedicine & Health Sciences, College of Medicine, The Catholic University of Korea, Seoul, South Korea
| | - Kang-Gu Lee
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea
- Department of Biomedicine & Health Sciences, College of Medicine, The Catholic University of Korea, Seoul, South Korea
| | - Saseong Lee
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea
| | - Bong-Ki Hong
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea
| | - Heejae Yun
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea
- Department of Biomedicine & Health Sciences, College of Medicine, The Catholic University of Korea, Seoul, South Korea
| | - Yune-Jung Park
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea
- Division of Rheumatology, Department of Internal Medicine, St. Vincent's Hospital, The Catholic University of Korea, Suwon, South Korea
| | - Seung-Ah Yoo
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea.
- Department of Biomedicine & Health Sciences, College of Medicine, The Catholic University of Korea, Seoul, South Korea.
| | - Wan-Uk Kim
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea.
- Department of Internal Medicine, The Catholic University of Korea, Seoul, South Korea.
| |
Collapse
|
82
|
Jia P, Zhang F, Wu C, Li M. A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond. Brief Bioinform 2024; 25:bbae162. [PMID: 38739759 PMCID: PMC11089422 DOI: 10.1093/bib/bbae162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 02/17/2024] [Accepted: 03/31/2024] [Indexed: 05/16/2024] Open
Abstract
Proteins interact with diverse ligands to perform a large number of biological functions, such as gene expression and signal transduction. Accurate identification of these protein-ligand interactions is crucial to the understanding of molecular mechanisms and the development of new drugs. However, traditional biological experiments are time-consuming and expensive. With the development of high-throughput technologies, an increasing amount of protein data is available. In the past decades, many computational methods have been developed to predict protein-ligand interactions. Here, we review a comprehensive set of over 160 protein-ligand interaction predictors, which cover protein-protein, protein-nucleic acid, protein-peptide and protein-other ligands (nucleotide, heme, ion) interactions. We have carried out a comprehensive analysis of the above four types of predictors from several significant perspectives, including their inputs, feature profiles, models, availability, etc. The current methods primarily rely on protein sequences, especially utilizing evolutionary information. The significant improvement in predictions is attributed to deep learning methods. Additionally, sequence-based pretrained models and structure-based approaches are emerging as new trends.
Collapse
Affiliation(s)
- Pengzhen Jia
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Fuhao Zhang
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
- College of Information Engineering, Northwest A&F University, No. 3 Taicheng Road, Yangling, Shaanxi 712100, China
| | - Chaojin Wu
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| |
Collapse
|
83
|
Karunakaran KB, Ganapathiraju MK, Jain S, Brahmachari SK, Balakrishnan N. Drug contraindications in comorbid diseases: a protein interactome perspective. NETWORK MODELING ANALYSIS IN HEALTH INFORMATICS AND BIOINFORMATICS 2024; 13:10. [DOI: 10.1007/s13721-023-00440-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 11/17/2023] [Accepted: 12/18/2023] [Indexed: 01/03/2025]
Abstract
AbstractAdverse drug reactions (ADRs) are leading causes of death and drug withdrawals and frequently co-occur with comorbidities. However, systematic studies on the effects of drugs on comorbidities are lacking. Drug interactions with the cellular protein–protein interaction (PPI) network give rise to ADRs. We selected 6 comorbid disease pairs, identified the drugs used in the treatment of the individual diseases ‘A’ and ‘B’– 44 drugs in anxiety and depression, 128 in asthma and hypertension, 48 in chronic obstructive pulmonary disease and heart failure, 58 in type 2 diabetes and obesity, 58 in Parkinson’s disease and schizophrenia, and 84 in rheumatoid arthritis and osteoporosis—and categorized them based on whether they aggravate the comorbid condition. We constructed drug target networks (DTNs) and examined their enrichment among genes in disease A/B PPI networks, expressed across 53 tissues and involved in ~ 1000 pathways. To characterize the biological features of the DTNs, we performed principal component analysis and computed the Euclidean distance between DTN component scores and feature loading values. DTNs of disease A drugs not contraindicated in B were affiliated with proteins common to A/B networks or uniquely found in the B network, similarly regulated common pathways, and disease-B specific pathways and tissues. DTNs of disease A drugs contraindicated in B were affiliated with common proteins or those uniquely found in the A network, differentially regulated common pathways, and disease A-specific pathways and tissues. Hence, DTN enrichment in pathways, tissues, and PPI networks of comorbid diseases will help identify drug contraindications in comorbidities.
Collapse
|
84
|
Xian L, Wang Y. Advances in Computational Methods for Protein–Protein Interaction Prediction. ELECTRONICS 2024; 13:1059. [DOI: 10.3390/electronics13061059] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Abstract
Protein–protein interactions (PPIs) are pivotal in various physiological processes inside biological entities. Accurate identification of PPIs holds paramount significance for comprehending biological processes, deciphering disease mechanisms, and advancing medical research. Given the costly and labor-intensive nature of experimental approaches, a multitude of computational methods have been devised to enable swift and large-scale PPI prediction. This review offers a thorough examination of recent strides in computational methodologies for PPI prediction, with a particular focus on the utilization of deep learning techniques within this domain. Alongside a systematic classification and discussion of relevant databases, feature extraction strategies, and prominent computational approaches, we conclude with a thorough analysis of current challenges and prospects for the future of this field.
Collapse
Affiliation(s)
- Lei Xian
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Yansu Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
85
|
Idrees S, Paudel KR. Proteome-wide assessment of human interactome as a source of capturing domain-motif and domain-domain interactions. J Cell Commun Signal 2024; 18:e12014. [PMID: 38545252 PMCID: PMC10964934 DOI: 10.1002/ccs3.12014] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 12/11/2023] [Indexed: 06/29/2024] Open
Abstract
Protein-protein interactions (PPIs) play a crucial role in various biological processes by establishing domain-motif (DMI) and domain-domain interactions (DDIs). While the existence of real DMIs/DDIs is generally assumed, it is rarely tested; therefore, this study extensively compared high-throughput methods and public PPI repositories as sources for DMI and DDI prediction based on the assumption that the human interactome provides sufficient data for the reliable identification of DMIs and DDIs. Different datasets from leading high-throughput methods (Yeast two-hybrid [Y2H], Affinity Purification coupled Mass Spectrometry [AP-MS], and Co-fractionation-coupled Mass Spectrometry) were assessed for their ability to capture DMIs and DDIs using known DMI/DDI information. High-throughput methods were not notably worse than PPI databases and, in some cases, appeared better. In conclusion, all PPI datasets demonstrated significant enrichment in DMIs and DDIs (p-value <0.001), establishing Y2H and AP-MS as reliable methods for predicting these interactions. This study provides valuable insights for biologists in selecting appropriate methods for predicting DMIs, ultimately aiding in SLiM discovery.
Collapse
Affiliation(s)
- Sobia Idrees
- School of Biotechnology and Biomolecular SciencesUniversity of New South WalesSydneyNew South WalesAustralia
- Centre for InflammationCentenary Institute and the University of Technology SydneySchool of Life SciencesFaculty of ScienceSydneyNew South WalesAustralia
| | - Keshav Raj Paudel
- Centre for InflammationCentenary Institute and the University of Technology SydneySchool of Life SciencesFaculty of ScienceSydneyNew South WalesAustralia
| |
Collapse
|
86
|
Zhu J, Tran AP, Deasy JO, Tannenbaum A. Multi-omic integrated curvature study on pan-cancer genomic data. MATHEMATICS OF CONTROL, SIGNALS, AND SYSTEMS 2024; 36:101-120. [DOI: 10.1007/s00498-023-00360-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 06/20/2023] [Indexed: 01/03/2025]
|
87
|
E Z, Qiao G, Wang G, Li Y. GSL-DTI: Graph structure learning network for Drug-Target interaction prediction. Methods 2024; 223:136-145. [PMID: 38360082 DOI: 10.1016/j.ymeth.2024.01.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 12/23/2023] [Accepted: 01/29/2024] [Indexed: 02/17/2024] Open
Abstract
MOTIVATION Drug-target interaction prediction is an important area of research to predict whether there is an interaction between a drug molecule and its target protein. It plays a critical role in drug discovery and development by facilitating the identification of potential drug candidates and expediting the overall process. Given the time-consuming, expensive, and high-risk nature of traditional drug discovery methods, the prediction of drug-target interactions has become an indispensable tool. Using machine learning and deep learning to tackle this class of problems has become a mainstream approach, and graph-based models have recently received much attention in this field. However, many current graph-based Drug-Target Interaction (DTI) prediction methods rely on manually defined rules to construct the Drug-Protein Pair (DPP) network during the DPP representation learning process. However, these methods fail to capture the true underlying relationships between drug molecules and target proteins. RESULTS We propose GSL-DTI, an automatic graph structure learning model used for predicting drug-target interactions (DTIs). Initially, we integrate large-scale heterogeneous networks using a graph convolution network based on meta-paths, effectively learning the representations of drugs and target proteins. Subsequently, we construct drug-protein pairs based on these representations. In contrast to previous studies that construct DPP networks based on manual rules, our method introduces an automatic graph structure learning approach. This approach utilizes a filter gate on the affinity scores of DPPs and relies on the classification loss of downstream tasks to guide the learning of the underlying DPP network structure. Based on the learned DPP network, we transform the prediction of drug-target interactions into a node classification problem. The comprehensive experiments conducted on three public datasets have shown the superiority of GSL-DTI in the tasks of DTI prediction. Additionally, GSL-DTI provides a fresh perspective for advancing research in graph structure learning for DTI prediction.
Collapse
Affiliation(s)
- Zixuan E
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China
| | - Guanyu Qiao
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China.
| | - Yang Li
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China.
| |
Collapse
|
88
|
Jin S, Zhang Y, Yu H, Lu M. SADR: Self-Supervised Graph Learning With Adaptive Denoising for Drug Repositioning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:265-277. [PMID: 38190661 DOI: 10.1109/tcbb.2024.3351079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Abstract
Traditional drug development is often high-risk and time-consuming. A promising alternative is to reuse or relocate approved drugs. Recently, some methods based on graph representation learning have started to be used for drug repositioning. These models learn the low dimensional embeddings of drug and disease nodes from the drug-disease interaction network to predict the potential association between drugs and diseases. However, these methods have strict requirements for the dataset, and if the dataset is sparse, the performance of these methods will be severely affected. At the same time, these methods have poor robustness to noise in the dataset. In response to the above challenges, we propose a drug repositioning model based on self-supervised graph learning with adptive denoising, called SADR. SADR uses data augmentation and contrastive learning strategies to learn feature representations of nodes, which can effectively solve the problems caused by sparse datasets. SADR includes an adaptive denoising training (ADT) component that can effectively identify noisy data during the training process and remove the impact of noise on the model. We have conducted comprehensive experiments on three datasets and have achieved better prediction accuracy compared to multiple baseline models. At the same time, we propose the top 10 new predictive approved drugs for treating two diseases. This demonstrates the ability of our model to identify potential drug candidates for disease indications.
Collapse
|
89
|
Liu Q, Li X, Li Y, Luo Q, Fan Q, Lu A, Guan D, Li J. A novel network pharmacology strategy to decode mechanism of Wuling Powder in treating liver cirrhosis. Chin Med 2024; 19:36. [PMID: 38429802 PMCID: PMC10905787 DOI: 10.1186/s13020-024-00896-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 01/26/2024] [Indexed: 03/03/2024] Open
Abstract
BACKGROUND Liver cirrhosis is a chronic liver disease with hepatocyte necrosis and lesion. As one of the TCM formulas Wuling Powder (WLP) is widely used in the treatment of liver cirrhosis. However, it's key functional components and action mechanism still remain unclear. We attempted to explore the Key Group of Effective Components (KGEC) of WLP in the treatment of Liver cirrhosis through integrative pharmacology combined with experiments. METHODS The components and potential target genes of WLP were extracted from published databases. A novel node importance calculation model considering both node control force and node bridging force is designed to construct the Function Response Space (FRS) and obtain key effector proteins. The genetic knapsack algorithm was employed to select KGEC. The effectiveness and reliability of KGEC were evaluated at the functional level by using gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Finally, the effectiveness and potential mechanism of KGEC were confirmed by CCK-8, qPCR and Western blot. RESULTS 940 effective proteins were obtained in FRS. KEGG pathways and GO terms enrichments analysis suggested that effective proteins well reflect liver cirrhosis characteristics at the functional level. 29 components of WLP were defined as KGEC, which covered 100% of the targets of the effective proteins. Additionally, the pathways enriched for the KGEC targets accounted for 83.33% of the shared genes between the targets and the pathogenic genes enrichment pathways. Three components scopoletin, caryophyllene oxide, and hydroxyzinamic acid from KGEC were selected for in vivo verification. The qPCR results demonstrated that all three components significantly reduced the mRNA levels of COL1A1 in TGF-β1-induced liver cirrhosis model. Furthermore, the Western blot assay indicated that these components acted synergistically to target the NF-κB, AMPK/p38, cAMP, and PI3K/AKT pathways, thus inhibiting the progression of liver cirrhosis. CONCLUSION In summary, we have developed a new model that reveals the key components and potential mechanisms of WLP for the treatment of liver cirrhosis. This model provides a reference for the secondary development of WLP and offers a methodological strategy for studying TCM formulas.
Collapse
Affiliation(s)
- Qinwen Liu
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou, China
| | - Xiaowei Li
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou, China
| | - Yi Li
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou, China
| | - Qian Luo
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou, China
| | - Qiling Fan
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou, China
| | - Aiping Lu
- Institute of Integrated Bioinformedicine and Translational Science, Hong Kong Baptist University, Hong Kong, China.
- Guangdong-Hong Kong-Macau Joint Lab On Chinese Medicine and Immune Disease Research, Guangzhou, China.
| | - Daogang Guan
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.
- Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou, China.
| | - Jiahui Li
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.
- Center for Genetics and Developmental Systems Biology, Department of Obstetrics and Gynecology, Nanfang Hospital, Southern Medical University, Guangzhou, China.
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.
| |
Collapse
|
90
|
Karunakaran KB, Jain S, Brahmachari SK, Balakrishnan N, Ganapathiraju MK. Parkinson's disease and schizophrenia interactomes contain temporally distinct gene clusters underlying comorbid mechanisms and unique disease processes. SCHIZOPHRENIA (HEIDELBERG, GERMANY) 2024; 10:26. [PMID: 38413605 PMCID: PMC10899210 DOI: 10.1038/s41537-024-00439-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 01/24/2024] [Indexed: 02/29/2024]
Abstract
Genome-wide association studies suggest significant overlaps in Parkinson's disease (PD) and schizophrenia (SZ) risks, but the underlying mechanisms remain elusive. The protein-protein interaction network ('interactome') plays a crucial role in PD and SZ and can incorporate their spatiotemporal specificities. Therefore, to study the linked biology of PD and SZ, we compiled PD- and SZ-associated genes from the DisGeNET database, and constructed their interactomes using BioGRID and HPRD. We examined the interactomes using clustering and enrichment analyses, in conjunction with the transcriptomic data of 26 brain regions spanning foetal stages to adulthood available in the BrainSpan Atlas. PD and SZ interactomes formed four gene clusters with distinct temporal identities (Disease Gene Networks or 'DGNs'1-4). DGN1 had unique SZ interactome genes highly expressed across developmental stages, corresponding to a neurodevelopmental SZ subtype. DGN2, containing unique SZ interactome genes expressed from early infancy to adulthood, correlated with an inflammation-driven SZ subtype and adult SZ risk. DGN3 contained unique PD interactome genes expressed in late infancy, early and late childhood, and adulthood, and involved in mitochondrial pathways. DGN4, containing prenatally-expressed genes common to both the interactomes, involved in stem cell pluripotency and overlapping with the interactome of 22q11 deletion syndrome (comorbid psychosis and Parkinsonism), potentially regulates neurodevelopmental mechanisms in PD-SZ comorbidity. Our findings suggest that disrupted neurodevelopment (regulated by DGN4) could expose risk windows in PD and SZ, later elevating disease risk through inflammation (DGN2). Alternatively, variant clustering in DGNs may produce disease subtypes, e.g., PD-SZ comorbidity with DGN4, and early/late-onset SZ with DGN1/DGN2.
Collapse
Affiliation(s)
- Kalyani B Karunakaran
- Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India.
- Institute for the Advanced Study of Human Biology, Kyoto University, Kyoto, Japan.
| | - Sanjeev Jain
- National Institute of Mental Health and Neuro-Sciences (NIMHANS), Bangalore, India.
| | | | - N Balakrishnan
- Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India
| | - Madhavi K Ganapathiraju
- Department of Computer Science, Carnegie Mellon University Qatar, Doha, Qatar.
- Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
91
|
Petrenko S, Hier DB, Bone MA, Obafemi-Ajayi T, Timpson EJ, Marsh WE, Speight M, Wunsch DC. Analyzing Biomedical Datasets with Symbolic Tree Adaptive Resonance Theory. INFORMATION 2024; 15:125. [DOI: 10.3390/info15030125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2025] Open
Abstract
Biomedical datasets distill many mechanisms of human diseases, linking diseases to genes and phenotypes (signs and symptoms of disease), genetic mutations to altered protein structures, and altered proteins to changes in molecular functions and biological processes. It is desirable to gain new insights from these data, especially with regard to the uncovering of hierarchical structures relating disease variants. However, analysis to this end has proven difficult due to the complexity of the connections between multi-categorical symbolic data. This article proposes symbolic tree adaptive resonance theory (START), with additional supervised, dual-vigilance (DV-START), and distributed dual-vigilance (DDV-START) formulations, for the clustering of multi-categorical symbolic data from biomedical datasets by demonstrating its utility in clustering variants of Charcot–Marie–Tooth disease using genomic, phenotypic, and proteomic data.
Collapse
Affiliation(s)
- Sasha Petrenko
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO 65409, USA
| | - Daniel B. Hier
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO 65409, USA
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL 60607, USA
| | - Mary A. Bone
- Department of Science and Industry Systems, University of Southeastern Norway, 3616 Kongsberg, Norway
| | - Tayo Obafemi-Ajayi
- Engineering Program, Missouri State University, Springfield, MO 65897, USA
| | - Erik J. Timpson
- Honeywell Federal Manufacturing & Technologies, Kansas City, MO 64147, USA
| | - William E. Marsh
- Honeywell Federal Manufacturing & Technologies, Kansas City, MO 64147, USA
| | - Michael Speight
- Honeywell Federal Manufacturing & Technologies, Kansas City, MO 64147, USA
| | - Donald C. Wunsch
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO 65409, USA
| |
Collapse
|
92
|
Liu M, Srivastava G, Ramanujam J, Brylinski M. SynerGNet: A Graph Neural Network Model to Predict Anticancer Drug Synergy. Biomolecules 2024; 14:253. [PMID: 38540674 PMCID: PMC10967862 DOI: 10.3390/biom14030253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Revised: 02/16/2024] [Accepted: 02/19/2024] [Indexed: 01/03/2025] Open
Abstract
Drug combination therapy shows promise in cancer treatment by addressing drug resistance, reducing toxicity, and enhancing therapeutic efficacy. However, the intricate and dynamic nature of biological systems makes identifying potential synergistic drugs a costly and time-consuming endeavor. To facilitate the development of combination therapy, techniques employing artificial intelligence have emerged as a transformative solution, providing a sophisticated avenue for advancing existing therapeutic approaches. In this study, we developed SynerGNet, a graph neural network model designed to accurately predict the synergistic effect of drug pairs against cancer cell lines. SynerGNet utilizes cancer-specific featured graphs created by integrating heterogeneous biological features into the human protein-protein interaction network, followed by a reduction process to enhance topological diversity. Leveraging synergy data provided by AZ-DREAM Challenges, the model yields a balanced accuracy of 0.68, significantly outperforming traditional machine learning. Encouragingly, augmenting the training data with carefully constructed synthetic instances improved the balanced accuracy of SynerGNet to 0.73. Finally, the results of an independent validation conducted against DrugCombDB demonstrated that it exhibits a strong performance when applied to unseen data. SynerGNet shows a great potential in detecting drug synergy, positioning itself as a valuable tool that could contribute to the advancement of combination therapy for cancer treatment.
Collapse
Affiliation(s)
- Mengmeng Liu
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803, USA; (M.L.)
| | - Gopal Srivastava
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - J. Ramanujam
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803, USA; (M.L.)
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803, USA
| |
Collapse
|
93
|
Xiong D, Qiu Y, Zhao J, Zhou Y, Lee D, Gupta S, Torres M, Lu W, Liang S, Kang JJ, Eng C, Loscalzo J, Cheng F, Yu H. Structurally-informed human interactome reveals proteome-wide perturbations by disease mutations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.04.24.538110. [PMID: 37162909 PMCID: PMC10168245 DOI: 10.1101/2023.04.24.538110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Human genome sequencing studies have identified numerous loci associated with complex diseases. However, translating human genetic and genomic findings to disease pathobiology and therapeutic discovery remains a major challenge at multiscale interactome network levels. Here, we present a deep-learning-based ensemble framework, termed PIONEER (Protein-protein InteractiOn iNtErfacE pRediction), that accurately predicts protein binding partner-specific interfaces for all known protein interactions in humans and seven other common model organisms, generating comprehensive structurally-informed protein interactomes. We demonstrate that PIONEER outperforms existing state-of-the-art methods. We further systematically validated PIONEER predictions experimentally through generating 2,395 mutations and testing their impact on 6,754 mutation-interaction pairs, confirming the high quality and validity of PIONEER predictions. We show that disease-associated mutations are enriched in PIONEER-predicted protein-protein interfaces after mapping mutations from ~60,000 germline exomes and ~36,000 somatic genomes. We identify 586 significant protein-protein interactions (PPIs) enriched with PIONEER-predicted interface somatic mutations (termed oncoPPIs) from pan-cancer analysis of ~11,000 tumor whole-exomes across 33 cancer types. We show that PIONEER-predicted oncoPPIs are significantly associated with patient survival and drug responses from both cancer cell lines and patient-derived xenograft mouse models. We identify a landscape of PPI-perturbing tumor alleles upon ubiquitination by E3 ligases, and we experimentally validate the tumorigenic KEAP1-NRF2 interface mutation p.Thr80Lys in non-small cell lung cancer. We show that PIONEER-predicted PPI-perturbing alleles alter protein abundance and correlates with drug responses and patient survival in colon and uterine cancers as demonstrated by proteogenomic data from the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium. PIONEER, implemented as both a web server platform and a software package, identifies functional consequences of disease-associated alleles and offers a deep learning tool for precision medicine at multiscale interactome network levels.
Collapse
Affiliation(s)
- Dapeng Xiong
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| | - Yunguang Qiu
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Junfei Zhao
- Department of Systems Biology, Herbert Irving Comprehensive Center, Columbia University, New York, NY 10032, USA
| | - Yadi Zhou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Dongjin Lee
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Shobhita Gupta
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
- Biophysics Program, Cornell University, Ithaca, NY 14853, USA
| | - Mateo Torres
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| | - Weiqiang Lu
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Jin Joo Kang
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| | - Charis Eng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Joseph Loscalzo
- Channing Division of Network Medicine, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
94
|
Dehghan A, Abbasi K, Razzaghi P, Banadkuki H, Gharaghani S. CCL-DTI: contributing the contrastive loss in drug-target interaction prediction. BMC Bioinformatics 2024; 25:48. [PMID: 38291364 PMCID: PMC11264960 DOI: 10.1186/s12859-024-05671-3] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 01/22/2024] [Indexed: 02/01/2024] Open
Abstract
BACKGROUND The Drug-Target Interaction (DTI) prediction uses a drug molecule and a protein sequence as inputs to predict the binding affinity value. In recent years, deep learning-based models have gotten more attention. These methods have two modules: the feature extraction module and the task prediction module. In most deep learning-based approaches, a simple task prediction loss (i.e., categorical cross entropy for the classification task and mean squared error for the regression task) is used to learn the model. In machine learning, contrastive-based loss functions are developed to learn more discriminative feature space. In a deep learning-based model, extracting more discriminative feature space leads to performance improvement for the task prediction module. RESULTS In this paper, we have used multimodal knowledge as input and proposed an attention-based fusion technique to combine this knowledge. Also, we investigate how utilizing contrastive loss function along the task prediction loss could help the approach to learn a more powerful model. Four contrastive loss functions are considered: (1) max-margin contrastive loss function, (2) triplet loss function, (3) Multi-class N-pair Loss Objective, and (4) NT-Xent loss function. The proposed model is evaluated using four well-known datasets: Wang et al. dataset, Luo's dataset, Davis, and KIBA datasets. CONCLUSIONS Accordingly, after reviewing the state-of-the-art methods, we developed a multimodal feature extraction network by combining protein sequences and drug molecules, along with protein-protein interaction networks and drug-drug interaction networks. The results show it performs significantly better than the comparable state-of-the-art approaches.
Collapse
Affiliation(s)
- Alireza Dehghan
- Department of Bioinformatics, Kish International Campus, University of Tehran, Kish, 1417614411, Iran
| | - Karim Abbasi
- Laboratory of System Biology, Bioinformatics and Artificial Intelligence in Medicine (LBB&AI), Faculty of Mathematics and Computer Science, Kharazmi University, Tehran, 1417614411, Iran
| | - Parvin Razzaghi
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, 4513766731, Iran.
| | - Hossein Banadkuki
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, 1417614411, Iran
| | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, 1417614411, Iran.
| |
Collapse
|
95
|
Duman ET, Tuna G, Ak E, Avsar G, Pir P. Optimized network based natural language processing approach to reveal disease comorbidities in COVID-19. Sci Rep 2024; 14:2325. [PMID: 38282038 PMCID: PMC10822845 DOI: 10.1038/s41598-024-52819-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 01/24/2024] [Indexed: 01/30/2024] Open
Abstract
A novel virus emerged from Wuhan, China, at the end of 2019 and quickly evolved into a pandemic, significantly impacting various industries, especially healthcare. One critical lesson from COVID-19 is the importance of understanding and predicting underlying comorbidities to better prioritize care and pharmacological therapies. Factors like age, race, and comorbidity history are crucial in determining disease mortality. While clinical data from hospitals and cohorts have led to the identification of these comorbidities, traditional approaches often lack a mechanistic understanding of the connections between them. In response, we utilized a deep learning approach to integrate COVID-19 data with data from other diseases, aiming to detect comorbidities with mechanistic insights. Our modified algorithm in the mpDisNet package, based on word-embedding deep learning techniques, incorporates miRNA expression profiles from SARS-CoV-2 infected cell lines and their target transcription factors. This approach is aligned with the emerging field of network medicine, which seeks to define diseases based on distinct pathomechanisms rather than just phenotypes. The main aim is discovery of possible unknown comorbidities by connecting the diseases by their miRNA mediated regulatory interactions. The algorithm can predict the majority of COVID-19's known comorbidities, as well as several diseases that have yet to be discovered to be comorbid with COVID-19. These potentially comorbid diseases should be investigated further to raise awareness and prevention, as well as informing the comorbidity research for the next possible outbreak.
Collapse
Affiliation(s)
- Emre Taylan Duman
- Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey.
- NGS-Core Unit for Integrative Genomics, Institute of Pathology, University Medical Center Göttingen, Göttingen, Germany.
| | - Gizem Tuna
- Department of Molecular Biology, Gebze Technical University, Kocaeli, Turkey
| | - Enes Ak
- Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey
| | - Gülben Avsar
- Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey
| | - Pinar Pir
- Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey
| |
Collapse
|
96
|
Kong X, Diao L, Jiang P, Nie S, Guo S, Li D. DDK-Linker: a network-based strategy identifies disease signals by linking high-throughput omics datasets to disease knowledge. Brief Bioinform 2024; 25:bbae111. [PMID: 38517698 PMCID: PMC10959161 DOI: 10.1093/bib/bbae111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 02/26/2024] [Accepted: 02/27/2024] [Indexed: 03/24/2024] Open
Abstract
The high-throughput genomic and proteomic scanning approaches allow investigators to measure the quantification of genome-wide genes (or gene products) for certain disease conditions, which plays an essential role in promoting the discovery of disease mechanisms. The high-throughput approaches often generate a large gene list of interest (GOIs), such as differentially expressed genes/proteins. However, researchers have to perform manual triage and validation to explore the most promising, biologically plausible linkages between the known disease genes and GOIs (disease signals) for further study. Here, to address this challenge, we proposed a network-based strategy DDK-Linker to facilitate the exploration of disease signals hidden in omics data by linking GOIs to disease knowns genes. Specifically, it reconstructed gene distances in the protein-protein interaction (PPI) network through six network methods (random walk with restart, Deepwalk, Node2Vec, LINE, HOPE, Laplacian) to discover disease signals in omics data that have shorter distances to disease genes. Furthermore, benefiting from the establishment of knowledge base we established, the abundant bioinformatics annotations were provided for each candidate disease signal. To assist in omics data interpretation and facilitate the usage, we have developed this strategy into an application that users can access through a website or download the R package. We believe DDK-Linker will accelerate the exploring of disease genes and drug targets in a variety of omics data, such as genomics, transcriptomics and proteomics data, and provide clues for complex disease mechanism and pharmacological research. DDK-Linker is freely accessible at http://ddklinker.ncpsb.org.cn/.
Collapse
Affiliation(s)
- Xiangren Kong
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Lihong Diao
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Peng Jiang
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Shiyan Nie
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Shuzhen Guo
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Dong Li
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| |
Collapse
|
97
|
Hu C, Mi W, Li F, Zhu L, Ou Q, Li M, Li T, Ma Y, Zhang Y, Xu Y. Optimizing drug combination and mechanism analysis based on risk pathway crosstalk in pan cancer. Sci Data 2024; 11:74. [PMID: 38228620 PMCID: PMC10791624 DOI: 10.1038/s41597-024-02915-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 01/03/2024] [Indexed: 01/18/2024] Open
Abstract
Combination therapy can greatly improve the efficacy of cancer treatment, so identifying the most effective drug combination and interaction can accelerate the development of combination therapy. Here we developed a computational network biological approach to identify the effective drug which inhibition risk pathway crosstalk of cancer, and then filtrated and optimized the drug combination for cancer treatment. We integrated high-throughput data concerning pan-cancer and drugs to construct miRNA-mediated crosstalk networks among cancer pathways and further construct networks for therapeutic drug. Screening by drug combination method, we obtained 687 optimized drug combinations of 83 first-line anticancer drugs in pan-cancer. Next, we analyzed drug combination mechanism, and confirmed that the targets of cancer-specific crosstalk network in drug combination were closely related to cancer prognosis by survival analysis. Finally, we save all the results to a webpage for query ( http://bio-bigdata.hrbmu.edu.cn/oDrugCP/ ). In conclusion, our study provided an effective method for screening precise drug combinations for various cancer treatments, which may have important scientific significance and clinical application value for tumor treatment.
Collapse
Affiliation(s)
- Congxue Hu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Wanqi Mi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Feng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Lun Zhu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Qi Ou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Maohao Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Tengyue Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yuheng Ma
- Department of Pharmacy, Inner Mongolia Medical University, Jinshan Development Zone, Hohhot, 010100, China
| | - Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yingqi Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China.
- Department of Pharmacy, Inner Mongolia Medical University, Jinshan Development Zone, Hohhot, 010100, China.
| |
Collapse
|
98
|
Elkin R, Oh JH, Dela Cruz F, Norton L, Deasy JO, Kung AL, Tannenbaum AR. Dynamic network curvature analysis of gene expression reveals novel potential therapeutic targets in sarcoma. Sci Rep 2024; 14:488. [PMID: 38177639 PMCID: PMC10766622 DOI: 10.1038/s41598-023-49930-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 12/13/2023] [Indexed: 01/06/2024] Open
Abstract
Network properties account for the complex relationship between genes, making it easier to identify complex patterns in their interactions. In this work, we leveraged these network properties for dual purposes. First, we clustered pediatric sarcoma tumors using network information flow as a similarity metric, computed by the Wasserstein distance. We demonstrate that this approach yields the best concordance with histological subtypes, validated against three state-of-the-art methods. Second, to identify molecular targets that would be missed by more conventional methods of analysis, we applied a novel unsupervised method to cluster gene interactomes represented as networks in pediatric sarcoma. RNA-Seq data were mapped to protein-level interactomes to construct weighted networks that were then subjected to a non-Euclidean, multi-scale geometric approach centered on a discrete notion of curvature. This provides a measure of the functional association among genes in the context of their connectivity. In confirmation of the validity of this method, hierarchical clustering revealed the characteristic EWSR1-FLI1 fusion in Ewing sarcoma. Furthermore, assessing the effects of in silico edge perturbations and simulated gene knockouts as quantified by changes in curvature, we found non-trivial gene associations not previously identified.
Collapse
Affiliation(s)
- Rena Elkin
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, 10065, USA.
| | - Jung Hun Oh
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, 10065, USA
| | - Filemon Dela Cruz
- Department of Pediatrics, Memorial Sloan Kettering Cancer Center, New York, 10065, USA
| | - Larry Norton
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, 10065, USA
| | - Joseph O Deasy
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, 10065, USA
| | - Andrew L Kung
- Department of Pediatrics, Memorial Sloan Kettering Cancer Center, New York, 10065, USA
| | - Allen R Tannenbaum
- Departments of Computer Science and Applied Mathematics and Statistics, Stony Brook University, Stony Brook, 11794, USA
| |
Collapse
|
99
|
Rani P, Dutta K, Kumar V. Performance evaluation of drug synergy datasets using computational intelligence approaches. MULTIMEDIA TOOLS AND APPLICATIONS 2024; 83:8971-8997. [DOI: 10.1007/s11042-023-15723-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 09/26/2022] [Accepted: 04/18/2023] [Indexed: 01/03/2025]
|
100
|
Nithya C, Kiran M, Nagarajaram HA. Hubs and Bottlenecks in Protein-Protein Interaction Networks. Methods Mol Biol 2024; 2719:227-248. [PMID: 37803121 DOI: 10.1007/978-1-0716-3461-5_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/08/2023]
Abstract
Protein-protein interaction networks (PPINs) represent the physical interactions among proteins in a cell. These interactions are critical in all cellular processes, including signal transduction, metabolic regulation, and gene expression. In PPINs, centrality measures are widely used to identify the most critical nodes. The two most commonly used centrality measures in networks are degree and betweenness centralities. Degree centrality is the number of connections a node has in the network, and betweenness centrality is the measure of the extent to which a node lies on the shortest paths between pairs of other nodes in the network. In PPINs, proteins with high degree and betweenness centrality are referred to as hubs and bottlenecks respectively. Hubs and bottlenecks are topologically and functionally essential proteins that play crucial roles in maintaining the network's structure and function. This article comprehensively reviews essential literature on hubs and bottlenecks, including their properties and functions.
Collapse
Affiliation(s)
- Chandramohan Nithya
- Department of Biotechnology and Bioinformatics, School of Life Sciences, University of Hyderabad, Hyderabad, Telangana, India
| | - Manjari Kiran
- Department of Systems and Computational Biology, School of Life Sciences, University of Hyderabad, Hyderabad, Telangana, India
| | | |
Collapse
|