1
|
Rao R, Gulfishan M, Kim MS, Kashyap MK. Deciphering Cancer Complexity: Integrative Proteogenomics and Proteomics Approaches for Biomarker Discovery. Methods Mol Biol 2025; 2859:211-237. [PMID: 39436604 DOI: 10.1007/978-1-0716-4152-1_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2024]
Abstract
Proteomics has revolutionized the field of cancer biology because the use of a large number of in vivo (SILAC), in vitro (iTRAQ, ICAT, TMT, stable-isotope Dimethyl, and 18O) labeling techniques or label-free methods (spectral counting or peak intensities) coupled with mass spectrometry enables us to profile and identify dysregulated proteins in diseases such as cancer. These proteome and genome studies have led to many challenges, such as the lack of consistency or correlation between copy numbers, RNA, and protein-level data. This review covers solely mass spectrometry-based approaches used for cancer biomarker discovery. It also touches on the emerging role of oncoproteogenomics or proteogenomics in cancer biomarker discovery and how this new area is attracting the integration of genomics and proteomics areas to address some of the important questions to help impinge on the biology and pathophysiology of different malignancies to make these mass spectrometry-based studies more realistic and relevant to clinical settings.
Collapse
Affiliation(s)
- Rashmi Rao
- School of Life and Allied Health Sciences, Glocal University, Saharanpur, UP, India
| | - Mohd Gulfishan
- School of Life and Allied Health Sciences, Glocal University, Saharanpur, UP, India
| | - Min-Sik Kim
- Department of New Biology, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu-42988, Republic of Korea
| | - Manoj Kumar Kashyap
- Amity Stem Cell Institute (ASCI), Amity Medical School (AMS), Amity University Haryana, Panchgaon (Manesar), Gurugram, Haryana, India.
| |
Collapse
|
2
|
Wright SN, Colton S, Schaffer LV, Pillich RT, Churas C, Pratt D, Ideker T. State of the interactomes: an evaluation of molecular networks for generating biological insights. Mol Syst Biol 2024:10.1038/s44320-024-00077-y. [PMID: 39653848 DOI: 10.1038/s44320-024-00077-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 11/07/2024] [Accepted: 11/11/2024] [Indexed: 12/18/2024] Open
Abstract
Advancements in genomic and proteomic technologies have powered the creation of large gene and protein networks ("interactomes") for understanding biological systems. However, the proliferation of interactomes complicates the selection of networks for specific applications. Here, we present a comprehensive evaluation of 45 current human interactomes, encompassing protein-protein interactions as well as gene regulatory, signaling, colocalization, and genetic interaction networks. Our analysis shows that large composite networks such as HumanNet, STRING, and FunCoup are most effective for identifying disease genes, while smaller networks such as DIP, Reactome, and SIGNOR demonstrate stronger performance in interaction prediction. Our study provides a benchmark for interactomes across diverse biological applications and clarifies factors that influence network performance. Furthermore, our evaluation pipeline paves the way for continued assessment of emerging and updated interaction networks in the future.
Collapse
Affiliation(s)
- Sarah N Wright
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Scott Colton
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Leah V Schaffer
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Rudolf T Pillich
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Christopher Churas
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Dexter Pratt
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Trey Ideker
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
3
|
Yin H, Duo H, Li S, Qin D, Xie L, Xiao Y, Sun J, Tao J, Zhang X, Li Y, Zou Y, Yang Q, Yang X, Hao Y, Li B. Unlocking biological insights from differentially expressed genes: Concepts, methods, and future perspectives. J Adv Res 2024:S2090-1232(24)00560-5. [PMID: 39647635 DOI: 10.1016/j.jare.2024.12.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2024] [Revised: 10/12/2024] [Accepted: 12/03/2024] [Indexed: 12/10/2024] Open
Abstract
BACKGROUND Identifying differentially expressed genes (DEGs) is a core task of transcriptome analysis, as DEGs can reveal the molecular mechanisms underlying biological processes. However, interpreting the biological significance of large DEG lists is challenging. Currently, gene ontology, pathway enrichment and protein-protein interaction analysis are common strategies employed by biologists. Additionally, emerging analytical strategies/approaches (such as network module analysis, knowledge graph, drug repurposing, cell marker discovery, trajectory analysis, and cell communication analysis) have been proposed. Despite these advances, comprehensive guidelines for systematically and thoroughly mining the biological information within DEGs remain lacking. AIM OF REVIEW This review aims to provide an overview of essential concepts and methodologies for the biological interpretation of DEGs, enhancing the contextual understanding. It also addresses the current limitations and future perspectives of these approaches, highlighting their broad applications in deciphering the molecular mechanism of complex diseases and phenotypes. To assist users in extracting insights from extensive datasets, especially various DEG lists, we developed DEGMiner (https://www.ciblab.net/DEGMiner/), which integrates over 300 easily accessible databases and tools. KEY SCIENTIFIC CONCEPTS OF REVIEW This review offers strong support and guidance for exploring DEGs, and also will accelerate the discovery of hidden biological insights within genomes.
Collapse
Affiliation(s)
- Huachun Yin
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China; Department of Neurosurgery, Xinqiao Hospital, The Army Medical University, Chongqing 400037, PR China; Department of Neurobiology, Chongqing Key Laboratory of Neurobiology, The Army Medical University, Chongqing 400038, PR China
| | - Hongrui Duo
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Song Li
- Department of Neurosurgery, Xinqiao Hospital, The Army Medical University, Chongqing 400037, PR China
| | - Dan Qin
- Department of Biology, College of Science, Northeastern University, Boston, MA 02115, USA
| | - Lingling Xie
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Yingxue Xiao
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Jing Sun
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Jingxin Tao
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Xiaoxi Zhang
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Yinghong Li
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, PR China
| | - Yue Zou
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Qingxia Yang
- Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou 310058, PR China
| | - Xian Yang
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China.
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China.
| |
Collapse
|
4
|
Sluzala ZB, Shan Y, Elghazi L, Cárdenas EL, Hamati A, Garner AL, Fort PE. Novel mTORC2/HSPB4 Interaction: Role and Regulation of HSPB4 T148 Phosphorylation. Cells 2024; 13:2000. [PMID: 39682748 DOI: 10.3390/cells13232000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2024] [Revised: 11/23/2024] [Accepted: 11/27/2024] [Indexed: 12/18/2024] Open
Abstract
HSPB4 and HSPB5 (α-crystallins) have shown increasing promise as neuroprotective agents, demonstrating several anti-apoptotic and protective roles in disorders such as multiple sclerosis and diabetic retinopathy. HSPs are highly regulated by post-translational modification, including deamidation, glycosylation, and phosphorylation. Among them, T148 phosphorylation has been shown to regulate the structural and functional characteristics of HSPB4 and underlie, in part, its neuroprotective capacity. We recently demonstrated that this phosphorylation is reduced in retinal tissues from patients with diabetic retinopathy, raising the question of its regulation during diseases. The kinase(s) responsible for regulating this phosphorylation, however, have yet to be identified. To this end, we employed a multi-tier strategy utilizing in vitro kinome profiling, bioinformatics, and chemoproteomics to predict and discover the kinases capable of phosphorylating T148. Several kinases were identified as being capable of specifically phosphorylating T148 in vitro, and further analysis highlighted mTORC2 as a particularly strong candidate. Altogether, our data demonstrate that the HSPB4-mTORC2 interaction is multi-faceted. Our data support the role of mTORC2 as a specific kinase phosphorylating HSPB4 at T148, but also provide evidence that the HSPB4 chaperone function further strengthens the interaction. This study addresses a critical gap in our understanding of the regulatory underpinnings of T148 phosphorylation-mediated neuroprotection.
Collapse
Affiliation(s)
- Zachary B Sluzala
- Department of Ophthalmology & Visual Sciences, The University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Shan
- Department of Ophthalmology & Visual Sciences, The University of Michigan, Ann Arbor, MI 48109, USA
| | - Lynda Elghazi
- Department of Ophthalmology & Visual Sciences, The University of Michigan, Ann Arbor, MI 48109, USA
| | - Emilio L Cárdenas
- Interdepartmental Program in Medicinal Chemistry, The University of Michigan, Ann Arbor, MI 48109, USA
| | - Angelina Hamati
- Department of Ophthalmology & Visual Sciences, The University of Michigan, Ann Arbor, MI 48109, USA
| | - Amanda L Garner
- Interdepartmental Program in Medicinal Chemistry, The University of Michigan, Ann Arbor, MI 48109, USA
| | - Patrice E Fort
- Department of Ophthalmology & Visual Sciences, The University of Michigan, Ann Arbor, MI 48109, USA
- Department of Molecular & Integrative Physiology, The University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
5
|
Nayar G, Altman RB. Heterogeneous network approaches to protein pathway prediction. Comput Struct Biotechnol J 2024; 23:2727-2739. [PMID: 39035835 PMCID: PMC11260399 DOI: 10.1016/j.csbj.2024.06.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 06/17/2024] [Accepted: 06/18/2024] [Indexed: 07/23/2024] Open
Abstract
Understanding protein-protein interactions (PPIs) and the pathways they comprise is essential for comprehending cellular functions and their links to specific phenotypes. Despite the prevalence of molecular data generated by high-throughput sequencing technologies, a significant gap remains in translating this data into functional information regarding the series of interactions that underlie phenotypic differences. In this review, we present an in-depth analysis of heterogeneous network methodologies for modeling protein pathways, highlighting the critical role of integrating multifaceted biological data. It outlines the process of constructing these networks, from data representation to machine learning-driven predictions and evaluations. The work underscores the potential of heterogeneous networks in capturing the complexity of proteomic interactions, thereby offering enhanced accuracy in pathway prediction. This approach not only deepens our understanding of cellular processes but also opens up new possibilities in disease treatment and drug discovery by leveraging the predictive power of comprehensive proteomic data analysis.
Collapse
Affiliation(s)
- Gowri Nayar
- Department of Biomedical Data Science, Stanford University, United States
| | - Russ B. Altman
- Department of Biomedical Data Science, Stanford University, United States
- Department of Genetics, Stanford University, United States
- Department of Medicine, Stanford University, United States
- Department of Bioengineering, Stanford University, United States
| |
Collapse
|
6
|
Cheng X, Meng X, Chen R, Song Z, Li S, Wei S, Lv H, Zhang S, Tang H, Jiang Y, Zhang R. The molecular subtypes of autoimmune diseases. Comput Struct Biotechnol J 2024; 23:1348-1363. [PMID: 38596313 PMCID: PMC11001648 DOI: 10.1016/j.csbj.2024.03.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 03/27/2024] [Accepted: 03/27/2024] [Indexed: 04/11/2024] Open
Abstract
Autoimmune diseases (ADs) are characterized by their complexity and a wide range of clinical differences. Despite patients presenting with similar symptoms and disease patterns, their reactions to treatments may vary. The current approach of personalized medicine, which relies on molecular data, is seen as an effective method to address the variability in these diseases. This review examined the pathologic classification of ADs, such as multiple sclerosis and lupus nephritis, over time. Acknowledging the limitations inherent in pathologic classification, the focus shifted to molecular classification to achieve a deeper insight into disease heterogeneity. The study outlined the established methods and findings from the molecular classification of ADs, categorizing systemic lupus erythematosus (SLE) into four subtypes, inflammatory bowel disease (IBD) into two, rheumatoid arthritis (RA) into three, and multiple sclerosis (MS) into a single subtype. It was observed that the high inflammation subtype of IBD, the RA inflammation subtype, and the MS "inflammation & EGF" subtype share similarities. These subtypes all display a consistent pattern of inflammation that is primarily driven by the activation of the JAK-STAT pathway, with the effective drugs being those that target this signaling pathway. Additionally, by identifying markers that are uniquely associated with the various subtypes within the same disease, the study was able to describe the differences between subtypes in detail. The findings are expected to contribute to the development of personalized treatment plans for patients and establish a strong basis for tailored approaches to treating autoimmune diseases.
Collapse
Affiliation(s)
| | | | | | - Zerun Song
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Shuai Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Siyu Wei
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Hongchao Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Shuhao Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Hao Tang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yongshuai Jiang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Ruijie Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| |
Collapse
|
7
|
Wiśniewski J, Więcek K, Ali H, Pyrc K, Kula-Păcurar A, Wagner M, Chen HC. Distinguishable topology of the task-evoked functional genome networks in HIV-1 reservoirs. iScience 2024; 27:111222. [PMID: 39559761 PMCID: PMC11570469 DOI: 10.1016/j.isci.2024.111222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Revised: 10/07/2024] [Accepted: 10/18/2024] [Indexed: 11/20/2024] Open
Abstract
HIV-1 reservoirs display a heterogeneous nature, lodging both intact and defective proviruses. To deepen our understanding of such heterogeneous HIV-1 reservoirs and their functional implications, we integrated basic concepts of graph theory to characterize the composition of HIV-1 reservoirs. Our analysis revealed noticeable topological properties in networks, featuring immunologic signatures enriched by genes harboring intact and defective proviruses, when comparing antiretroviral therapy (ART)-treated HIV-1-infected individuals and elite controllers. The key variable, the rich factor, played a pivotal role in classifying distinct topological properties in networks. The host gene expression strengthened the accuracy of classification between elite controllers and ART-treated patients. Markov chain modeling for the simulation of different graph networks demonstrated the presence of an intrinsic barrier between elite controllers and non-elite controllers. Overall, our work provides a prime example of leveraging genomic approaches alongside mathematical tools to unravel the complexities of HIV-1 reservoirs.
Collapse
Affiliation(s)
- Janusz Wiśniewski
- Quantitative Virology Research Group, Population Diagnostics Center, Łukasiewicz Research Network – PORT Polish Center for Technology Development, Stabłowicka 147, 54-066 Wrocław, Poland
| | - Kamil Więcek
- Quantitative Virology Research Group, Population Diagnostics Center, Łukasiewicz Research Network – PORT Polish Center for Technology Development, Stabłowicka 147, 54-066 Wrocław, Poland
| | - Haider Ali
- Molecular Virology Group, Małopolska Centre of Biotechnology, Jagiellonian University, Gronostajowa 7A str, 30-387 Kraków, Poland
- Doctoral School of Exact and Natural Sciences, Jagiellonian University, Łojasiewicza 11, 30-348 Kraków, Poland
| | - Krzysztof Pyrc
- Virogenetics Laboratory of Virology, Małopolska Centre of Biotechnology, Jagiellonian University, Gronostajowa 7A str, 30-387 Kraków, Poland
| | - Anna Kula-Păcurar
- Molecular Virology Group, Małopolska Centre of Biotechnology, Jagiellonian University, Gronostajowa 7A str, 30-387 Kraków, Poland
| | - Marek Wagner
- Innate Immunity Research Group, Life Sciences and Biotechnology Center, Łukasiewicz Research Network – PORT Polish Center for Technology Development, Stabłowicka 147, 54-066 Wrocław, Poland
| | - Heng-Chang Chen
- Quantitative Virology Research Group, Population Diagnostics Center, Łukasiewicz Research Network – PORT Polish Center for Technology Development, Stabłowicka 147, 54-066 Wrocław, Poland
| |
Collapse
|
8
|
He C, Zhao Z, Wang X, Zheng H, Duan L, Zuo J. Exploring drug-target interaction prediction on cold-start scenarios via meta-learning-based graph transformer. Methods 2024; 234:10-20. [PMID: 39550022 DOI: 10.1016/j.ymeth.2024.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 11/07/2024] [Accepted: 11/12/2024] [Indexed: 11/18/2024] Open
Abstract
Predicting drug-target interaction (DTI) is of great importance for drug discovery and development. With the rapid development of biological and chemical technologies, computational methods for DTI prediction are becoming a promising approach. However, there are few solutions to the cold-start problem in DTI prediction scenarios, as these methods rely on existing interaction information to support their modeling. Consequently, they are unable to effectively predict DTIs for new drugs or targets with limited interaction data in the existing work. To this end, we propose a graph transformer method based on meta-learning named MGDTI (short for Meta-learning-based Graph Transformer for Drug-Target Interaction prediction) to fill this gap. Technically, we employ drug-drug similarity and target-target similarity as additional information to mitigate the scarcity of interactions. Besides, we trained MGDTI via meta-learning to be adaptive to cold-start tasks. Moreover, we employed graph transformer to prevent over-smoothing by capturing long-range dependencies. Extensive results on the benchmark dataset demonstrate that MGDTI is effective on DTI prediction under cold-start scenarios.
Collapse
Affiliation(s)
- Chengxin He
- School of Computer Science, Sichuan University, Chengdu 610065, China; College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Zhenjiang Zhao
- School of Computer Science, Sichuan University, Chengdu 610065, China
| | - Xinye Wang
- School of Computer Science, Sichuan University, Chengdu 610065, China
| | - Huiru Zheng
- School of Computing, Ulster University, Belfast BT15 1ED, Northern Ireland, UK
| | - Lei Duan
- School of Computer Science, Sichuan University, Chengdu 610065, China
| | - Jie Zuo
- School of Computer Science, Sichuan University, Chengdu 610065, China.
| |
Collapse
|
9
|
Huang P, Gao W, Fu C, Wang M, Li Y, Chu B, He A, Li Y, Deng X, Zhang Y, Kong Q, Yuan J, Wang H, Shi Y, Gao D, Qin R, Hunter T, Tian R. Clinical functional proteomics of intercellular signalling in pancreatic cancer. Nature 2024:10.1038/s41586-024-08225-y. [PMID: 39537929 DOI: 10.1038/s41586-024-08225-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 10/15/2024] [Indexed: 11/16/2024]
Abstract
Pancreatic ductal adenocarcinoma (PDAC) has an atypical, highly stromal tumour microenvironment (TME) that profoundly contributes to its poor prognosis1. Here, to better understand the intercellular signalling between cancer and stromal cells directly in PDAC tumours, we developed a multidimensional proteomic strategy called TMEPro. We applied TMEPro to profile the glycosylated secreted and plasma membrane proteome of 100 human pancreatic tissue samples to a great depth, define cell type origins and identify potential paracrine cross-talk, especially that mediated through tyrosine phosphorylation. Temporal dynamics during pancreatic tumour progression were investigated in a genetically engineered PDAC mouse model. Functionally, we revealed reciprocal signalling between stromal cells and cancer cells mediated by the stromal PDGFR-PTPN11-FOS signalling axis. Furthermore, we examined the generic shedding mechanism of plasma membrane proteins in PDAC tumours and revealed that matrix-metalloprotease-mediated shedding of the AXL receptor tyrosine kinase ectodomain provides an additional dimension of intercellular signalling regulation in the PDAC TME. Importantly, the level of shed AXL has a potential correlation with lymph node metastasis, and inhibition of AXL shedding and its kinase activity showed a substantial synergistic effect in inhibiting cancer cell growth. In summary, we provide TMEPro, a generically applicable clinical functional proteomic strategy, and a comprehensive resource for better understanding the PDAC TME and facilitating the discovery of new diagnostic and therapeutic targets.
Collapse
Affiliation(s)
- Peiwu Huang
- State Key Laboratory of Medical Proteomics and Shenzhen Key Laboratory of Functional Proteomics, Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, School of Science and Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen, China
| | - Weina Gao
- State Key Laboratory of Medical Proteomics and Shenzhen Key Laboratory of Functional Proteomics, Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, School of Science and Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen, China
| | - Changying Fu
- State Key Laboratory of Medical Proteomics and Shenzhen Key Laboratory of Functional Proteomics, Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, School of Science and Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen, China
| | - Min Wang
- Department of Biliary-Pancreatic Surgery, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Yunguang Li
- Key Laboratory of Multi-Cell Systems, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Bizhu Chu
- State Key Laboratory of Medical Proteomics and Shenzhen Key Laboratory of Functional Proteomics, Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, School of Science and Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen, China
| | - An He
- State Key Laboratory of Medical Proteomics and Shenzhen Key Laboratory of Functional Proteomics, Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, School of Science and Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen, China
| | - Yuan Li
- State Key Laboratory of Medical Proteomics and Shenzhen Key Laboratory of Functional Proteomics, Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, School of Science and Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen, China
| | - Xiaomei Deng
- State Key Laboratory of Medical Proteomics and Shenzhen Key Laboratory of Functional Proteomics, Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, School of Science and Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen, China
| | - Yehan Zhang
- Key Laboratory of Multi-Cell Systems, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Qian Kong
- State Key Laboratory of Medical Proteomics and Shenzhen Key Laboratory of Functional Proteomics, Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, School of Science and Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen, China
| | - Jingxiong Yuan
- Department of Biliary-Pancreatic Surgery, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Hebin Wang
- Department of Biliary-Pancreatic Surgery, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Yu Shi
- Molecular and Cell Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, USA.
- Bristol Myers Squibb, San Diego, CA, USA.
| | - Dong Gao
- Key Laboratory of Multi-Cell Systems, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China.
- Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, China.
| | - Renyi Qin
- Department of Biliary-Pancreatic Surgery, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
| | - Tony Hunter
- Molecular and Cell Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Ruijun Tian
- State Key Laboratory of Medical Proteomics and Shenzhen Key Laboratory of Functional Proteomics, Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, School of Science and Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen, China.
| |
Collapse
|
10
|
Xu Y, Zhang Y, Song K, Liu J, Zhao R, Zhang X, Pei L, Li M, Chen Z, Zhang C, Wang P, Li F. ScDrugAct: a comprehensive database to dissect tumor microenvironment cell heterogeneity contributing to drug action and resistance across human cancers. Nucleic Acids Res 2024:gkae994. [PMID: 39526387 DOI: 10.1093/nar/gkae994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Revised: 09/27/2024] [Accepted: 10/15/2024] [Indexed: 11/16/2024] Open
Abstract
The transcriptional heterogeneity of tumor microenvironment (TME) cells is a crucial factor driving the diversity of cellular response to drug treatment and resistance. Therefore, characterizing the cells associated with drug treatment and resistance will help us understand therapeutic mechanisms, discover new therapeutic targets and facilitate precision medicine. Here, we describe a database, scDrugAct (http://bio-bigdata.hrbmu.edu.cn/scDrugAct/), which aims to establish connections among drugs, genes and cells and dissect the impact of TME cellular heterogeneity on drug action and resistance at single-cell resolution. ScDrugAct is curated with drug-cell connections between 3838 223 cells across 34 cancer types and 13 857 drugs and identifies 17 274 drug perturbation/resistance-related genes and 276 559 associations between >10 000 drugs and 53 cell types. ScDrugAct also provides multiple flexible tools to retrieve and analyze connections among drugs, genes and cells; the distribution and developmental trajectories of drug-associated cells within the TME; functional features affecting the heterogeneity of cellular responses to drug perturbation and drug resistance; the cell-specific drug-related gene network; and drug-drug similarities. ScDrugAct serves as an important resource for investigating the impact of the cellular heterogeneity of the TME on drug therapies and can help researchers understand the mechanisms of action and resistance of drugs, as well as discover therapeutic targets.
Collapse
Affiliation(s)
- Yanjun Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, 157 Baojian Road, Harbin 150081, China
| | - Yifang Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, 157 Baojian Road, Harbin 150081, China
| | - Kaiyue Song
- College of Bioinformatics Science and Technology, Harbin Medical University, 157 Baojian Road, Harbin 150081, China
| | - Jiaqi Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, 157 Baojian Road, Harbin 150081, China
| | - Rui Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, 157 Baojian Road, Harbin 150081, China
| | - Xiaomeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, 157 Baojian Road, Harbin 150081, China
| | - Liying Pei
- College of Bioinformatics Science and Technology, Harbin Medical University, 157 Baojian Road, Harbin 150081, China
| | - Mengyue Li
- College of Bioinformatics Science and Technology, Harbin Medical University, 157 Baojian Road, Harbin 150081, China
| | - Zhe Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, 157 Baojian Road, Harbin 150081, China
| | - Chunlong Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, 157 Baojian Road, Harbin 150081, China
| | - Peng Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, 157 Baojian Road, Harbin 150081, China
| | - Feng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, 157 Baojian Road, Harbin 150081, China
| |
Collapse
|
11
|
Lin X, Chang X, Zhang Y, Gao Z, Chi X. Automatic construction of Petri net models for computational simulations of molecular interaction network. NPJ Syst Biol Appl 2024; 10:131. [PMID: 39521772 PMCID: PMC11550427 DOI: 10.1038/s41540-024-00464-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Accepted: 10/30/2024] [Indexed: 11/16/2024] Open
Abstract
Petri nets are commonly applied in modeling biological systems. However, construction of a Petri net model for complex biological systems is often time consuming, and requires expertise in the research area, limiting their application. To address this challenge, we developed GINtoSPN, an R package that automates the conversion of multi-omics molecular interaction network extracted from the Global Integrative Network (GIN) into Petri nets in GraphML format. These GraphML files can be directly used for Signaling Petri Net (SPN) simulation. To demonstrate the utility of this tool, we built a Petri net model for neurofibromatosis type I. Simulation of NF1 gene knockout, compared to normal skin fibroblast cells, revealed persistent accumulation of Ras-GTPs as expected. Additionally, we identified several other genes substantially affected by the loss of NF1's function, exhibiting individual-specific variability. These results highlight the effectiveness of GINtoSPN in streamlining the modeling and simulation of complex biological systems.
Collapse
Affiliation(s)
- Xuefei Lin
- Department of Dermatology and Venereal Disease, Xuan Wu Hospital, Beijing, China
| | - Xiao Chang
- Department of Dermatology and Venereal Disease, Xuan Wu Hospital, Beijing, China
| | - Yizheng Zhang
- China National Center for Bioinformation, Beijing, China
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhanyu Gao
- China National Center for Bioinformation, Beijing, China
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
- HKU Li Ka Shing Faculty of Medicine, Hong Kong, China
| | - Xu Chi
- China National Center for Bioinformation, Beijing, China.
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
12
|
E U, T M, A V G, D P. A comprehensive survey of drug-target interaction analysis in allopathy and siddha medicine. Artif Intell Med 2024; 157:102986. [PMID: 39326289 DOI: 10.1016/j.artmed.2024.102986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 08/13/2024] [Accepted: 09/18/2024] [Indexed: 09/28/2024]
Abstract
Effective drug delivery is the cornerstone of modern healthcare, ensuring therapeutic compounds reach their intended targets efficiently. This paper explores the potential of personalized and holistic healthcare, driven by the synergy between traditional and allopathic medicine systems, with a specific focus on the vast reservoir of medicinal compounds found in plants rooted in the historical legacy of traditional medicine. Motivated by the desire to unlock the therapeutic potential of medicinal plants and bridge the gap between traditional and allopathic medicine, this survey delves into in-silico computational approaches for studying Drug-Target Interactions (DTI) within the contexts of allopathy and siddha medicine. The contributions of this survey are multifaceted: it offers a comprehensive overview of in-silico methods for DTI analysis in both systems, identifies common challenges in DTI studies, provides insights into future directions to advance DTI analysis, and includes a comparative analysis of DTI in allopathy and siddha medicine. The findings of this survey highlight the pivotal role of in-silico computational approaches in advancing drug research and development in both allopathy and siddha medicine, emphasizing the importance of integrating these methods to drive the future of personalized healthcare.
Collapse
Affiliation(s)
- Uma E
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India.
| | - Mala T
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India
| | - Geetha A V
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India
| | - Priyanka D
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India
| |
Collapse
|
13
|
Ayalvari S, Kaedi M, Sehhati M. A modified multiple-criteria decision-making approach based on a protein-protein interaction network to diagnose latent tuberculosis. BMC Med Inform Decis Mak 2024; 24:319. [PMID: 39478591 PMCID: PMC11523813 DOI: 10.1186/s12911-024-02668-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Accepted: 09/05/2024] [Indexed: 11/02/2024] Open
Abstract
BACKGROUND DNA microarrays provide informative data for transcriptional profiling and identifying gene expression signatures to help prevent progression of latent tuberculosis infection (LTBI) to active disease. However, constructing a prognostic model for distinguishing LTBI from active tuberculosis (ATB) is very challenging due to the noisy nature of data and lack of a generally stable analysis approach. METHODS In the present study, we proposed an accurate predictive model with the help of data fusion at the decision level. In this regard, results of filter feature selection and wrapper feature selection techniques were combined with multiple-criteria decision-making (MCDM) methods to select 10 genes from six microarray datasets that can be the most discriminative genes for diagnosing tuberculosis cases. As the main contribution of this study, the final ranking function was constructed by combining protein-protein interaction (PPI) network with an MCDM method (called Decision-making Trial and Evaluation Laboratory or DEMATEL) to improve the feature ranking approach. RESULTS By applying data fusion at the decision level on the 10 introduced genes in terms of fusion of classifiers of random forests (RF) and k-nearest neighbors (KNN) regarding Yager's theory, the proposed algorithm reached a sensitivity of 0.97, specificity of 0.90, and accuracy of 0.95. Finally, with the help of cumulative clustering, the genes involved in the diagnosis of latent and activated tuberculosis have been introduced. CONCLUSIONS The combination of MCDM methods and PPI networks can significantly improve the diagnosis different states of tuberculosis. CLINICAL TRIAL NUMBER Not applicable.
Collapse
Affiliation(s)
- Somayeh Ayalvari
- Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran
| | - Marjan Kaedi
- Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran.
| | - Mohammadreza Sehhati
- Department of Biomedical Engineering, School of Advanced Medical Technology, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
14
|
Xiong D, Qiu Y, Zhao J, Zhou Y, Lee D, Gupta S, Torres M, Lu W, Liang S, Kang JJ, Eng C, Loscalzo J, Cheng F, Yu H. A structurally informed human protein-protein interactome reveals proteome-wide perturbations caused by disease mutations. Nat Biotechnol 2024:10.1038/s41587-024-02428-4. [PMID: 39448882 DOI: 10.1038/s41587-024-02428-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 09/11/2024] [Indexed: 10/26/2024]
Abstract
To assist the translation of genetic findings to disease pathobiology and therapeutics discovery, we present an ensemble deep learning framework, termed PIONEER (Protein-protein InteractiOn iNtErfacE pRediction), that predicts protein-binding partner-specific interfaces for all known protein interactions in humans and seven other common model organisms to generate comprehensive structurally informed protein interactomes. We demonstrate that PIONEER outperforms existing state-of-the-art methods and experimentally validate its predictions. We show that disease-associated mutations are enriched in PIONEER-predicted protein-protein interfaces and explore their impact on disease prognosis and drug responses. We identify 586 significant protein-protein interactions (PPIs) enriched with PIONEER-predicted interface somatic mutations (termed oncoPPIs) from analysis of approximately 11,000 whole exomes across 33 cancer types and show significant associations of oncoPPIs with patient survival and drug responses. PIONEER, implemented as both a web server platform and a software package, identifies functional consequences of disease-associated alleles and offers a deep learning tool for precision medicine at multiscale interactome network levels.
Collapse
Grants
- R01GM124559 U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
- R01GM125639 U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
- R01GM130885 U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
- RM1GM139738 U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
- R01DK115398 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- U01HG007691 U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute (NHGRI)
- R01HL155107 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01HL155096 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01HL166137 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U54HL119145 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- AHA957729 American Heart Association (American Heart Association, Inc.)
- 24MERIT1185447 American Heart Association (American Heart Association, Inc.)
- R01AG084250 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- R56AG074001 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- U01AG073323 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- R01AG066707 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- R01AG076448 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- R01AG082118 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- RF1AG082211 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- R21AG083003 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- RF1NS133812 U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
Collapse
Affiliation(s)
- Dapeng Xiong
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY, USA
| | - Yunguang Qiu
- Cleveland Clinic Genome Center, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Junfei Zhao
- Department of Systems Biology, Herbert Irving Comprehensive Center, Columbia University, New York, NY, USA
| | - Yadi Zhou
- Cleveland Clinic Genome Center, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Dongjin Lee
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| | - Shobhita Gupta
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY, USA
- Biophysics Program, Cornell University, Ithaca, NY, USA
| | - Mateo Torres
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY, USA
| | - Weiqiang Lu
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| | - Jin Joo Kang
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY, USA
| | - Charis Eng
- Cleveland Clinic Genome Center, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, USA
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Joseph Loscalzo
- Channing Division of Network Medicine, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Feixiong Cheng
- Cleveland Clinic Genome Center, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA.
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA.
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, USA.
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH, USA.
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY, USA.
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA.
- Center for Innovative Proteomics, Cornell University, Ithaca, NY, USA.
| |
Collapse
|
15
|
Li L, Li H, Ishdorj TO, Zheng C, Su Y. MDNNSyn: A Multi-Modal Deep Learning Framework for Drug Synergy Prediction. IEEE J Biomed Health Inform 2024; 28:6225-6236. [PMID: 38954565 DOI: 10.1109/jbhi.2024.3421916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
Synergistic drug combination prediction tasks based on the computational models have been widely studied and applied in the cancer field. However, most of models only consider the interactions between drug pairs and specific cell lines, without taking into account the multiple biological relationships of drug-drug and cell line-cell line that also largely affect synergistic mechanisms. To this end, here we propose a multi-modal deep learning framework, termed MDNNSyn, which adequately applies multi-source information and trains multi-modal features to infer potential synergistic drug combinations. MDNNSyn extracts topology modality features by implementing the multi-layer hypergraph neural network on drug synergy hypergraph and constructs semantic modality features through similarity strategy. A multi-modal fusion network layer with gated neural network is then employed for synergy score prediction. MDNNSyn is compared to five classic and state-of-the-art prediction methods on DrugCombDB and Oncology-Screen datasets. The model achieves area under the curve (AUC) scores of 0.8682 and 0.9013 on two datasets, an improvement of 3.70 % and 2.71 % over the second-best model. Case study indicates that MDNNSyn is capable of detecting potential synergistic drug combinations.
Collapse
|
16
|
Yao W, Wei A, Xiao Z, Zhao W, Shen X, Jiang X, He T. An Improved Framework for Drug-Side Effect Associations Prediction via Counterfactual Inference-Based Data Augmentation. IEEE Trans Nanobioscience 2024; 23:540-547. [PMID: 39141449 DOI: 10.1109/tnb.2024.3443244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/16/2024]
Abstract
Detecting side effects of drugs is a fundamental task in drug development. With the expansion of publicly available biomedical data, researchers have proposed many computational methods for predicting drug-side effect associations (DSAs), among which network-based methods attract wide attention in the biomedical field. However, the problem of data scarcity poses a great challenge for existing DSAs prediction models. Although several data augmentation methods have been proposed to address this issue, most of existing methods employ a random way to manipulate the original networks, which ignores the causality of existence of DSAs, leading to the poor performance on the task of DSAs prediction. In this paper, we propose a counterfactual inference-based data augmentation method for improving the performance of the task. First, we construct a heterogeneous information network (HIN) by integrating multiple biomedical data. Based on the community detection on the HIN, a counterfactual inference-based method is designed to derive augmented links, and an augmented HIN is obtained accordingly. Then, a meta-path-based graph neural network is applied to learn high-quality representations of drugs and side effects, on which the predicted DSAs are obtained. Finally, comprehensive experiments are conducted, and the results demonstrate the effectiveness of the proposed counterfactual inference-based data augmentation for the task of DSAs prediction.
Collapse
|
17
|
Hu X, Yi H, Cheng H, Zhao Y, Zhang D, Li J, Ruan J, Zhang J, Lu X. Multiple Heterogeneous Networks Representation With Latent Space for Synthetic Lethality Prediction. IEEE Trans Nanobioscience 2024; 23:564-571. [PMID: 39150817 DOI: 10.1109/tnb.2024.3444922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/18/2024]
Abstract
Computational synthetic lethality (SL) method has become a promising strategy to identify SL gene pairs for targeted cancer therapy and cancer medicine development. Feature representation for integrating various biological networks is crutial to improve the identification performance. However, previous feature representation, such as matrix factorization and graph neural network, projects gene features onto latent variables by keeping a specific geometric metric. There is a lack of models of gene representational latent space with considerating multiple dimentionalities correlation and preserving latent geometric structures in both sample and feature spaces. Therefore, we propose a novel method to model gene Latent Space using matrix Tri-Factorization (LSTF) to obtain gene representation with embedding variables resulting from the potential interpretation of synthetic lethality. Meanwhile, manifold subspace regularization is applied to the tri-factorization to capture the geometrical manifold structure in the latent space with gene PPI functional and GO semantic embeddings. Then, SL gene pairs are identified by the reconstruction of the associations with gene representations in the latent space. The experimental results illustrate that LSTF is superior to other state-of-the-art methods. Case study demonstrate the effectiveness of the predicted SL associations.
Collapse
|
18
|
Gervas-Arruga J, Barba-Romero MÁ, Fernández-Martín JJ, Gómez-Cerezo JF, Segú-Vergés C, Ronzoni G, Cebolla JJ. In Silico Modeling of Fabry Disease Pathophysiology for the Identification of Early Cellular Damage Biomarker Candidates. Int J Mol Sci 2024; 25:10329. [PMID: 39408658 PMCID: PMC11477023 DOI: 10.3390/ijms251910329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2024] [Revised: 09/19/2024] [Accepted: 09/24/2024] [Indexed: 10/20/2024] Open
Abstract
Fabry disease (FD) is an X-linked lysosomal disease whose ultimate consequences are the accumulation of sphingolipids and subsequent inflammatory events, mainly at the endothelial level. The outcomes include different nervous system manifestations as well as multiple organ damage. Despite the availability of known biomarkers, early detection of FD remains a medical need. This study aimed to develop an in silico model based on machine learning to identify candidate vascular and nervous system proteins for early FD damage detection at the cellular level. A combined systems biology and machine learning approach was carried out considering molecular characteristics of FD to create a computational model of vascular and nervous system disease. A data science strategy was applied to identify risk classifiers by using 10 K-fold cross-validation. Further biological and clinical criteria were used to prioritize the most promising candidates, resulting in the identification of 36 biomarker candidates with classifier abilities, which are easily measurable in body fluids. Among them, we propose four candidates, CAMK2A, ILK, LMNA, and KHSRP, which have high classification capabilities according to our models (cross-validated accuracy ≥ 90%) and are related to the vascular and nervous systems. These biomarkers show promise as high-risk cellular and tissue damage indicators that are potentially applicable in clinical settings, although in vivo validation is still needed.
Collapse
Affiliation(s)
| | - Miguel Ángel Barba-Romero
- Department of Internal Medicine, Albacete University Hospital, 02006 Albacete, Spain;
- Albacete Medical School, Castilla-La Mancha University, 02006 Albacete, Spain
| | | | - Jorge Francisco Gómez-Cerezo
- Department of Internal Medicine, Infanta Sofía University Hospital, 28702 Madrid, Spain;
- Faculty of Medicine, European University of Madrid, 28670 Madrid, Spain
| | | | | | | |
Collapse
|
19
|
Mazein I, Rougny A, Mazein A, Henkel R, Gütebier L, Michaelis L, Ostaszewski M, Schneider R, Satagopam V, Jensen LJ, Waltemath D, Wodke JAH, Balaur I. Graph databases in systems biology: a systematic review. Brief Bioinform 2024; 25:bbae561. [PMID: 39565895 PMCID: PMC11578065 DOI: 10.1093/bib/bbae561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 09/28/2024] [Accepted: 10/21/2024] [Indexed: 11/22/2024] Open
Abstract
Graph databases are becoming increasingly popular across scientific disciplines, being highly suitable for storing and connecting complex heterogeneous data. In systems biology, they are used as a backend solution for biological data repositories, ontologies, networks, pathways, and knowledge graph databases. In this review, we analyse all publications using or mentioning graph databases retrieved from PubMed and PubMed Central full-text search, focusing on the top 16 available graph databases, Publications are categorized according to their domain and application, focusing on pathway and network biology and relevant ontologies and tools. We detail different approaches and highlight the advantages of outstanding resources, such as UniProtKB, Disease Ontology, and Reactome, which provide graph-based solutions. We discuss ongoing efforts of the systems biology community to standardize and harmonize knowledge graph creation and the maintenance of integrated resources. Outlining prospects, including the use of graph databases as a way of communication between biological data repositories, we conclude that efficient design, querying, and maintenance of graph databases will be key for knowledge generation in systems biology and other research fields with heterogeneous data.
Collapse
Affiliation(s)
- Ilya Mazein
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Adrien Rougny
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Alexander Mazein
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Ron Henkel
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Lea Gütebier
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Lea Michaelis
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Marek Ostaszewski
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Venkata Satagopam
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Lars Juhl Jensen
- Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Grønnegårdsvej 15, 1870 Frederiksberg C, Denmark
| | - Dagmar Waltemath
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Judith A H Wodke
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Irina Balaur
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| |
Collapse
|
20
|
Zhao MX, Ding RF, Chen Q, Meng J, Li F, Fu S, Huang B, Liu Y, Ji ZL, Zhao Y. Nphos: Database and Predictor of Protein N-phosphorylation. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae032. [PMID: 39380205 DOI: 10.1093/gpbjnl/qzae032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 03/03/2024] [Accepted: 04/01/2024] [Indexed: 10/10/2024]
Abstract
Protein N-phosphorylation is widely present in nature and participates in various biological processes. However, current knowledge on N-phosphorylation is extremely limited compared to that on O-phosphorylation. In this study, we collected 11,710 experimentally verified N-phosphosites of 7344 proteins from 39 species and subsequently constructed the database Nphos to share up-to-date information on protein N-phosphorylation. Upon these substantial data, we characterized the sequential and structural features of protein N-phosphorylation. Moreover, after comparing hundreds of learning models, we chose and optimized gradient boosting decision tree (GBDT) models to predict three types of human N-phosphorylation, achieving mean area under the receiver operating characteristic curve (AUC) values of 90.56%, 91.24%, and 92.01% for pHis, pLys, and pArg, respectively. Meanwhile, we discovered 488,825 distinct N-phosphosites in the human proteome. The models were also deployed in Nphos for interactive N-phosphosite prediction. In summary, this work provides new insights and points for both flexible and focused investigations of N-phosphorylation. It will also facilitate a deeper and more systematic understanding of protein N-phosphorylation modification by providing a data and technical foundation. Nphos is freely available at http://www.bio-add.org/Nphos/ and http://ppodd.org.cn/Nphos/.
Collapse
Affiliation(s)
- Ming-Xiao Zhao
- Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
- Department of Chemical Biology, Key Laboratory for Chemical Biology of Fujian Province, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Ruo-Fan Ding
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361102, China
| | - Qiang Chen
- Zhejiang Key Laboratory of Pathophysiology, Department of Biochemistry and Molecular Biology, Health Science Center, Ningbo University, Ningbo 315211, China
| | - Junhua Meng
- BGI Genomics, BGI-Shenzhen, Shenzhen 518083, China
| | - Fulai Li
- Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
| | - Songsen Fu
- Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
| | - Biling Huang
- Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
| | - Yan Liu
- Department of Chemical Biology, Key Laboratory for Chemical Biology of Fujian Province, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Zhi-Liang Ji
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361102, China
| | - Yufen Zhao
- Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
- Department of Chemical Biology, Key Laboratory for Chemical Biology of Fujian Province, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- Key Laboratory of Bioorganic Phosphorus Chemistry & Chemical Biology, Department of Chemistry, Tsinghua University, Beijing 100084, China
| |
Collapse
|
21
|
Cingiz MÖ. Ensemble decision of local similarity indices on the biological network for disease related gene prediction. PeerJ 2024; 12:e17975. [PMID: 39247551 PMCID: PMC11380840 DOI: 10.7717/peerj.17975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 08/05/2024] [Indexed: 09/10/2024] Open
Abstract
Link prediction (LP) is a task for the identification of potential, missing and spurious links in complex networks. Protein-protein interaction (PPI) networks are important for understanding the underlying biological mechanisms of diseases. Many complex networks have been constructed using LP methods; however, there are a limited number of studies that focus on disease-related gene predictions and evaluate these genes using various evaluation criteria. The main objective of the study is to investigate the effect of a simple ensemble method in disease related gene predictions. Local similarity indices (LSIs) based disease related gene predictions were integrated by a simple ensemble decision method, simple majority voting (SMV), on the PPI network to detect accurate disease related genes. Human PPI network was utilized to discover potential disease related genes using four LSIs for the gene prediction. LSIs discovered potential links between disease related genes, which were obtained from OMIM database for gastric, colorectal, breast, prostate and lung cancers. LSIs based disease related genes were ranked due to their LSI scores in descending order for retrieving the top 10, 50 and 100 disease related genes. SMV integrated four LSIs based predictions to obtain SMV based the top 10, 50 and 100 disease related genes. The performance of LSIs based and SMV based genes were evaluated separately by employing overlap analyses, which were performed with GeneCard disease-gene relation dataset and Gene Ontology (GO) terms. The GO-terms were used for biological assessment for the inferred gene lists by LSIs and SMV on all cancer types. Adamic-Adar (AA), Resource Allocation Index (RAI), and SMV based gene lists are generally achieved good performance results on all cancers in both overlap analyses. SMV also outperformed on breast cancer data. The increment in the selection of the number of the top ranked disease related genes also enhanced the performance results of SMV.
Collapse
Affiliation(s)
- Mustafa Özgür Cingiz
- Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Bursa Technical University, Bursa, Turkey
| |
Collapse
|
22
|
Li Z, Zhang Y, Zhou P. Temporal Protein Complex Identification Based on Dynamic Heterogeneous Protein Information Network Representation Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1154-1164. [PMID: 38190662 DOI: 10.1109/tcbb.2024.3351078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Abstract
Protein complexes, as the fundamental units of cellular function and regulation, play a crucial role in understanding the normal physiological functions of cells. Existing methods for protein complex identification attempt to introduce other biological information on top of the protein-protein interaction (PPI) network to assist in evaluating the degree of association between proteins. However, these methods usually treat protein interaction networks as flat homogeneous static networks. They cannot distinguish the roles and importance of different types of biological information, nor can they reflect the dynamic changes of protein complexes. In recent years, heterogeneous network representation learning has achieved great success in processing complex heterogeneous information and mining deep semantics. We thus propose a temporal protein complex identification method based on Dynamic Heterogeneous Protein information network Representation Learning, DHPRL. DHPRL naturally integrates multiple types of heterogeneous biological information in the cellular temporal dimension. It simultaneously models the temporal dynamic properties of proteins and the heterogeneity of biological information to improve the understanding of protein interactions and the accuracy of complex prediction. Firstly, we construct Dynamic Heterogeneous Protein Information Network (DHPIN) by integrating temporal gene expression information and GO attribute information. Then we design a dual-view collaborative contrast mechanism. Specifically, proposing to learn protein representations from two views of DHPIN (1-hop relation view and meta-path view) to model the consistency and specificity between nearest-neighbour bio information and deeper biological semantics. The dynamic PPI network is thereafter re-weighted based on the learned protein representations. Finally, we perform protein identification on the re-weighted dynamic PPI network. Extensive experimental results demonstrate that DHPRL can effectively model complicated biological information and achieve state-of-the-art performance in most cases.
Collapse
|
23
|
Zhang B, Niu D, Zhang L, Zhang Q, Li Z. MSH-DTI: multi-graph convolution with self-supervised embedding and heterogeneous aggregation for drug-target interaction prediction. BMC Bioinformatics 2024; 25:275. [PMID: 39179993 PMCID: PMC11342675 DOI: 10.1186/s12859-024-05904-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 08/16/2024] [Indexed: 08/26/2024] Open
Abstract
BACKGROUND The rise of network pharmacology has led to the widespread use of network-based computational methods in predicting drug target interaction (DTI). However, existing DTI prediction models typically rely on a limited amount of data to extract drug and target features, potentially affecting the comprehensiveness and robustness of features. In addition, although multiple networks are used for DTI prediction, the integration of heterogeneous information often involves simplistic aggregation and attention mechanisms, which may impose certain limitations. RESULTS MSH-DTI, a deep learning model for predicting drug-target interactions, is proposed in this paper. The model uses self-supervised learning methods to obtain drug and target structure features. A Heterogeneous Interaction-enhanced Feature Fusion Module is designed for multi-graph construction, and the graph convolutional networks are used to extract node features. With the help of an attention mechanism, the model focuses on the important parts of different features for prediction. Experimental results show that the AUROC and AUPR of MSH-DTI are 0.9620 and 0.9605 respectively, outperforming other models on the DTINet dataset. CONCLUSION The proposed MSH-DTI is a helpful tool to discover drug-target interactions, which is also validated through case studies in predicting new DTIs.
Collapse
Affiliation(s)
- Beiyi Zhang
- College of Computer Science and Technology, Qingdao University, Ningxia Road, Qingdao, 266071, Shandong, China
| | - Dongjiang Niu
- College of Computer Science and Technology, Qingdao University, Ningxia Road, Qingdao, 266071, Shandong, China
| | - Lianwei Zhang
- College of Computer Science and Technology, Qingdao University, Ningxia Road, Qingdao, 266071, Shandong, China
| | - Qiang Zhang
- College of Computer Science and Technology, Qingdao University, Ningxia Road, Qingdao, 266071, Shandong, China
| | - Zhen Li
- College of Computer Science and Technology, Qingdao University, Ningxia Road, Qingdao, 266071, Shandong, China.
| |
Collapse
|
24
|
Cebolla JJ, Giraldo P, Gómez J, Montoto C, Gervas-Arruga J. Machine Learning-Driven Biomarker Discovery for Skeletal Complications in Type 1 Gaucher Disease Patients. Int J Mol Sci 2024; 25:8586. [PMID: 39201273 PMCID: PMC11354847 DOI: 10.3390/ijms25168586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 08/01/2024] [Accepted: 08/05/2024] [Indexed: 09/02/2024] Open
Abstract
Type 1 Gaucher disease (GD1) is a rare, autosomal recessive disorder caused by glucocerebrosidase deficiency. Skeletal manifestations represent one of the most debilitating and potentially irreversible complications of GD1. Although imaging studies are the gold standard, early diagnostic/prognostic tools, such as molecular biomarkers, are needed for the rapid management of skeletal complications. This study aimed to identify potential protein biomarkers capable of predicting the early diagnosis of bone skeletal complications in GD1 patients using artificial intelligence. An in silico study was performed using the novel Therapeutic Performance Mapping System methodology to construct mathematical models of GD1-associated complications at the protein level. Pathophysiological characterization was performed before modeling, and a data science strategy was applied to the predicted protein activity for each protein in the models to identify classifiers. Statistical criteria were used to prioritize the most promising candidates, and 18 candidates were identified. Among them, PDGFB, IL1R2, PTH and CCL3 (MIP-1α) were highlighted due to their ease of measurement in blood. This study proposes a validated novel tool to discover new protein biomarkers to support clinician decision-making in an area where medical needs have not yet been met. However, confirming the results using in vitro and/or in vivo studies is necessary.
Collapse
Affiliation(s)
| | - Pilar Giraldo
- FEETEG, 50006 Zaragoza, Spain;
- Hospital QuirónSalud Zaragoza, 50012 Zaragoza, Spain
| | | | | | | |
Collapse
|
25
|
Gabriel GC, Ganapathiraju M, Lo CW. The Role of Cilia and the Complex Genetics of Congenital Heart Disease. Annu Rev Genomics Hum Genet 2024; 25:309-327. [PMID: 38724024 DOI: 10.1146/annurev-genom-121222-105345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024]
Abstract
Congenital heart disease (CHD) can affect up to 1% of live births, and despite abundant evidence of a genetic etiology, the genetic landscape of CHD is still not well understood. A large-scale mouse chemical mutagenesis screen for mutations causing CHD yielded a preponderance of cilia-related genes, pointing to a central role for cilia in CHD pathogenesis. The genes uncovered by the screen included genes that regulate ciliogenesis and cilia-transduced cell signaling as well as many that mediate endocytic trafficking, a cell process critical for both ciliogenesis and cell signaling. The clinical relevance of these findings is supported by whole-exome sequencing analysis of CHD patients that showed enrichment for pathogenic variants in ciliome genes. Surprisingly, among the ciliome CHD genes recovered were many that encoded direct protein-protein interactors. Assembly of the CHD genes into a protein-protein interaction network yielded a tight interactome that suggested this protein-protein interaction may have functional importance and that its disruption could contribute to the pathogenesis of CHD. In light of these and other findings, we propose that an interactome enriched for ciliome genes may provide the genomic context for the complex genetics of CHD and its often-observed incomplete penetrance and variable expressivity.
Collapse
Affiliation(s)
- George C Gabriel
- Department of Developmental Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA; ,
| | - Madhavi Ganapathiraju
- Carnegie Mellon University in Qatar, Doha, Qatar
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA;
| | - Cecilia W Lo
- Department of Developmental Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA; ,
| |
Collapse
|
26
|
Teimouri H, Medvedeva A, Kolomeisky AB. Unraveling the role of physicochemical differences in predicting protein-protein interactions. J Chem Phys 2024; 161:045102. [PMID: 39051836 DOI: 10.1063/5.0219501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 07/09/2024] [Indexed: 07/27/2024] Open
Abstract
The ability to accurately predict protein-protein interactions is critically important for understanding major cellular processes. However, current experimental and computational approaches for identifying them are technically very challenging and still have limited success. We propose a new computational method for predicting protein-protein interactions using only primary sequence information. It utilizes the concept of physicochemical similarity to determine which interactions will most likely occur. In our approach, the physicochemical features of proteins are extracted using bioinformatics tools for different organisms. Then they are utilized in a machine-learning method to identify successful protein-protein interactions via correlation analysis. It was found that the most important property that correlates most with the protein-protein interactions for all studied organisms is dipeptide amino acid composition (the frequency of specific amino acid pairs in a protein sequence). While current approaches often overlook the specificity of protein-protein interactions with different organisms, our method yields context-specific features that determine protein-protein interactions. The analysis is specifically applied to the bacterial two-component system that includes histidine kinase and transcriptional response regulators, as well as to the barnase-barstar complex, demonstrating the method's versatility across different biological systems. Our approach can be applied to predict protein-protein interactions in any biological system, providing an important tool for investigating complex biological processes' mechanisms.
Collapse
Affiliation(s)
- Hamid Teimouri
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, USA
| | - Angela Medvedeva
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, USA
| | - Anatoly B Kolomeisky
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
27
|
Zhu Y, Ning C, Zhang N, Wang M, Zhang Y. GSRF-DTI: a framework for drug-target interaction prediction based on a drug-target pair network and representation learning on a large graph. BMC Biol 2024; 22:156. [PMID: 39020316 PMCID: PMC11256582 DOI: 10.1186/s12915-024-01949-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 07/01/2024] [Indexed: 07/19/2024] Open
Abstract
BACKGROUND Identification of potential drug-target interactions (DTIs) with high accuracy is a key step in drug discovery and repositioning, especially concerning specific drug targets. Traditional experimental methods for identifying the DTIs are arduous, time-intensive, and financially burdensome. In addition, robust computational methods have been developed for predicting the DTIs and are widely applied in drug discovery research. However, advancing more precise algorithms for predicting DTIs is essential to meet the stringent standards demanded by drug discovery. RESULTS We proposed a novel method called GSRF-DTI, which integrates networks with a deep learning algorithm to identify DTIs. Firstly, GSRF-DTI learned the embedding representation of drugs and targets by integrating multiple drug association information and target association information, respectively. Then, GSRF-DTI considered the influence of drug-target pair (DTP) association on DTI prediction to construct a drug-target pair network (DTP-NET). Next, we utilized GraphSAGE on DTP-NET to learn the potential features of the network and applied random forest (RF) to predict the DTIs. Furthermore, we conducted ablation experiments to validate the necessity of integrating different types of network features for identifying DTIs. It is worth noting that GSRF-DTI proposed three novel DTIs. CONCLUSIONS GSRF-DTI not only considered the influence of the interaction relationship between drug and target but also considered the impact of DTP association relationship on DTI prediction. We initially use GraphSAGE to aggregate the neighbor information of nodes for better identification. Experimental analysis on Luo's dataset and the newly constructed dataset revealed that the GSRF-DTI framework outperformed several state-of-the-art methods significantly.
Collapse
Affiliation(s)
- Yongdi Zhu
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China
| | - Chunhui Ning
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China
| | - Naiqian Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China
| | - Mingyi Wang
- Department of Central Lab, Weihai Municipal Hospital, Weihai, Shandong, China.
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China.
| |
Collapse
|
28
|
Abu-Bakar A, Ismail M, Zulkifli MZI, Zaini NAS, Shukor NIA, Harun S, Inayat-Hussain SH. Mapping the influence of hydrocarbons mixture on molecular mechanisms, involved in breast and lung neoplasms: in silico toxicogenomic data-mining. Genes Environ 2024; 46:15. [PMID: 38982523 PMCID: PMC11232146 DOI: 10.1186/s41021-024-00310-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 06/07/2024] [Indexed: 07/11/2024] Open
Abstract
BACKGROUND Exposure to chemical mixtures inherent in air pollution, has been shown to be associated with the risk of breast and lung cancers. However, studies on the molecular mechanisms of exposure to a mixture of these pollutants, such as hydrocarbons, in the development of breast and lung cancers are scarce. We utilized in silico toxicogenomic analysis to elucidate the molecular pathways linked to both cancers that are influenced by exposure to a mixture of selected hydrocarbons. The Comparative Toxicogenomics Database and Cytoscape software were used for data mining and visualization. RESULTS Twenty-five hydrocarbons, common in air pollution with carcinogenicity classification of 1 A/B or 2 (known/presumed or suspected human carcinogen), were divided into three groups: alkanes and alkenes, halogenated hydrocarbons, and polyaromatic hydrocarbons. The in silico data-mining revealed 87 and 44 genes commonly interacted with most of the investigated hydrocarbons are linked to breast and lung cancer, respectively. The dominant interactions among the common genes are co-expression, physical interaction, genetic interaction, co-localization, and interaction in shared protein domains. Among these genes, only 16 are common in the development of both cancers. Benzo(a)pyrene and tetrachlorodibenzodioxin interacted with all 16 genes. The molecular pathways potentially affected by the investigated hydrocarbons include aryl hydrocarbon receptor, chemical carcinogenesis, ferroptosis, fluid shear stress and atherosclerosis, interleukin 17 signaling pathway, lipid and atherosclerosis, NRF2 pathway, and oxidative stress response. CONCLUSIONS Within the inherent limitations of in silico toxicogenomics tools, we elucidated the molecular pathways associated with breast and lung cancer development potentially affected by hydrocarbons mixture. Our findings indicate adaptive responses to oxidative stress and inflammatory damages are instrumental in the development of both cancers. Additionally, ferroptosis-a non-apoptotic programmed cell death driven by lipid peroxidation and iron homeostasis-was identified as a new player in these responses. Finally, AHR potential involvement in modulating IL-8, a critical gene that mediates breast cancer invasion and metastasis to the lungs, was also highlighted. A deeper understanding of the interplay between genes associated with these pathways, and other survival signaling pathways identified in this study, will provide invaluable knowledge in assessing the risk of inhalation exposure to hydrocarbons mixture. The findings offer insights into future in vivo and in vitro laboratory investigations that focus on inhalation exposure to the hydrocarbons mixture.
Collapse
Affiliation(s)
- A'edah Abu-Bakar
- Product Stewardship and Toxicology, Environment, Social Performance & Product Stewardship (ESPPS), Group Health, Safety and Environment (GHSE), Petroliam Nasional Berhad (PETRONAS), Kuala Lumpur, 50088, Malaysia.
| | - Maihani Ismail
- Product Stewardship and Toxicology, Environment, Social Performance & Product Stewardship (ESPPS), Group Health, Safety and Environment (GHSE), Petroliam Nasional Berhad (PETRONAS), Kuala Lumpur, 50088, Malaysia.
| | - M Zaqrul Ieman Zulkifli
- Product Stewardship and Toxicology, Environment, Social Performance & Product Stewardship (ESPPS), Group Health, Safety and Environment (GHSE), Petroliam Nasional Berhad (PETRONAS), Kuala Lumpur, 50088, Malaysia
| | - Nur Aini Sofiyya Zaini
- Product Stewardship and Toxicology, Environment, Social Performance & Product Stewardship (ESPPS), Group Health, Safety and Environment (GHSE), Petroliam Nasional Berhad (PETRONAS), Kuala Lumpur, 50088, Malaysia
| | - Nur Izzah Abd Shukor
- Health, Safety and Environment (HSE), KLCC Urusharta, Kuala Lumpur, 50088, Malaysia
| | - Sarahani Harun
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi, Selangor, 43600 UKM, Malaysia
| | - Salmaan Hussain Inayat-Hussain
- ESPPS, GHSE, PETRONAS, Kuala Lumpur, 50088, Malaysia
- Department of Environmental Health Sciences, Yale School of Public Health, Yale University, 60 College St, New Haven, CT, 06250, USA
| |
Collapse
|
29
|
Rawal O, Turhan B, Peradejordi IF, Chandrasekar S, Kalayci S, Gnjatic S, Johnson J, Bouhaddou M, Gümüş ZH. PhosNetVis: a web-based tool for fast kinase-substrate enrichment analysis and interactive 2D/3D network visualizations of phosphoproteomics data. ARXIV 2024:arXiv:2402.05016v3. [PMID: 39010877 PMCID: PMC11247916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 07/17/2024]
Abstract
Protein phosphorylation involves the reversible modification of a protein (substrate) residue by another protein (kinase). Liquid chromatography-mass spectrometry studies are rapidly generating massive protein phosphorylation datasets across multiple conditions. Researchers then must infer kinases responsible for changes in phosphosites of each substrate. However, tools that infer kinase-substrate interactions (KSIs) are not optimized to interactively explore the resulting large and complex networks, significant phosphosites, and states. There is thus an unmet need for a tool that facilitates user-friendly analysis, interactive exploration, visualization, and communication of phosphoproteomics datasets. We present PhosNetVis, a web-based tool for researchers of all computational skill levels to easily infer, generate and interactively explore KSI networks in 2D or 3D by streamlining phosphoproteomics data analysis steps within a single tool. PhostNetVis lowers barriers for researchers in rapidly generating high-quality visualizations to gain biological insights from their phosphoproteomics datasets. It is available at: https://gumuslab.github.io/PhosNetVis/.
Collapse
Affiliation(s)
- Osho Rawal
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Berk Turhan
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul, Türkiye
| | - Irene Font Peradejordi
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Cornell Tech, Cornell University, New York, NY, USA
| | - Shreya Chandrasekar
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Cornell Tech, Cornell University, New York, NY, USA
| | - Selim Kalayci
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Sacha Gnjatic
- Marc and Jennifer Lipschultz Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Immunology and Immunotherapy, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jeffrey Johnson
- Department of Immunology and Immunotherapy, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Mehdi Bouhaddou
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA
- Department of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles; Los Angeles, CA, USA
| | - Zeynep H. Gümüş
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Marc and Jennifer Lipschultz Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
30
|
Piersma SR, Valles-Marti A, Rolfs F, Pham TV, Henneman AA, Jiménez CR. Inferring kinase activity from phosphoproteomic data: Tool comparison and recent applications. MASS SPECTROMETRY REVIEWS 2024; 43:725-751. [PMID: 36156810 DOI: 10.1002/mas.21808] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Aberrant cellular signaling pathways are a hallmark of cancer and other diseases. One of the most important signaling mechanisms involves protein phosphorylation/dephosphorylation. Protein phosphorylation is catalyzed by protein kinases, and over 530 protein kinases have been identified in the human genome. Aberrant kinase activity is one of the drivers of tumorigenesis and cancer progression and results in altered phosphorylation abundance of downstream substrates. Upstream kinase activity can be inferred from the global collection of phosphorylated substrates. Mass spectrometry-based phosphoproteomic experiments nowadays routinely allow identification and quantitation of >10k phosphosites per biological sample. This substrate phosphorylation footprint can be used to infer upstream kinase activities using tools like Kinase Substrate Enrichment Analysis (KSEA), Posttranslational Modification Substrate Enrichment Analysis (PTM-SEA), and Integrative Inferred Kinase Activity Analysis (INKA). Since the topic of kinase activity inference is very active with many new approaches reported in the past 3 years, we would like to give an overview of the field. In this review, an inventory of kinase activity inference tools, their underlying algorithms, statistical frameworks, kinase-substrate databases, and user-friendliness is presented. The most widely-used tools are compared in-depth. Subsequently, recent applications of the tools are described focusing on clinical tissues and hematological samples. Two main application areas for kinase activity inference tools can be discerned. (1) Maximal biological insights can be obtained from large data sets with group comparisons using multiple complementary tools (e.g., PTM-SEA and KSEA or INKA). (2) In the oncology context where personalized treatment requires analysis of single samples, INKA for example, has emerged as tool that can prioritize actionable kinases for targeted inhibition.
Collapse
Affiliation(s)
- Sander R Piersma
- OncoProteomics Laboratory Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| | - Andrea Valles-Marti
- OncoProteomics Laboratory Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| | - Frank Rolfs
- OncoProteomics Laboratory Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| | - Thang V Pham
- OncoProteomics Laboratory Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| | - Alex A Henneman
- OncoProteomics Laboratory Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| | - Connie R Jiménez
- OncoProteomics Laboratory Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| |
Collapse
|
31
|
Menor-Flores M, Vega-Rodríguez MA. A protein-protein interaction network aligner study in the multi-objective domain. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108188. [PMID: 38657382 DOI: 10.1016/j.cmpb.2024.108188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 04/14/2024] [Accepted: 04/17/2024] [Indexed: 04/26/2024]
Abstract
BACKGROUND AND OBJECTIVE The protein-protein interaction (PPI) network alignment has proven to be an efficient technique in the diagnosis and prevention of certain diseases. However, the difficulty in maximizing, at the same time, the two qualities that measure the goodness of alignments (topological and biological quality) has led aligners to produce very different alignments. Thus making a comparative study among alignments of such different qualities a big challenge. Multi-objective optimization is a computer method, which is very powerful in this kind of contexts because both conflicting qualities are considered together. Analysing the alignments of each PPI network aligner with multi-objective methodologies allows you to visualize a bigger picture of the alignments and their qualities, obtaining very interesting conclusions. This paper proposes a comprehensive PPI network aligner study in the multi-objective domain. METHODS Alignments from each aligner and all aligners together were studied and compared to each other via Pareto dominance methodologies. The best alignments produced by each aligner and all aligners together for five different alignment scenarios were displayed in Pareto front graphs. Later, the aligners were ranked according to the topological, biological, and combined quality of their alignments. Finally, the aligners were also ranked based on their average runtimes. RESULTS Regarding aligners constructing the best overall alignments, we found that SAlign, BEAMS, SANA, and HubAlign are the best options. Additionally, the alignments of best topological quality are produced by: SANA, SAlign, and HubAlign aligners. On the contrary, the aligners returning the alignments of best biological quality are: BEAMS, TAME, and WAVE. However, if there are time constraints, it is recommended to select SAlign to obtain high topological quality alignments and PISwap or SAlign aligners for high biological quality alignments. CONCLUSIONS The use of the SANA aligner is recommended for obtaining the best alignments of topological quality, BEAMS for alignments of the best biological quality, and SAlign for alignments of the best combined topological and biological quality. Simultaneously, SANA and BEAMS have above-average runtimes. Therefore, it is suggested, if necessary due to time restrictions, to choose other, faster aligners like SAlign or PISwap whose alignments are also of high quality.
Collapse
Affiliation(s)
- Manuel Menor-Flores
- Escuela Politécnica, Universidad de Extremadura,(1) Campus Universitario s/n, 10003 Cáceres, Spain.
| | - Miguel A Vega-Rodríguez
- Escuela Politécnica, Universidad de Extremadura,(1) Campus Universitario s/n, 10003 Cáceres, Spain.
| |
Collapse
|
32
|
Karunakaran KB, Ganapathiraju MK. Malignant peritoneal mesothelioma interactome with 417 novel protein-protein interactions. BJC REPORTS 2024; 2:42. [PMID: 39516360 PMCID: PMC11524009 DOI: 10.1038/s44276-024-00062-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 04/11/2024] [Accepted: 04/16/2024] [Indexed: 11/16/2024]
Abstract
BACKGROUND Malignant peritoneal mesothelioma (MPeM) is an aggressive cancer affecting the abdominal peritoneal lining and intra-abdominal organs, with a median survival of ~2.5 years. METHODS We constructed the protein interactome of 59 MPeM-associated genes with previously known protein-protein interactions (PPIs) as well as novel PPIs predicted using our previously developed HiPPIP computational model and analysed it for transcriptomic and functional associations and for repurposable drugs. RESULTS The MPeM interactome had over 400 computationally predicted PPIs and 4700 known PPIs. Transcriptomic evidence validated 75.6% of the genes in the interactome and 65% of the novel interactors. Some genes had tissue-specific expression in extramedullary hematopoietic sites and the expression of some genes could be correlated with unfavourable prognoses in various cancers. 39 out of 152 drugs that target the proteins in the interactome were identified as potentially repurposable for MPeM, with 29 having evidence from prior clinical trials, animal models or cell lines for effectiveness against peritoneal and pleural mesothelioma and primary peritoneal cancer. Functional modules related to chromosomal segregation, transcriptional dysregulation, IL-6 production and hematopoiesis were identified from the interactome. The MPeM interactome overlapped significantly with the malignant pleural mesothelioma interactome, revealing shared molecular pathways. CONCLUSIONS Our findings demonstrate the utility of the interactome in uncovering biological associations and in generating clinically translatable results.
Collapse
Affiliation(s)
- Kalyani B Karunakaran
- Supercomputer Education and Research Centre, Indian Institute of Science, Bengaluru, 560012, India.
| | - Madhavi K Ganapathiraju
- Department of Biomedical Informatics, School of Medicine, and Intelligent Systems Program, School of Computing and Information, University of Pittsburgh, 5607 Baum Blvd, 5th Floor, Pittsburgh, PA, 15206, USA.
- Carnegie Mellon University in Qatar, Doha, Qatar.
| |
Collapse
|
33
|
Rojas-Rodriguez F, Schmidt MK, Canisius S. Assessing the validity of driver gene identification tools for targeted genome sequencing data. BIOINFORMATICS ADVANCES 2024; 4:vbae073. [PMID: 38808071 PMCID: PMC11132814 DOI: 10.1093/bioadv/vbae073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 04/16/2024] [Accepted: 05/22/2024] [Indexed: 05/30/2024]
Abstract
Motivation Most cancer driver gene identification tools have been developed for whole-exome sequencing data. Targeted sequencing is a popular alternative to whole-exome sequencing for large cancer studies due to its greater depth at a lower cost per tumor. Unlike whole-exome sequencing, targeted sequencing only enables mutation calling for a selected subset of genes. Whether existing driver gene identification tools remain valid in that context has not previously been studied. Results We evaluated the validity of seven popular driver gene identification tools when applied to targeted sequencing data. Based on whole-exome data of 14 different cancer types from TCGA, we constructed matching targeted datasets by keeping only the mutations overlapping with the pan-cancer MSK-IMPACT panel and, in the case of breast cancer, also the breast-cancer-specific B-CAST panel. We then compared the driver gene predictions obtained on whole-exome and targeted mutation data for each of the seven tools. Differences in how the tools model background mutation rates were the most important determinant of their validity on targeted sequencing data. Based on our results, we recommend OncodriveFML, OncodriveCLUSTL, 20/20+, dNdSCv, and ActiveDriver for driver gene identification in targeted sequencing data, whereas MutSigCV and DriverML are best avoided in that context. Availability and implementation Code for the analyses is available at https://github.com/SchmidtGroupNKI/TGSdrivergene_validity.
Collapse
Affiliation(s)
- Felipe Rojas-Rodriguez
- Division of Molecular Pathology, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands
| | - Marjanka K Schmidt
- Division of Molecular Pathology, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands
- Department of Clinical Genetics, Leiden University Medical Center, 2333 ZC Leiden, The Netherlands
- Division of Psychosocial Research and Epidemiology, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands
| | - Sander Canisius
- Division of Molecular Pathology, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands
- Division of Molecular Carcinogenesis, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands
| |
Collapse
|
34
|
Chereda H, Leha A, Beißbarth T. Stable feature selection utilizing Graph Convolutional Neural Network and Layer-wise Relevance Propagation for biomarker discovery in breast cancer. Artif Intell Med 2024; 151:102840. [PMID: 38658129 DOI: 10.1016/j.artmed.2024.102840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 03/05/2024] [Accepted: 03/10/2024] [Indexed: 04/26/2024]
Abstract
High-throughput technologies are becoming increasingly important in discovering prognostic biomarkers and in identifying novel drug targets. With Mammaprint, Oncotype DX, and many other prognostic molecular signatures breast cancer is one of the paradigmatic examples of the utility of high-throughput data to deliver prognostic biomarkers, that can be represented in a form of a rather short gene list. Such gene lists can be obtained as a set of features (genes) that are important for the decisions of a Machine Learning (ML) method applied to high-dimensional gene expression data. Several studies have identified predictive gene lists for patient prognosis in breast cancer, but these lists are unstable and have only a few genes in common. Instability of feature selection impedes biological interpretability: genes that are relevant for cancer pathology should be members of any predictive gene list obtained for the same clinical type of patients. Stability and interpretability of selected features can be improved by including information on molecular networks in ML methods. Graph Convolutional Neural Network (GCNN) is a contemporary deep learning approach applicable to gene expression data structured by a prior knowledge molecular network. Layer-wise Relevance Propagation (LRP) and SHapley Additive exPlanations (SHAP) are methods to explain individual decisions of deep learning models. We used both GCNN+LRP and GCNN+SHAP techniques to construct feature sets by aggregating individual explanations. We suggest a methodology to systematically and quantitatively analyze the stability, the impact on the classification performance, and the interpretability of the selected feature sets. We used this methodology to compare GCNN+LRP to GCNN+SHAP and to more classical ML-based feature selection approaches. Utilizing a large breast cancer gene expression dataset we show that, while feature selection with SHAP is useful in applications where selected features have to be impactful for classification performance, among all studied methods GCNN+LRP delivers the most stable (reproducible) and interpretable gene lists.
Collapse
Affiliation(s)
- Hryhorii Chereda
- Medical Bioinformatics, University Medical Center Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany
| | - Andreas Leha
- Medical Bioinformatics, University Medical Center Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany; Medical Statistics, University Medical Center Göttingen, Humboldtallee 32, Göttingen, 37073, Germany; Scientific Core Facility Medical Biometry and Statistical Bioinformatics, University Medical Center Göttingen, Humboldtallee 32, Göttingen, 37073, Germany
| | - Tim Beißbarth
- Medical Bioinformatics, University Medical Center Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany; Campus-Institute Data Science (CIDAS), University of Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany.
| |
Collapse
|
35
|
Priyanka P, Gopalakrishnan AP, Nisar M, Shivamurthy PB, George M, John L, Sanjeev D, Yandigeri T, Thomas SD, Rafi A, Dagamajalu S, Velikkakath AKG, Abhinand CS, Kanekar S, Prasad TSK, Balaya RDA, Raju R. A global phosphosite-correlated network map of Thousand And One Kinase 1 (TAOK1). Int J Biochem Cell Biol 2024; 170:106558. [PMID: 38479581 DOI: 10.1016/j.biocel.2024.106558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 02/19/2024] [Accepted: 03/09/2024] [Indexed: 03/25/2024]
Abstract
Thousand and one amino acid kinase 1 (TAOK1) is a sterile 20 family Serine/Threonine kinase linked to microtubule dynamics, checkpoint signaling, DNA damage response, and neurological functions. Molecular-level alterations of TAOK1 have been associated with neurodevelopment disorders and cancers. Despite their known involvement in physiological and pathophysiological processes, and as a core member of the hippo signaling pathway, the phosphoregulatory network of TAOK1 has not been visualized. Aimed to explore this network, we first analyzed the predominantly detected and differentially regulated TAOK1 phosphosites in global phosphoproteome datasets across diverse experimental conditions. Based on 709 qualitative and 210 quantitative differential cellular phosphoproteome datasets that were systematically assembled, we identified that phosphorylation at Ser421, Ser9, Ser965, and Ser445 predominantly represented TAOK1 in almost 75% of these datasets. Surprisingly, the functional role of all these phosphosites in TAOK1 remains unexplored. Hence, we employed a robust strategy to extract the phosphosites in proteins that significantly correlated in expression with predominant TAOK1 phosphosites. This led to the first categorization of the phosphosites including those in the currently known and predicted interactors, kinases, and substrates, that positively/negatively correlated with the expression status of each predominant TAOK1 phosphosites. Subsequently, we also analyzed the phosphosites in core proteins of the hippo signaling pathway. Based on the TAOK1 phosphoregulatory network analysis, we inferred the potential role of the predominant TAOK1 phosphosites. Especially, we propose pSer9 as an autophosphorylation and TAOK1 kinase activity-associated phosphosite and pS421, the most frequently detected phosphosite in TAOK1, as a significant regulatory phosphosite involved in the maintenance of genome integrity. Considering that the impact of all phosphosites that predominantly represent each kinase is essential for the efficient interpretation of global phosphoproteome datasets, we believe that the approach undertaken in this study is suitable to be extended to other kinases for accelerated research.
Collapse
Affiliation(s)
- Pahal Priyanka
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Athira Perunelly Gopalakrishnan
- Center for Systems Biology and Molecular Medicine (CSBMM), Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Mahammad Nisar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | | | - Mejo George
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Levin John
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Diya Sanjeev
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Tanuja Yandigeri
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Sonet D Thomas
- Center for Systems Biology and Molecular Medicine (CSBMM), Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Ahmad Rafi
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Shobha Dagamajalu
- Center for Systems Biology and Molecular Medicine (CSBMM), Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Anoop Kumar G Velikkakath
- Center for Systems Biology and Molecular Medicine (CSBMM), Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Chandran S Abhinand
- Center for Systems Biology and Molecular Medicine (CSBMM), Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India.
| | - Saptami Kanekar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| | | | | | - Rajesh Raju
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore 575018, India.
| |
Collapse
|
36
|
Wright SN, Colton S, Schaffer LV, Pillich RT, Churas C, Pratt D, Ideker T. State of the Interactomes: an evaluation of molecular networks for generating biological insights. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.26.587073. [PMID: 38746239 PMCID: PMC11092493 DOI: 10.1101/2024.04.26.587073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Advancements in genomic and proteomic technologies have powered the use of gene and protein networks ("interactomes") for understanding genotype-phenotype translation. However, the proliferation of interactomes complicates the selection of networks for specific applications. Here, we present a comprehensive evaluation of 46 current human interactomes, encompassing protein-protein interactions as well as gene regulatory, signaling, colocalization, and genetic interaction networks. Our analysis shows that large composite networks such as HumanNet, STRING, and FunCoup are most effective for identifying disease genes, while smaller networks such as DIP and SIGNOR demonstrate strong interaction prediction performance. These findings provide a benchmark for interactomes across diverse network biology applications and clarify factors that influence network performance. Furthermore, our evaluation pipeline paves the way for continued assessment of emerging and updated interaction networks in the future.
Collapse
|
37
|
Lai W, Xie R, Chen C, Lou W, Yang H, Deng L, Lu Q, Tang X. Integrated analysis of scRNA-seq and bulk RNA-seq identifies FBXO2 as a candidate biomarker associated with chemoresistance in HGSOC. Heliyon 2024; 10:e28490. [PMID: 38590858 PMCID: PMC10999934 DOI: 10.1016/j.heliyon.2024.e28490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 03/19/2024] [Accepted: 03/20/2024] [Indexed: 04/10/2024] Open
Abstract
Background High-grade serous ovarian carcinoma (HGSOC) is the most prevalent and aggressive histological subtype of epithelial ovarian cancer. Around 80% of individuals will experience a recurrence within five years because of resistance to chemotherapy, despite initially responding well to platinum-based treatment. Biomarkers associated with chemoresistance are desperately needed in clinical practice. Methods We jointly analyzed the transcriptomic profiles of single-cell and bulk datasets of HGSOC to identify cell types associated with chemoresistance. Copy number variation (CNV) inference was performed to identify malignant cells. We subsequently analyzed the expression of candidate biomarkers and their relationship with patients' prognosis. The enrichment analysis and potential biological function of candidate biomarkers were explored. Then, we validated the candidate biomarker using in vitro experiments. Results We identified 8871 malignant epithelial cells in a single-cell RNA sequencing dataset, of which 861 cells were associated with chemoresistance. Among these malignant epithelial cells, FBXO2 (F-box protein 2) is highly expressed in cells related to chemoresistance. Moreover, FBXO2 expression was found to be higher in epithelial cells from chemoresistance samples compared to those from chemosensitivity samples in a separate single-cell RNA sequencing dataset. Patients exhibiting elevated levels of FBXO2 experienced poorer outcomes in terms of both overall survival (OS) and progression-free survival (PFS). FBXO2 could impact chemoresistance by influencing the PI3K-Akt signaling pathway, focal adhesion, and ECM-receptor interactions and regulating tumorigenesis. The 50% maximum inhibitory concentration (IC50) of cisplatin decreased in A2780 and SKOV3 ovarian carcinoma cell lines with silenced FBXO2 during an in vitro experiment. Conclusions We determined that FBXO2 is a potential biomarker linked to chemoresistance in HGSOC by combining single-cell RNA-seq and bulk RNA-seq dataset. Our results suggest that FBXO2 could serve as a valuable prognostic marker and potential target for drug development in HGSOC.
Collapse
Affiliation(s)
- Wenwen Lai
- Department of Organ Transplantation, The Second Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
- Jiangxi Provincial Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
- Department of Biostatistics and Epidemiology, School of Public Health, Nanchang University, Nanchang, Jiangxi, China
| | - Ruixiang Xie
- School of Life Science, Nanchang University, Nanchang University, Nanchang, China
| | - Chen Chen
- College of Basic Medical Science, Nanchang University, Nanchang, China
| | - Weiming Lou
- Academic Affairs Office, The Second Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
| | - Haiyan Yang
- Jiangxi Provincial Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
- Department of Biostatistics and Epidemiology, School of Public Health, Nanchang University, Nanchang, Jiangxi, China
| | - Libin Deng
- Jiangxi Provincial Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
- Department of Biostatistics and Epidemiology, School of Public Health, Nanchang University, Nanchang, Jiangxi, China
| | - Quqin Lu
- Jiangxi Provincial Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
- Department of Biostatistics and Epidemiology, School of Public Health, Nanchang University, Nanchang, Jiangxi, China
| | - Xiaoli Tang
- College of Basic Medical Science, Nanchang University, Nanchang, China
| |
Collapse
|
38
|
Saravanan KS, Satish KS, Saraswathy GR, Kuri U, Vastrad SJ, Giri R, Dsouza PL, Kumar AP, Nair G. Innovative target mining stratagems to navigate drug repurposing endeavours. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2024; 205:303-355. [PMID: 38789185 DOI: 10.1016/bs.pmbts.2024.03.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
The conventional theory linking a single gene with a particular disease and a specific drug contributes to the dwindling success rates of traditional drug discovery. This requires a substantial shift focussing on contemporary drug design or drug repurposing, which entails linking multiple genes to diverse physiological or pathological pathways and drugs. Lately, drug repurposing, the art of discovering new/unlabelled indications for existing drugs or candidates in clinical trials, is gaining attention owing to its success rates. The rate-limiting phase of this strategy lies in target identification, which is generally driven through disease-centric and/or drug-centric approaches. The disease-centric approach is based on exploration of crucial biomolecules such as genes or proteins underlying pathological cascades of the disease of interest. Investigating these pathological interplays aids in the identification of potential drug targets that can be leveraged for novel therapeutic interventions. The drug-centric approach involves various strategies such as exploring the mechanism of adverse drug reactions that can unearth potential targets, as these untoward reactions might be considered desirable therapeutic actions in other disease conditions. Currently, artificial intelligence is an emerging robust tool that can be used to translate the aforementioned intricate biological networks to render interpretable data for extracting precise molecular targets. Integration of multiple approaches, big data analytics, and clinical corroboration are essential for successful target mining. This chapter highlights the contemporary strategies steering target identification and diverse frameworks for drug repurposing. These strategies are illustrated through case studies curated from recent drug repurposing research inclined towards neurodegenerative diseases, cancer, infections, immunological, and cardiovascular disorders.
Collapse
Affiliation(s)
- Kamatchi Sundara Saravanan
- Department of Pharmacognosy, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Kshreeraja S Satish
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Ganesan Rajalekshmi Saraswathy
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India.
| | - Ushnaa Kuri
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Soujanya J Vastrad
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Ritesh Giri
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Prizvan Lawrence Dsouza
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Adusumilli Pramod Kumar
- Department of Pharmacy Practice, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| | - Gouri Nair
- Department of Pharmacology, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, Bangalore, Karnataka, India
| |
Collapse
|
39
|
Cao J, Chen Q, Qiu J, Wang Y, Lan W, Du X, Tan K. NGCN: Drug-target interaction prediction by integrating information and feature learning from heterogeneous network. J Cell Mol Med 2024; 28:e18224. [PMID: 38509739 PMCID: PMC10955156 DOI: 10.1111/jcmm.18224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 02/14/2024] [Accepted: 02/26/2024] [Indexed: 03/22/2024] Open
Abstract
Drug-target interaction (DTI) prediction is essential for new drug design and development. Constructing heterogeneous network based on diverse information about drugs, proteins and diseases provides new opportunities for DTI prediction. However, the inherent complexity, high dimensionality and noise of such a network prevent us from taking full advantage of these network characteristics. This article proposes a novel method, NGCN, to predict drug-target interactions from an integrated heterogeneous network, from which to extract relevant biological properties and association information while maintaining the topology information. It focuses on learning the topology representation of drugs and targets to improve the performance of DTI prediction. Unlike traditional methods, it focuses on learning the low-dimensional topology representation of drugs and targets via graph-based convolutional neural network. NGCN achieves substantial performance improvements over other state-of-the-art methods, such as a nearly 1.0% increase in AUPR value. Moreover, we verify the robustness of NGCN through benchmark tests, and the experimental results demonstrate it is an extensible framework capable of combining heterogeneous information for DTI prediction.
Collapse
Affiliation(s)
- Junyue Cao
- College of Life Science and TechnologyGuangxi UniversityNanningChina
| | - Qingfeng Chen
- School of Computer, Electronics and InformationGuangxi UniversityNanningChina
| | - Junlai Qiu
- School of Computer, Electronics and InformationGuangxi UniversityNanningChina
| | - Yiming Wang
- School of Computer, Electronics and InformationGuangxi UniversityNanningChina
| | - Wei Lan
- School of Computer, Electronics and InformationGuangxi UniversityNanningChina
| | - Xiaojing Du
- School of Computer, Electronics and InformationGuangxi UniversityNanningChina
| | - Kai Tan
- School of Computer, Electronics and InformationGuangxi UniversityNanningChina
| |
Collapse
|
40
|
Chen S, Li M, Semenov I. MFA-DTI: Drug-target interaction prediction based on multi-feature fusion adopted framework. Methods 2024; 224:79-92. [PMID: 38430967 DOI: 10.1016/j.ymeth.2024.02.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 02/16/2024] [Accepted: 02/23/2024] [Indexed: 03/05/2024] Open
Abstract
The identification of drug-target interactions (DTI) is a valuable step in the drug discovery and repositioning process. However, traditional laboratory experiments are time-consuming and expensive. Computational methods have streamlined research to determine DTIs. The application of deep learning methods has significantly improved the prediction performance for DTIs. Modern deep learning methods can leverage multiple sources of information, including sequence data that contains biological structural information, and interaction data. While useful, these methods cannot be effectively applied to each type of information individually (e.g., chemical structure and interaction network) and do not take into account the specificity of DTI data such as low- or zero-interaction biological entities. To overcome these limitations, we propose a method called MFA-DTI (Multi-feature Fusion Adopted framework for DTI). MFA-DTI consists of three modules: an interaction graph learning module that processes the interaction network to generate interaction vectors, a chemical structure learning module that extracts features from the chemical structure, and a fusion module that combines these features for the final prediction. To validate the performance of MFA-DTI, we conducted experiments on six public datasets under different settings. The results indicate that the proposed method is highly effective in various settings and outperforms state-of-the-art methods.
Collapse
Affiliation(s)
- Siqi Chen
- School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing, 400074, China.
| | - Minghui Li
- Beidahuang Industry Group General Hospital, Harbin, 150006, China
| | - Ivan Semenov
- College of Intelligence and Computing, Tianjin University, Tianjin, 300072, China
| |
Collapse
|
41
|
Idrees S, Paudel KR, Sadaf T, Hansbro PM. Uncovering domain motif interactions using high-throughput protein-protein interaction detection methods. FEBS Lett 2024; 598:725-742. [PMID: 38439692 DOI: 10.1002/1873-3468.14841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/09/2024] [Accepted: 02/18/2024] [Indexed: 03/06/2024]
Abstract
Protein-protein interactions (PPIs) are often mediated by short linear motifs (SLiMs) in one protein and domain in another, known as domain-motif interactions (DMIs). During the past decade, SLiMs have been studied to find their role in cellular functions such as post-translational modifications, regulatory processes, protein scaffolding, cell cycle progression, cell adhesion, cell signalling and substrate selection for proteasomal degradation. This review provides a comprehensive overview of the current PPI detection techniques and resources, focusing on their relevance to capturing interactions mediated by SLiMs. We also address the challenges associated with capturing DMIs. Moreover, a case study analysing the BioGrid database as a source of DMI prediction revealed significant known DMI enrichment in different PPI detection methods. Overall, it can be said that current high-throughput PPI detection methods can be a reliable source for predicting DMIs.
Collapse
Affiliation(s)
- Sobia Idrees
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Keshav Raj Paudel
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Tayyaba Sadaf
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Philip M Hansbro
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| |
Collapse
|
42
|
Liu C, Xiao K, Yu C, Lei Y, Lyu K, Tian T, Zhao D, Zhou F, Tang H, Zeng J. A probabilistic knowledge graph for target identification. PLoS Comput Biol 2024; 20:e1011945. [PMID: 38578805 PMCID: PMC11034645 DOI: 10.1371/journal.pcbi.1011945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 04/22/2024] [Accepted: 02/24/2024] [Indexed: 04/07/2024] Open
Abstract
Early identification of safe and efficacious disease targets is crucial to alleviating the tremendous cost of drug discovery projects. However, existing experimental methods for identifying new targets are generally labor-intensive and failure-prone. On the other hand, computational approaches, especially machine learning-based frameworks, have shown remarkable application potential in drug discovery. In this work, we propose Progeni, a novel machine learning-based framework for target identification. In addition to fully exploiting the known heterogeneous biological networks from various sources, Progeni integrates literature evidence about the relations between biological entities to construct a probabilistic knowledge graph. Graph neural networks are then employed in Progeni to learn the feature embeddings of biological entities to facilitate the identification of biologically relevant target candidates. A comprehensive evaluation of Progeni demonstrated its superior predictive power over the baseline methods on the target identification task. In addition, our extensive tests showed that Progeni exhibited high robustness to the negative effect of exposure bias, a common phenomenon in recommendation systems, and effectively identified new targets that can be strongly supported by the literature. Moreover, our wet lab experiments successfully validated the biological significance of the top target candidates predicted by Progeni for melanoma and colorectal cancer. All these results suggested that Progeni can identify biologically effective targets and thus provide a powerful and useful tool for advancing the drug discovery process.
Collapse
Affiliation(s)
- Chang Liu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Kaimin Xiao
- School of Pharmaceutical Sciences, Tsinghua University, Beijing, China
- Joint Graduate Program of Peking-Tsinghua-NIBS, School of Life Sciences, Tsinghua University, Beijing, China
| | - Cuinan Yu
- Machine Learning Department, Silexon AI Technology Co., Ltd., Nanjing, Jiangsu Province, China
| | - Yipin Lei
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Kangbo Lyu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Tingzhong Tian
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Dan Zhao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Fengfeng Zhou
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, Jilin Province, China
| | - Haidong Tang
- School of Pharmaceutical Sciences, Tsinghua University, Beijing, China
| | - Jianyang Zeng
- School of Engineering, Westlake University, Hangzhou, China
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, China
- Research Center for Industries of the Future and School of Engineering, Westlake University, Hangzhou, Zhejiang Province, China
| |
Collapse
|
43
|
Kim KM, Lee KG, Lee S, Hong BK, Yun H, Park YJ, Yoo SA, Kim WU. The acute phase reactant orosomucoid-2 directly promotes rheumatoid inflammation. Exp Mol Med 2024; 56:890-903. [PMID: 38556552 PMCID: PMC11058272 DOI: 10.1038/s12276-024-01188-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 12/04/2023] [Accepted: 12/20/2023] [Indexed: 04/02/2024] Open
Abstract
Acute phase proteins involved in chronic inflammatory diseases have not been systematically analyzed. Here, global proteome profiling of serum and urine revealed that orosomucoid-2 (ORM2), an acute phase reactant, was differentially expressed in rheumatoid arthritis (RA) patients and showed the highest fold change. Therefore, we questioned the extent to which ORM2, which is produced mainly in the liver, actively participates in rheumatoid inflammation. Surprisingly, ORM2 expression was upregulated in the synovial fluids and synovial membranes of RA patients. The major cell types producing ORM2 were synovial macrophages and fibroblast-like synoviocytes (FLSs) from RA patients. Recombinant ORM2 robustly increased IL-6, TNF-α, CXCL8 (IL-8), and CCL2 production by RA macrophages and FLSs via the NF-κB and p38 MAPK pathways. Interestingly, glycophorin C, a membrane protein for determining erythrocyte shape, was the receptor for ORM2. Intra-articular injection of ORM2 increased the severity of arthritis in mice and accelerated the infiltration of macrophages into the affected joints. Moreover, circulating ORM2 levels correlated with RA activity and radiographic progression. In conclusion, the acute phase protein ORM2 can directly increase the production of proinflammatory mediators and promote chronic arthritis in mice, suggesting that ORM2 could be a new therapeutic target for RA.
Collapse
Affiliation(s)
- Ki-Myo Kim
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea
- Department of Biomedicine & Health Sciences, College of Medicine, The Catholic University of Korea, Seoul, South Korea
| | - Kang-Gu Lee
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea
- Department of Biomedicine & Health Sciences, College of Medicine, The Catholic University of Korea, Seoul, South Korea
| | - Saseong Lee
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea
| | - Bong-Ki Hong
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea
| | - Heejae Yun
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea
- Department of Biomedicine & Health Sciences, College of Medicine, The Catholic University of Korea, Seoul, South Korea
| | - Yune-Jung Park
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea
- Division of Rheumatology, Department of Internal Medicine, St. Vincent's Hospital, The Catholic University of Korea, Suwon, South Korea
| | - Seung-Ah Yoo
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea.
- Department of Biomedicine & Health Sciences, College of Medicine, The Catholic University of Korea, Seoul, South Korea.
| | - Wan-Uk Kim
- Center for Integrative Rheumatoid Transcriptomics and Dynamics, The Catholic University of Korea, Seoul, South Korea.
- Department of Internal Medicine, The Catholic University of Korea, Seoul, South Korea.
| |
Collapse
|
44
|
Jia P, Zhang F, Wu C, Li M. A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond. Brief Bioinform 2024; 25:bbae162. [PMID: 38739759 PMCID: PMC11089422 DOI: 10.1093/bib/bbae162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 02/17/2024] [Accepted: 03/31/2024] [Indexed: 05/16/2024] Open
Abstract
Proteins interact with diverse ligands to perform a large number of biological functions, such as gene expression and signal transduction. Accurate identification of these protein-ligand interactions is crucial to the understanding of molecular mechanisms and the development of new drugs. However, traditional biological experiments are time-consuming and expensive. With the development of high-throughput technologies, an increasing amount of protein data is available. In the past decades, many computational methods have been developed to predict protein-ligand interactions. Here, we review a comprehensive set of over 160 protein-ligand interaction predictors, which cover protein-protein, protein-nucleic acid, protein-peptide and protein-other ligands (nucleotide, heme, ion) interactions. We have carried out a comprehensive analysis of the above four types of predictors from several significant perspectives, including their inputs, feature profiles, models, availability, etc. The current methods primarily rely on protein sequences, especially utilizing evolutionary information. The significant improvement in predictions is attributed to deep learning methods. Additionally, sequence-based pretrained models and structure-based approaches are emerging as new trends.
Collapse
Affiliation(s)
- Pengzhen Jia
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Fuhao Zhang
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
- College of Information Engineering, Northwest A&F University, No. 3 Taicheng Road, Yangling, Shaanxi 712100, China
| | - Chaojin Wu
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| |
Collapse
|
45
|
Idrees S, Paudel KR. Proteome-wide assessment of human interactome as a source of capturing domain-motif and domain-domain interactions. J Cell Commun Signal 2024; 18:e12014. [PMID: 38545252 PMCID: PMC10964934 DOI: 10.1002/ccs3.12014] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 12/11/2023] [Indexed: 06/29/2024] Open
Abstract
Protein-protein interactions (PPIs) play a crucial role in various biological processes by establishing domain-motif (DMI) and domain-domain interactions (DDIs). While the existence of real DMIs/DDIs is generally assumed, it is rarely tested; therefore, this study extensively compared high-throughput methods and public PPI repositories as sources for DMI and DDI prediction based on the assumption that the human interactome provides sufficient data for the reliable identification of DMIs and DDIs. Different datasets from leading high-throughput methods (Yeast two-hybrid [Y2H], Affinity Purification coupled Mass Spectrometry [AP-MS], and Co-fractionation-coupled Mass Spectrometry) were assessed for their ability to capture DMIs and DDIs using known DMI/DDI information. High-throughput methods were not notably worse than PPI databases and, in some cases, appeared better. In conclusion, all PPI datasets demonstrated significant enrichment in DMIs and DDIs (p-value <0.001), establishing Y2H and AP-MS as reliable methods for predicting these interactions. This study provides valuable insights for biologists in selecting appropriate methods for predicting DMIs, ultimately aiding in SLiM discovery.
Collapse
Affiliation(s)
- Sobia Idrees
- School of Biotechnology and Biomolecular SciencesUniversity of New South WalesSydneyNew South WalesAustralia
- Centre for InflammationCentenary Institute and the University of Technology SydneySchool of Life SciencesFaculty of ScienceSydneyNew South WalesAustralia
| | - Keshav Raj Paudel
- Centre for InflammationCentenary Institute and the University of Technology SydneySchool of Life SciencesFaculty of ScienceSydneyNew South WalesAustralia
| |
Collapse
|
46
|
E Z, Qiao G, Wang G, Li Y. GSL-DTI: Graph structure learning network for Drug-Target interaction prediction. Methods 2024; 223:136-145. [PMID: 38360082 DOI: 10.1016/j.ymeth.2024.01.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 12/23/2023] [Accepted: 01/29/2024] [Indexed: 02/17/2024] Open
Abstract
MOTIVATION Drug-target interaction prediction is an important area of research to predict whether there is an interaction between a drug molecule and its target protein. It plays a critical role in drug discovery and development by facilitating the identification of potential drug candidates and expediting the overall process. Given the time-consuming, expensive, and high-risk nature of traditional drug discovery methods, the prediction of drug-target interactions has become an indispensable tool. Using machine learning and deep learning to tackle this class of problems has become a mainstream approach, and graph-based models have recently received much attention in this field. However, many current graph-based Drug-Target Interaction (DTI) prediction methods rely on manually defined rules to construct the Drug-Protein Pair (DPP) network during the DPP representation learning process. However, these methods fail to capture the true underlying relationships between drug molecules and target proteins. RESULTS We propose GSL-DTI, an automatic graph structure learning model used for predicting drug-target interactions (DTIs). Initially, we integrate large-scale heterogeneous networks using a graph convolution network based on meta-paths, effectively learning the representations of drugs and target proteins. Subsequently, we construct drug-protein pairs based on these representations. In contrast to previous studies that construct DPP networks based on manual rules, our method introduces an automatic graph structure learning approach. This approach utilizes a filter gate on the affinity scores of DPPs and relies on the classification loss of downstream tasks to guide the learning of the underlying DPP network structure. Based on the learned DPP network, we transform the prediction of drug-target interactions into a node classification problem. The comprehensive experiments conducted on three public datasets have shown the superiority of GSL-DTI in the tasks of DTI prediction. Additionally, GSL-DTI provides a fresh perspective for advancing research in graph structure learning for DTI prediction.
Collapse
Affiliation(s)
- Zixuan E
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China
| | - Guanyu Qiao
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China.
| | - Yang Li
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China.
| |
Collapse
|
47
|
Jin S, Zhang Y, Yu H, Lu M. SADR: Self-Supervised Graph Learning With Adaptive Denoising for Drug Repositioning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:265-277. [PMID: 38190661 DOI: 10.1109/tcbb.2024.3351079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Abstract
Traditional drug development is often high-risk and time-consuming. A promising alternative is to reuse or relocate approved drugs. Recently, some methods based on graph representation learning have started to be used for drug repositioning. These models learn the low dimensional embeddings of drug and disease nodes from the drug-disease interaction network to predict the potential association between drugs and diseases. However, these methods have strict requirements for the dataset, and if the dataset is sparse, the performance of these methods will be severely affected. At the same time, these methods have poor robustness to noise in the dataset. In response to the above challenges, we propose a drug repositioning model based on self-supervised graph learning with adptive denoising, called SADR. SADR uses data augmentation and contrastive learning strategies to learn feature representations of nodes, which can effectively solve the problems caused by sparse datasets. SADR includes an adaptive denoising training (ADT) component that can effectively identify noisy data during the training process and remove the impact of noise on the model. We have conducted comprehensive experiments on three datasets and have achieved better prediction accuracy compared to multiple baseline models. At the same time, we propose the top 10 new predictive approved drugs for treating two diseases. This demonstrates the ability of our model to identify potential drug candidates for disease indications.
Collapse
|
48
|
Liu Q, Li X, Li Y, Luo Q, Fan Q, Lu A, Guan D, Li J. A novel network pharmacology strategy to decode mechanism of Wuling Powder in treating liver cirrhosis. Chin Med 2024; 19:36. [PMID: 38429802 PMCID: PMC10905787 DOI: 10.1186/s13020-024-00896-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 01/26/2024] [Indexed: 03/03/2024] Open
Abstract
BACKGROUND Liver cirrhosis is a chronic liver disease with hepatocyte necrosis and lesion. As one of the TCM formulas Wuling Powder (WLP) is widely used in the treatment of liver cirrhosis. However, it's key functional components and action mechanism still remain unclear. We attempted to explore the Key Group of Effective Components (KGEC) of WLP in the treatment of Liver cirrhosis through integrative pharmacology combined with experiments. METHODS The components and potential target genes of WLP were extracted from published databases. A novel node importance calculation model considering both node control force and node bridging force is designed to construct the Function Response Space (FRS) and obtain key effector proteins. The genetic knapsack algorithm was employed to select KGEC. The effectiveness and reliability of KGEC were evaluated at the functional level by using gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Finally, the effectiveness and potential mechanism of KGEC were confirmed by CCK-8, qPCR and Western blot. RESULTS 940 effective proteins were obtained in FRS. KEGG pathways and GO terms enrichments analysis suggested that effective proteins well reflect liver cirrhosis characteristics at the functional level. 29 components of WLP were defined as KGEC, which covered 100% of the targets of the effective proteins. Additionally, the pathways enriched for the KGEC targets accounted for 83.33% of the shared genes between the targets and the pathogenic genes enrichment pathways. Three components scopoletin, caryophyllene oxide, and hydroxyzinamic acid from KGEC were selected for in vivo verification. The qPCR results demonstrated that all three components significantly reduced the mRNA levels of COL1A1 in TGF-β1-induced liver cirrhosis model. Furthermore, the Western blot assay indicated that these components acted synergistically to target the NF-κB, AMPK/p38, cAMP, and PI3K/AKT pathways, thus inhibiting the progression of liver cirrhosis. CONCLUSION In summary, we have developed a new model that reveals the key components and potential mechanisms of WLP for the treatment of liver cirrhosis. This model provides a reference for the secondary development of WLP and offers a methodological strategy for studying TCM formulas.
Collapse
Affiliation(s)
- Qinwen Liu
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou, China
| | - Xiaowei Li
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou, China
| | - Yi Li
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou, China
| | - Qian Luo
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou, China
| | - Qiling Fan
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou, China
| | - Aiping Lu
- Institute of Integrated Bioinformedicine and Translational Science, Hong Kong Baptist University, Hong Kong, China.
- Guangdong-Hong Kong-Macau Joint Lab On Chinese Medicine and Immune Disease Research, Guangzhou, China.
| | - Daogang Guan
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.
- Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou, China.
| | - Jiahui Li
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.
- Center for Genetics and Developmental Systems Biology, Department of Obstetrics and Gynecology, Nanfang Hospital, Southern Medical University, Guangzhou, China.
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.
| |
Collapse
|
49
|
Karunakaran KB, Jain S, Brahmachari SK, Balakrishnan N, Ganapathiraju MK. Parkinson's disease and schizophrenia interactomes contain temporally distinct gene clusters underlying comorbid mechanisms and unique disease processes. SCHIZOPHRENIA (HEIDELBERG, GERMANY) 2024; 10:26. [PMID: 38413605 PMCID: PMC10899210 DOI: 10.1038/s41537-024-00439-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 01/24/2024] [Indexed: 02/29/2024]
Abstract
Genome-wide association studies suggest significant overlaps in Parkinson's disease (PD) and schizophrenia (SZ) risks, but the underlying mechanisms remain elusive. The protein-protein interaction network ('interactome') plays a crucial role in PD and SZ and can incorporate their spatiotemporal specificities. Therefore, to study the linked biology of PD and SZ, we compiled PD- and SZ-associated genes from the DisGeNET database, and constructed their interactomes using BioGRID and HPRD. We examined the interactomes using clustering and enrichment analyses, in conjunction with the transcriptomic data of 26 brain regions spanning foetal stages to adulthood available in the BrainSpan Atlas. PD and SZ interactomes formed four gene clusters with distinct temporal identities (Disease Gene Networks or 'DGNs'1-4). DGN1 had unique SZ interactome genes highly expressed across developmental stages, corresponding to a neurodevelopmental SZ subtype. DGN2, containing unique SZ interactome genes expressed from early infancy to adulthood, correlated with an inflammation-driven SZ subtype and adult SZ risk. DGN3 contained unique PD interactome genes expressed in late infancy, early and late childhood, and adulthood, and involved in mitochondrial pathways. DGN4, containing prenatally-expressed genes common to both the interactomes, involved in stem cell pluripotency and overlapping with the interactome of 22q11 deletion syndrome (comorbid psychosis and Parkinsonism), potentially regulates neurodevelopmental mechanisms in PD-SZ comorbidity. Our findings suggest that disrupted neurodevelopment (regulated by DGN4) could expose risk windows in PD and SZ, later elevating disease risk through inflammation (DGN2). Alternatively, variant clustering in DGNs may produce disease subtypes, e.g., PD-SZ comorbidity with DGN4, and early/late-onset SZ with DGN1/DGN2.
Collapse
Affiliation(s)
- Kalyani B Karunakaran
- Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India.
- Institute for the Advanced Study of Human Biology, Kyoto University, Kyoto, Japan.
| | - Sanjeev Jain
- National Institute of Mental Health and Neuro-Sciences (NIMHANS), Bangalore, India.
| | | | - N Balakrishnan
- Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India
| | - Madhavi K Ganapathiraju
- Department of Computer Science, Carnegie Mellon University Qatar, Doha, Qatar.
- Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
50
|
Xiong D, Qiu Y, Zhao J, Zhou Y, Lee D, Gupta S, Torres M, Lu W, Liang S, Kang JJ, Eng C, Loscalzo J, Cheng F, Yu H. Structurally-informed human interactome reveals proteome-wide perturbations by disease mutations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.04.24.538110. [PMID: 37162909 PMCID: PMC10168245 DOI: 10.1101/2023.04.24.538110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Human genome sequencing studies have identified numerous loci associated with complex diseases. However, translating human genetic and genomic findings to disease pathobiology and therapeutic discovery remains a major challenge at multiscale interactome network levels. Here, we present a deep-learning-based ensemble framework, termed PIONEER (Protein-protein InteractiOn iNtErfacE pRediction), that accurately predicts protein binding partner-specific interfaces for all known protein interactions in humans and seven other common model organisms, generating comprehensive structurally-informed protein interactomes. We demonstrate that PIONEER outperforms existing state-of-the-art methods. We further systematically validated PIONEER predictions experimentally through generating 2,395 mutations and testing their impact on 6,754 mutation-interaction pairs, confirming the high quality and validity of PIONEER predictions. We show that disease-associated mutations are enriched in PIONEER-predicted protein-protein interfaces after mapping mutations from ~60,000 germline exomes and ~36,000 somatic genomes. We identify 586 significant protein-protein interactions (PPIs) enriched with PIONEER-predicted interface somatic mutations (termed oncoPPIs) from pan-cancer analysis of ~11,000 tumor whole-exomes across 33 cancer types. We show that PIONEER-predicted oncoPPIs are significantly associated with patient survival and drug responses from both cancer cell lines and patient-derived xenograft mouse models. We identify a landscape of PPI-perturbing tumor alleles upon ubiquitination by E3 ligases, and we experimentally validate the tumorigenic KEAP1-NRF2 interface mutation p.Thr80Lys in non-small cell lung cancer. We show that PIONEER-predicted PPI-perturbing alleles alter protein abundance and correlates with drug responses and patient survival in colon and uterine cancers as demonstrated by proteogenomic data from the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium. PIONEER, implemented as both a web server platform and a software package, identifies functional consequences of disease-associated alleles and offers a deep learning tool for precision medicine at multiscale interactome network levels.
Collapse
Affiliation(s)
- Dapeng Xiong
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| | - Yunguang Qiu
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Junfei Zhao
- Department of Systems Biology, Herbert Irving Comprehensive Center, Columbia University, New York, NY 10032, USA
| | - Yadi Zhou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Dongjin Lee
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Shobhita Gupta
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
- Biophysics Program, Cornell University, Ithaca, NY 14853, USA
| | - Mateo Torres
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| | - Weiqiang Lu
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Jin Joo Kang
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| | - Charis Eng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Joseph Loscalzo
- Channing Division of Network Medicine, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|