1
|
Giudice G, Chen H, Koutsandreas T, Petsalaki E. phuEGO: A Network-Based Method to Reconstruct Active Signaling Pathways From Phosphoproteomics Datasets. Mol Cell Proteomics 2024; 23:100771. [PMID: 38642805 PMCID: PMC11134849 DOI: 10.1016/j.mcpro.2024.100771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 04/08/2024] [Accepted: 04/17/2024] [Indexed: 04/22/2024] Open
Abstract
Signaling networks are critical for virtually all cell functions. Our current knowledge of cell signaling has been summarized in signaling pathway databases, which, while useful, are highly biased toward well-studied processes, and do not capture context specific network wiring or pathway cross-talk. Mass spectrometry-based phosphoproteomics data can provide a more unbiased view of active cell signaling processes in a given context, however, it suffers from low signal-to-noise ratio and poor reproducibility across experiments. While progress in methods to extract active signaling signatures from such data has been made, there are still limitations with respect to balancing bias and interpretability. Here we present phuEGO, which combines up-to-three-layer network propagation with ego network decomposition to provide small networks comprising active functional signaling modules. PhuEGO boosts the signal-to-noise ratio from global phosphoproteomics datasets, enriches the resulting networks for functional phosphosites and allows the improved comparison and integration across datasets. We applied phuEGO to five phosphoproteomics data sets from cell lines collected upon infection with SARS CoV2. PhuEGO was better able to identify common active functions across datasets and to point to a subnetwork enriched for known COVID-19 targets. Overall, phuEGO provides a flexible tool to the community for the improved functional interpretation of global phosphoproteomics datasets.
Collapse
Affiliation(s)
- Girolamo Giudice
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, United Kingdom
| | - Haoqi Chen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, United Kingdom
| | - Thodoris Koutsandreas
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, United Kingdom
| | - Evangelia Petsalaki
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, United Kingdom.
| |
Collapse
|
2
|
Oba GM, Nakato R. Clover: An unbiased method for prioritizing differentially expressed genes using a data-driven approach. Genes Cells 2024; 29:456-470. [PMID: 38602264 PMCID: PMC11163938 DOI: 10.1111/gtc.13119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 03/12/2024] [Accepted: 03/20/2024] [Indexed: 04/12/2024]
Abstract
Identifying key genes from a list of differentially expressed genes (DEGs) is a critical step in transcriptome analysis. However, current methods, including Gene Ontology analysis and manual annotation, essentially rely on existing knowledge, which is highly biased depending on the extent of the literature. As a result, understudied genes, some of which may be associated with important molecular mechanisms, are often ignored or remain obscure. To address this problem, we propose Clover, a data-driven scoring method to specifically highlight understudied genes. Clover aims to prioritize genes associated with important molecular mechanisms by integrating three metrics: the likelihood of appearing in the DEG list, tissue specificity, and number of publications. We applied Clover to Alzheimer's disease data and confirmed that it successfully detected known associated genes. Moreover, Clover effectively prioritized understudied but potentially druggable genes. Overall, our method offers a novel approach to gene characterization and has the potential to expand our understanding of gene functions. Clover is an open-source software written in Python3 and available on GitHub at https://github.com/G708/Clover.
Collapse
Affiliation(s)
- Gina Miku Oba
- Laboratory of Computational Genomics, Institute for Quantitative BiosciencesUniversity of TokyoTokyoJapan
- Department of Computational Biology and Medical Science, Graduate School of Frontier ScienceUniversity of TokyoTokyoJapan
| | - Ryuichiro Nakato
- Laboratory of Computational Genomics, Institute for Quantitative BiosciencesUniversity of TokyoTokyoJapan
- Department of Computational Biology and Medical Science, Graduate School of Frontier ScienceUniversity of TokyoTokyoJapan
| |
Collapse
|
3
|
Saranya KR, Vimina ER, Pinto FR. TransNeT-CGP: A cluster-based comorbid gene prioritization by integrating transcriptomics and network-topological features. Comput Biol Chem 2024; 110:108038. [PMID: 38461796 DOI: 10.1016/j.compbiolchem.2024.108038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 01/11/2024] [Accepted: 02/25/2024] [Indexed: 03/12/2024]
Abstract
The local disruptions caused by the genes of one disease can influence the pathways associated with the other diseases resulting in comorbidity. For gene therapies, it is necessary to prioritize the key genes that regulate common biological mechanisms to tackle the issues caused by overlapping diseases. This work proposes a clustering-based computational approach for prioritising the comorbid genes within the overlapping disease modules by analyzing Protein-Protein Interaction networks. For this, a sub-network with gene interactions of the disease pair was extracted from the interactome. The edge weights are assigned by combining the pairwise gene expression correlation and betweenness centrality scores. Further, a weighted graph clustering algorithm is applied and dominant nodes of high-density clusters are ranked based on clustering coefficients and neighborhood connectivity. Case studies based on neurodegenerative diseases such as Amyotrophic Lateral Sclerosis- Spinal Muscular Atrophy (ALS-SMA) pair and cancers such as Ovarian Carcinoma-Invasive Ductal Breast Carcinoma (OC-IDBC) pair were conducted to examine the efficacy of the proposed method. To identify the mechanistic role of top-ranked genes, we used Functional and Pathway enrichment analysis, connectivity analysis with leave-one-out (LOO) method, analysis of associated disease-related protein complexes, and prioritization tools such as TOPPGENE and Heml2.0. From pathway analysis, it was observed that the top 10 genes obtained using the proposed method were associated with 10 pathways in ALS-SMA comorbidity and 15 in the case of OC-IDBC, while that in similar methods like SAPDSB and S2B were 4, 6 respectively for ALS-SMA and 9, 10 respectively for OC-IDBC. In both case studies, 70 % of the disease-specific benchmark protein complexes were linked to top-ranked genes of the proposed method while that of SAPDSB and S2B were 55 % and 60 % respectively. Additionally, it was found that the removal of the top 10 genes disconnect the network into 14 distinct components in the case of ALS-SMA and 9 in the case of OC-IDBC. The experimental results shows that the proposed method can be effectively used for identifying key genes in comorbidity and can offer insights about the intricate molecular relationship driving comorbid diseases.
Collapse
Affiliation(s)
- K R Saranya
- Department of Computer Science & IT, School of Computing, Amrita Vishwa Vidyapeetham, Kochi Campus, India.
| | - E R Vimina
- Department of Computer Science & IT, School of Computing, Amrita Vishwa Vidyapeetham, Kochi Campus, India.
| | - F R Pinto
- Chemistry and Biochemistry Department, Faculty of Sciences, University of Lisbon, Portugal.
| |
Collapse
|
4
|
Li L, Huang F, Zhang YH, Cai YD. Identifying allergic-rhinitis-associated genes with random-walk-based method in PPI network. Comput Biol Med 2024; 175:108495. [PMID: 38697003 DOI: 10.1016/j.compbiomed.2024.108495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/21/2024] [Accepted: 04/21/2024] [Indexed: 05/04/2024]
Abstract
Allergic rhinitis is a common allergic disease with a complex pathogenesis and many unresolved issues. Studies have shown that the incidence of allergic rhinitis is closely related to genetic factors, and research on the related genes could help further understand its pathogenesis and develop new treatment methods. In this study, 446 allergic rhinitis-related genes were obtained on the basis of the DisGeNET database. The protein-protein interaction network was searched using the random-walk-with-restart algorithm with these 446 genes as seed nodes to assess the linkages between other genes and allergic rhinitis. Then, this result was further examined by three screening tests, including permutation, interaction, and enrichment tests, which aimed to pick up genes that have strong and special associations with allergic rhinitis. 52 novel genes were finally obtained. The functional enrichment test confirmed their relationships to the biological processes and pathways related to allergic rhinitis. Furthermore, some genes were extensively analyzed to uncover their special or latent associations to allergic rhinitis, including IRAK2 and MAPK, which are involved in the pathogenesis of allergic rhinitis and the inhibition of allergic inflammation via the p38-MAPK pathway, respectively. The new found genes may help the following investigations for understanding the underlying molecular mechanisms of allergic rhinitis and developing effective treatments.
Collapse
Affiliation(s)
- Lin Li
- Department of Otolaryngology and Head&neck, The Affiliated Wuxi People's Hospital of Nanjing Medical University, Wuxi Medical Center, Nanjing Medical University, Wuxi, 214023, China; Department of Otolaryngology and Head&neck, China-Japan Union Hospital, Jilin University, Changchun, 130033, China.
| | - FeiMing Huang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China.
| | - Yu-Hang Zhang
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA.
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, 200444, China.
| |
Collapse
|
5
|
Shi W, Dong J, Zhong B, Hu X, Zhao C. Predicting the Prognosis of Bladder Cancer Patients Through Integrated Multi-omics Exploration of Chemotherapy-Related Hypoxia Genes. Mol Biotechnol 2024:10.1007/s12033-024-01203-9. [PMID: 38806990 DOI: 10.1007/s12033-024-01203-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 05/14/2024] [Indexed: 05/30/2024]
Abstract
Bladder cancer is a prevalent malignancy with high mortality rates worldwide. Hypoxia is a critical factor in the development and progression of cancers. However, whether and how hypoxia-related genes (HRGs) could affect the development and the chemotherapy response of bladder cancer is still largely unexplored. This study comprehensively explored the complex molecular landscape associated with hypoxia in bladder cancer by analyzing 260 hypoxia genes based on transcriptomic and genomic data in 411 samples. Employing the 109 dysregulated hypoxia genes for consensus clustering, we delineated two distinct bladder cancer clusters characterized by disparate survival outcomes and distinct oncogenic roles. We defined a HPscore that was correlated with a variety of clinical features, including TNM stages and pathologic grades. Tumor immune landscape analysis identified three immune clusters and close interactions between hypoxia genes and the various immune cells. Utilizing a network-based method, we defined 129 HRGs exerting influence on apoptotic processes and critical signaling pathways in cancer. Further analysis of chemotherapy drug sensitivity identified potential drug-target HRGs. We developed a Risk Score model that was related to the overall survival of bladder cancer patients based on doxorubicin-target HRGs: ACTG2, MYC, PDGFRB, DHRS2, and KLRD1. This study not only enhanced our understanding of bladder cancer at the molecular level but also provided promising avenues for the development of targeted therapies, representing a significant step toward the identification of effective treatments and addressing the urgent need for advancements in bladder cancer management.
Collapse
Affiliation(s)
- Wensheng Shi
- Hunan Key Laboratory of Skin Cancer and Psoriasis, Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Xiangya Hospital, Central South University, Changsha, 410008, Hunan, China
- National Engineering Research Center of Personalized Diagnostic and Therapeutic Technology, Central South University, Changsha, 410008, Hunan, China
- Furong Laboratory, Changsha, 410008, Hunan, China
- Department of Urology, Xiangya Hospital, Central South University, Changsha, 410008, Hunan, China
| | - Jiaming Dong
- Department of Radiation, Cangzhou Central Hospital, Hebei, 061000, China
| | - Bowen Zhong
- Hunan Key Laboratory of Skin Cancer and Psoriasis, Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Xiangya Hospital, Central South University, Changsha, 410008, Hunan, China
- National Engineering Research Center of Personalized Diagnostic and Therapeutic Technology, Central South University, Changsha, 410008, Hunan, China
- Furong Laboratory, Changsha, 410008, Hunan, China
- Department of Urology, Xiangya Hospital, Central South University, Changsha, 410008, Hunan, China
| | - Xiheng Hu
- Hunan Key Laboratory of Skin Cancer and Psoriasis, Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Xiangya Hospital, Central South University, Changsha, 410008, Hunan, China
- National Engineering Research Center of Personalized Diagnostic and Therapeutic Technology, Central South University, Changsha, 410008, Hunan, China
- Furong Laboratory, Changsha, 410008, Hunan, China
- Department of Urology, Xiangya Hospital, Central South University, Changsha, 410008, Hunan, China
| | - Chunguang Zhao
- Department of Critical Care Medicine, National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, 410008, China.
| |
Collapse
|
6
|
Wu J, Zhao X, He Y, Pan B, Lai J, Ji M, Li S, Huang J, Han J. IDMIR: identification of dysregulated miRNAs associated with disease based on a miRNA-miRNA interaction network constructed through gene expression data. Brief Bioinform 2024; 25:bbae258. [PMID: 38801703 PMCID: PMC11129766 DOI: 10.1093/bib/bbae258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 05/10/2024] [Accepted: 05/15/2024] [Indexed: 05/29/2024] Open
Abstract
Micro ribonucleic acids (miRNAs) play a pivotal role in governing the human transcriptome in various biological phenomena. Hence, the accumulation of miRNA expression dysregulation frequently assumes a noteworthy role in the initiation and progression of complex diseases. However, accurate identification of dysregulated miRNAs still faces challenges at the current stage. Several bioinformatics tools have recently emerged for forecasting the associations between miRNAs and diseases. Nonetheless, the existing reference tools mainly identify the miRNA-disease associations in a general state and fall short of pinpointing dysregulated miRNAs within a specific disease state. Additionally, no studies adequately consider miRNA-miRNA interactions (MMIs) when analyzing the miRNA-disease associations. Here, we introduced a systematic approach, called IDMIR, which enabled the identification of expression dysregulated miRNAs through an MMI network under the gene expression context, where the network's architecture was designed to implicitly connect miRNAs based on their shared biological functions within a particular disease context. The advantage of IDMIR is that it uses gene expression data for the identification of dysregulated miRNAs by analyzing variations in MMIs. We illustrated the excellent predictive power for dysregulated miRNAs of the IDMIR approach through data analysis on breast cancer and bladder urothelial cancer. IDMIR could surpass several existing miRNA-disease association prediction approaches through comparison. We believe the approach complements the deficiencies in predicting miRNA-disease association and may provide new insights and possibilities for diagnosing and treating diseases. The IDMIR approach is now available as a free R package on CRAN (https://CRAN.R-project.org/package=IDMIR).
Collapse
Affiliation(s)
- Jiashuo Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Xilong Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Yalan He
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Bingyue Pan
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Jiyin Lai
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Miao Ji
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Siyuan Li
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Junling Huang
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Junwei Han
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| |
Collapse
|
7
|
Qi T, Song L, Guo Y, Chen C, Yang J. From genetic associations to genes: methods, applications, and challenges. Trends Genet 2024:S0168-9525(24)00095-7. [PMID: 38734482 DOI: 10.1016/j.tig.2024.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 04/15/2024] [Accepted: 04/16/2024] [Indexed: 05/13/2024]
Abstract
Genome-wide association studies (GWASs) have identified numerous genetic loci associated with human traits and diseases. However, pinpointing the causal genes remains a challenge, which impedes the translation of GWAS findings into biological insights and medical applications. In this review, we provide an in-depth overview of the methods and technologies used for prioritizing genes from GWAS loci, including gene-based association tests, integrative analysis of GWAS and molecular quantitative trait loci (xQTL) data, linking GWAS variants to target genes through enhancer-gene connection maps, and network-based prioritization. We also outline strategies for generating context-dependent xQTL data and their applications in gene prioritization. We further highlight the potential of gene prioritization in drug repurposing. Lastly, we discuss future challenges and opportunities in this field.
Collapse
Affiliation(s)
- Ting Qi
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China.
| | - Liyang Song
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China
| | - Yazhou Guo
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China
| | - Chang Chen
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China
| | - Jian Yang
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China.
| |
Collapse
|
8
|
Hu J, Szymczak S. Evaluation of network-guided random forest for disease gene discovery. BioData Min 2024; 17:10. [PMID: 38627770 PMCID: PMC11020917 DOI: 10.1186/s13040-024-00361-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 04/09/2024] [Indexed: 04/20/2024] Open
Abstract
BACKGROUND Gene network information is believed to be beneficial for disease module and pathway identification, but has not been explicitly utilized in the standard random forest (RF) algorithm for gene expression data analysis. We investigate the performance of a network-guided RF where the network information is summarized into a sampling probability of predictor variables which is further used in the construction of the RF. RESULTS Our simulation results suggest that network-guided RF does not provide better disease prediction than the standard RF. In terms of disease gene discovery, if disease genes form module(s), network-guided RF identifies them more accurately. In addition, when disease status is independent from genes in the given network, spurious gene selection results can occur when using network information, especially on hub genes. Our empirical analysis on two balanced microarray and RNA-Seq breast cancer datasets from The Cancer Genome Atlas (TCGA) for classification of progesterone receptor (PR) status also demonstrates that network-guided RF can identify genes from PGR-related pathways, which leads to a better connected module of identified genes. CONCLUSIONS Gene networks can provide additional information to aid the gene expression analysis for disease module and pathway identification. But they need to be used with caution and validation on the results need to be carried out to guard against spurious gene selection. More robust approaches to incorporate such information into RF construction also warrant further study.
Collapse
Affiliation(s)
- Jianchang Hu
- Institute of Medical Biometry and Statistics, University of Lübeck, Ratzeburger Allee 160, Lübeck, 23562, Germany
| | - Silke Szymczak
- Institute of Medical Biometry and Statistics, University of Lübeck, Ratzeburger Allee 160, Lübeck, 23562, Germany.
| |
Collapse
|
9
|
Liu C, Xiao K, Yu C, Lei Y, Lyu K, Tian T, Zhao D, Zhou F, Tang H, Zeng J. A probabilistic knowledge graph for target identification. PLoS Comput Biol 2024; 20:e1011945. [PMID: 38578805 PMCID: PMC11034645 DOI: 10.1371/journal.pcbi.1011945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 04/22/2024] [Accepted: 02/24/2024] [Indexed: 04/07/2024] Open
Abstract
Early identification of safe and efficacious disease targets is crucial to alleviating the tremendous cost of drug discovery projects. However, existing experimental methods for identifying new targets are generally labor-intensive and failure-prone. On the other hand, computational approaches, especially machine learning-based frameworks, have shown remarkable application potential in drug discovery. In this work, we propose Progeni, a novel machine learning-based framework for target identification. In addition to fully exploiting the known heterogeneous biological networks from various sources, Progeni integrates literature evidence about the relations between biological entities to construct a probabilistic knowledge graph. Graph neural networks are then employed in Progeni to learn the feature embeddings of biological entities to facilitate the identification of biologically relevant target candidates. A comprehensive evaluation of Progeni demonstrated its superior predictive power over the baseline methods on the target identification task. In addition, our extensive tests showed that Progeni exhibited high robustness to the negative effect of exposure bias, a common phenomenon in recommendation systems, and effectively identified new targets that can be strongly supported by the literature. Moreover, our wet lab experiments successfully validated the biological significance of the top target candidates predicted by Progeni for melanoma and colorectal cancer. All these results suggested that Progeni can identify biologically effective targets and thus provide a powerful and useful tool for advancing the drug discovery process.
Collapse
Affiliation(s)
- Chang Liu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Kaimin Xiao
- School of Pharmaceutical Sciences, Tsinghua University, Beijing, China
- Joint Graduate Program of Peking-Tsinghua-NIBS, School of Life Sciences, Tsinghua University, Beijing, China
| | - Cuinan Yu
- Machine Learning Department, Silexon AI Technology Co., Ltd., Nanjing, Jiangsu Province, China
| | - Yipin Lei
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Kangbo Lyu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Tingzhong Tian
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Dan Zhao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Fengfeng Zhou
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, Jilin Province, China
| | - Haidong Tang
- School of Pharmaceutical Sciences, Tsinghua University, Beijing, China
| | - Jianyang Zeng
- School of Engineering, Westlake University, Hangzhou, China
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, China
- Research Center for Industries of the Future and School of Engineering, Westlake University, Hangzhou, Zhejiang Province, China
| |
Collapse
|
10
|
Gravel B, Renaux A, Papadimitriou S, Smits G, Nowé A, Lenaerts T. Prioritization of oligogenic variant combinations in whole exomes. Bioinformatics 2024; 40:btae184. [PMID: 38603604 PMCID: PMC11037482 DOI: 10.1093/bioinformatics/btae184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 01/29/2024] [Accepted: 04/10/2024] [Indexed: 04/13/2024] Open
Abstract
MOTIVATION Whole exome sequencing (WES) has emerged as a powerful tool for genetic research, enabling the collection of a tremendous amount of data about human genetic variation. However, properly identifying which variants are causative of a genetic disease remains an important challenge, often due to the number of variants that need to be screened. Expanding the screening to combinations of variants in two or more genes, as would be required under the oligogenic inheritance model, simply blows this problem out of proportion. RESULTS We present here the High-throughput oligogenic prioritizer (Hop), a novel prioritization method that uses direct oligogenic information at the variant, gene and gene pair level to detect digenic variant combinations in WES data. This method leverages information from a knowledge graph, together with specialized pathogenicity predictions in order to effectively rank variant combinations based on how likely they are to explain the patient's phenotype. The performance of Hop is evaluated in cross-validation on 36 120 synthetic exomes for training and 14 280 additional synthetic exomes for independent testing. Whereas the known pathogenic variant combinations are found in the top 20 in approximately 60% of the cross-validation exomes, 71% are found in the same ranking range when considering the independent set. These results provide a significant improvement over alternative approaches that depend simply on a monogenic assessment of pathogenicity, including early attempts for digenic ranking using monogenic pathogenicity scores. AVAILABILITY AND IMPLEMENTATION Hop is available at https://github.com/oligogenic/HOP.
Collapse
Affiliation(s)
- Barbara Gravel
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Department of Computer Science, Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium
- Department of Computer Science, Artificial Intelligence Laboratory, Vrije Universiteit Brussels, 1050 Brussels, Belgium
| | - Alexandre Renaux
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Department of Computer Science, Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium
- Department of Computer Science, Artificial Intelligence Laboratory, Vrije Universiteit Brussels, 1050 Brussels, Belgium
| | - Sofia Papadimitriou
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Department of Computer Science, Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium
- Brussels Interuniversity Genomics High Throughput core (BRIGHTcore), UZ Brussel, Vrije Universiteit Brussel (VUB) - Université Libre de Bruxelles (ULB), 1090 Brussels, Belgium
| | - Guillaume Smits
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Center of Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, 1070 Brussels, Belgium
| | - Ann Nowé
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Department of Computer Science, Artificial Intelligence Laboratory, Vrije Universiteit Brussels, 1050 Brussels, Belgium
| | - Tom Lenaerts
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Department of Computer Science, Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium
- Department of Computer Science, Artificial Intelligence Laboratory, Vrije Universiteit Brussels, 1050 Brussels, Belgium
| |
Collapse
|
11
|
Kuang H, Zhang Z, Zeng B, Liu X, Zuo H, Xu X, Wang L. A novel microbe-drug association prediction model based on graph attention networks and bilayer random forest. BMC Bioinformatics 2024; 25:78. [PMID: 38378437 PMCID: PMC10877932 DOI: 10.1186/s12859-024-05687-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 01/31/2024] [Indexed: 02/22/2024] Open
Abstract
BACKGROUND In recent years, the extensive use of drugs and antibiotics has led to increasing microbial resistance. Therefore, it becomes crucial to explore deep connections between drugs and microbes. However, traditional biological experiments are very expensive and time-consuming. Therefore, it is meaningful to develop efficient computational models to forecast potential microbe-drug associations. RESULTS In this manuscript, we proposed a novel prediction model called GARFMDA by combining graph attention networks and bilayer random forest to infer probable microbe-drug correlations. In GARFMDA, through integrating different microbe-drug-disease correlation indices, we constructed two different microbe-drug networks first. And then, based on multiple measures of similarity, we constructed a unique feature matrix for drugs and microbes respectively. Next, we fed these newly-obtained microbe-drug networks together with feature matrices into the graph attention network to extract the low-dimensional feature representations for drugs and microbes separately. Thereafter, these low-dimensional feature representations, along with the feature matrices, would be further inputted into the first layer of the Bilayer random forest model to obtain the contribution values of all features. And then, after removing features with low contribution values, these contribution values would be fed into the second layer of the Bilayer random forest to detect potential links between microbes and drugs. CONCLUSIONS Experimental results and case studies show that GARFMDA can achieve better prediction performance than state-of-the-art approaches, which means that GARFMDA may be a useful tool in the field of microbe-drug association prediction in the future. Besides, the source code of GARFMDA is available at https://github.com/KuangHaiYue/GARFMDA.git.
Collapse
Affiliation(s)
- Haiyue Kuang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Zhen Zhang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| | - Bin Zeng
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| | - Xin Liu
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| | - Hao Zuo
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Xingye Xu
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Lei Wang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| |
Collapse
|
12
|
Baptista A, Brière G, Baudot A. Random walk with restart on multilayer networks: from node prioritisation to supervised link prediction and beyond. BMC Bioinformatics 2024; 25:70. [PMID: 38355439 PMCID: PMC10865648 DOI: 10.1186/s12859-024-05683-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 01/29/2024] [Indexed: 02/16/2024] Open
Abstract
BACKGROUND Biological networks have proven invaluable ability for representing biological knowledge. Multilayer networks, which gather different types of nodes and edges in multiplex, heterogeneous and bipartite networks, provide a natural way to integrate diverse and multi-scale data sources into a common framework. Recently, we developed MultiXrank, a Random Walk with Restart algorithm able to explore such multilayer networks. MultiXrank outputs scores reflecting the proximity between an initial set of seed node(s) and all the other nodes in the multilayer network. We illustrate here the versatility of bioinformatics tasks that can be performed using MultiXrank. RESULTS We first show that MultiXrank can be used to prioritise genes and drugs of interest by exploring multilayer networks containing interactions between genes, drugs, and diseases. In a second study, we illustrate how MultiXrank scores can also be used in a supervised strategy to train a binary classifier to predict gene-disease associations. The classifier performance are validated using outdated and novel gene-disease association for training and evaluation, respectively. Finally, we show that MultiXrank scores can be used to compute diffusion profiles and use them as disease signatures. We computed the diffusion profiles of more than 100 immune diseases using a multilayer network that includes cell-type specific genomic information. The clustering of the immune disease diffusion profiles reveals shared shared phenotypic characteristics. CONCLUSION Overall, we illustrate here diverse applications of MultiXrank to showcase its versatility. We expect that this can lead to further and broader bioinformatics applications.
Collapse
Affiliation(s)
- Anthony Baptista
- School of Mathematical Sciences, Queen Mary University of London, London, UK.
- The Alan Turing Institute, London, UK.
| | | | - Anaïs Baudot
- INSERM, MMG, Turing Center for Living Systems, Aix-Marseille Univ, Marseille, France.
- Barcelona Supercomputing Center, Barcelona, Spain.
| |
Collapse
|
13
|
Zhang P, Zhang W, Sun W, Xu J, Hu H, Wang L, Wong L. Identification of gene biomarkers for brain diseases via multi-network topological semantics extraction and graph convolutional network. BMC Genomics 2024; 25:175. [PMID: 38350848 PMCID: PMC10865627 DOI: 10.1186/s12864-024-09967-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 01/03/2024] [Indexed: 02/15/2024] Open
Abstract
BACKGROUND Brain diseases pose a significant threat to human health, and various network-based methods have been proposed for identifying gene biomarkers associated with these diseases. However, the brain is a complex system, and extracting topological semantics from different brain networks is necessary yet challenging to identify pathogenic genes for brain diseases. RESULTS In this study, we present a multi-network representation learning framework called M-GBBD for the identification of gene biomarker in brain diseases. Specifically, we collected multi-omics data to construct eleven networks from different perspectives. M-GBBD extracts the spatial distributions of features from these networks and iteratively optimizes them using Kullback-Leibler divergence to fuse the networks into a common semantic space that represents the gene network for the brain. Subsequently, a graph consisting of both gene and large-scale disease proximity networks learns representations through graph convolution techniques and predicts whether a gene is associated which brain diseases while providing associated scores. Experimental results demonstrate that M-GBBD outperforms several baseline methods. Furthermore, our analysis supported by bioinformatics revealed CAMP as a significantly associated gene with Alzheimer's disease identified by M-GBBD. CONCLUSION Collectively, M-GBBD provides valuable insights into identifying gene biomarkers for brain diseases and serves as a promising framework for brain networks representation learning.
Collapse
Affiliation(s)
- Ping Zhang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277100, Shandong, China
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Weihan Zhang
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, The Innovative Academy of Seed Design, Chinese Academy of Sciences, Hubei Hongshan Laboratory, Wuhan, 430074, China
| | - Weicheng Sun
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jinsheng Xu
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Hua Hu
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277100, Shandong, China.
| | - Lei Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277100, Shandong, China.
- Guangxi Key Lab of Human-Machine Interaction and Intelligent Decision, Guangxi Academy of Sciences, Nanning, 530007, China.
| | - Leon Wong
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, 518118, China.
| |
Collapse
|
14
|
Li Z, Liu G, Yang X, Shu M, Jin W, Tong Y, Liu X, Wang Y, Yuan J, Yang Y. An atlas of cell-type-specific interactome networks across 44 human tumor types. Genome Med 2024; 16:30. [PMID: 38347596 PMCID: PMC10860273 DOI: 10.1186/s13073-024-01303-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 02/06/2024] [Indexed: 02/15/2024] Open
Abstract
BACKGROUND Biological processes are controlled by groups of genes acting in concert. Investigating gene-gene interactions within different cell types can help researchers understand the regulatory mechanisms behind human complex diseases, such as tumors. METHODS We collected extensive single-cell RNA-seq data from tumors, involving 563 patients with 44 different tumor types. Through our analysis, we identified various cell types in tumors and created an atlas of different immune cell subsets across different tumor types. Using the SCINET method, we reconstructed interactome networks specific to different cell types. Diverse functional data was then integrated to gain biological insights into the networks, including somatic mutation patterns and gene functional annotation. Additionally, genes with prognostic relevance within the networks were also identified. We also examined cell-cell communications to investigate how gene interactions modulate cell-cell interactions. RESULTS We developed a data portal called CellNetdb for researchers to study cell-type-specific interactome networks. Our findings indicate that these networks can be used to identify genes with topological specificity in different cell types. We also found that prognostic genes can deconvolved into cell types through analyzing network connectivity. Additionally, we identified commonalities and differences in cell-type-specific networks across different tumor types. Our results suggest that these networks can be used to prioritize risk genes. CONCLUSIONS This study presented CellNetdb, a comprehensive repository featuring an atlas of cell-type-specific interactome networks across 44 human tumor types. The findings underscore the utility of these networks in delineating the intricacies of tumor microenvironments and advancing the understanding of molecular mechanisms underpinning human tumors.
Collapse
Affiliation(s)
- Zekun Li
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Gerui Liu
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Xiaoxiao Yang
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Meng Shu
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Wen Jin
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Yang Tong
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Xiaochuan Liu
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Yuting Wang
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Jiapei Yuan
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology and Blood Diseases Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin, 300020, China.
- Tianjin Institutes of Health Science, Tianjin, 301600, China.
| | - Yang Yang
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China.
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China.
| |
Collapse
|
15
|
Kong X, Diao L, Jiang P, Nie S, Guo S, Li D. DDK-Linker: a network-based strategy identifies disease signals by linking high-throughput omics datasets to disease knowledge. Brief Bioinform 2024; 25:bbae111. [PMID: 38517698 PMCID: PMC10959161 DOI: 10.1093/bib/bbae111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 02/26/2024] [Accepted: 02/27/2024] [Indexed: 03/24/2024] Open
Abstract
The high-throughput genomic and proteomic scanning approaches allow investigators to measure the quantification of genome-wide genes (or gene products) for certain disease conditions, which plays an essential role in promoting the discovery of disease mechanisms. The high-throughput approaches often generate a large gene list of interest (GOIs), such as differentially expressed genes/proteins. However, researchers have to perform manual triage and validation to explore the most promising, biologically plausible linkages between the known disease genes and GOIs (disease signals) for further study. Here, to address this challenge, we proposed a network-based strategy DDK-Linker to facilitate the exploration of disease signals hidden in omics data by linking GOIs to disease knowns genes. Specifically, it reconstructed gene distances in the protein-protein interaction (PPI) network through six network methods (random walk with restart, Deepwalk, Node2Vec, LINE, HOPE, Laplacian) to discover disease signals in omics data that have shorter distances to disease genes. Furthermore, benefiting from the establishment of knowledge base we established, the abundant bioinformatics annotations were provided for each candidate disease signal. To assist in omics data interpretation and facilitate the usage, we have developed this strategy into an application that users can access through a website or download the R package. We believe DDK-Linker will accelerate the exploring of disease genes and drug targets in a variety of omics data, such as genomics, transcriptomics and proteomics data, and provide clues for complex disease mechanism and pharmacological research. DDK-Linker is freely accessible at http://ddklinker.ncpsb.org.cn/.
Collapse
Affiliation(s)
- Xiangren Kong
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Lihong Diao
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Peng Jiang
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Shiyan Nie
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Shuzhen Guo
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Dong Li
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| |
Collapse
|
16
|
Yu Z, Wu Z, Wang Z, Wang Y, Zhou M, Li W, Liu G, Tang Y. Network-Based Methods and Their Applications in Drug Discovery. J Chem Inf Model 2024; 64:57-75. [PMID: 38150548 DOI: 10.1021/acs.jcim.3c01613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2023]
Abstract
Drug discovery is time-consuming, expensive, and predominantly follows the "one drug → one target → one disease" paradigm. With the rapid development of systems biology and network pharmacology, a novel drug discovery paradigm, "multidrug → multitarget → multidisease", has emerged. This new holistic paradigm of drug discovery aligns well with the essence of networks, leading to the emergence of network-based methods in the field of drug discovery. In this Perspective, we initially introduce the concept and data sources of networks and highlight classical methodologies employed in network-based methods. Subsequently, we focus on the practical applications of network-based methods across various areas of drug discovery, such as target prediction, virtual screening, prediction of drug therapeutic effects or adverse drug events, and elucidation of molecular mechanisms. In addition, we provide representative web servers for researchers to use network-based methods in specific applications. Finally, we discuss several challenges of network-based methods and the directions for future development. In a word, network-based methods could serve as powerful tools to accelerate drug discovery.
Collapse
Affiliation(s)
- Zhuohang Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Zengrui Wu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Ze Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yimeng Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Moran Zhou
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
17
|
Singh V, Singh V. Inferring Interaction Networks from Transcriptomic Data: Methods and Applications. Methods Mol Biol 2024; 2812:11-37. [PMID: 39068355 DOI: 10.1007/978-1-0716-3886-6_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Transcriptomic data is a treasure trove in modern molecular biology, as it offers a comprehensive viewpoint into the intricate nuances of gene expression dynamics underlying biological systems. This genetic information must be utilized to infer biomolecular interaction networks that can provide insights into the complex regulatory mechanisms underpinning the dynamic cellular processes. Gene regulatory networks and protein-protein interaction networks are two major classes of such networks. This chapter thoroughly investigates the wide range of methodologies used for distilling insightful revelations from transcriptomic data that include association-based methods (based on correlation among expression vectors), probabilistic models (using Bayesian and Gaussian models), and interologous methods. We reviewed different approaches for evaluating the significance of interactions based on the network topology and biological functions of the interacting molecules and discuss various strategies for the identification of functional modules. The chapter concludes with highlighting network-based techniques of prioritizing key genes, outlining the centrality-based, diffusion- based, and subgraph-based methods. The chapter provides a meticulous framework for investigating transcriptomic data to uncover assembly of complex molecular networks for their adaptable analyses across a broad spectrum of biological domains.
Collapse
Affiliation(s)
- Vikram Singh
- Centre for Computational Biology and Bioinformatics, Central University of Himachal Pradesh, Dharamshala, Himachal Pradesh, India
| | - Vikram Singh
- Centre for Computational Biology and Bioinformatics, Central University of Himachal Pradesh, Dharamshala, Himachal Pradesh, India.
| |
Collapse
|
18
|
Li G, Zeng F, Luo J, Liang C, Xiao Q. MNCLCDA: predicting circRNA-drug sensitivity associations by using mixed neighbourhood information and contrastive learning. BMC Med Inform Decis Mak 2023; 23:291. [PMID: 38110886 PMCID: PMC10729363 DOI: 10.1186/s12911-023-02384-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 12/01/2023] [Indexed: 12/20/2023] Open
Abstract
BACKGROUND circRNAs play an important role in drug resistance and cancer development. Recently, many studies have shown that the expressions of circRNAs in human cells can affect the sensitivity of cells to therapeutic drugs, thus significantly influencing the therapeutic effects of these drugs. Traditional biomedical experiments required to verify this sensitivity relationship are not only time-consuming but also expensive. Hence, the development of an efficient computational approach that can accurately predict the novel associations between drug sensitivities and circRNAs is a crucial and pressing need. METHODS In this research, we present a novel computational framework called MNCLCDA, which aims to predict the potential associations between drug sensitivities and circRNAs to assist with medical research. First, MNCLCDA quantifies the similarity between the given drug and circRNA using drug structure information, circRNA gene sequence information, and GIP kernel information. Due to the existence of noise in similarity information, we employ a preprocessing approach based on random walk with restart for similarity networks to efficiently capture the useful features of circRNAs and drugs. Second, we use a mixed neighbourhood graph convolutional network to obtain the neighbourhood information of nodes. Then, a graph-based contrastive learning method is used to enhance the robustness of the model, and finally, a double Laplace-regularized least-squares method is used to predict potential circRNA-drug associations through the kernel matrices in the circRNA and drug spaces. RESULTS Numerous experimental results show that MNCLCDA outperforms six other advanced methods. In addition, the excellent performance of our proposed model in case studies illustrates that MNCLCDA also has the ability to predict the associations between drug sensitivity and circRNA in practical situations. CONCLUSIONS After a large number of experiments, it is illustrated that MNCLCDA is an efficient tool for predicting the potential associations between drug sensitivities and circRNAs, thereby can provide some guidance for clinical trials.
Collapse
Affiliation(s)
- Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China.
| | - Feifan Zeng
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| |
Collapse
|
19
|
Su D, Xiong Y, Wang S, Wei H, Ke J, Li H, Wang T, Zuo Y, Yang L. Structural deep clustering network for stratification of breast cancer patients through integration of somatic mutation profiles. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 242:107808. [PMID: 37716222 DOI: 10.1016/j.cmpb.2023.107808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/15/2023] [Accepted: 09/10/2023] [Indexed: 09/18/2023]
Abstract
BACKGROUND AND OBJECTIVE Breast cancer is among of the most malignant tumor that occurs in women and is one of the leading causes of death from gynecologic malignancy worldwide. The high degree of heterogeneity that characterizes breast cancer makes it challenging to devise effective therapeutic strategies. Accumulating evidence highlights the crucial role of stratifying breast cancer patients into clinically significant subtypes to achieve better prognoses and treatments. The structural deep clustering network is a graph convolutional network-based clustering algorithm that integrates structural information and has achieved state-of-the-art performance in various applications. METHODS In this study, we employed structural deep clustering network to integrate somatic mutation profiles for stratifying 2526 breast cancer patients from the Memorial Sloan Kettering Cancer Center into two clinically differentiable subtypes. RESULTS Breast cancer patients in cluster 1 exhibited better prognosis than breast cancer patients in cluster 2, and the difference between them was statistically significant. The immunogenomic landscape further demonstrated that cluster 1 was associated with remarkable infiltration of the tumor infiltrating lymphocytes. The clustering subtype could be used to evaluate the therapeutic benefit of immunotherapy and chemotherapy in breast cancer patients. Furthermore, our approach effectively classified patients from eight different cancer types, demonstrating its generalizability. CONCLUSIONS Our study represents a step towards a generic methodology for classifying cancer patients using only somatic mutation data and structural deep clustering network approaches. Employing structural deep clustering network to identify breast cancer subtypes is promising and can inform the development of more accurate and personalized therapies.
Collapse
Affiliation(s)
- Dongqing Su
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yuqiang Xiong
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Shiyuan Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Haodong Wei
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Jiawei Ke
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Honghao Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Tao Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yongchun Zuo
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China; Digital College, Inner Mongolia Intelligent Union Big Data Academy, Inner Mongolia Wesure Date Technology Co., Ltd. Hohhot, 010010, China; Inner Mongolia International Mongolian Hospital, Hohhot 010065, China
| | - Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China.
| |
Collapse
|
20
|
Zou M, Li H, Su D, Xiong Y, Wei H, Wang S, Sun H, Wang T, Xi Q, Zuo Y, Yang L. Integrating somatic mutation profiles with structural deep clustering network for metabolic stratification in pancreatic cancer: a comprehensive analysis of prognostic and genomic landscapes. Brief Bioinform 2023; 25:bbad430. [PMID: 38040491 PMCID: PMC10783866 DOI: 10.1093/bib/bbad430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 09/29/2023] [Accepted: 11/05/2023] [Indexed: 12/03/2023] Open
Abstract
Pancreatic cancer is a globally recognized highly aggressive malignancy, posing a significant threat to human health and characterized by pronounced heterogeneity. In recent years, researchers have uncovered that the development and progression of cancer are often attributed to the accumulation of somatic mutations within cells. However, cancer somatic mutation data exhibit characteristics such as high dimensionality and sparsity, which pose new challenges in utilizing these data effectively. In this study, we propagated the discrete somatic mutation data of pancreatic cancer through a network propagation model based on protein-protein interaction networks. This resulted in smoothed somatic mutation profile data that incorporate protein network information. Based on this smoothed mutation profile data, we obtained the activity levels of different metabolic pathways in pancreatic cancer patients. Subsequently, using the activity levels of various metabolic pathways in cancer patients, we employed a deep clustering algorithm to establish biologically and clinically relevant metabolic subtypes of pancreatic cancer. Our study holds scientific significance in classifying pancreatic cancer based on somatic mutation data and may provide a crucial theoretical basis for the diagnosis and immunotherapy of pancreatic cancer patients.
Collapse
Affiliation(s)
- Min Zou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Honghao Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Dongqing Su
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yuqiang Xiong
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Haodong Wei
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Shiyuan Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Hongmei Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Tao Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Qilemuge Xi
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Yongchun Zuo
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
- Digital College, Inner Mongolia Intelligent Union Big Data Academy, Inner Mongolia Wesure Date Technology Co., Ltd. Hohhot 010010, China
- Inner Mongolia International Mongolian Hospital, Hohhot 010065, China
| | - Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| |
Collapse
|
21
|
Chen L, Zhao X. PCDA-HNMP: Predicting circRNA-disease association using heterogeneous network and meta-path. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:20553-20575. [PMID: 38124565 DOI: 10.3934/mbe.2023909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Increasing amounts of experimental studies have shown that circular RNAs (circRNAs) play important regulatory roles in human diseases through interactions with related microRNAs (miRNAs). CircRNAs have become new potential disease biomarkers and therapeutic targets. Predicting circRNA-disease association (CDA) is of great significance for exploring the pathogenesis of complex diseases, which can improve the diagnosis level of diseases and promote the targeted therapy of diseases. However, determination of CDAs through traditional clinical trials is usually time-consuming and expensive. Computational methods are now alternative ways to predict CDAs. In this study, a new computational method, named PCDA-HNMP, was designed. For obtaining informative features of circRNAs and diseases, a heterogeneous network was first constructed, which defined circRNAs, mRNAs, miRNAs and diseases as nodes and associations between them as edges. Then, a deep analysis was conducted on the heterogeneous network by extracting meta-paths connecting to circRNAs (diseases), thereby mining hidden associations between various circRNAs (diseases). These associations constituted the meta-path-induced networks for circRNAs and diseases. The features of circRNAs and diseases were derived from the aforementioned networks via mashup. On the other hand, miRNA-disease associations (mDAs) were employed to improve the model's performance. miRNA features were yielded from the meta-path-induced networks on miRNAs and circRNAs, which were constructed from the meta-paths connecting miRNAs and circRNAs in the heterogeneous network. A concatenation operation was adopted to build the features of CDAs and mDAs. Such representations of CDAs and mDAs were fed into XGBoost to set up the model. The five-fold cross-validation yielded an area under the curve (AUC) of 0.9846, which was better than those of some existing state-of-the-art methods. The employment of mDAs can really enhance the model's performance and the importance analysis on meta-path-induced networks shown that networks produced by the meta-paths containing validated CDAs provided the most important contributions.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Xiaoyu Zhao
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| |
Collapse
|
22
|
Ratajczak F, Joblin M, Hildebrandt M, Ringsquandl M, Falter-Braun P, Heinig M. Speos: an ensemble graph representation learning framework to predict core gene candidates for complex diseases. Nat Commun 2023; 14:7206. [PMID: 37938585 PMCID: PMC10632370 DOI: 10.1038/s41467-023-42975-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 10/27/2023] [Indexed: 11/09/2023] Open
Abstract
Understanding phenotype-to-genotype relationships is a grand challenge of 21st century biology with translational implications. The recently proposed "omnigenic" model postulates that effects of genetic variation on traits are mediated by core-genes and -proteins whose activities mechanistically influence the phenotype, whereas peripheral genes encode a regulatory network that indirectly affects phenotypes via core gene products. Here, we develop a positive-unlabeled graph representation-learning ensemble-approach based on a nested cross-validation to predict core-like genes for diverse diseases using Mendelian disorder genes for training. Employing mouse knockout phenotypes for external validations, we demonstrate that core-like genes display several key properties of core genes: Mouse knockouts of genes corresponding to our most confident predictions give rise to relevant mouse phenotypes at rates on par with the Mendelian disorder genes, and all candidates exhibit core gene properties like transcriptional deregulation in disease and loss-of-function intolerance. Moreover, as predicted for core genes, our candidates are enriched for drug targets and druggable proteins. In contrast to Mendelian disorder genes the new core-like genes are enriched for druggable yet untargeted gene products, which are therefore attractive targets for drug development. Interpretation of the underlying deep learning model suggests plausible explanations for our core gene predictions in form of molecular mechanisms and physical interactions. Our results demonstrate the potential of graph representation learning for the interpretation of biological complexity and pave the way for studying core gene properties and future drug development.
Collapse
Affiliation(s)
- Florin Ratajczak
- Institute of Network Biology (INET), Molecular Targets and Therapeutics Center (MTTC), Helmholtz Munich, Neuherberg, Germany
| | | | | | | | - Pascal Falter-Braun
- Institute of Network Biology (INET), Molecular Targets and Therapeutics Center (MTTC), Helmholtz Munich, Neuherberg, Germany.
- Microbe-Host Interactions, Faculty of Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany.
| | - Matthias Heinig
- Institute of Computational Biology (ICB), Helmholtz Munich, Neuherberg, Germany.
- Department of Computer Science, TUM School of Computation, Information and Technology, Technical University of Munich, Garching, Germany.
- German Centre for Cardiovascular Research (DZHK), Munich Heart Association, Partner Site Munich, Berlin, Germany.
| |
Collapse
|
23
|
Xu J, Pang B, Lan Y, Dou R, Wang S, Kang S, Zhang W, Liu Y, Zhang Y, Ping Y. Identifying the personalized driver gene sets maximally contributing to abnormality of transcriptome phenotype in glioblastoma multiforme individuals. Mol Oncol 2023; 17:2472-2490. [PMID: 37491836 PMCID: PMC10620122 DOI: 10.1002/1878-0261.13499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 06/21/2023] [Accepted: 07/24/2023] [Indexed: 07/27/2023] Open
Abstract
High heterogeneity in genome and phenotype of cancer populations made it difficult to apply population-based common driver genes to the diagnosis and treatment of cancer individuals. Characterizing and identifying the personalized driver mechanism for glioblastoma multiforme (GBM) individuals were pivotal for the realization of precision medicine. We proposed an integrative method to identify the personalized driver gene sets by integrating the profiles of gene expression and genetic alterations in cancer individuals. This method coupled genetic algorithm and random walk to identify the optimal gene sets that could explain abnormality of transcriptome phenotype to the maximum extent. The personalized driver gene sets were identified for 99 GBM individuals using our method. We found that genomic alterations in between one and seven driver genes could maximally and cumulatively explain the dysfunction of cancer hallmarks across GBM individuals. The driver gene sets were distinct even in GBM individuals with significantly similar transcriptomic phenotypes. Our method identified MCM4 with rare genetic alterations as previously unknown oncogenic genes, the high expression of which were significantly associated with poor GBM prognosis. The functional experiments confirmed that knockdown of MCM4 could significantly inhibit proliferation, invasion, migration, and clone formation of the GBM cell lines U251 and U118MG, and overexpression of MCM4 significantly promoted the proliferation, invasion, migration, and clone formation of the GBM cell line U87MG. Our method could dissect the personalized driver genetic alteration sets that are pivotal for developing targeted therapy strategies and precision medicine. Our method could be extended to identify key drivers from other levels and could be applied to more cancer types.
Collapse
Affiliation(s)
- Jinyuan Xu
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Bo Pang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Yujia Lan
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Renjie Dou
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Shuai Wang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Shaobo Kang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Wanmei Zhang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Yuanyuan Liu
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Yijing Zhang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Yanyan Ping
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| |
Collapse
|
24
|
Han X, Wang B, Situ C, Qi Y, Zhu H, Li Y, Guo X. scapGNN: A graph neural network-based framework for active pathway and gene module inference from single-cell multi-omics data. PLoS Biol 2023; 21:e3002369. [PMID: 37956172 PMCID: PMC10681325 DOI: 10.1371/journal.pbio.3002369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 11/27/2023] [Accepted: 10/07/2023] [Indexed: 11/15/2023] Open
Abstract
Although advances in single-cell technologies have enabled the characterization of multiple omics profiles in individual cells, extracting functional and mechanistic insights from such information remains a major challenge. Here, we present scapGNN, a graph neural network (GNN)-based framework that creatively transforms sparse single-cell profile data into the stable gene-cell association network for inferring single-cell pathway activity scores and identifying cell phenotype-associated gene modules from single-cell multi-omics data. Systematic benchmarking demonstrated that scapGNN was more accurate, robust, and scalable than state-of-the-art methods in various downstream single-cell analyses such as cell denoising, batch effect removal, cell clustering, cell trajectory inference, and pathway or gene module identification. scapGNN was developed as a systematic R package that can be flexibly extended and enhanced for existing analysis processes. It provides a new analytical platform for studying single cells at the pathway and network levels.
Collapse
Affiliation(s)
- Xudong Han
- State Key Laboratory of Reproductive Medicine and Offspring Health, School of Medicine, Southeast University, Nanjing, China
- Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing, China
| | - Bing Wang
- State Key Laboratory of Reproductive Medicine and Offspring Health, School of Medicine, Southeast University, Nanjing, China
- Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing, China
| | - Chenghao Situ
- Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing, China
| | - Yaling Qi
- Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing, China
| | - Hui Zhu
- Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing, China
| | - Yan Li
- Department of Clinical Laboratory, Sir Run Run Hospital, Nanjing Medical University, Nanjing, China
| | - Xuejiang Guo
- State Key Laboratory of Reproductive Medicine and Offspring Health, School of Medicine, Southeast University, Nanjing, China
- Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing, China
| |
Collapse
|
25
|
Singh V, Singh V. Characterizing the circadian connectome of Ocimum tenuiflorum using an integrated network theoretic framework. Sci Rep 2023; 13:13108. [PMID: 37567911 PMCID: PMC10421869 DOI: 10.1038/s41598-023-40212-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 08/07/2023] [Indexed: 08/13/2023] Open
Abstract
Across the three domains of life, circadian clock is known to regulate vital physiological processes, like, growth, development, defence etc. by anticipating environmental cues. In this work, we report an integrated network theoretic methodology comprising of random walk with restart and graphlet degree vectors to characterize genome wide core circadian clock and clock associated raw candidate proteins in a plant for which protein interaction information is available. As a case study, we have implemented this framework in Ocimum tenuiflorum (Tulsi); one of the most valuable medicinal plants that has been utilized since ancient times in the management of a large number of diseases. For that, 24 core clock (CC) proteins were mined in 56 template plant genomes to build their hidden Markov models (HMMs). These HMMs were then used to identify 24 core clock proteins in O. tenuiflorum. The local topology of the interologous Tulsi protein interaction network was explored to predict the CC associated raw candidate proteins. Statistical and biological significance of the raw candidates was determined using permutation and enrichment tests. A total of 66 putative CC associated proteins were identified and their functional annotation was performed.
Collapse
Affiliation(s)
- Vikram Singh
- Centre for Computational Biology and Bioinformatics, Central University of Himahcal Pradesh, Dharamshala, Himahcal Pradesh, 176206, India
| | - Vikram Singh
- Centre for Computational Biology and Bioinformatics, Central University of Himahcal Pradesh, Dharamshala, Himahcal Pradesh, 176206, India.
| |
Collapse
|
26
|
Li X, Yuan H, Wu X, Wang C, Wu M, Shi H, Lv Y. MultiDS-MDA: Integrating multiple data sources into heterogeneous network for predicting novel metabolite-drug associations. Comput Biol Med 2023; 162:107067. [PMID: 37276756 DOI: 10.1016/j.compbiomed.2023.107067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/15/2023] [Accepted: 05/27/2023] [Indexed: 06/07/2023]
Abstract
Metabolic processes in the human body play an important role in maintaining normal life activities, and the abnormal concentration of metabolites is closely related to the occurrence and development of diseases. The use of drugs is considered to have a major impact on metabolism, and drug metabolites can contribute to efficacy, drug toxicity and drug-drug interaction. However, our understanding of metabolite-drug associations is far from complete, and individual data source tends to be incomplete and noisy. Therefore, the integration of various types of data sources for inferring reliable metabolite-drug associations is urgently needed. In this study, we proposed a computational framework, MultiDS-MDA, for identifying metabolite-drug associations by integrating multiple data sources, including chemical structure information of metabolites and drugs, the relationships of metabolite-gene, metabolite-disease, drug-gene and drug-disease, the data of gene ontology (GO) and disease ontology (DO) and known metabolite-drug connections. The performance of MultiDS-MDA was evaluated by 5-fold cross-validation, which achieved an area under the ROC curve (AUROC) of 0.911 and an area under the precision-recall curve (AUPRC) of 0.907. Additionally, MultiDS-MDA showed outstanding performance compared with similar approaches. Case studies for three metabolites (cholesterol, thromboxane B2 and coenzyme Q10) and three drugs (simvastatin, pravastatin and morphine) also demonstrated the reliability and efficiency of MultiDS-MDA, and it is anticipated that MultiDS-MDA will serve as a powerful tool for future exploration of metabolite-drug interactions and contribute to drug development and drug combination.
Collapse
Affiliation(s)
- Xiuhong Li
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Hao Yuan
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Xiaoliang Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Chengyi Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Meitao Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Hongbo Shi
- College of Bioinformatics Science and Technology, Harbin Medical University, China.
| | - Yingli Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, China.
| |
Collapse
|
27
|
Mastropietro A, De Carlo G, Anagnostopoulos A. XGDAG: explainable gene-disease associations via graph neural networks. Bioinformatics 2023; 39:btad482. [PMID: 37531293 PMCID: PMC10421968 DOI: 10.1093/bioinformatics/btad482] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 06/27/2023] [Accepted: 08/01/2023] [Indexed: 08/04/2023] Open
Abstract
MOTIVATION Disease gene prioritization consists in identifying genes that are likely to be involved in the mechanisms of a given disease, providing a ranking of such genes. Recently, the research community has used computational methods to uncover unknown gene-disease associations; these methods range from combinatorial to machine learning-based approaches. In particular, during the last years, approaches based on deep learning have provided superior results compared to more traditional ones. Yet, the problem with these is their inherent black-box structure, which prevents interpretability. RESULTS We propose a new methodology for disease gene discovery, which leverages graph-structured data using graph neural networks (GNNs) along with an explainability phase for determining the ranking of candidate genes and understanding the model's output. Our approach is based on a positive-unlabeled learning strategy, which outperforms existing gene discovery methods by exploiting GNNs in a non-black-box fashion. Our methodology is effective even in scenarios where a large number of associated genes need to be retrieved, in which gene prioritization methods often tend to lose their reliability. AVAILABILITY AND IMPLEMENTATION The source code of XGDAG is available on GitHub at: https://github.com/GiDeCarlo/XGDAG. The data underlying this article are available at: https://www.disgenet.org/, https://thebiogrid.org/, https://doi.org/10.1371/journal.pcbi.1004120.s003, and https://doi.org/10.1371/journal.pcbi.1004120.s004.
Collapse
Affiliation(s)
- Andrea Mastropietro
- Department of Computer, Control and Management Engineering “Antonio Ruberti”, Sapienza University of Rome, Rome 00185, Italy
| | - Gianluca De Carlo
- Department of Computer, Control and Management Engineering “Antonio Ruberti”, Sapienza University of Rome, Rome 00185, Italy
| | - Aris Anagnostopoulos
- Department of Computer, Control and Management Engineering “Antonio Ruberti”, Sapienza University of Rome, Rome 00185, Italy
| |
Collapse
|
28
|
Liu X, Gao L, Peng Y, Fang Z, Wang J. PheSom: a term frequency-based method for measuring human phenotype similarity on the basis of MeSH vocabulary. Front Genet 2023; 14:1185790. [PMID: 37496714 PMCID: PMC10366691 DOI: 10.3389/fgene.2023.1185790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 06/21/2023] [Indexed: 07/28/2023] Open
Abstract
Background: Phenotype similarity calculation should be used to help improve drug repurposing. In this study, based on the MeSH terms describing the phenotypes deposited in OMIM, we proposed a method, namely, PheSom (Phenotype Similarity On MeSH), to measure the similarity between phenotypes. PheSom counted the number of overlapping MeSH terms between two phenotypes and then took the weight of every MeSH term within each phenotype into account according to the term frequency-inverse document frequency (FIDC). Phenotype-related genes were used for the evaluation of our method. Results: A 7,739 × 7,739 similarity score matrix was finally obtained and the number of phenotype pairs was dramatically decreased with the increase of similarity score. Besides, the overlapping rates of phenotype-related genes were remarkably increased with the increase of similarity score between phenotypes, which supports the reliability of our method. Conclusion: We anticipate our method can be applied to identifying novel therapeutic methods for complex diseases.
Collapse
Affiliation(s)
- Xinhua Liu
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hangzhou Normal University, Hangzhou, Zhejiang, China
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin, China
| | - Ling Gao
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hangzhou Normal University, Hangzhou, Zhejiang, China
| | - Yonglin Peng
- Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai, China
| | - Zhonghai Fang
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin, China
| | - Ju Wang
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin, China
| |
Collapse
|
29
|
Chen L, Chen K, Zhou B. Inferring drug-disease associations by a deep analysis on drug and disease networks. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:14136-14157. [PMID: 37679129 DOI: 10.3934/mbe.2023632] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Drugs, which treat various diseases, are essential for human health. However, developing new drugs is quite laborious, time-consuming, and expensive. Although investments into drug development have greatly increased over the years, the number of drug approvals each year remain quite low. Drug repositioning is deemed an effective means to accelerate the procedures of drug development because it can discover novel effects of existing drugs. Numerous computational methods have been proposed in drug repositioning, some of which were designed as binary classifiers that can predict drug-disease associations (DDAs). The negative sample selection was a common defect of this method. In this study, a novel reliable negative sample selection scheme, named RNSS, is presented, which can screen out reliable pairs of drugs and diseases with low probabilities of being actual DDAs. This scheme considered information from k-neighbors of one drug in a drug network, including their associations to diseases and the drug. Then, a scoring system was set up to evaluate pairs of drugs and diseases. To test the utility of the RNSS, three classic classification algorithms (random forest, bayes network and nearest neighbor algorithm) were employed to build classifiers using negative samples selected by the RNSS. The cross-validation results suggested that such classifiers provided a nearly perfect performance and were significantly superior to those using some traditional and previous negative sample selection schemes.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Kaiyu Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Bo Zhou
- Shanghai University of Medicine & Health Sciences, Shanghai 201318, China
| |
Collapse
|
30
|
Xu Z, Marchionni L, Wang S. MultiNEP: a multi-omics network enhancement framework for prioritizing disease genes and metabolites simultaneously. Bioinformatics 2023; 39:btad333. [PMID: 37216914 PMCID: PMC10250081 DOI: 10.1093/bioinformatics/btad333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 04/28/2023] [Accepted: 05/19/2023] [Indexed: 05/24/2023] Open
Abstract
MOTIVATION Many studies have successfully used network information to prioritize candidate omics profiles associated with diseases. The metabolome, as the link between genotypes and phenotypes, has accumulated growing attention. Using a "multi-omics" network constructed with a gene-gene network, a metabolite-metabolite network, and a gene-metabolite network to simultaneously prioritize candidate disease-associated metabolites and gene expressions could further utilize gene-metabolite interactions that are not used when prioritizing them separately. However, the number of metabolites is usually 100 times fewer than that of genes. Without accounting for this imbalance issue, we cannot effectively use gene-metabolite interactions when simultaneously prioritizing disease-associated metabolites and genes. RESULTS Here, we developed a Multi-omics Network Enhancement Prioritization (MultiNEP) framework with a weighting scheme to reweight contributions of different sub-networks in a multi-omics network to effectively prioritize candidate disease-associated metabolites and genes simultaneously. In simulation studies, MultiNEP outperforms competing methods that do not address network imbalances and identifies more true signal genes and metabolites simultaneously when we down-weight relative contributions of the gene-gene network and up-weight that of the metabolite-metabolite network to the gene-metabolite network. Applications to two human cancer cohorts show that MultiNEP prioritizes more cancer-related genes by effectively using both within- and between-omics interactions after handling network imbalance. AVAILABILITY AND IMPLEMENTATION The developed MultiNEP framework is implemented in an R package and available at: https://github.com/Karenxzr/MultiNep.
Collapse
Affiliation(s)
- Zhuoran Xu
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10065, United States
| | - Luigi Marchionni
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10065, United States
| | - Shuang Wang
- Department of Biostatistics, Columbia University, New York, NY 10032, United States
| |
Collapse
|
31
|
Stein CM. Genetic epidemiology of resistance to M. tuberculosis Infection: importance of study design and recent findings. Genes Immun 2023; 24:117-123. [PMID: 37085579 PMCID: PMC10121418 DOI: 10.1038/s41435-023-00204-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 04/03/2023] [Accepted: 04/14/2023] [Indexed: 04/23/2023]
Abstract
Resistance to M. tuberculosis, often referred to as "RSTR" in the literature, is being increasingly studied because of its potential relevance as a clinical outcome in vaccine studies. This review starts by addressing the importance of epidemiological characterization of this phenotype, and ongoing challenges in that characterization. Then, this review summarizes the extant genetic and genomic studies of this phenotype, including heritability studies, candidate gene studies, and genome-wide association studies, as well as whole transcriptome studies. Findings from recent studies that used longitudinal characterization of the RSTR phenotype are compared to those using a cross-sectional definition, and the challenges of using tuberculin skin test and interferon-gamma release assay are discussed. Finally, future directions are proposed. Since this is a rapidly evolving area of public health significance, this review will help frame future research questions and study designs.
Collapse
Affiliation(s)
- Catherine M Stein
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA.
- Division of Infectious Diseases and HIV Medicine, Department of Medicine, Case Western Reserve University, Cleveland, OH, USA.
| |
Collapse
|
32
|
Fan L, Wang L, Zhu X. A novel microbe-drug association prediction model based on stacked autoencoder with multi-head attention mechanism. Sci Rep 2023; 13:7396. [PMID: 37149692 PMCID: PMC10164153 DOI: 10.1038/s41598-023-34438-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 04/29/2023] [Indexed: 05/08/2023] Open
Abstract
Microbes are intimately tied to the occurrence of various diseases that cause serious hazards to human health, and play an essential role in drug discovery, clinical application, and drug quality control. In this manuscript, we put forward a novel prediction model named MDASAE based on a stacked autoencoder (SAE) with multi-head attention mechanism to infer potential microbe-drug associations. In MDASAE, we first constructed three kinds of microbe-related and drug-related similarity matrices based on known microbe-disease-drug associations respectively. And then, we fed two kinds of microbe-related and drug-related similarity matrices respectively into the SAE to learn node attribute features, and introduced a multi-head attention mechanism into the output layer of the SAE to enhance feature extraction. Thereafter, we further adopted the remaining microbe and drug similarity matrices to derive inter-node features by using the Restart Random Walk algorithm. After that, the node attribute features and inter-node features of microbes and drugs would be fused together to predict scores of possible associations between microbes and drugs. Finally, intensive comparison experiments and case studies based on different well-known public databases under 5-fold cross-validation and 10-fold cross-validation respectively, proved that MDASAE can effectively predict the potential microbe-drug associations.
Collapse
Affiliation(s)
- Liu Fan
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421010, China
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, 410022, China
| | - Lei Wang
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, 410022, China.
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| | - Xianyou Zhu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421010, China.
| |
Collapse
|
33
|
Meng P, Wang G, Guo H, Jiang T. Identifying cancer driver genes using a two-stage random walk with restart on a gene interaction network. Comput Biol Med 2023; 158:106810. [PMID: 37011433 DOI: 10.1016/j.compbiomed.2023.106810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 03/08/2023] [Accepted: 03/20/2023] [Indexed: 04/03/2023]
Abstract
Cancer development and progression are significantly influenced by cancer driver genes. Understanding cancer driver genes and their mechanisms of action is essential for developing effective cancer treatments. As a result, identifying driver genes is important for drug development, cancer diagnosis, and treatment. Here, we present an algorithm to discover driver genes based on the two-stage random walk with restart (RWR), and the modified method for calculating the transition probability matrix in random walk algorithm. First, we performed the first stage of RWR on the whole gene interaction network, in which we employ a new method for calculating the transition probability matrix and extracted the subnetwork based on nodes that had a high correlation with the seed nodes. The subnetwork was then applied to the second stage of RWR and the nodes were re-ranked in the subnetwork. Our approach outperformed existing methods in identifying driver genes. The outcome of the effect of three gene interaction networks, two rounds of random walk, and the seed nodes' sensitivity were all compared at the same time. In addition, we identified several potential driver genes, some of which are involved in driving cancer development. Overall, our method is efficient in various cancer types, significantly outperforms existing methods, and can identify possible driver genes.
Collapse
|
34
|
Su Y, Wu J, Li X, Li J, Zhao X, Pan B, Huang J, Kong Q, Han J. DTSEA: A network-based drug target set enrichment analysis method for drug repurposing against COVID-19. Comput Biol Med 2023; 159:106969. [PMID: 37105108 PMCID: PMC10121077 DOI: 10.1016/j.compbiomed.2023.106969] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 03/27/2023] [Accepted: 04/19/2023] [Indexed: 04/29/2023]
Abstract
The Coronavirus Disease 2019 (COVID-19) pandemic is still wreaking havoc worldwide. Therefore, the urgent need for efficient treatments pushes researchers and clinicians into screening effective drugs. Drug repurposing may be a promising and time-saving strategy to identify potential drugs against this disease. Here, we developed a novel computational approach, named Drug Target Set Enrichment Analysis (DTSEA), to identify potent drugs against COVID-19. DTSEA first mapped the disease-related genes into a gene functional interaction network, and then it used a network propagation algorithm to rank all genes in the network by calculating the network proximity of genes to disease-related genes. Finally, an enrichment analysis was performed on drug target sets to prioritize disease-candidate drugs. It was shown that the top three drugs predicted by DTSEA, including Ataluren, Carfilzomib, and Aripiprazole, were significantly enriched in the immune response pathways indicating the potential for use as promising COVID-19 inhibitors. In addition to these drugs, DTSEA also identified several drugs (such as Remdesivir and Olumiant), which have obtained emergency use authorization (EUA) for COVID-19. These results indicated that DTSEA could effectively identify the candidate drugs for COVID-19, which will help to accelerate the development of drugs for COVID-19. We then performed several validations to ensure the reliability and validity of DTSEA, including topological analysis, robustness analysis, and prediction consistency. Collectively, DTSEA successfully predicted candidate drugs against COVID-19 with high accuracy and reliability, thus making it a formidable tool to identify potential drugs for a specific disease and facilitate further investigation.
Collapse
Affiliation(s)
- Yinchun Su
- Department of Neurobiology, Harbin Medical University, Harbin, 150081, PR China
| | - Jiashuo Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Xiangmei Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Ji Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Xilong Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Bingyue Pan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Junling Huang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Qingfei Kong
- Department of Neurobiology, Harbin Medical University, Harbin, 150081, PR China.
| | - Junwei Han
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China.
| |
Collapse
|
35
|
Kumar N, Mukhtar MS. Ranking Plant Network Nodes Based on Their Centrality Measures. ENTROPY (BASEL, SWITZERLAND) 2023; 25:e25040676. [PMID: 37190464 PMCID: PMC10137616 DOI: 10.3390/e25040676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/14/2023] [Accepted: 04/16/2023] [Indexed: 05/17/2023]
Abstract
Biological networks are often large and complex, making it difficult to accurately identify the most important nodes. Node prioritization algorithms are used to identify the most influential nodes in a biological network by considering their relationships with other nodes. These algorithms can help us understand the functioning of the network and the role of individual nodes. We developed CentralityCosDist, an algorithm that ranks nodes based on a combination of centrality measures and seed nodes. We applied this and four other algorithms to protein-protein interactions and co-expression patterns in Arabidopsis thaliana using pathogen effector targets as seed nodes. The accuracy of the algorithms was evaluated through functional enrichment analysis of the top 10 nodes identified by each algorithm. Most enriched terms were similar across algorithms, except for DIAMOnD. CentralityCosDist identified more plant-pathogen interactions and related functions and pathways compared to the other algorithms.
Collapse
Affiliation(s)
- Nilesh Kumar
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - M Shahid Mukhtar
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| |
Collapse
|
36
|
Pandey AK, Loscalzo J. Network medicine: an approach to complex kidney disease phenotypes. Nat Rev Nephrol 2023:10.1038/s41581-023-00705-0. [PMID: 37041415 DOI: 10.1038/s41581-023-00705-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/13/2023] [Indexed: 04/13/2023]
Abstract
Scientific reductionism has been the basis of disease classification and understanding for more than a century. However, the reductionist approach of characterizing diseases from a limited set of clinical observations and laboratory evaluations has proven insufficient in the face of an exponential growth in data generated from transcriptomics, proteomics, metabolomics and deep phenotyping. A new systematic method is necessary to organize these datasets and build new definitions of what constitutes a disease that incorporates both biological and environmental factors to more precisely describe the ever-growing complexity of phenotypes and their underlying molecular determinants. Network medicine provides such a conceptual framework to bridge these vast quantities of data while providing an individualized understanding of disease. The modern application of network medicine principles is yielding new insights into the pathobiology of chronic kidney diseases and renovascular disorders by expanding the understanding of pathogenic mediators, novel biomarkers and new options for renal therapeutics. These efforts affirm network medicine as a robust paradigm for elucidating new advances in the diagnosis and treatment of kidney disorders.
Collapse
Affiliation(s)
- Arvind K Pandey
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital, and Harvard Medical School, Boston, MA, USA
| | - Joseph Loscalzo
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital, and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
37
|
Han S, Hong J, Yun SJ, Koo HJ, Kim TY. PWN: enhanced random walk on a warped network for disease target prioritization. BMC Bioinformatics 2023; 24:105. [PMID: 36944912 PMCID: PMC10031933 DOI: 10.1186/s12859-023-05227-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 03/13/2023] [Indexed: 03/23/2023] Open
Abstract
BACKGROUND Extracting meaningful information from unbiased high-throughput data has been a challenge in diverse areas. Specifically, in the early stages of drug discovery, a considerable amount of data was generated to understand disease biology when identifying disease targets. Several random walk-based approaches have been applied to solve this problem, but they still have limitations. Therefore, we suggest a new method that enhances the effectiveness of high-throughput data analysis with random walks. RESULTS We developed a new random walk-based algorithm named prioritization with a warped network (PWN), which employs a warped network to achieve enhanced performance. Network warping is based on both internal and external features: graph curvature and prior knowledge. CONCLUSIONS We showed that these compositive features synergistically increased the resulting performance when applied to random walk algorithms, which led to PWN consistently achieving the best performance among several other known methods. Furthermore, we performed subsequent experiments to analyze the characteristics of PWN.
Collapse
Affiliation(s)
- Seokjin Han
- Standigm Inc., 70, Nonhyeon-ro 85-gil, Gangnam-gu, Seoul, 06234, Republic of Korea
| | - Jinhee Hong
- Standigm Inc., 70, Nonhyeon-ro 85-gil, Gangnam-gu, Seoul, 06234, Republic of Korea
| | - So Jeong Yun
- Standigm Inc., 70, Nonhyeon-ro 85-gil, Gangnam-gu, Seoul, 06234, Republic of Korea
| | - Hee Jung Koo
- Standigm UK Co., Ltd, 50-60 Station Road, Cambridge, CB1 2JH, UK.
| | - Tae Yong Kim
- Standigm Inc., 70, Nonhyeon-ro 85-gil, Gangnam-gu, Seoul, 06234, Republic of Korea.
| |
Collapse
|
38
|
Altuntas V. Diffusion Alignment Coefficient (DAC): A Novel Similarity Metric for Protein-Protein Interaction Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:894-903. [PMID: 35737632 DOI: 10.1109/tcbb.2022.3185406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Interaction networks can be used to predict the functions of unknown proteins using known interactions and proteins with known functions. Many graph theory or diffusion-based methods have been proposed, using the assumption that the topological properties of a protein in a network are related to its biological function. Here we seek to improve function prediction by finding more similar neighbors with a new diffusion-based alignment technique to overcome the topological information loss of the node. In this study, we introduce the Diffusion Alignment Coefficient (DAC) algorithm, which combines diffusion, longest common subsequence, and longest common substring techniques to measure the similarity of two nodes in protein interaction networks. As a proof of concept, our experiments, conducted on a real PPI networks S.cerevisiae and Homo Sapiens, demonstrated that our method obtained better results than competitors for MIPS and MSigDB Collections hallmark gene set functional categories. This is the first study to develop a measure of node function similarity using alignment to consider the positions of nodes in protein-protein interaction networks. According to the experimental results, the use of spatial information belonging to the nodes in the network has a positive effect on the detection of more functionally similar neighboring nodes.
Collapse
|
39
|
Jagodnik KM, Shvili Y, Bartal A. HetIG-PreDiG: A Heterogeneous Integrated Graph Model for Predicting Human Disease Genes based on gene expression. PLoS One 2023; 18:e0280839. [PMID: 36791052 PMCID: PMC9931161 DOI: 10.1371/journal.pone.0280839] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Accepted: 01/10/2023] [Indexed: 02/16/2023] Open
Abstract
Graph analytical approaches permit identifying novel genes involved in complex diseases, but are limited by (i) inferring structural network similarity of connected gene nodes, ignoring potentially relevant unconnected nodes; (ii) using homogeneous graphs, missing gene-disease associations' complexity; (iii) relying on disease/gene-phenotype associations' similarities, involving highly incomplete data; (iv) using binary classification, with gene-disease edges as positive training samples, and non-associated gene and disease nodes as negative samples that may include currently unknown disease genes; or (v) reporting predicted novel associations without systematically evaluating their accuracy. Addressing these limitations, we develop the Heterogeneous Integrated Graph for Predicting Disease Genes (HetIG-PreDiG) model that includes gene-gene, gene-disease, and gene-tissue associations. We predict novel disease genes using low-dimensional representation of nodes accounting for network structure, and extending beyond network structure using the developed Gene-Disease Prioritization Score (GDPS) reflecting the degree of gene-disease association via gene co-expression data. For negative training samples, we select non-associated gene and disease nodes with lower GDPS that are less likely to be affiliated. We evaluate the developed model's success in predicting novel disease genes by analyzing the prediction probabilities of gene-disease associations. HetIG-PreDiG successfully predicts (Micro-F1 = 0.95) gene-disease associations, outperforming baseline models, and is validated using published literature, thus advancing our understanding of complex genetic diseases.
Collapse
Affiliation(s)
- Kathleen M. Jagodnik
- The School of Business Administration, Bar-Ilan University, Ramat Gan, Israel
- Department of Psychiatry, Harvard Medical School, Boston, MA, United States of America
- Department of Psychiatry, Massachusetts General Hospital, Boston, MA, United States of America
| | - Yael Shvili
- Department of Surgery A, Meir Medical Center, Kfar Sava, Israel
| | - Alon Bartal
- The School of Business Administration, Bar-Ilan University, Ramat Gan, Israel
- * E-mail:
| |
Collapse
|
40
|
Wei MM, Yu CQ, Li LP, You ZH, Ren ZH, Guan YJ, Wang XF, Li YC. LPIH2V: LncRNA-protein interactions prediction using HIN2Vec based on heterogeneous networks model. Front Genet 2023; 14:1122909. [PMID: 36845392 PMCID: PMC9950107 DOI: 10.3389/fgene.2023.1122909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 01/30/2023] [Indexed: 02/12/2023] Open
Abstract
LncRNA-protein interaction plays an important role in the development and treatment of many human diseases. As the experimental approaches to determine lncRNA-protein interactions are expensive and time-consuming, considering that there are few calculation methods, therefore, it is urgent to develop efficient and accurate methods to predict lncRNA-protein interactions. In this work, a model for heterogeneous network embedding based on meta-path, namely LPIH2V, is proposed. The heterogeneous network is composed of lncRNA similarity networks, protein similarity networks, and known lncRNA-protein interaction networks. The behavioral features are extracted in a heterogeneous network using the HIN2Vec method of network embedding. The results showed that LPIH2V obtains an AUC of 0.97 and ACC of 0.95 in the 5-fold cross-validation test. The model successfully showed superiority and good generalization ability. Compared to other models, LPIH2V not only extracts attribute characteristics by similarity, but also acquires behavior properties by meta-path wandering in heterogeneous networks. LPIH2V would be beneficial in forecasting interactions between lncRNA and protein.
Collapse
Affiliation(s)
- Meng-Meng Wei
- School of Information Engineering, Xijing University, Xi’an, China
| | - Chang-Qing Yu
- School of Information Engineering, Xijing University, Xi’an, China,*Correspondence: Chang-Qing Yu, ; Li-Ping Li,
| | - Li-Ping Li
- School of Information Engineering, Xijing University, Xi’an, China,College of Grassland and Environment Sciences, Xinjiang Agricultural University, Urumqi, China,*Correspondence: Chang-Qing Yu, ; Li-Ping Li,
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
| | - Zhong-Hao Ren
- School of Information Engineering, Xijing University, Xi’an, China
| | - Yong-Jian Guan
- School of Information Engineering, Xijing University, Xi’an, China
| | | | | |
Collapse
|
41
|
Alfano C, Farina L, Petti M. Networks as Biomarkers: Uses and Purposes. Genes (Basel) 2023; 14:429. [PMID: 36833356 PMCID: PMC9956930 DOI: 10.3390/genes14020429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 02/03/2023] [Accepted: 02/06/2023] [Indexed: 02/10/2023] Open
Abstract
Networks-based approaches are often used to analyze gene expression data or protein-protein interactions but are not usually applied to study the relationships between different biomarkers. Given the clinical need for more comprehensive and integrative biomarkers that can help to identify personalized therapies, the integration of biomarkers of different natures is an emerging trend in the literature. Network analysis can be used to analyze the relationships between different features of a disease; nodes can be disease-related phenotypes, gene expression, mutational events, protein quantification, imaging-derived features and more. Since different biomarkers can exert causal effects between them, describing such interrelationships can be used to better understand the underlying mechanisms of complex diseases. Networks as biomarkers are not yet commonly used, despite being proven to lead to interesting results. Here, we discuss in which ways they have been used to provide novel insights into disease susceptibility, disease development and severity.
Collapse
Affiliation(s)
- Caterina Alfano
- Department of Experimental Medicine, Sapienza University of Rome, Viale Regina Elena, 324, 00161 Rome, Italy
| | - Lorenzo Farina
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Via Ariosto, 25, 00185 Rome, Italy
| | - Manuela Petti
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Via Ariosto, 25, 00185 Rome, Italy
| |
Collapse
|
42
|
Mancuso CA, Liu R, Krishnan A. PyGenePlexus: a Python package for gene discovery using network-based machine learning. Bioinformatics 2023; 39:7017525. [PMID: 36721325 PMCID: PMC9900208 DOI: 10.1093/bioinformatics/btad064] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 11/29/2022] [Accepted: 01/30/2023] [Indexed: 02/02/2023] Open
Abstract
SUMMARY PyGenePlexus is a Python package that enables a user to gain insight into any gene set of interest through a molecular interaction network informed supervised machine learning model. PyGenePlexus provides predictions of how associated every gene in the network is to the input gene set, offers interpretability by comparing the model trained on the input gene set to models trained on thousands of known gene sets, and returns the network connectivity of the top predicted genes. AVAILABILITY AND IMPLEMENTATION https://pypi.org/project/geneplexus/ and https://github.com/krishnanlab/PyGenePlexus. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Christopher A Mancuso
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA.,Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado-Denver Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Renming Liu
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Arjun Krishnan
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA.,Department of Biomedical Informatics, University of Colorado-Denver Anschutz Medical Campus, Aurora, CO 80045, USA
| |
Collapse
|
43
|
Stolfi P, Mastropietro A, Pasculli G, Tieri P, Vergni D. NIAPU: network-informed adaptive positive-unlabeled learning for disease gene identification. Bioinformatics 2023; 39:7023926. [PMID: 36727493 PMCID: PMC9933847 DOI: 10.1093/bioinformatics/btac848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 12/23/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Gene-disease associations are fundamental for understanding disease etiology and developing effective interventions and treatments. Identifying genes not yet associated with a disease due to a lack of studies is a challenging task in which prioritization based on prior knowledge is an important element. The computational search for new candidate disease genes may be eased by positive-unlabeled learning, the machine learning (ML) setting in which only a subset of instances are labeled as positive while the rest of the dataset is unlabeled. In this work, we propose a set of effective network-based features to be used in a novel Markov diffusion-based multi-class labeling strategy for putative disease gene discovery. RESULTS The performances of the new labeling algorithm and the effectiveness of the proposed features have been tested on 10 different disease datasets using three ML algorithms. The new features have been compared against classical topological and functional/ontological features and a set of network- and biological-derived features already used in gene discovery tasks. The predictive power of the integrated methodology in searching for new disease genes has been found to be competitive against state-of-the-art algorithms. AVAILABILITY AND IMPLEMENTATION The source code of NIAPU can be accessed at https://github.com/AndMastro/NIAPU. The source data used in this study are available online on the respective websites. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Paola Stolfi
- Institute for Applied Computing (IAC) 'Mauro Picone', National Research Council of Italy (CNR), Rome 00185, Italy
| | - Andrea Mastropietro
- Department of Computer, Control and Management Engineering (DIAG) 'Antonio Ruberti', Sapienza University of Rome, Rome 00185, Italy
| | - Giuseppe Pasculli
- Department of Computer, Control and Management Engineering (DIAG) 'Antonio Ruberti', Sapienza University of Rome, Rome 00185, Italy
| | - Paolo Tieri
- Institute for Applied Computing (IAC) 'Mauro Picone', National Research Council of Italy (CNR), Rome 00185, Italy
| | - Davide Vergni
- Institute for Applied Computing (IAC) 'Mauro Picone', National Research Council of Italy (CNR), Rome 00185, Italy
| |
Collapse
|
44
|
Zhang Y, Xiang J, Tang L, Yang J, Li J. PGAGP: Predicting pathogenic genes based on adaptive network embedding algorithm. Front Genet 2023; 13:1087784. [PMID: 36744177 PMCID: PMC9895109 DOI: 10.3389/fgene.2022.1087784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 12/09/2022] [Indexed: 01/21/2023] Open
Abstract
The study of disease-gene associations is an important topic in the field of computational biology. The accumulation of massive amounts of biomedical data provides new possibilities for exploring potential relations between diseases and genes through computational strategy, but how to extract valuable information from the data to predict pathogenic genes accurately and rapidly is currently a challenging and meaningful task. Therefore, we present a novel computational method called PGAGP for inferring potential pathogenic genes based on an adaptive network embedding algorithm. The PGAGP algorithm is to first extract initial features of nodes from a heterogeneous network of diseases and genes efficiently and effectively by Gaussian random projection and then optimize the features of nodes by an adaptive refining process. These low-dimensional features are used to improve the disease-gene heterogenous network, and we apply network propagation to the improved heterogenous network to predict pathogenic genes more effectively. By a series of experiments, we study the effect of PGAGP's parameters and integrated strategies on predictive performance and confirm that PGAGP is better than the state-of-the-art algorithms. Case studies show that many of the predicted candidate genes for specific diseases have been implied to be related to these diseases by literature verification and enrichment analysis, which further verifies the effectiveness of PGAGP. Overall, this work provides a useful solution for mining disease-gene heterogeneous network to predict pathogenic genes more effectively.
Collapse
Affiliation(s)
- Yan Zhang
- School of Computer Science and Engineering, Central South University, Changsha, China
- School of Information Science and Engineering, Changsha Medical University, Changsha, China
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Ju Xiang
- School of Computer Science and Engineering, Central South University, Changsha, China
- School of Information Science and Engineering, Changsha Medical University, Changsha, China
- Academician Workstation, Changsha Medical University, Changsha, China
- School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, China
- Department of Basic Medical Sciences and Neuroscience Research Center, Changsha Medical University, Changsha, China
| | - Liang Tang
- Academician Workstation, Changsha Medical University, Changsha, China
- Department of Basic Medical Sciences and Neuroscience Research Center, Changsha Medical University, Changsha, China
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha, China
- Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
- Geneis Beijing Co., Ltd, Beijing, China
| | - Jianming Li
- Academician Workstation, Changsha Medical University, Changsha, China
- Department of Basic Medical Sciences and Neuroscience Research Center, Changsha Medical University, Changsha, China
| |
Collapse
|
45
|
de la Fuente L, Del Pozo-Valero M, Perea-Romero I, Blanco-Kelly F, Fernández-Caballero L, Cortón M, Ayuso C, Mínguez P. Prioritization of New Candidate Genes for Rare Genetic Diseases by a Disease-Aware Evaluation of Heterogeneous Molecular Networks. Int J Mol Sci 2023; 24:ijms24021661. [PMID: 36675175 PMCID: PMC9864172 DOI: 10.3390/ijms24021661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 01/10/2023] [Accepted: 01/11/2023] [Indexed: 01/18/2023] Open
Abstract
Screening for pathogenic variants in the diagnosis of rare genetic diseases can now be performed on all genes thanks to the application of whole exome and genome sequencing (WES, WGS). Yet the repertoire of gene-disease associations is not complete. Several computer-based algorithms and databases integrate distinct gene-gene functional networks to accelerate the discovery of gene-disease associations. We hypothesize that the ability of every type of information to extract relevant insights is disease-dependent. We compiled 33 functional networks classified into 13 knowledge categories (KCs) and observed large variability in their ability to recover genes associated with 91 genetic diseases, as measured using efficiency and exclusivity. We developed GLOWgenes, a network-based algorithm that applies random walk with restart to evaluate KCs' ability to recover genes from a given list associated with a phenotype and modulates the prediction of new candidates accordingly. Comparison with other integration strategies and tools shows that our disease-aware approach can boost the discovery of new gene-disease associations, especially for the less obvious ones. KC contribution also varies if obtained using recently discovered genes. Applied to 15 unsolved WES, GLOWgenes proposed three new genes to be involved in the phenotypes of patients with syndromic inherited retinal dystrophies.
Collapse
Affiliation(s)
- Lorena de la Fuente
- Department of Genetics, Health Research Institute–Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), 28049 Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III (ISCIII), 28040 Madrid, Spain
- Bioinformatics Unit, Health Research Institute–Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), 28049 Madrid, Spain
| | - Marta Del Pozo-Valero
- Department of Genetics, Health Research Institute–Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), 28049 Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III (ISCIII), 28040 Madrid, Spain
| | - Irene Perea-Romero
- Department of Genetics, Health Research Institute–Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), 28049 Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III (ISCIII), 28040 Madrid, Spain
| | - Fiona Blanco-Kelly
- Department of Genetics, Health Research Institute–Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), 28049 Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III (ISCIII), 28040 Madrid, Spain
| | - Lidia Fernández-Caballero
- Department of Genetics, Health Research Institute–Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), 28049 Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III (ISCIII), 28040 Madrid, Spain
| | - Marta Cortón
- Department of Genetics, Health Research Institute–Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), 28049 Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III (ISCIII), 28040 Madrid, Spain
| | - Carmen Ayuso
- Department of Genetics, Health Research Institute–Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), 28049 Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III (ISCIII), 28040 Madrid, Spain
| | - Pablo Mínguez
- Department of Genetics, Health Research Institute–Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), 28049 Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III (ISCIII), 28040 Madrid, Spain
- Bioinformatics Unit, Health Research Institute–Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), 28049 Madrid, Spain
- Correspondence:
| |
Collapse
|
46
|
Koch E, Kauppi K, Chen CH. Candidates for drug repurposing to address the cognitive symptoms in schizophrenia. Prog Neuropsychopharmacol Biol Psychiatry 2023; 120:110637. [PMID: 36099967 DOI: 10.1016/j.pnpbp.2022.110637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 07/23/2022] [Accepted: 09/07/2022] [Indexed: 01/24/2023]
Abstract
In the protein-protein interactome, we have previously identified a significant overlap between schizophrenia risk genes and genes associated with cognitive performance. Here, we further studied this overlap to identify potential candidate drugs for repurposing to treat the cognitive symptoms in schizophrenia. We first defined a cognition-related schizophrenia interactome from network propagation analyses, and identified drugs known to target more than one protein within this network. Thereafter, we used gene expression data to further select drugs that could counteract schizophrenia-associated gene expression perturbations. Additionally, we stratified these analyses by sex to identify sex-specific pharmacological treatment options for the cognitive symptoms in schizophrenia. After excluding drugs contraindicated in schizophrenia, we identified 12 drug repurposing candidates, most of which have anti-inflammatory and neuroprotective effects. Sex-stratified analyses showed that out of these 12 drugs, four were identified in females only, three were identified in males only, and five were identified in both sexes. Based on our bioinformatics analyses of disease genetics, we suggest 12 candidate drugs that warrant further examination for repurposing to treat the cognitive symptoms in schizophrenia, and suggest that these symptoms could be addressed by sex-specific pharmacological treatment options.
Collapse
Affiliation(s)
- Elise Koch
- Department of Integrative Medical Biology, Umeå University, Umeå, Sweden; NORMENT, Centre for Mental Disorders Research, Division of Mental Health and Addiction, Oslo University Hospital, and Institute of Clinical Medicine, University of Oslo, Oslo, Norway.
| | - Karolina Kauppi
- Department of Integrative Medical Biology, Umeå University, Umeå, Sweden; Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Solna, Sweden
| | - Chi-Hua Chen
- Department of Radiology and Center for Multimodal Imaging and Genetics, University of California San Diego, USA.
| |
Collapse
|
47
|
Zheng X, Li F, Zhao H, Tang Y, Xue K, Zhang X, Liang W, Zhao R, Lv X, Song X, Zhang C, Xu Y, Zhang Y. A novel method to identify and characterize personalized functional driver lncRNAs in cancer samples. Comput Struct Biotechnol J 2023; 21:2471-2482. [PMID: 37077174 PMCID: PMC10106482 DOI: 10.1016/j.csbj.2023.03.041] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 03/17/2023] [Accepted: 03/23/2023] [Indexed: 04/21/2023] Open
Abstract
Cancer is a highly heterogeneous disease, and different individuals of the same cancer type can display different therapeutic effects and prognosis. Genetic variation of long non-coding RNA is the key factor driving tumor development, and plays an important role in genetic and biological heterogeneity. Therefore, it is of great significance to identify lncRNA as a driving factor in the non-coding region and explain its function in tumors for revealing the pathogenesis of cancer. In this study, we developed an integrated method to identify Personalized Functional Driver lncRNAs (PFD-lncRNAs) by integrating the DNA copy number data, gene expression data, and the biological subpathways information. Then, we applied the method to identify 2695 PFD-lncRNAs in 5334 samples across 19 cancer types. We performed an analysis of the association between PFD-lncRNAs and drug sensitivity, which provides medication guidance in disease therapy and drug discovery in the individual. Our research is of great significance for elucidating the biological roles of lncRNA genetic variation in cancer, revealing the related mechanism of cancer, and providing novel insights for individualized medicine.
Collapse
|
48
|
Ma J, Qin T, Xiang J. Disease-gene prediction based on preserving structure network embedding. Front Aging Neurosci 2023; 15:1061892. [PMID: 36896421 PMCID: PMC9990751 DOI: 10.3389/fnagi.2023.1061892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 01/30/2023] [Indexed: 02/23/2023] Open
Abstract
Many diseases, such as Alzheimer's disease (AD) and Parkinson's disease (PD), are caused by abnormalities or mutations of related genes. Many computational methods based on the network relationship between diseases and genes have been proposed to predict potential pathogenic genes. However, how to effectively mine the disease-gene relationship network to predict disease genes better is still an open problem. In this paper, a disease-gene-prediction method based on preserving structure network embedding (PSNE) is introduced. In order to predict pathogenic genes more effectively, a heterogeneous network with multiple types of bio-entities was constructed by integrating disease-gene associations, human protein network, and disease-disease associations. Furthermore, the low-dimension features of nodes extracted from the network were used to reconstruct a new disease-gene heterogeneous network. Compared with other advanced methods, the performance of PSNE has been confirmed more effective in disease-gene prediction. Finally, we applied the PSNE method to predict potential pathogenic genes for age-associated diseases such as AD and PD. We verified the effectiveness of these predicted potential genes by literature verification. Overall, this work provides an effective method for disease-gene prediction, and a series of high-confidence potential pathogenic genes of AD and PD which may be helpful for the experimental discovery of disease genes.
Collapse
Affiliation(s)
- Jinlong Ma
- School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, China
| | - Tian Qin
- School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, China
| | - Ju Xiang
- School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, China.,Department of Basic Medical Sciences, Changsha Medical University, Changsha, China
| |
Collapse
|
49
|
Peng J, Yang K, Tian H, Lin Y, Hou M, Gao Y, Zhou X, Gao Z, Ren J. The mechanisms of Qizhu Tangshen formula in the treatment of diabetic kidney disease: Network pharmacology, machine learning, molecular docking and experimental assessment. PHYTOMEDICINE : INTERNATIONAL JOURNAL OF PHYTOTHERAPY AND PHYTOPHARMACOLOGY 2023; 108:154525. [PMID: 36413925 DOI: 10.1016/j.phymed.2022.154525] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 09/04/2022] [Accepted: 10/25/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND Qizhu Tangshen Formula (QZTS) has been shown therapeutic effects on diabetic kidney disease (DKD). However, to date, the pharmacological mechanisms remain vague. METHODS To explore the underlying mechanisms of QZTS in treating DKD using network pharmacology, machine learning, molecular docking and experimental assessment. RESULTS First, we found that QZTS improved glycolipid metabolism disorder, decreased proteinuria and alleviated kidney tissue injury in DKD model KKAy mice. Then, by integrating multiple databases, a total of 96 targets of 74 active compounds in QZTS and 759 DKD-related genes were acquired. Next, we identified 13 hub targets of QZTS in DKD by three rank algorithms, including functional similarity, topological similarity and shortest path. Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses demonstrated that the pathways mainly centered on the processes of glycolipid metabolism disorder, inflammation and angiogenesis. Among them, VEGF signaling pathway was significantly enriched. Molecular docking showed that key active compounds of QZTS all had relatively good binding affinity with predicted hub targets. Finally, animal experiments found that QZTS significantly inhibited the secretion of plasma VEGF and downregulated the protein and mRNA expression levels of AKT, p38MAPK and VEGFR2. CONCLUSION Our results indicated that QZTS treated DKD via multiple targets and pathways and the VEGF signaling pathway may be highly involved in this process.
Collapse
Affiliation(s)
- Juqin Peng
- China Academy of Chinese Medical Sciences, Xiyuan Hospital, Beijing 100091, China
| | - Kuo Yang
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
| | - Haoyu Tian
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
| | - Yadong Lin
- China Academy of Chinese Medical Sciences, Xiyuan Hospital, Beijing 100091, China
| | - Min Hou
- China Academy of Chinese Medical Sciences, Xiyuan Hospital, Beijing 100091, China; Beijing University of Chinese Medicine, Beijing 100029, China
| | - Yunxiao Gao
- China Academy of Chinese Medical Sciences, Xiyuan Hospital, Beijing 100091, China
| | - Xuezhong Zhou
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Zhuye Gao
- China Academy of Chinese Medical Sciences, Xiyuan Hospital, Beijing 100091, China.
| | - Junguo Ren
- China Academy of Chinese Medical Sciences, Xiyuan Hospital, Beijing 100091, China.
| |
Collapse
|
50
|
Delmas M, Filangi O, Duperier C, Paulhe N, Vinson F, Rodriguez-Mier P, Giacomoni F, Jourdan F, Frainay C. Suggesting disease associations for overlooked metabolites using literature from metabolic neighbors. Gigascience 2022; 12:giad065. [PMID: 37712592 PMCID: PMC10502579 DOI: 10.1093/gigascience/giad065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 06/13/2023] [Accepted: 07/28/2023] [Indexed: 09/16/2023] Open
Abstract
In human health research, metabolic signatures extracted from metabolomics data have a strong added value for stratifying patients and identifying biomarkers. Nevertheless, one of the main challenges is to interpret and relate these lists of discriminant metabolites to pathological mechanisms. This task requires experts to combine their knowledge with information extracted from databases and the scientific literature. However, we show that most compounds (>99%) in the PubChem database lack annotated literature. This dearth of available information can have a direct impact on the interpretation of metabolic signatures, which is often restricted to a subset of significant metabolites. To suggest potential pathological phenotypes related to overlooked metabolites that lack annotated literature, we extend the "guilt-by-association" principle to literature information by using a Bayesian framework. The underlying assumption is that the literature associated with the metabolic neighbors of a compound can provide valuable insights, or an a priori, into its biomedical context. The metabolic neighborhood of a compound can be defined from a metabolic network and correspond to metabolites to which it is connected through biochemical reactions. With the proposed approach, we suggest more than 35,000 associations between 1,047 overlooked metabolites and 3,288 diseases (or disease families). All these newly inferred associations are freely available on the FORUM ftp server (see information at https://github.com/eMetaboHUB/Forum-LiteraturePropagation).
Collapse
Affiliation(s)
- Maxime Delmas
- Toxalim (Research Center in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, 31300 Toulouse, France
| | - Olivier Filangi
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Christophe Duperier
- Université Clermont Auvergne, INRAE, UNH, Plateforme d’Exploration du Métabolisme, MetaboHUB Clermont, F-63000 Clermont-Ferrand, France
| | - Nils Paulhe
- Université Clermont Auvergne, INRAE, UNH, Plateforme d’Exploration du Métabolisme, MetaboHUB Clermont, F-63000 Clermont-Ferrand, France
| | - Florence Vinson
- Toxalim (Research Center in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, 31300 Toulouse, France
- MetaboHUB-Metatoul, National Infrastructure of Metabolomics and Fluxomics, Toulouse, 31300, France
| | - Pablo Rodriguez-Mier
- Toxalim (Research Center in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, 31300 Toulouse, France
| | - Franck Giacomoni
- Université Clermont Auvergne, INRAE, UNH, Plateforme d’Exploration du Métabolisme, MetaboHUB Clermont, F-63000 Clermont-Ferrand, France
| | - Fabien Jourdan
- Toxalim (Research Center in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, 31300 Toulouse, France
- MetaboHUB-Metatoul, National Infrastructure of Metabolomics and Fluxomics, Toulouse, 31300, France
| | - Clément Frainay
- Toxalim (Research Center in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, 31300 Toulouse, France
| |
Collapse
|