1
|
Jin DM, Morton JT, Bonneau R. Meta-analysis of the human gut microbiome uncovers shared and distinct microbial signatures between diseases. mSystems 2024; 9:e0029524. [PMID: 39078158 PMCID: PMC11334437 DOI: 10.1128/msystems.00295-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 05/08/2024] [Indexed: 07/31/2024] Open
Abstract
Microbiome studies have revealed gut microbiota's potential impact on complex diseases. However, many studies often focus on one disease per cohort. We developed a meta-analysis workflow for gut microbiome profiles and analyzed shotgun metagenomic data covering 11 diseases. Using interpretable machine learning and differential abundance analysis, our findings reinforce the generalization of binary classifiers for Crohn's disease (CD) and colorectal cancer (CRC) to hold-out cohorts and highlight the key microbes driving these classifications. We identified high microbial similarity in disease pairs like CD vs ulcerative colitis (UC), CD vs CRC, Parkinson's disease vs type 2 diabetes (T2D), and schizophrenia vs T2D. We also found strong inverse correlations in Alzheimer's disease vs CD and UC. These findings, detected by our pipeline, provide valuable insights into these diseases. IMPORTANCE Assessing disease similarity is an essential initial step preceding a disease-based approach for drug repositioning. Our study provides a modest first step in underscoring the potential of integrating microbiome insights into the disease similarity assessment. Recent microbiome research has predominantly focused on analyzing individual diseases to understand their unique characteristics, which by design excludes comorbidities in individuals. We analyzed shotgun metagenomic data from existing studies and identified previously unknown similarities between diseases. Our research represents a pioneering effort that utilizes both interpretable machine learning and differential abundance analysis to assess microbial similarity between diseases.
Collapse
Affiliation(s)
- Dong-Min Jin
- Center for Genomics and Systems Biology, New York University, New York, New York, USA
| | - James T. Morton
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, New York, USA
| | - Richard Bonneau
- Center for Genomics and Systems Biology, New York University, New York, New York, USA
- Genentech, New York, New York, USA
| |
Collapse
|
2
|
Jahangir M, Nazari M, Babakhanzadeh E, Manshadi SD. Where do obesity and male infertility collide? BMC Med Genomics 2024; 17:128. [PMID: 38730451 PMCID: PMC11088066 DOI: 10.1186/s12920-024-01897-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 04/30/2024] [Indexed: 05/12/2024] Open
Abstract
The parallel rise in obesity and male infertility in modern societies necessitates the identification of susceptibility genes underlying these interconnected health issues. In our study, we conducted a comprehensive search in the OMIM database to identify genes commonly associated with male infertility and obesity. Subsequently, we performed an insilico analysis using the REVEL algorithm to detect pathogenic single nucleotide polymorphisms (SNPs) in the coding region of these candidate genes. To validate our findings in vivo, we conducted a comprehensive analysis of SNPs and gene expression of candidate genes in 200 obese infertile subjects and 240 obese fertile individuals using ARMS-PCR. Additionally, we analyzed 20 fertile and 22 infertile obese individuals using Realtime-qPCR. By removing duplicated queries, we obtained 197 obesity-related genes and 102 male infertility-related genes from the OMIM database. Interestingly, the APOB gene was found in common between the two datasets. REVEL identified the rs13306194 variant as potentially pathogenic with a calculated score of 0.524. The study identified a significant association between the AA (P value = 0.001) genotype and A allele (P value = 0.003) of the APOB rs13306194 variant and infertility in obese men. APOB expression levels were significantly lower in obese infertile men compared to obese fertile controls (p < 0.01). Moreover, the AA genotype of rs13306194 APOB was associated with a significant decrease in APOB gene expression in obese infertile men (p = 0.05). There is a significant association between the Waist-to-Hip Ratio (WHR) and LH with infertility in the obese infertile group. These results are likely to contribute to a better understanding of the causes of male infertility and its association with obesity.
Collapse
Affiliation(s)
- Melika Jahangir
- Department of Pharmacy, Tehran University of Medical Sciences, P.O. Box: 64155-65117, Tehran, Iran
| | - Majid Nazari
- Department of Medical Genetics, Shahid Sadoughi University of Medical Sciences, Yazd, Iran.
| | - Emad Babakhanzadeh
- Department of Medical Genetics, Shahid Sadoughi University of Medical Sciences, Yazd, Iran
| | | |
Collapse
|
3
|
Jin DM, Morton JT, Bonneau R. Meta-analysis of the human gut microbiome uncovers shared and distinct microbial signatures between diseases. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.27.582333. [PMID: 38464323 PMCID: PMC10925178 DOI: 10.1101/2024.02.27.582333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Microbiome studies have revealed gut microbiota's potential impact on complex diseases. However, many studies often focus on one disease per cohort. We developed a meta-analysis workflow for gut microbiome profiles and analyzed shotgun metagenomic data covering 11 diseases. Using interpretable machine learning and differential abundance analysis, our findings reinforce the generalization of binary classifiers for Crohn's disease (CD) and colorectal cancer (CRC) to hold-out cohorts and highlight the key microbes driving these classifications. We identified high microbial similarity in disease pairs like CD vs ulcerative colitis (UC), CD vs CRC, Parkinson's disease vs type 2 diabetes (T2D), and schizophrenia vs T2D. We also found strong inverse correlations in Alzheimer's disease vs CD and UC. These findings detected by our pipeline provide valuable insights into these diseases.
Collapse
Affiliation(s)
- Dong-Min Jin
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
| | - James T. Morton
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Richard Bonneau
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
- Genentech, New York, NY, USA
| |
Collapse
|
4
|
CHENG K, YUAN J, LIU J, ZHANG S, XU Q, XIE Y, ZHAO J, ZHANG X, TANG X, ZHENG Y, WANG Z. Identifying Qingkailing ingredients-dependent mesenchymal-epithelial transition factor-axiation "π" structuring module with angiogenesis and neurogenesis effects. J TRADIT CHIN MED 2024; 44:35-43. [PMID: 38213237 PMCID: PMC10774727 DOI: 10.19852/j.cnki.jtcm.2024.01.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 05/22/2023] [Indexed: 01/13/2024]
Abstract
OBJECTIVE To explore the functional role of the drug-dependent mesenchymal-epithelial transition (Met)-axiation "π" structural module of neurogenesis after processing by three components of Qingkailing injection in neurogenesis and angiogenesis in cerebral ischemia. METHODS We used a Glutathione S-transferase (GST)-pull down assay, isothermal titration calorimetry assay, and other related methods to identify the relationships among Met, inositol polyphosphate phosphatase like 1 (Inppl1), and death associated protein kinase 3 (Dapk3) in this allosteric module. The biological effects of the modules of neurons generation composed of Met, Inppl1, and Dapk3 were measured through Western blot, apoptosis analysis, and double immunofluorescence labeling. RESULTS The GST-pull down assay revealed that proline-serine-threonine rich domain of Met binds to the Src homology domain of Inppl1 to form a protein-protein complex; Dapk3 with a C-terminal domain interacts weakly with the protein kinase C domain of Met in the intracellular region. Thus, we obtained a "π" structuring module considered a neural regeneration module. The biological effects of angiogenesis and neurogenesis modules composed of Met, Inppl1, and Dapk3 were also verified. CONCLUSION The study suggested that understanding the functional modules that contribute to pharmaceutics might provide novel signatures that can be used as endpoints to define disease processes under stroke or cerebral ischemia conditions.
Collapse
Affiliation(s)
- Kunming CHENG
- 1 Provincial Engineering Laboratory for Screening and Re-evaluation of Active Compounds of Herbal Medicines in Southern Anhui, Teaching and Research Section of Traditional Chinese Medicine, School of Pharmacy, Wannan Medical College, Wuhu 241000, China
| | - Jianan YUAN
- 1 Provincial Engineering Laboratory for Screening and Re-evaluation of Active Compounds of Herbal Medicines in Southern Anhui, Teaching and Research Section of Traditional Chinese Medicine, School of Pharmacy, Wannan Medical College, Wuhu 241000, China
| | - Jun LIU
- 2 Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Shengpeng ZHANG
- 1 Provincial Engineering Laboratory for Screening and Re-evaluation of Active Compounds of Herbal Medicines in Southern Anhui, Teaching and Research Section of Traditional Chinese Medicine, School of Pharmacy, Wannan Medical College, Wuhu 241000, China
| | - Qixiang XU
- 1 Provincial Engineering Laboratory for Screening and Re-evaluation of Active Compounds of Herbal Medicines in Southern Anhui, Teaching and Research Section of Traditional Chinese Medicine, School of Pharmacy, Wannan Medical College, Wuhu 241000, China
| | - Yong XIE
- 3 Key Laboratory of Bioactive Substances and Resources Utilization of Chinese Herbal Medicine, Ministry of Education, Department of Medicinal Plant Development, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100193, China
| | - Jingfeng ZHAO
- 3 Key Laboratory of Bioactive Substances and Resources Utilization of Chinese Herbal Medicine, Ministry of Education, Department of Medicinal Plant Development, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100193, China
| | - Xiaoxu ZHANG
- 2 Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Xudong TANG
- 4 Department of Gastroenterology, Xiyuan Hospital, China Academy of Chinese Medical Sciences, Beijing 100091, China
| | - Yongqiu ZHENG
- 1 Provincial Engineering Laboratory for Screening and Re-evaluation of Active Compounds of Herbal Medicines in Southern Anhui, Teaching and Research Section of Traditional Chinese Medicine, School of Pharmacy, Wannan Medical College, Wuhu 241000, China
| | - Zhong WANG
- 2 Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| |
Collapse
|
5
|
Sánchez-Valle J, Valencia A. Molecular bases of comorbidities: present and future perspectives. Trends Genet 2023; 39:773-786. [PMID: 37482451 DOI: 10.1016/j.tig.2023.06.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 06/12/2023] [Accepted: 06/12/2023] [Indexed: 07/25/2023]
Abstract
Co-occurrence of diseases decreases patient quality of life, complicates treatment choices, and increases mortality. Analyses of electronic health records present a complex scenario of comorbidity relationships that vary by age, sex, and cohort under study. The study of similarities between diseases using 'omics data, such as genes altered in diseases, gene expression, proteome, and microbiome, are fundamental to uncovering the origin of, and potential treatment for, comorbidities. Recent studies have produced a first generation of genetic interpretations for as much as 46% of the comorbidities described in large cohorts. Integrating different sources of molecular information and using artificial intelligence (AI) methods are promising approaches for the study of comorbidities. They may help to improve the treatment of comorbidities, including the potential repositioning of drugs.
Collapse
Affiliation(s)
- Jon Sánchez-Valle
- Life Sciences Department, Barcelona Supercomputing Center, Barcelona, 08034, Spain.
| | - Alfonso Valencia
- Life Sciences Department, Barcelona Supercomputing Center, Barcelona, 08034, Spain; ICREA, Barcelona, 08010, Spain.
| |
Collapse
|
6
|
Shi W, Feng H, Li J, Liu T, Liu Z. DapBCH: a disease association prediction model Based on Cross-species and Heterogeneous graph embedding. Front Genet 2023; 14:1222346. [PMID: 37811150 PMCID: PMC10556742 DOI: 10.3389/fgene.2023.1222346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Accepted: 09/11/2023] [Indexed: 10/10/2023] Open
Abstract
The study of comorbidity can provide new insights into the pathogenesis of the disease and has important economic significance in the clinical evaluation of treatment difficulty, medical expenses, length of stay, and prognosis of the disease. In this paper, we propose a disease association prediction model DapBCH, which constructs a cross-species biological network and applies heterogeneous graph embedding to predict disease association. First, we combine the human disease-gene network, mouse gene-phenotype network, human-mouse homologous gene network, and human protein-protein interaction network to reconstruct a heterogeneous biological network. Second, we apply heterogeneous graph embedding based on meta-path aggregation to generate the feature vector of disease nodes. Finally, we employ link prediction to obtain the similarity of disease pairs. The experimental results indicate that our model is highly competitive in predicting the disease association and is promising for finding potential disease associations.
Collapse
Affiliation(s)
- Wanqi Shi
- School of Mathematics and Computer Science, Zhejiang A & F University, Hangzhou, Zhejiang, China
| | - Hailin Feng
- School of Mathematics and Computer Science, Zhejiang A & F University, Hangzhou, Zhejiang, China
| | - Jian Li
- School of Mathematics and Computer Science, Zhejiang A & F University, Hangzhou, Zhejiang, China
| | - Tongcun Liu
- School of Mathematics and Computer Science, Zhejiang A & F University, Hangzhou, Zhejiang, China
| | - Zhe Liu
- College of Media Engineering, Zhejiang University of Media and Communications, Hangzhou, Zhejiang, China
| |
Collapse
|
7
|
Chen L, Tang Q, Zhang K, Huang Q, Ding Y, Jin B, Liu S, Hwa K, Chou CJ, Zhang Y, Thyparambil S, Liao W, Han Z, Mortensen R, Schilling J, Li Z, Heaton R, Tian L, Cohen HJ, Sylvester KG, Arent RC, Zhao X, McElhinney DB, Wu Y, Bai W, Ling XB. Altered expression of the L-arginine/nitric oxide pathway in ovarian cancer: metabolic biomarkers and biological implications. BMC Cancer 2023; 23:844. [PMID: 37684587 PMCID: PMC10492322 DOI: 10.1186/s12885-023-11192-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 07/19/2023] [Indexed: 09/10/2023] Open
Abstract
MOTIVATION Ovarian cancer (OC) is a highly lethal gynecological malignancy. Extensive research has shown that OC cells undergo significant metabolic alterations during tumorigenesis. In this study, we aim to leverage these metabolic changes as potential biomarkers for assessing ovarian cancer. METHODS A functional module-based approach was utilized to identify key gene expression pathways that distinguish different stages of ovarian cancer (OC) within a tissue biopsy cohort. This cohort consisted of control samples (n = 79), stage I/II samples (n = 280), and stage III/IV samples (n = 1016). To further explore these altered molecular pathways, minimal spanning tree (MST) analysis was applied, leading to the formulation of metabolic biomarker hypotheses for OC liquid biopsy. To validate, a multiple reaction monitoring (MRM) based quantitative LCMS/MS method was developed. This method allowed for the precise quantification of targeted metabolite biomarkers using an OC blood cohort comprising control samples (n = 464), benign samples (n = 3), and OC samples (n = 13). RESULTS Eleven functional modules were identified as significant differentiators (false discovery rate, FDR < 0.05) between normal and early-stage, or early-stage and late-stage ovarian cancer (OC) tumor tissues. MST analysis revealed that the metabolic L-arginine/nitric oxide (L-ARG/NO) pathway was reprogrammed, and the modules related to "DNA replication" and "DNA repair and recombination" served as anchor modules connecting the other nine modules. Based on this analysis, symmetric dimethylarginine (SDMA) and arginine were proposed as potential liquid biopsy biomarkers for OC assessment. Our quantitative LCMS/MS analysis on our OC blood cohort provided direct evidence supporting the use of the SDMA-to-arginine ratio as a liquid biopsy panel to distinguish between normal and OC samples, with an area under the ROC curve (AUC) of 98.3%. CONCLUSION Our comprehensive analysis of tissue genomics and blood quantitative LC/MSMS metabolic data shed light on the metabolic reprogramming underlying OC pathophysiology. These findings offer new insights into the potential diagnostic utility of the SDMA-to-arginine ratio for OC assessment. Further validation studies using adequately powered OC cohorts are warranted to fully establish the clinical effectiveness of this diagnostic test.
Collapse
Affiliation(s)
- Linfeng Chen
- Beijing Shijitan Hospital, Capital Medical University, Beijing, China
| | - Qiming Tang
- Shanghai Yunxiang Medical Technology Co., Ltd., Shanghai, China
- Binhai Industrial Technology Research Institute, Zhejiang University, Tianjin, China
| | - Keying Zhang
- Beijing Obstetrics and Gynecology Hospital, Capital Medical University, Beijing Maternal and Child Health Care Hospital, Beijing, China
| | | | | | - Bo Jin
- Tianjin Yunjian Medical Laboratory Institute Co., Ltd, Tianjin, China
| | - Szumam Liu
- School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | | | - C James Chou
- School of Medicine, Stanford University, Stanford, CA, USA
| | - Yani Zhang
- Tianjin Yunjian Medical Laboratory Institute Co., Ltd, Tianjin, China
| | | | | | - Zhi Han
- School of Medicine, Stanford University, Stanford, CA, USA
| | | | | | - Zhen Li
- Shanghai Yunxiang Medical Technology Co., Ltd., Shanghai, China
- Binhai Industrial Technology Research Institute, Zhejiang University, Tianjin, China
| | | | - Lu Tian
- School of Medicine, Stanford University, Stanford, CA, USA
| | - Harvey J Cohen
- School of Medicine, Stanford University, Stanford, CA, USA
| | | | - Rebecca C Arent
- School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Xinyang Zhao
- School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | | | - Yumei Wu
- Beijing Obstetrics and Gynecology Hospital, Capital Medical University, Beijing Maternal and Child Health Care Hospital, Beijing, China.
| | - Wenpei Bai
- Beijing Shijitan Hospital, Capital Medical University, Beijing, China.
| | - Xuefeng B Ling
- School of Medicine, Stanford University, Stanford, CA, USA.
| |
Collapse
|
8
|
Xu W, Duan L, Zheng H, Li-Ling J, Jiang W, Zhang Y, Wang T, Qin R. An Integrative Disease Information Network Approach to Similar Disease Detection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2724-2735. [PMID: 34478379 DOI: 10.1109/tcbb.2021.3110127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Disease similarity analysis impacts significantly in pathogenesis revealing, treatment recommending, and disease-causing genes predicting. Previous works study the disease similarity based on the semantics obtaining from biomedical ontologies (e.g., disease ontology) or the function of disease-causing molecules. However, such methods almost focus on a single perspective for obtaining disease features, which may lead to biased results for similar disease detection. To address this issue, we propose a disease information network-based integrative approach named MISSION for detecting similar diseases. By leveraging the associations between diseases and other biomedical entities, the disease information network is established first. Then, the disease similarity features extracted from the aspects of disease taxonomy, attributes, literature, and annotations are integrated into the disease information network. Finally, the top-k similar disease query is performed based on the integrative disease information. The experiments conducted on real-world datasets demonstrate that MISSION is effective and useful in similar disease detection.
Collapse
|
9
|
Wang W, Meng X, Xiang J, Shuai Y, Bedru HD, Li M. CACO: A Core-Attachment Method With Cross-Species Functional Ortholog Information to Detect Human Protein Complexes. IEEE J Biomed Health Inform 2023; 27:4569-4578. [PMID: 37399160 DOI: 10.1109/jbhi.2023.3289490] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/05/2023]
Abstract
Protein complexes play an essential role in living cells. Detecting protein complexes is crucial to understand protein functions and treat complex diseases. Due to high time and resource consumption of experiment approaches, many computational approaches have been proposed to detect protein complexes. However, most of them are only based on protein-protein interaction (PPI) networks, which heavily suffer from the noise in PPI networks. Therefore, we propose a novel core-attachment method, named CACO, to detect human protein complexes, by integrating the functional information from other species via protein ortholog relations. First, CACO constructs a cross-species ortholog relation matrix and transfers GO terms from other species as a reference to evaluate the confidence of PPIs. Then, a PPI filter strategy is adopted to clean the PPI network and thus a weighted clean PPI network is constructed. Finally, a new effective core-attachment algorithm is proposed to detect protein complexes from the weighted PPI network. Compared to other thirteen state-of-the-art methods, CACO outperforms all of them in terms of F-measure and Composite Score, showing that integrating ortholog information and the proposed core-attachment algorithm are effective in detecting protein complexes.
Collapse
|
10
|
He H, Duo H, Hao Y, Zhang X, Zhou X, Zeng Y, Li Y, Li B. Computational drug repurposing by exploiting large-scale gene expression data: Strategy, methods and applications. Comput Biol Med 2023; 155:106671. [PMID: 36805225 DOI: 10.1016/j.compbiomed.2023.106671] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 02/05/2023] [Accepted: 02/10/2023] [Indexed: 02/18/2023]
Abstract
De novo drug development is an extremely complex, time-consuming and costly task. Urgent needs for therapies of various diseases have greatly accelerated searches for more effective drug development methods. Luckily, drug repurposing provides a new and effective perspective on disease treatment. Rapidly increased large-scale transcriptome data paints a detailed prospect of gene expression during disease onset and thus has received wide attention in the field of computational drug repurposing. However, how to efficiently mine transcriptome data and identify new indications for old drugs remains a critical challenge. This review discussed the irreplaceable role of transcriptome data in computational drug repurposing and summarized some representative databases, tools and strategies. More importantly, it proposed a practical guideline through establishing the correspondence between three gene expression data types and five strategies, which would facilitate researchers to adopt appropriate strategies to deeply mine large-scale transcriptome data and discover more effective therapies.
Collapse
Affiliation(s)
- Hao He
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China; State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Institutes of Brain Science, Fudan University, Shanghai, 200032, PR China
| | - Hongrui Duo
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Xiaoxi Zhang
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Xinyi Zhou
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Yujie Zeng
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Yinghong Li
- The Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, PR China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China.
| |
Collapse
|
11
|
Chen H, Cai Y, Ji C, Selvaraj G, Wei D, Wu H. AdaPPI: identification of novel protein functional modules via adaptive graph convolution networks in a protein-protein interaction network. Brief Bioinform 2023; 24:6918779. [PMID: 36526282 DOI: 10.1093/bib/bbac523] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 10/10/2022] [Accepted: 11/02/2022] [Indexed: 12/23/2022] Open
Abstract
Identifying unknown protein functional modules, such as protein complexes and biological pathways, from protein-protein interaction (PPI) networks, provides biologists with an opportunity to efficiently understand cellular function and organization. Finding complex nonlinear relationships in underlying functional modules may involve a long-chain of PPI and pose great challenges in a PPI network with an unevenly sparse and dense node distribution. To overcome these challenges, we propose AdaPPI, an adaptive convolution graph network in PPI networks to predict protein functional modules. We first suggest an attributed graph node presentation algorithm. It can effectively integrate protein gene ontology attributes and network topology, and adaptively aggregates low- or high-order graph structural information according to the node distribution by considering graph node smoothness. Based on the obtained node representations, core cliques and expansion algorithms are applied to find functional modules in PPI networks. Comprehensive performance evaluations and case studies indicate that the framework significantly outperforms state-of-the-art methods. We also presented potential functional modules based on their confidence.
Collapse
|
12
|
Yang X, Xu W, Leng D, Wen Y, Wu L, Li R, Huang J, Bo X, He S. Exploring novel disease-disease associations based on multi-view fusion network. Comput Struct Biotechnol J 2023; 21:1807-1819. [PMID: 36923471 PMCID: PMC10009443 DOI: 10.1016/j.csbj.2023.02.038] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 02/02/2023] [Accepted: 02/22/2023] [Indexed: 03/06/2023] Open
Abstract
Established taxonomy system based on disease symptom and tissue characteristics have provided an important basis for physicians to correctly identify diseases and treat them successfully. However, these classifications tend to be based on phenotypic observations, lacking a molecular biological foundation. Therefore, there is an urgent to integrate multi-dimensional molecular biological information or multi-omics data to redefine disease classification in order to provide a powerful perspective for understanding the molecular structure of diseases. Therefore, we offer a flexible disease classification that integrates the biological process, gene expression, and symptom phenotype of diseases, and propose a disease-disease association network based on multi-view fusion. We applied the fusion approach to 223 diseases and divided them into 24 disease clusters. The contribution of internal and external edges of disease clusters were analyzed. The results of the fusion model were compared with Medical Subject Headings, a traditional and commonly used disease taxonomy. Then, experimental results of model performance comparison show that our approach performs better than other integration methods. As it was observed, the obtained clusters provided more interesting and novel disease-disease associations. This multi-view human disease association network describes relationships between diseases based on multiple molecular levels, thus breaking through the limitation of the disease classification system based on tissues and organs. This approach which motivates clinicians and researchers to reposition the understanding of diseases and explore diagnosis and therapy strategies, extends the existing disease taxonomy. Availability of data and materials The preprocessed dataset and source code supporting the conclusions of this article are available at GitHub repository https://github.com/yangxiaoxi89/mvHDN.
Collapse
Affiliation(s)
- Xiaoxi Yang
- Clinical Medicine Institute, Beijing Friendship Hospital, Capital Medical University, Beijing 100050, China.,Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Wenjian Xu
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China.,Rare Disease Center, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing 100045, China.,MOE Key Laboratory of Major Diseases in Children, Beijing 100045, China.,Beijing Key Laboratory for Genetics of Birth Defects, Beijing Pediatric Research Institute, Beijing 100045, China
| | - Dongjin Leng
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yuqi Wen
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Lianlian Wu
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Ruijiang Li
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Jian Huang
- Clinical Medicine Institute, Beijing Friendship Hospital, Capital Medical University, Beijing 100050, China
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| |
Collapse
|
13
|
Kim Y, Jeon SJ, Gonzales EL, Shin D, Remonde CG, Ahn T, Shin CY. Pirenperone relieves the symptoms of fragile X syndrome in Fmr1 knockout mice. Sci Rep 2022; 12:20966. [PMID: 36470953 PMCID: PMC9723111 DOI: 10.1038/s41598-022-25582-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 12/01/2022] [Indexed: 12/12/2022] Open
Abstract
Fragile X syndrome (FXS) is a neurodevelopmental disorder that is caused by the loss of Fragile X-linked mental retardation protein (FMRP), an RNA binding protein that can bind and recognize different RNA structures and regulate the target mRNAs' translation involved in neuronal synaptic plasticity. Perturbations of this gene expression network have been related to abnormal behavioral symptoms such as hyperactivity, and impulsivity. Considering the roles of FMRP in the modulation of mRNA translation, we investigated the differentially expressed genes which might be targeted to revert to normal and ameliorate behavioral symptoms. Gene expression data was analyzed and used the connectivity map (CMap) to understand the changes in gene expression in FXS and predict the effective drug candidates. We analyzed the GSE7329 dataset that had 15 control and 8 FXS patients' lymphoblastoid samples. Among 924 genes, 42 genes were selected as signatures for CMap analysis, and 24 associated drugs were found. Pirenperone was selected as a potential drug candidate for FXS for its possible antipsychotic effect. Treatment of pirenperone increased the expression level of Fmr1 gene. Moreover, pirenperone rescued the behavioral deficits in Fmr1 KO mice including hyperactivity, spatial memory, and impulsivity. These results suggest that pirenperone is a new drug candidate for FXS, which should be verified in future studies.
Collapse
Affiliation(s)
- Yujeong Kim
- grid.258676.80000 0004 0532 8339Department of Pharmacology and Department of Advanced Translational Medicine, School of Medicine, Konkuk University, Seoul, 05029 Republic of Korea
| | - Se Jin Jeon
- grid.412357.60000 0004 0533 2063Department of Integrative Biotechnology, College of Science and Technology, Sahmyook University, Seoul, 01795 Republic of Korea
| | - Edson Luck Gonzales
- grid.258676.80000 0004 0532 8339Department of Pharmacology and Department of Advanced Translational Medicine, School of Medicine, Konkuk University, Seoul, 05029 Republic of Korea
| | - Dongpil Shin
- grid.258676.80000 0004 0532 8339Department of Pharmacology and Department of Advanced Translational Medicine, School of Medicine, Konkuk University, Seoul, 05029 Republic of Korea
| | - Chilly Gay Remonde
- grid.258676.80000 0004 0532 8339Department of Pharmacology and Department of Advanced Translational Medicine, School of Medicine, Konkuk University, Seoul, 05029 Republic of Korea
| | - TaeJin Ahn
- grid.411957.f0000 0004 0647 2543Department of Life Science, Handong Global University, Nehemiah 36, Handong-ro 558, Pohang, 37554 Republic of Korea
| | - Chan Young Shin
- grid.258676.80000 0004 0532 8339Department of Pharmacology and Department of Advanced Translational Medicine, School of Medicine, Konkuk University, Seoul, 05029 Republic of Korea
| |
Collapse
|
14
|
Yue R, Dutta A. Computational systems biology in disease modeling and control, review and perspectives. NPJ Syst Biol Appl 2022; 8:37. [PMID: 36192551 PMCID: PMC9528884 DOI: 10.1038/s41540-022-00247-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 09/05/2022] [Indexed: 02/02/2023] Open
Abstract
Omics-based approaches have become increasingly influential in identifying disease mechanisms and drug responses. Considering that diseases and drug responses are co-expressed and regulated in the relevant omics data interactions, the traditional way of grabbing omics data from single isolated layers cannot always obtain valuable inference. Also, drugs have adverse effects that may impair patients, and launching new medicines for diseases is costly. To resolve the above difficulties, systems biology is applied to predict potential molecular interactions by integrating omics data from genomic, proteomic, transcriptional, and metabolic layers. Combined with known drug reactions, the resulting models improve medicines' therapeutical performance by re-purposing the existing drugs and combining drug molecules without off-target effects. Based on the identified computational models, drug administration control laws are designed to balance toxicity and efficacy. This review introduces biomedical applications and analyses of interactions among gene, protein and drug molecules for modeling disease mechanisms and drug responses. The therapeutical performance can be improved by combining the predictive and computational models with drug administration designed by control laws. The challenges are also discussed for its clinical uses in this work.
Collapse
Affiliation(s)
- Rongting Yue
- Department of Electrical and Computer Engineering, University of Connecticut, 371 Fairfield Way, Storrs, CT, 06269, USA.
| | - Abhishek Dutta
- Department of Electrical and Computer Engineering, University of Connecticut, 371 Fairfield Way, Storrs, CT, 06269, USA
| |
Collapse
|
15
|
Chen Y, Hu Y, Hu X, Feng C, Chen M. CoGO: a contrastive learning framework to predict disease similarity based on gene network and ontology structure. Bioinformatics 2022; 38:4380-4386. [PMID: 35900147 DOI: 10.1093/bioinformatics/btac520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 06/16/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Quantifying the similarity of human diseases provides guiding insights to the discovery of micro-scope mechanisms from a macro scale. Previous work demonstrated that better performance can be gained by integrating multiview data sources or applying machine learning techniques. However, designing an efficient framework to extract and incorporate information from different biological data using deep learning models remains unexplored. RESULTS We present CoGO, a Contrastive learning framework to predict disease similarity based on Gene network and Ontology structure, which incorporates the gene interaction network and gene ontology (GO) domain knowledge using graph deep learning models. First, graph deep learning models are applied to encode the features of genes and GO terms from separate graph structure data. Next, gene and GO features are projected to a common embedding space via a nonlinear projection. Then cross-view contrastive loss is applied to maximize the agreement of corresponding gene-GO associations and lead to meaningful gene representation. Finally, CoGO infers the similarity between diseases by the cosine similarity of disease representation vectors derived from related gene embedding. In our experiments, CoGO outperforms the most competitive baseline method on both AUROC and AUPRC, especially improves 19.57% in AUPRC (0.7733). The prediction results are significantly comparable with other disease similarity studies and thus highly credible. Furthermore, we conduct a detailed case study of top similar disease pairs which is demonstrated by other studies. Empirical results show that CoGO achieves powerful performance in disease similarity problem. AVAILABILITY AND IMPLEMENTATION https://github.com/yhchen1123/CoGO.
Collapse
Affiliation(s)
- Yuhao Chen
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yanshi Hu
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Xiaotian Hu
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Cong Feng
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Ming Chen
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China.,Biomedical Big Data Center, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310058, China.,Institute of Hematology, Zhejiang University, Hangzhou, 310058, China
| |
Collapse
|
16
|
Thistlethwaite LR, Li X, Burrage LC, Riehle K, Hacia JG, Braverman N, Wangler MF, Miller MJ, Elsea SH, Milosavljevic A. Clinical diagnosis of metabolic disorders using untargeted metabolomic profiling and disease-specific networks learned from profiling data. Sci Rep 2022; 12:6556. [PMID: 35449147 PMCID: PMC9023513 DOI: 10.1038/s41598-022-10415-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Accepted: 03/14/2022] [Indexed: 02/06/2023] Open
Abstract
Untargeted metabolomics is a global molecular profiling technology that can be used to screen for inborn errors of metabolism (IEMs). Metabolite perturbations are evaluated based on current knowledge of specific metabolic pathway deficiencies, a manual diagnostic process that is qualitative, has limited scalability, and is not equipped to learn from accumulating clinical data. Our purpose was to improve upon manual diagnosis of IEMs in the clinic by developing novel computational methods for analyzing untargeted metabolomics data. We employed CTD, an automated computational diagnostic method that "connects the dots" between metabolite perturbations observed in individual metabolomics profiling data and modules identified in disease-specific metabolite co-perturbation networks learned from prior profiling data. We also extended CTD to calculate distances between any two individuals (CTDncd) and between an individual and a disease state (CTDdm), to provide additional network-quantified predictors for use in diagnosis. We show that across 539 plasma samples, CTD-based network-quantified measures can reproduce accurate diagnosis of 16 different IEMs, including adenylosuccinase deficiency, argininemia, argininosuccinic aciduria, aromatic L-amino acid decarboxylase deficiency, cerebral creatine deficiency syndrome type 2, citrullinemia, cobalamin biosynthesis defect, GABA-transaminase deficiency, glutaric acidemia type 1, maple syrup urine disease, methylmalonic aciduria, ornithine transcarbamylase deficiency, phenylketonuria, propionic acidemia, rhizomelic chondrodysplasia punctata, and the Zellweger spectrum disorders. Our approach can be used to supplement information from biochemical pathways and has the potential to significantly enhance the interpretation of variants of uncertain significance uncovered by exome sequencing. CTD, CTDdm, and CTDncd can serve as an essential toolset for biological interpretation of untargeted metabolomics data that overcomes limitations associated with manual diagnosis to assist diagnosticians in clinical decision-making. By automating and quantifying the interpretation of perturbation patterns, CTD can improve the speed and confidence by which clinical laboratory directors make diagnostic and treatment decisions, while automatically improving performance with new case data.
Collapse
Affiliation(s)
- Lillian R Thistlethwaite
- Quantitative and Computational Biosciences Program, Baylor College of Medicine, One Baylor Plaza, 400D, Houston, TX, 77030, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Xiqi Li
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Lindsay C Burrage
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Texas Children's Hospital, Houston, TX, USA
| | - Kevin Riehle
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Joseph G Hacia
- Department of Biochemistry and Molecular Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA
| | - Nancy Braverman
- Department of Pediatrics and Human Genetics, McGill University, Montreal, QC, Canada
| | - Michael F Wangler
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Texas Children's Hospital, Houston, TX, USA
- Jan and Dan Duncan Texas Children's Hospital Neurological Research Institute, Houston, TX, USA
| | - Marcus J Miller
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Sarah H Elsea
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Aleksandar Milosavljevic
- Quantitative and Computational Biosciences Program, Baylor College of Medicine, One Baylor Plaza, 400D, Houston, TX, 77030, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
17
|
Zhang N, Zang T. A multi-network integration approach for measuring disease similarity based on ncRNA regulation and heterogeneous information. BMC Bioinformatics 2022; 23:89. [PMID: 35255810 PMCID: PMC8902705 DOI: 10.1186/s12859-022-04613-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 02/14/2022] [Indexed: 11/28/2022] Open
Abstract
Background Measuring similarity between complex diseases has significant implications for revealing the pathogenesis of diseases and development in the domain of biomedicine. It has been consentaneous that functional associations between disease-related genes and semantic associations can be applied to calculate disease similarity. Currently, more and more studies have demonstrated the profound involvement of non-coding RNA in the regulation of genome organization and gene expression. Thus, taking ncRNA into account can be useful in measuring disease similarities. However, existing methods ignore the regulation functions of ncRNA in biological process. In this study, we proposed a novel deep-learning method to deduce disease similarity. Results In this article, we proposed a novel method, ImpAESim, a framework integrating multiple networks embedding to learn compact feature representations and disease similarity calculation. We first utilize three different disease-related information networks to build up a heterogeneous network, after a network diffusion process, RWR, a compact feature learning model composed of classic Auto Encoder (AE) and improved AE model is proposed to extract constraints and low-dimensional feature representations. We finally obtain an accurate and low-dimensional feature representation of diseases, then we employed the cosine distance as the measurement of disease similarity. Conclusion ImpAESim focuses on extracting a low-dimensional vector representation of features based on ncRNA regulation, and gene–gene interaction network. Our method can significantly reduce the calculation bias resulted from the sparse disease associations which are derived from semantic associations.
Collapse
Affiliation(s)
- Ningyi Zhang
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Tianyi Zang
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.
| |
Collapse
|
18
|
Cheong JH, Wang SC, Park S, Porembka MR, Christie AL, Kim H, Kim HS, Zhu H, Hyung WJ, Noh SH, Hu B, Hong C, Karalis JD, Kim IH, Lee SH, Hwang TH. Development and validation of a prognostic and predictive 32-gene signature for gastric cancer. Nat Commun 2022; 13:774. [PMID: 35140202 PMCID: PMC8828873 DOI: 10.1038/s41467-022-28437-y] [Citation(s) in RCA: 54] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 01/21/2022] [Indexed: 12/12/2022] Open
Abstract
Genomic profiling can provide prognostic and predictive information to guide clinical care. Biomarkers that reliably predict patient response to chemotherapy and immune checkpoint inhibition in gastric cancer are lacking. In this retrospective analysis, we use our machine learning algorithm NTriPath to identify a gastric-cancer specific 32-gene signature. Using unsupervised clustering on expression levels of these 32 genes in tumors from 567 patients, we identify four molecular subtypes that are prognostic for survival. We then built a support vector machine with linear kernel to generate a risk score that is prognostic for five-year overall survival and validate the risk score using three independent datasets. We also find that the molecular subtypes predict response to adjuvant 5-fluorouracil and platinum therapy after gastrectomy and to immune checkpoint inhibitors in patients with metastatic or recurrent disease. In sum, we show that the 32-gene signature is a promising prognostic and predictive biomarker to guide the clinical care of gastric cancer patients and should be validated using large patient cohorts in a prospective manner. The ability to predict the survival and response to treatment of cancer patients may improve patient care. Here, the authors generate a 32 gene signature that can predict the survival and response to treatment in gastric cancer patients.
Collapse
Affiliation(s)
- Jae-Ho Cheong
- Department of Surgery, Yonsei University College of Medicine, Seoul, South Korea. .,Department of Biochemistry and Molecular Biology, Yonsei University College of Medicine, Seoul, South Korea. .,Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, South Korea.
| | - Sam C Wang
- Division of Surgical Oncology, Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Sunho Park
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, USA
| | - Matthew R Porembka
- Division of Surgical Oncology, Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Alana L Christie
- Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Hyunki Kim
- Department of Pathology, Yonsei University College of Medicine, Seoul, South Korea
| | - Hyo Song Kim
- Department of Internal Medicine, Division of Medical Oncology, Yonsei University College of Medicine, Seoul, South Korea
| | - Hong Zhu
- Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Woo Jin Hyung
- Department of Surgery, Yonsei University College of Medicine, Seoul, South Korea
| | - Sung Hoon Noh
- Department of Surgery, Yonsei University College of Medicine, Seoul, South Korea
| | - Bo Hu
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Changjin Hong
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, USA
| | - John D Karalis
- Division of Surgical Oncology, Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - In-Ho Kim
- Department of Internal Medicine, Division of Medical Oncology, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, South Korea
| | - Sung Hak Lee
- Department of Hospital Pathology, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, South Korea
| | - Tae Hyun Hwang
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, USA. .,Department of Immunology, Mayo Clinic, Jacksonville, FL, USA.
| |
Collapse
|
19
|
Xiang J, Zhang J, Zhao Y, Wu FX, Li M. Biomedical data, computational methods and tools for evaluating disease-disease associations. Brief Bioinform 2022; 23:6522999. [PMID: 35136949 DOI: 10.1093/bib/bbac006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 01/04/2022] [Accepted: 01/05/2022] [Indexed: 12/12/2022] Open
Abstract
In recent decades, exploring potential relationships between diseases has been an active research field. With the rapid accumulation of disease-related biomedical data, a lot of computational methods and tools/platforms have been developed to reveal intrinsic relationship between diseases, which can provide useful insights to the study of complex diseases, e.g. understanding molecular mechanisms of diseases and discovering new treatment of diseases. Human complex diseases involve both external phenotypic abnormalities and complex internal molecular mechanisms in organisms. Computational methods with different types of biomedical data from phenotype to genotype can evaluate disease-disease associations at different levels, providing a comprehensive perspective for understanding diseases. In this review, available biomedical data and databases for evaluating disease-disease associations are first summarized. Then, existing computational methods for disease-disease associations are reviewed and classified into five groups in terms of the usages of biomedical data, including disease semantic-based, phenotype-based, function-based, representation learning-based and text mining-based methods. Further, we summarize software tools/platforms for computation and analysis of disease-disease associations. Finally, we give a discussion and summary on the research of disease-disease associations. This review provides a systematic overview for current disease association research, which could promote the development and applications of computational methods and tools/platforms for disease-disease associations.
Collapse
Affiliation(s)
- Ju Xiang
- School of Computer Science and Engineering, Central South University, China
| | - Jiashuai Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, China
| | - Fang-Xiang Wu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Min Li
- Division of Biomedical Engineering and Department of Mechanical Engineering at University of Saskatchewan, Saskatoon, Canada
| |
Collapse
|
20
|
Liu L, Zhang Y, Niu G, Li Q, Li Z, Zhu T, Feng C, Liu X, Zhang Y, Xu T, Chen R, Teng X, Zhang R, Zou D, Ma L, Zhang Z. BrainBase: a curated knowledgebase for brain diseases. Nucleic Acids Res 2022; 50:D1131-D1138. [PMID: 34718720 PMCID: PMC8728122 DOI: 10.1093/nar/gkab987] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 10/01/2021] [Accepted: 10/07/2021] [Indexed: 12/23/2022] Open
Abstract
Brain is the central organ of the nervous system and any brain disease can seriously affect human health. Here we present BrainBase (https://ngdc.cncb.ac.cn/brainbase), a curated knowledgebase for brain diseases that aims to provide a whole picture of brain diseases and associated genes. Specifically, based on manual curation of 2768 published articles along with information retrieval from several public databases, BrainBase features comprehensive collection of 7175 disease-gene associations spanning a total of 123 brain diseases and linking with 5662 genes, 16 591 drug-target interactions covering 2118 drugs/chemicals and 623 genes, and five types of specific genes in light of expression specificity in brain tissue/regions/cerebrospinal fluid/cells. In addition, considering the severity of glioma among brain tumors, the current version of BrainBase incorporates 21 multi-omics datasets, presents molecular profiles across various samples/conditions and identifies four groups of glioma featured genes with potential clinical significance. Collectively, BrainBase integrates not only valuable curated disease-gene associations and drug-target interactions but also molecular profiles through multi-omics data analysis, accordingly bearing great promise to serve as a valuable knowledgebase for brain diseases.
Collapse
Affiliation(s)
- Lin Liu
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
| | - Yang Zhang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Guangyi Niu
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qianpeng Li
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhao Li
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tongtong Zhu
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Changrui Feng
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaonan Liu
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yuansheng Zhang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tianyi Xu
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
| | - Ruru Chen
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xufei Teng
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Rongqin Zhang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Dong Zou
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
| | - Lina Ma
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhang Zhang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
21
|
Maudsley S, Leysen H, van Gastel J, Martin B. Systems Pharmacology: Enabling Multidimensional Therapeutics. COMPREHENSIVE PHARMACOLOGY 2022:725-769. [DOI: 10.1016/b978-0-12-820472-6.00017-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
22
|
Leysen H, Walter D, Christiaenssen B, Vandoren R, Harputluoğlu İ, Van Loon N, Maudsley S. GPCRs Are Optimal Regulators of Complex Biological Systems and Orchestrate the Interface between Health and Disease. Int J Mol Sci 2021; 22:ijms222413387. [PMID: 34948182 PMCID: PMC8708147 DOI: 10.3390/ijms222413387] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 12/08/2021] [Accepted: 12/09/2021] [Indexed: 02/06/2023] Open
Abstract
GPCRs arguably represent the most effective current therapeutic targets for a plethora of diseases. GPCRs also possess a pivotal role in the regulation of the physiological balance between healthy and pathological conditions; thus, their importance in systems biology cannot be underestimated. The molecular diversity of GPCR signaling systems is likely to be closely associated with disease-associated changes in organismal tissue complexity and compartmentalization, thus enabling a nuanced GPCR-based capacity to interdict multiple disease pathomechanisms at a systemic level. GPCRs have been long considered as controllers of communication between tissues and cells. This communication involves the ligand-mediated control of cell surface receptors that then direct their stimuli to impact cell physiology. Given the tremendous success of GPCRs as therapeutic targets, considerable focus has been placed on the ability of these therapeutics to modulate diseases by acting at cell surface receptors. In the past decade, however, attention has focused upon how stable multiprotein GPCR superstructures, termed receptorsomes, both at the cell surface membrane and in the intracellular domain dictate and condition long-term GPCR activities associated with the regulation of protein expression patterns, cellular stress responses and DNA integrity management. The ability of these receptorsomes (often in the absence of typical cell surface ligands) to control complex cellular activities implicates them as key controllers of the functional balance between health and disease. A greater understanding of this function of GPCRs is likely to significantly augment our ability to further employ these proteins in a multitude of diseases.
Collapse
Affiliation(s)
- Hanne Leysen
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - Deborah Walter
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - Bregje Christiaenssen
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - Romi Vandoren
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - İrem Harputluoğlu
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
- Department of Chemistry, Middle East Technical University, Çankaya, Ankara 06800, Turkey
| | - Nore Van Loon
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
| | - Stuart Maudsley
- Receptor Biology Lab, University of Antwerp, 2610 Wilrijk, Belgium; (H.L.); (D.W.); (B.C.); (R.V.); (İ.H.); (N.V.L.)
- Correspondence:
| |
Collapse
|
23
|
Gao J, Zhang X, Tian L, Liu Y, Wang J, Li Z, Hu X. MTGNN: Multi-Task Graph Neural Network based few-shot learning for disease similarity measurement. Methods 2021; 198:88-95. [PMID: 34700014 DOI: 10.1016/j.ymeth.2021.10.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 10/16/2021] [Accepted: 10/18/2021] [Indexed: 11/24/2022] Open
Abstract
Similar diseases are usually caused by molecular origins or similar phenotypes. Confirming the relationship between diseases can help researchers gain a deep insight of the pathogenic mechanisms of emerging complex diseases, and improve the corresponding diagnoses and treatment. Therefore, similar diseases are considerably important in biology and pathology. However, the insufficient number of labelled similar disease pairs cannot support the optimal training of the models. In this paper, we propose a Multi-Task Graph Neural Network (MTGNN) framework to measure disease similarity by few-shot learning. To tackle the problem of insufficient number of labelled similar disease pairs, we design the multi-task optimization strategy to train the graph neural network for disease similarity task (lack of labelled training data) by introducing link prediction task (sufficient labelled training data). The similarity between diseases can then be obtained by measuring the distance between disease embeddings in high-dimensional space learning from the double tasks. The experiment results evaluate the performance of MTGNN and illustrate its advantages over previous methods on few labeled training dataset.
Collapse
Affiliation(s)
- Jianliang Gao
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Xiangchi Zhang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Ling Tian
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yuxin Liu
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Zhao Li
- Alibaba Group, Hangzhou 310000, China.
| | - Xiaohua Hu
- College of Computing & Informatics, Drexel University, Philadelphia, PA 19104, USA
| |
Collapse
|
24
|
Li Y, Wang K, Wang G. Evaluating Disease Similarity Based on Gene Network Reconstruction and Representation. Bioinformatics 2021; 37:3579-3587. [PMID: 33978702 DOI: 10.1093/bioinformatics/btab252] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 03/01/2021] [Accepted: 04/28/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Quantifying the associations between diseases is of great significance in increasing our understanding of disease biology, improving disease diagnosis, re-positioning, and developing drugs. Therefore, in recent years, the research of disease similarity has received a lot of attention in the field of bioinformatics. Previous work has shown that the combination of the ontology (such as disease ontology and gene ontology) and disease-gene interactions are worthy to be regarded to elucidate diseases and disease associations. However, most of them are either based on the overlap between disease-related gene sets or distance within the ontology's hierarchy. The diseases in these methods are represented by discrete or sparse feature vectors, which cannot grasp the deep semantic information of diseases. Recently, deep representation learning has been widely studied and gradually applied to various fields of bioinformatics. Based on the hypothesis that disease representation depends on its related gene representations, we propose a disease representation model using two most representative gene resources HumanNet and Gene Ontology to construct a new gene network and learn gene (disease) representations. The similarity between two diseases is computed by the cosine similarity of their corresponding representations. RESULTS We propose a novel approach to compute disease similarity, which integrates two important factors disease-related genes and gene ontology hierarchy to learn disease representation based on deep representation learning. Under the same experimental settings, the AUC value of our method is 0.8074, which improves the most competitive baseline method by 10.1%. The quantitative and qualitative experimental results show that our model can learn effective disease representations and improve the accuracy of disease similarity computation significantly. AVAILABILITY The research shows that this method has certain applicability in the prediction of gene-related diseases, the migration of disease treatment methods, drug development, and so on. SUPPLEMENTARY INFORMATION Supplementary data are available at https://github.com/catly/disease_similarity.
Collapse
Affiliation(s)
- Yang Li
- College of information and Computer Engineering, Northeast Forestry University, Harbin, 150004, China
| | - Keqi Wang
- College of information and Computer Engineering, Northeast Forestry University, Harbin, 150004, China
| | - Guohua Wang
- College of information and Computer Engineering, Northeast Forestry University, Harbin, 150004, China
| |
Collapse
|
25
|
Naveed H, Reglin C, Schubert T, Gao X, Arold ST, Maitland ML. Identifying Novel Drug Targets by iDTPnd: A Case Study of Kinase Inhibitors. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:986-997. [PMID: 33794377 PMCID: PMC9403029 DOI: 10.1016/j.gpb.2020.05.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 01/08/2020] [Accepted: 05/11/2020] [Indexed: 11/16/2022]
Abstract
Current FDA-approved kinase inhibitors cause diverse adverse effects, some of which are due to the mechanism-independent effects of these drugs. Identifying these mechanism-independent interactions could improve drug safety and support drug repurposing. Here, we develop iDTPnd (integrated Drug Target Predictor with negative dataset), a computational approach for large-scale discovery of novel targets for known drugs. For a given drug, we construct a positive structural signature as well as a negative structural signature that captures the weakly conserved structural features of drug-binding sites. To facilitate assessment of unintended targets, iDTPnd also provides a docking-based interaction score and its statistical significance. We confirm the interactions of sorafenib, imatinib, dasatinib, sunitinib, and pazopanib with their known targets at a sensitivity of 52% and a specificity of 55%. We also validate 10 predicted novel targets by using in vitro experiments. Our results suggest that proteins other than kinases, such as nuclear receptors, cytochrome P450, and MHC class I molecules, can also be physiologically relevant targets of kinase inhibitors. Our method is general and broadly applicable for the identification of protein–small molecule interactions, when sufficient drug–target 3D data are available. The code for constructing the structural signatures is available at https://sfb.kaust.edu.sa/Documents/iDTP.zip.
Collapse
Affiliation(s)
- Hammad Naveed
- Toyota Technological Institute at Chicago, Chicago, IL 60637, USA; Department of Computer Science, National University of Computer and Emerging Sciences, Islamabad 44000, Pakistan.
| | | | | | - Xin Gao
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955, Saudi Arabia
| | - Stefan T Arold
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Biological and Environmental Sciences and Engineering (BESE) Division, Thuwal 23955, Saudi Arabia
| | - Michael L Maitland
- Inova Center for Personalized Health and Schar Cancer Institute, Falls Church, VA 22042 USA,; University of Virginia Cancer Center, Annandale, Virginia 22003, USA
| |
Collapse
|
26
|
Disease network delineates the disease progression profile of cardiovascular diseases. J Biomed Inform 2021; 115:103686. [PMID: 33493631 DOI: 10.1016/j.jbi.2021.103686] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 01/14/2021] [Accepted: 01/15/2021] [Indexed: 11/20/2022]
Abstract
OBJECTIVE As Electronic Health Records (EHR) data accumulated explosively in recent years, the tremendous amount of patient clinical data provided opportunities to discover real world evidence. In this study, a graphical disease network, named progressive cardiovascular disease network (progCDN), was built to delineate the progression profiles of cardiovascular diseases (CVD). MATERIALS AND METHODS The EHR data of 14.3 million patients with CVD diagnoses were collected for building disease network and further analysis. We applied a new designed method, progression rates (PR), to calculate the progression relationship among different diagnoses. Based on the disease network outcome, 23 disease progression pair were selected to screen for salient features. RESULTS The network depicted the dominant diseases in CVD development, such as the heart failure and coronary arteriosclerosis. Novel progression relationships were also discovered, such as the progression path from long QT syndrome to major depression. In addition, three age-group progCDNs identified a series of age-associated disease progression paths and important successor diseases with age bias. Furthermore, a list of important features with sufficient abundance and high correlation was extracted for building disease risk models. DISCUSSION The PR method designed for identifying the progression relationship could be widely applied in any EHR database due to its flexibility and robust functionality. Meanwhile, researchers could use the progCDN network to validate or explore novel disease relationships in real world data. CONCLUSION The first-time interrogation of such a huge CVD patients cohort enabled us to explore the general and age-specific disease progression patterns in CVD development.
Collapse
|
27
|
Shi W, Chen X, Deng L. A Review of Recent Developments and Progress in Computational Drug Repositioning. Curr Pharm Des 2021; 26:3059-3068. [PMID: 31951162 DOI: 10.2174/1381612826666200116145559] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 01/09/2020] [Indexed: 12/27/2022]
Abstract
Computational drug repositioning is an efficient approach towards discovering new indications for existing drugs. In recent years, with the accumulation of online health-related information and the extensive use of biomedical databases, computational drug repositioning approaches have achieved significant progress in drug discovery. In this review, we summarize recent advancements in drug repositioning. Firstly, we explicitly demonstrated the available data source information which is conducive to identifying novel indications. Furthermore, we provide a summary of the commonly used computing approaches. For each method, we briefly described techniques, case studies, and evaluation criteria. Finally, we discuss the limitations of the existing computing approaches.
Collapse
Affiliation(s)
- Wanwan Shi
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Xuegong Chen
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
28
|
Qin R, Duan L, Zheng H, Li-Ling J, Song K, Zhang Y. An Ontology-Independent Representation Learning for Similar Disease Detection Based on Multi-Layer Similarity Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:183-193. [PMID: 31536013 DOI: 10.1109/tcbb.2019.2941475] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
To identify similar diseases has significant implications for revealing the etiology and pathogenesis of diseases and further research in the domain of biomedicine. Currently, most methods for the measurement of disease similarity utilize either associations of ontological disease concepts or functional interactions between disease-related genes. These methods are heavily dependent on the ontology, which are not always available, and the selection of datasets. Moreover, many methods suffer from a drawback that they only use a single metric to evaluate disease similarity from an individual data source, which may result in biased conclusions without consideration of other aspects. In this study, we proposed a novel ontology-independent framework, namely RADAR, for learning representations for diseases to deduce their similarities from an integrative perspective. By leveraging the associations between diseases and disease-related biomedical entities, a disease similarity network was built under various metrics. Then, a multi-layer disease similarity network was constructed by integrating multiple disease similarity networks derived from multiple data sources, where the representation learning was derived to provide a comprehensive evaluation of disease similarities. The performance of RADAR was assessed by a benchmark disease set and 100 random disease sets. Experimental results demonstrated that RADAR can detect similar diseases effectively.
Collapse
|
29
|
Karaman Mayack B, Sippl W. Current In Silico Drug Repurposing Strategies. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11523-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
30
|
Wu Z, Liao Q, Fan S, Liu B. idenPC-CAP: Identify protein complexes from weighted RNA-protein heterogeneous interaction networks using co-assemble partner relation. Brief Bioinform 2020; 22:6041167. [PMID: 33333549 DOI: 10.1093/bib/bbaa372] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 11/07/2020] [Accepted: 11/20/2020] [Indexed: 12/18/2022] Open
Abstract
Protein complexes play important roles in most cellular processes. The available genome-wide protein-protein interaction (PPI) data make it possible for computational methods identifying protein complexes from PPI networks. However, PPI datasets usually contain a large ratio of false positive noise. Moreover, different types of biomolecules in a living cell cooperate to form a union interaction network. Because previous computational methods focus only on PPIs ignoring other types of biomolecule interactions, their predicted protein complexes often contain many false positive proteins. In this study, we develop a novel computational method idenPC-CAP to identify protein complexes from the RNA-protein heterogeneous interaction network consisting of RNA-RNA interactions, RNA-protein interactions and PPIs. By considering interactions among proteins and RNAs, the new method reduces the ratio of false positive proteins in predicted protein complexes. The experimental results demonstrate that idenPC-CAP outperforms the other state-of-the-art methods in this field.
Collapse
Affiliation(s)
- Zhourun Wu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Qing Liao
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Shixi Fan
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China
| |
Collapse
|
31
|
Xu H, Wang H, Yuan C, Zhai Q, Tian X, Wu L, Mi Y. Identifying diseases that cause psychological trauma and social avoidance by GCN-Xgboost. BMC Bioinformatics 2020; 21:504. [PMID: 33323103 PMCID: PMC7739481 DOI: 10.1186/s12859-020-03847-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Accepted: 10/27/2020] [Indexed: 11/10/2022] Open
Abstract
Background With the rapid development of medical treatment, many patients not only consider the survival time, but also care about the quality of life. Changes in physical, psychological and social functions after and during treatment have caused a lot of troubles to patients and their families. Based on the bio-psycho-social medical model theory, mental health plays an important role in treatment. Therefore, it is necessary for medical staff to know the diseases which have high potential to cause psychological trauma and social avoidance (PTSA). Results Firstly, we obtained diseases which can cause PTSA from literatures. Then, we calculated the similarities of related-diseases to build a disease network. The similarities between diseases were based on their known related genes. Then, we obtained these diseases-related proteins from UniProt. These proteins were extracted as the features of diseases. Therefore, in the disease network, each node denotes a disease and contains the information of its related proteins, and the edges of the network are the similarities of diseases. Then, graph convolutional network (GCN) was used to encode the disease network. In this way, each disease’s own feature and its relationship with other diseases were extracted. Finally, Xgboost was used to identify PTSA diseases. Conclusion We developed a novel method ‘GCN-Xgboost’ and compared it with some traditional methods. Using leave-one-out cross-validation, the AUC and AUPR were higher than some existing methods. In addition, case studies have been done to verify our results. We also discussed the trajectory of social avoidance and distress during acute survival of breast cancer patients.
Collapse
Affiliation(s)
- Huijuan Xu
- First Department of Breast Surgery, Shanxi Provincial Cancer Hospital, Taiyuan, People's Republic of China
| | - Hairong Wang
- Department of Nursing, Shanxi Provincial Cancer Hospital, Taiyuan, People's Republic of China.
| | - Chenshan Yuan
- Department of Nutrition, Shanxi Provincial Cancer Hospital, Taiyuan, People's Republic of China
| | - Qinghua Zhai
- Department of Medical Records, Shanxi Provincial Cancer Hospital, Taiyuan, People's Republic of China
| | - Xufeng Tian
- Second Department of Breast Surgery, Shanxi Provincial Cancer Hospital, Taiyuan, People's Republic of China
| | - Lei Wu
- Second Department of Breast Surgery, Shanxi Provincial Cancer Hospital, Taiyuan, People's Republic of China
| | - Yuanyuan Mi
- Second Department of Breast Surgery, Shanxi Provincial Cancer Hospital, Taiyuan, People's Republic of China
| |
Collapse
|
32
|
Fang J, Pian C, Xu M, Kong L, Li Z, Ji J, Zhang L, Chen Y. Revealing Prognosis-Related Pathways at the Individual Level by a Comprehensive Analysis of Different Cancer Transcription Data. Genes (Basel) 2020; 11:genes11111281. [PMID: 33138076 PMCID: PMC7692404 DOI: 10.3390/genes11111281] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 10/26/2020] [Accepted: 10/26/2020] [Indexed: 02/07/2023] Open
Abstract
Identifying perturbed pathways at an individual level is important to discover the causes of cancer and develop individualized custom therapeutic strategies. Though prognostic gene lists have had success in prognosis prediction, using single genes that are related to the relevant system or specific network cannot fully reveal the process of tumorigenesis. We hypothesize that in individual samples, the disruption of transcription homeostasis can influence the occurrence, development, and metastasis of tumors and has implications for patient survival outcomes. Here, we introduced the individual-level pathway score, which can measure the correlation perturbation of the pathways in a single sample well. We applied this method to the expression data of 16 different cancer types from The Cancer Genome Atlas (TCGA) database. Our results indicate that different cancer types as well as their tumor-adjacent tissues can be clearly distinguished by the individual-level pathway score. Additionally, we found that there was strong heterogeneity among different cancer types and the percentage of perturbed pathways as well as the perturbation proportions of tumor samples in each pathway were significantly different. Finally, the prognosis-related pathways of different cancer types were obtained by survival analysis. We demonstrated that the individual-level pathway score (iPS) is capable of classifying cancer types and identifying some key prognosis-related pathways.
Collapse
Affiliation(s)
- Jingya Fang
- College of Agriculture, Nanjing Agricultural University, Nanjing 210095, China; (J.F.); (M.X.); (L.K.); (Z.L.); (J.J.)
| | - Cong Pian
- Department of Mathematics, College of Science, Nanjing Agricultural University, Nanjing 210095, China;
| | - Mingmin Xu
- College of Agriculture, Nanjing Agricultural University, Nanjing 210095, China; (J.F.); (M.X.); (L.K.); (Z.L.); (J.J.)
| | - Lingpeng Kong
- College of Agriculture, Nanjing Agricultural University, Nanjing 210095, China; (J.F.); (M.X.); (L.K.); (Z.L.); (J.J.)
| | - Zutan Li
- College of Agriculture, Nanjing Agricultural University, Nanjing 210095, China; (J.F.); (M.X.); (L.K.); (Z.L.); (J.J.)
| | - Jinwen Ji
- College of Agriculture, Nanjing Agricultural University, Nanjing 210095, China; (J.F.); (M.X.); (L.K.); (Z.L.); (J.J.)
| | - Liangyun Zhang
- College of Agriculture, Nanjing Agricultural University, Nanjing 210095, China; (J.F.); (M.X.); (L.K.); (Z.L.); (J.J.)
- Correspondence: (L.Z.); (Y.C.)
| | - Yuanyuan Chen
- Department of Mathematics, College of Science, Nanjing Agricultural University, Nanjing 210095, China;
- Correspondence: (L.Z.); (Y.C.)
| |
Collapse
|
33
|
Li X, Liu G, Chen W, Bi Z, Liang H. Network analysis of autistic disease comorbidities in Chinese children based on ICD-10 codes. BMC Med Inform Decis Mak 2020; 20:268. [PMID: 33069223 PMCID: PMC7568351 DOI: 10.1186/s12911-020-01282-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Accepted: 10/05/2020] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Autism is a lifelong disability associated with several comorbidities that confound diagnosis and treatment. A better understanding of these comorbidities would facilitate diagnosis and improve treatments. Our aim was to improve the detection of comorbid diseases associated with autism. METHODS We used an FP-growth algorithm to retrospectively infer disease associations using 1488 patients with autism treated at the Guangzhou Women and Children's Medical Center. The disease network was established using Cytoscape 3.7. The rules were internally validated by 10-fold cross-validation. All rules were further verified using the Columbia Open Health Data (COHD) and by literature search. RESULTS We found 148 comorbid diseases including intellectual disability, developmental speech disorder, and epilepsy. The network comprised of 76 nodes and 178 directed links. 158 links were confirmed by literature search and 105 links were validated by COHD. Furthermore, we identified 14 links not previously reported. CONCLUSION We demonstrate that the FP-growth algorithm can detect comorbid disease patterns, including novel ones, in patients with autism.
Collapse
Affiliation(s)
- Xiaojun Li
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, 510623, China
| | - Guangjian Liu
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, 510623, China
| | - Wenxiong Chen
- Department of Neurology, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, 510623, China
| | - Zhisheng Bi
- School of Basic Medical Sciences, Guangzhou Medical University, Guangzhou, 511436, China.
| | - Huiying Liang
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, 510623, China.
| |
Collapse
|
34
|
Gao J, Tian L, Wang J, Chen Y, Song B, Hu X. Similar Disease Prediction With Heterogeneous Disease Information Networks. IEEE Trans Nanobioscience 2020; 19:571-578. [PMID: 32603299 DOI: 10.1109/tnb.2020.2994983] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Studying the similarity of diseases can help us to explore the pathological characteristics of complex diseases, and help provide reliable reference information for inferring the relationship between new diseases and known diseases, so as to develop effective treatment plans. To obtain the similarity of the disease, most previous methods either use a single similarity metric such as semantic score, functional score from single data source, or utilize weighting coefficients to simply combine multiple metrics with different dimensions. In this paper, we proposes a method to predict the similarity of diseases by node representation learning. We first integrate the semantic score and topological score between diseases by combining multiple data sources. Then for each disease, its integrated scores with all other diseases are utilized to map it into a vector of the same spatial dimension, and the vectors are used to measure and comprehensively analyze the similarity between diseases. Lastly, we conduct comparative experiment based on benchmark set and other disease nodes outside the benchmark set. Using the statistics such as average, variance, and coefficient of variation in the benchmark set to evaluate multiple methods demonstrates the effectiveness of our approach in the prediction of similar diseases.
Collapse
|
35
|
Saberian N, Peyvandipour A, Donato M, Ansari S, Draghici S. A new computational drug repurposing method using established disease-drug pair knowledge. Bioinformatics 2020; 35:3672-3678. [PMID: 30840053 DOI: 10.1093/bioinformatics/btz156] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Revised: 01/15/2019] [Accepted: 03/04/2019] [Indexed: 12/23/2022] Open
Abstract
MOTIVATION Drug repurposing is a potential alternative to the classical drug discovery pipeline. Repurposing involves finding novel indications for already approved drugs. In this work, we present a novel machine learning-based method for drug repurposing. This method explores the anti-similarity between drugs and a disease to uncover new uses for the drugs. More specifically, our proposed method takes into account three sources of information: (i) large-scale gene expression profiles corresponding to human cell lines treated with small molecules, (ii) gene expression profile of a human disease and (iii) the known relationship between Food and Drug Administration (FDA)-approved drugs and diseases. Using these data, our proposed method learns a similarity metric through a supervised machine learning-based algorithm such that a disease and its associated FDA-approved drugs have smaller distance than the other disease-drug pairs. RESULTS We validated our framework by showing that the proposed method incorporating distance metric learning technique can retrieve FDA-approved drugs for their approved indications. Once validated, we used our approach to identify a few strong candidates for repurposing. AVAILABILITY AND IMPLEMENTATION The R scripts are available on demand from the authors. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nafiseh Saberian
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Azam Peyvandipour
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Michele Donato
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Sahar Ansari
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, MI, USA.,Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI, USA
| |
Collapse
|
36
|
Emon MA, Domingo-Fernández D, Hoyt CT, Hofmann-Apitius M. PS4DR: a multimodal workflow for identification and prioritization of drugs based on pathway signatures. BMC Bioinformatics 2020; 21:231. [PMID: 32503412 PMCID: PMC7275349 DOI: 10.1186/s12859-020-03568-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Accepted: 05/28/2020] [Indexed: 12/21/2022] Open
Abstract
Background During the last decade, there has been a surge towards computational drug repositioning owing to constantly increasing -omics data in the biomedical research field. While numerous existing methods focus on the integration of heterogeneous data to propose candidate drugs, it is still challenging to substantiate their results with mechanistic insights of these candidate drugs. Therefore, there is a need for more innovative and efficient methods which can enable better integration of data and knowledge for drug repositioning. Results Here, we present a customizable workflow (PS4DR) which not only integrates high-throughput data such as genome-wide association study (GWAS) data and gene expression signatures from disease and drug perturbations but also takes pathway knowledge into consideration to predict drug candidates for repositioning. We have collected and integrated publicly available GWAS data and gene expression signatures for several diseases and hundreds of FDA-approved drugs or those under clinical trial in this study. Additionally, different pathway databases were used for mechanistic knowledge integration in the workflow. Using this systematic consolidation of data and knowledge, the workflow computes pathway signatures that assist in the prediction of new indications for approved and investigational drugs. Conclusion We showcase PS4DR with applications demonstrating how this tool can be used for repositioning and identifying new drugs as well as proposing drugs that can simulate disease dysregulations. We were able to validate our workflow by demonstrating its capability to predict FDA-approved drugs for their known indications for several diseases. Further, PS4DR returned many potential drug candidates for repositioning that were backed up by epidemiological evidence extracted from scientific literature. Source code is freely available at https://github.com/ps4dr/ps4dr.
Collapse
Affiliation(s)
- Mohammad Asif Emon
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (Fraunhofer SCAI), 53757, Sankt Augustin, Germany. .,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53117, Bonn, Germany.
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (Fraunhofer SCAI), 53757, Sankt Augustin, Germany. .,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53117, Bonn, Germany.
| | - Charles Tapley Hoyt
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (Fraunhofer SCAI), 53757, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53117, Bonn, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (Fraunhofer SCAI), 53757, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53117, Bonn, Germany
| |
Collapse
|
37
|
Interpreting molecular similarity between patients as a determinant of disease comorbidity relationships. Nat Commun 2020; 11:2854. [PMID: 32504002 PMCID: PMC7275044 DOI: 10.1038/s41467-020-16540-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Accepted: 05/09/2020] [Indexed: 12/31/2022] Open
Abstract
Comorbidity is a medical condition attracting increasing attention in healthcare and biomedical research. Little is known about the involvement of potential molecular factors leading to the emergence of a specific disease in patients affected by other conditions. We present here a disease interaction network inferred from similarities between patients’ molecular profiles, which significantly recapitulates epidemiologically documented comorbidities. Furthermore, we identify disease patient-subgroups that present different molecular similarities with other diseases, some of them opposing the general tendencies observed at the disease level. Analyzing the generated patient-subgroup network, we identify genes involved in such relations, together with drugs whose effects are potentially associated with the observed comorbidities. All the obtained associations are available at the disease PERCEPTION portal (http://disease-perception.bsc.es). Disease comorbidity is attracting increasing attention, but the involvement of molecular factors in forecasting risk of a disease in the presence of other diseases is poorly understood. Here the authors build a disease interaction network based on gene expression profile and discover new comorbidity relationships in patient subgroups.
Collapse
|
38
|
Ni P, Wang J, Zhong P, Li Y, Wu FX, Pan Y. Constructing Disease Similarity Networks Based on Disease Module Theory. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:906-915. [PMID: 29993782 DOI: 10.1109/tcbb.2018.2817624] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Quantifying the associations between diseases is now playing an important role in modern biology and medicine. Actually discovering associations between diseases could help us gain deeper insights into pathogenic mechanisms of complex diseases, thus could lead to improvements in disease diagnosis, drug repositioning, and drug development. Due to the growing body of high-throughput biological data, a number of methods have been developed for computing similarity between diseases during the past decade. However, these methods rarely consider the interconnections of genes related to each disease in protein-protein interaction network (PPIN). Recently, the disease module theory has been proposed, which states that disease-related genes or proteins tend to interact with each other in the same neighborhood of a PPIN. In this study, we propose a new method called ModuleSim to measure associations between diseases by using disease-gene association data and PPIN data based on disease module theory. The experimental results show that by considering the interactions between disease modules and their modularity, the disease similarity calculated by ModuleSim has a significant correlation with disease classification of Disease Ontology (DO). Furthermore, ModuleSim outperforms other four popular methods which are all using disease-gene association data and PPIN data to measure disease-disease associations. In addition, the disease similarity network constructed by MoudleSim suggests that ModuleSim is capable of finding potential associations between diseases.
Collapse
|
39
|
Parisi D, Adasme MF, Sveshnikova A, Bolz SN, Moreau Y, Schroeder M. Drug repositioning or target repositioning: A structural perspective of drug-target-indication relationship for available repurposed drugs. Comput Struct Biotechnol J 2020; 18:1043-1055. [PMID: 32419905 PMCID: PMC7215100 DOI: 10.1016/j.csbj.2020.04.004] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 03/31/2020] [Accepted: 04/04/2020] [Indexed: 12/18/2022] Open
Abstract
Drug repositioning aims to find new indications for existing drugs in order to reduce drug development cost and time. Currently,there are numerous stories of successful drug repositioning that have been reported and many repurposed drugs are already available on the market. Although drug repositioning is often a product of serendipity, repositioning opportunities can be uncovered systematically. There are three systematic approaches to drug repositioning: disease-centric approach, target-centric and drug-centric. Disease-centric approaches identify close relationships between an old and a new indication. A target-centric approach links a known target and its established drug to a new indication. Lastly, a drug-centric approach connects a known drug to a new target and its associated indication. These three approaches differ in their potential and their limitations, but above all else, in the required start information and computing power. This raises the question of which approach prevails in current drug discovery and what that implies for future developments. To address this question, we systematically evaluated over 100 drugs, 200 target structures and over 300 indications from the Drug Repositioning Database. Each analyzed case was classified as one of the three repositioning approaches. For the majority of cases (more than 60%) the disease-centric definition was assigned. Almost 30% of the cases were classified as target-centric and less than 10% as drug-centric approaches. We concluded that, despite the use of umbrella term “drug” repositioning, disease- and target-centric approaches have dominated the field until now. We propose the use of drug-centric approaches while discussing reasons, such as structure-based repositioning techniques, to exploit the full potential of drug-target-disease connections.
Collapse
Affiliation(s)
| | - Melissa F Adasme
- Biotechnology Center (BIOTEC), Technische Universität Dresden, 01307 Dresden, Germany
| | - Anastasia Sveshnikova
- Biotechnology Center (BIOTEC), Technische Universität Dresden, 01307 Dresden, Germany
| | | | - Yves Moreau
- ESAT-STADIUS, KU Leuven, B-3001 Heverlee, Belgium
| | - Michael Schroeder
- Biotechnology Center (BIOTEC), Technische Universität Dresden, 01307 Dresden, Germany
| |
Collapse
|
40
|
Oerton E, Roberts I, Lewis PSH, Guilliams T, Bender A. Understanding and predicting disease relationships through similarity fusion. Bioinformatics 2020; 35:1213-1220. [PMID: 30169824 PMCID: PMC6449746 DOI: 10.1093/bioinformatics/bty754] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Revised: 08/09/2018] [Accepted: 08/29/2018] [Indexed: 12/15/2022] Open
Abstract
Motivation Combining disease relationships across multiple biological levels could aid our understanding of common processes taking place in disease, potentially indicating opportunities for drug sharing. Here, we propose a similarity fusion approach which accounts for differences in information content between different data types, allowing combination of each data type in a balanced manner. Results We apply this method to six different types of biological data (ontological, phenotypic, literature co-occurrence, genetic association, gene expression and drug indication data) for 84 diseases to create a ‘disease map’: a network of diseases connected at one or more biological levels. As well as reconstructing known disease relationships, 15% of links in the disease map are novel links spanning traditional ontological classes, such as between psoriasis and inflammatory bowel disease. 62% of links in the disease map represent drug-sharing relationships, illustrating the relevance of the similarity fusion approach to the identification of potential therapeutic relationships. Availability and implementation Freely available under the MIT license at https://github.com/e-oerton/disease-similarity-fusion Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Erin Oerton
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK.,Healx Ltd, Park House, Castle Park, Cambridge, UK
| | - Ian Roberts
- Healx Ltd, Park House, Castle Park, Cambridge, UK
| | | | | | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK.,Healx Ltd, Park House, Castle Park, Cambridge, UK
| |
Collapse
|
41
|
A systems biology approach to identifying genetic factors affected by aging, lifestyle factors, and type 2 diabetes that influences Parkinson's disease progression. INFORMATICS IN MEDICINE UNLOCKED 2020. [DOI: 10.1016/j.imu.2020.100448] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
|
42
|
Network-based identification of genetic factors in ageing, lifestyle and type 2 diabetes that influence to the progression of Alzheimer's disease. INFORMATICS IN MEDICINE UNLOCKED 2020. [DOI: 10.1016/j.imu.2020.100309] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
|
43
|
Abstract
BACKGROUND A collection of disease-associated data contributes to study the association between diseases. Discovering closely related diseases plays a crucial role in revealing their common pathogenic mechanisms. This might further imply treatment that can be appropriated from one disease to another. During the past decades, a number of approaches for calculating disease similarity have been developed. However, most of them are designed to take advantage of single or few data sources, which results in their low accuracy. METHODS In this paper, we propose a novel method, called MultiSourcDSim, to calculate disease similarity by integrating multiple data sources, namely, gene-disease associations, GO biological process-disease associations and symptom-disease associations. Firstly, we establish three disease similarity networks according to the three disease-related data sources respectively. Secondly, the representation of each node is obtained by integrating the three small disease similarity networks. In the end, the learned representations are applied to calculate the similarity between diseases. RESULTS Our approach shows the best performance compared to the other three popular methods. Besides, the similarity network built by MultiSourcDSim suggests that our method can also uncover the latent relationships between diseases. CONCLUSIONS MultiSourcDSim is an efficient approach to predict similarity between diseases.
Collapse
Affiliation(s)
- Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, 410075 China
| | - Danyi Ye
- School of Computer Science and Engineering, Central South University, Changsha, 410075 China
| | - Junmin Zhao
- School of Computer and Data Science, Henan University of Urban Construction, Pingdingshan, 467000 China
| | - Jingpu Zhang
- School of Computer and Data Science, Henan University of Urban Construction, Pingdingshan, 467000 China
| |
Collapse
|
44
|
Cheng L, Zhao H, Wang P, Zhou W, Luo M, Li T, Han J, Liu S, Jiang Q. Computational Methods for Identifying Similar Diseases. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 18:590-604. [PMID: 31678735 PMCID: PMC6838934 DOI: 10.1016/j.omtn.2019.09.019] [Citation(s) in RCA: 75] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 09/11/2019] [Accepted: 09/12/2019] [Indexed: 02/01/2023]
Abstract
Although our knowledge of human diseases has increased dramatically, the molecular basis, phenotypic traits, and therapeutic targets of most diseases still remain unclear. An increasing number of studies have observed that similar diseases often are caused by similar molecules, can be diagnosed by similar markers or phenotypes, or can be cured by similar drugs. Thus, the identification of diseases similar to known ones has attracted considerable attention worldwide. To this end, the associations between diseases at the molecular, phenotypic, and taxonomic levels were used to measure the pairwise similarity in diseases. The corresponding performance assessment strategies for these methods involving the terms “category-based,” “simulated-patient-based,” and “benchmark-data-based” were thus further emphasized. Then, frequently used methods were evaluated using a benchmark-data-based strategy. To facilitate the assessment of disease similarity scores, researchers have designed dozens of tools that implement these methods for calculating disease similarity. Currently, disease similarity has been advantageous in predicting noncoding RNA (ncRNA) function and therapeutic drugs for diseases. In this article, we review disease similarity methods, evaluation strategies, tools, and their applications in the biomedical community. We further evaluate the performance of these methods and discuss the current limitations and future trends for calculating disease similarity.
Collapse
Affiliation(s)
- Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Hengqiang Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Pingping Wang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Wenyang Zhou
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Meng Luo
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Tianxin Li
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Junwei Han
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China.
| | - Shulin Liu
- Systemomics Center, College of Pharmacy, and Genomics Research Center (State-Province Key Laboratories of Biomedicine-Pharmaceutics of China), Harbin Medical University, Harbin, Heilongjiang, China; Department of Microbiology, Immunology and Infectious Diseases, University of Calgary, Calgary, AB, Canada.
| | - Qinghua Jiang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China.
| |
Collapse
|
45
|
A computational approach to identify blood cell-expressed Parkinson's disease biomarkers that are coordinately expressed in brain tissue. Comput Biol Med 2019; 113:103385. [PMID: 31437626 DOI: 10.1016/j.compbiomed.2019.103385] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 08/06/2019] [Accepted: 08/07/2019] [Indexed: 01/09/2023]
Abstract
Identification of genes whose regulation of expression is functionally similar in both brain tissue and blood cells could in principle enable monitoring of significant neurological traits and disorders by analysis of blood samples. We thus employed transcriptional analysis of pathologically affected tissues, using agnostic approaches to identify overlapping gene functions and integrating this transcriptomic information with expression quantitative trait loci (eQTL) data. Here, we estimate the correlation of gene expression in the top-associated cis-eQTLs of brain tissue and blood cells in Parkinson's Disease (PD). We introduced quantitative frameworks to reveal the complex relationship of various biasing genetic factors in PD, a neurodegenerative disease. We examined gene expression microarray and RNA-Seq datasets from human brain and blood tissues from PD-affected and control individuals. Differentially expressed genes (DEG) were identified for both brain and blood cells to determine common DEG overlaps. Based on neighborhood-based benchmarking and multilayer network topology approaches we then developed genetic associations of factors with PD. Overlapping DEG sets underwent gene enrichment using pathway analysis and gene ontology methods, which identified candidate common genes and pathways. We identified 12 significantly dysregulated genes shared by brain and blood cells, which were validated using dbGaP (gene SNP-disease linkage) database for gold-standard benchmarking of their significance in disease processes. Ontological and pathway analyses identified significant gene ontology and molecular pathways that indicate PD progression. In sum, we found possible novel links between pathological processes in brain tissue and blood cells by examining cell pathway commonalities, corroborating these associations using well validated datasets. This demonstrates that for brain-related pathologies combining gene expression analysis and blood cell cis-eQTL is a potentially powerful analytical approach. Thus, our methodologies facilitate data-driven approaches that can advance knowledge of disease mechanisms and may, with clinical validation, enable prediction of neurological dysfunction using blood cell transcript profiling.
Collapse
|
46
|
Inferring Drug-Related Diseases Based on Convolutional Neural Network and Gated Recurrent Unit. Molecules 2019; 24:molecules24152712. [PMID: 31349692 PMCID: PMC6696443 DOI: 10.3390/molecules24152712] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Revised: 07/18/2019] [Accepted: 07/19/2019] [Indexed: 12/15/2022] Open
Abstract
Predicting novel uses for drugs using their chemical, pharmacological, and indication information contributes to minimizing costs and development periods. Most previous prediction methods focused on integrating the similarity and association information of drugs and diseases. However, they tended to construct shallow prediction models to predict drug-associated diseases, which make deeply integrating the information difficult. Further, path information between drugs and diseases is important auxiliary information for association prediction, while it is not deeply integrated. We present a deep learning-based method, CGARDP, for predicting drug-related candidate disease indications. CGARDP establishes a feature matrix by exploiting a variety of biological premises related to drugs and diseases. A novel model based on convolutional neural network (CNN) and gated recurrent unit (GRU) is constructed to learn the local and path representations for a drug-disease pair. The CNN-based framework on the left of the model learns the local representation of the drug-disease pair from their feature matrix. As the different paths have discriminative contributions to the drug-disease association prediction, we construct an attention mechanism at the path level to learn the informative paths. In the right part, a GRU-based framework learns the path representation based on path information between the drug and the disease. Cross-validation results indicate that CGARDP performs better than several state-of-the-art methods. Further, CGARDP retrieves more real drug-disease associations in the top part of the prediction result that are of concern to biologists. Case studies on five drugs demonstrate that CGARDP can discover potential drug-related disease indications.
Collapse
|
47
|
Yang X, Wang Y, Byrne R, Schneider G, Yang S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem Rev 2019; 119:10520-10594. [PMID: 31294972 DOI: 10.1021/acs.chemrev.8b00728] [Citation(s) in RCA: 369] [Impact Index Per Article: 61.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Artificial intelligence (AI), and, in particular, deep learning as a subcategory of AI, provides opportunities for the discovery and development of innovative drugs. Various machine learning approaches have recently (re)emerged, some of which may be considered instances of domain-specific AI which have been successfully employed for drug discovery and design. This review provides a comprehensive portrayal of these machine learning techniques and of their applications in medicinal chemistry. After introducing the basic principles, alongside some application notes, of the various machine learning algorithms, the current state-of-the art of AI-assisted pharmaceutical discovery is discussed, including applications in structure- and ligand-based virtual screening, de novo drug design, physicochemical and pharmacokinetic property prediction, drug repurposing, and related aspects. Finally, several challenges and limitations of the current methods are summarized, with a view to potential future directions for AI-assisted drug discovery and design.
Collapse
Affiliation(s)
- Xin Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Yifei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Ryan Byrne
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Gisbert Schneider
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Shengyong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| |
Collapse
|
48
|
Luo L, Zheng C, Wang J, Tan M, Li Y, Xu R. Analysis of disease organ as a novel phenotype towards disease genetics understanding. J Biomed Inform 2019; 95:103235. [PMID: 31207382 PMCID: PMC6644057 DOI: 10.1016/j.jbi.2019.103235] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2018] [Revised: 06/06/2019] [Accepted: 06/13/2019] [Indexed: 11/24/2022]
Abstract
Discerning the modular nature of human diseases through computational approaches calls for diverse data. The finding sites of diseases, like other disease phenotypes, possess rich information in understanding disease genetics. Yet, analysis of the rich knowledge of disease finding sites has not been comprehensively investigated. In this study, we built a large-scale disease organ network (DON) based on 76,561 disease-organ associations (for 37,615 diseases and 3492 organs) extracted from the United Medical Language System (UMLS) Metathesaurus. We investigated how phenotypic organ similarity among diseases in DON reflects disease gene sharing. We constructed a disease genetic network (DGN) using curated disease-gene associations and demonstrated that disease pairs with higher organ similarities not only are more likely to share genes, but also tend to share more genes. Based on community detection algorithm, we showed that phenotypic disease clusters on DON significantly correlated with genetic disease clusters on DGN. We compared DON with a state-of-art disease phenotype network, disease manifestation network (DMN), that we have recently constructed, and demonstrated that DON contains complementary knowledge for disease genetics understanding.
Collapse
Affiliation(s)
- Lingyun Luo
- School of Computer Science, University of South China, Hengyang, Hunan 421001, China; Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA.
| | - Chunlei Zheng
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Jiaolong Wang
- School of Computer Science, University of South China, Hengyang, Hunan 421001, China
| | - Minsheng Tan
- School of Computer Science, University of South China, Hengyang, Hunan 421001, China
| | - Yanshu Li
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Rong Xu
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA
| |
Collapse
|
49
|
Dozmorov MG. Disease classification: from phenotypic similarity to integrative genomics and beyond. Brief Bioinform 2019; 20:1769-1780. [DOI: 10.1093/bib/bby049] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 05/01/2018] [Indexed: 02/06/2023] Open
Abstract
Abstract
A fundamental challenge of modern biomedical research is understanding how diseases that are similar on the phenotypic level are similar on the molecular level. Integration of various genomic data sets with the traditionally used phenotypic disease similarity revealed novel genetic and molecular mechanisms and blurred the distinction between monogenic (Mendelian) and complex diseases. Network-based medicine has emerged as a complementary approach for identifying disease-causing genes, genetic mediators, disruptions in the underlying cellular functions and for drug repositioning. The recent development of machine and deep learning methods allow for leveraging real-life information about diseases to refine genetic and phenotypic disease relationships. This review describes the historical development and recent methodological advancements for studying disease classification (nosology).
Collapse
Affiliation(s)
- Mikhail G Dozmorov
- Department of Biostatistics, Virginia Commonwealth University, 830 East Main Street, Richmond, VA, USA
| |
Collapse
|
50
|
García del Valle EP, Lagunes García G, Prieto Santamaría L, Zanin M, Menasalvas Ruiz E, Rodríguez-González A. Disease networks and their contribution to disease understanding: A review of their evolution, techniques and data sources. J Biomed Inform 2019; 94:103206. [DOI: 10.1016/j.jbi.2019.103206] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Revised: 04/14/2019] [Accepted: 05/06/2019] [Indexed: 12/14/2022]
|