1
|
Wu X, Luo G, Dong Z, Zheng W, Jia G. Integrated Pleiotropic Gene Set Unveils Comorbidity Insights across Digestive Cancers and Other Diseases. Genes (Basel) 2024; 15:478. [PMID: 38674412 PMCID: PMC11049963 DOI: 10.3390/genes15040478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Revised: 03/31/2024] [Accepted: 04/04/2024] [Indexed: 04/28/2024] Open
Abstract
Comorbidities are prevalent in digestive cancers, intensifying patient discomfort and complicating prognosis. Identifying potential comorbidities and investigating their genetic connections in a systemic manner prove to be instrumental in averting additional health challenges during digestive cancer management. Here, we investigated 150 diseases across 18 categories by collecting and integrating various factors related to disease comorbidity, such as disease-associated SNPs or genes from sources like MalaCards, GWAS Catalog and UK Biobank. Through this extensive analysis, we have established an integrated pleiotropic gene set comprising 548 genes in total. Particularly, there enclosed the genes encoding major histocompatibility complex or related to antigen presentation. Additionally, we have unveiled patterns in protein-protein interactions and key hub genes/proteins including TP53, KRAS, CTNNB1 and PIK3CA, which may elucidate the co-occurrence of digestive cancers with certain diseases. These findings provide valuable insights into the molecular origins of comorbidity, offering potential avenues for patient stratification and the development of targeted therapies in clinical trials.
Collapse
Affiliation(s)
- Xinnan Wu
- Institute of Public-Safety and Big Data, College of Data Science, Taiyuan University of Technology, University Street, Yuci District, Jinzhong 030600, China;
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China; (G.L.); (Z.D.)
| | - Guangwen Luo
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China; (G.L.); (Z.D.)
| | - Zhaonian Dong
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China; (G.L.); (Z.D.)
| | - Wen Zheng
- Institute of Public-Safety and Big Data, College of Data Science, Taiyuan University of Technology, University Street, Yuci District, Jinzhong 030600, China;
| | - Gengjie Jia
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China; (G.L.); (Z.D.)
| |
Collapse
|
2
|
Jiang F, Ruan Y, Chen XH, Yu HL, Cheng T, Duan XY, Liu YG, Zhang HY, Zhang QY. Metabolites of pathogenic microorganisms database (MPMdb) and its seed metabolite applications. Microbiol Spectr 2024; 12:e0234223. [PMID: 38391229 PMCID: PMC10986615 DOI: 10.1128/spectrum.02342-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 01/23/2024] [Indexed: 02/24/2024] Open
Abstract
Seed metabolites are the combination of essential compounds required by an organism across various potential environmental conditions. The seed metabolites screening framework based on the network topology approach can capture important biological information of species. This study aims to identify comprehensively the relationship between seed metabolites and pathogenic bacteria. A large-scale data set was compiled, describing the seed metabolite sets and metabolite sets of 124,192 pathogenic strains from 34 genera, by constructing genome-scale metabolic models. The enrichment analysis method was used to screen the specific seed metabolites of each species/genus of pathogenic bacteria. The metabolites of pathogenic microorganisms database (MPMdb) (http://qyzhanglab.hzau.edu.cn/MPMdb/) was established for browsing, searching, predicting, or downloading metabolites and seed metabolites of pathogenic microorganisms. Based on the MPMdb, taxonomic and phylogenetic analyses of pathogenic bacteria were performed according to the function of seed metabolites and metabolites. The results showed that the seed metabolites could be used as a feature for microorganism chemotaxonomy, and they could mirror the phylogeny of pathogenic bacteria. In addition, our screened specific seed metabolites of pathogenic bacteria can be used not only for further tapping the nutritional resources and identifying auxotrophies of pathogenic bacteria but also for designing targeted bactericidal compounds by combining with existing antimicrobial agents.IMPORTANCEMetabolites serve as key communication links between pathogenic microorganisms and hosts, with seed metabolites being crucial for microbial growth, reproduction, external communication, and host infection. However, the large-scale screening of metabolites and the identification of seed metabolites have always been the main technical bottleneck due to the low throughput and costly analysis. Genome-scale metabolic models have become a recognized research paradigm to investigate the metabolic characteristics of species. The developed metabolites of pathogenic microorganisms database in this study is committed to systematically predicting and identifying the metabolites and seed metabolites of pathogenic microorganisms, which could provide a powerful resource platform for pathogenic bacteria research.
Collapse
Affiliation(s)
- Feng Jiang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Yao Ruan
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Xiao-Hui Chen
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Hai-Long Yu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Ting Cheng
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Xin-Ya Duan
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Yan-Guang Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Hong-Yu Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Qing-Ye Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
3
|
Tsekenis G, Cimini G, Kalafatis M, Giacometti A, Gili T, Caldarelli G. Network topology mapping of chemical compounds space. Sci Rep 2024; 14:5266. [PMID: 38438443 PMCID: PMC10912673 DOI: 10.1038/s41598-024-54594-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 02/14/2024] [Indexed: 03/06/2024] Open
Abstract
We define bipartite and monopartite relational networks of chemical elements and compounds using two different datasets of inorganic chemical and material compounds, as well as study their topology. We discover that the connectivity between elements and compounds is distributed exponentially for materials, and with a fat tail for chemicals. Compounds networks show similar distribution of degrees, and feature a highly-connected club due to oxygen . Chemical compounds networks appear more modular than material ones, while the communities detected reveal different dominant elements specific to the topology. We successfully reproduce the connectivity of the empirical chemicals and materials networks by using a family of fitness models, where the fitness values are derived from the abundances of the elements in the aggregate compound data. Our results pave the way towards a relational network-based understanding of the inherent complexity of the vast chemical knowledge atlas, and our methodology can be applied to other systems with the ingredient-composite structure.
Collapse
Affiliation(s)
- Georgios Tsekenis
- Institute for Complex Systems, National Research Council, Rome, Italy.
- Department of Molecular Sciences and Nanosystems (DMSN), "Ca' Foscari" University of Venice, Venice, Italy.
| | - Giulio Cimini
- Physics Department and INFN, University of Rome Tor Vergata, Rome, Italy
| | - Marinos Kalafatis
- Department of Microbiology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Achille Giacometti
- Department of Molecular Sciences and Nanosystems (DMSN), "Ca' Foscari" University of Venice, Venice, Italy
- European Centre of Living Technologies (ECLT), "Ca' Foscari" University of Venice, Venice, Italy
| | - Tommaso Gili
- Networks Unit, IMT School for Advanced Studies Lucca, 55100, Lucca, Italy
| | - Guido Caldarelli
- Institute for Complex Systems, National Research Council, Rome, Italy
- Department of Molecular Sciences and Nanosystems (DMSN), "Ca' Foscari" University of Venice, Venice, Italy
- European Centre of Living Technologies (ECLT), "Ca' Foscari" University of Venice, Venice, Italy
- Rara Foundation - Sustainable Materials and Technologies ETS, 30171, Venice, Italy
| |
Collapse
|
4
|
Duman ET, Tuna G, Ak E, Avsar G, Pir P. Optimized network based natural language processing approach to reveal disease comorbidities in COVID-19. Sci Rep 2024; 14:2325. [PMID: 38282038 PMCID: PMC10822845 DOI: 10.1038/s41598-024-52819-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 01/24/2024] [Indexed: 01/30/2024] Open
Abstract
A novel virus emerged from Wuhan, China, at the end of 2019 and quickly evolved into a pandemic, significantly impacting various industries, especially healthcare. One critical lesson from COVID-19 is the importance of understanding and predicting underlying comorbidities to better prioritize care and pharmacological therapies. Factors like age, race, and comorbidity history are crucial in determining disease mortality. While clinical data from hospitals and cohorts have led to the identification of these comorbidities, traditional approaches often lack a mechanistic understanding of the connections between them. In response, we utilized a deep learning approach to integrate COVID-19 data with data from other diseases, aiming to detect comorbidities with mechanistic insights. Our modified algorithm in the mpDisNet package, based on word-embedding deep learning techniques, incorporates miRNA expression profiles from SARS-CoV-2 infected cell lines and their target transcription factors. This approach is aligned with the emerging field of network medicine, which seeks to define diseases based on distinct pathomechanisms rather than just phenotypes. The main aim is discovery of possible unknown comorbidities by connecting the diseases by their miRNA mediated regulatory interactions. The algorithm can predict the majority of COVID-19's known comorbidities, as well as several diseases that have yet to be discovered to be comorbid with COVID-19. These potentially comorbid diseases should be investigated further to raise awareness and prevention, as well as informing the comorbidity research for the next possible outbreak.
Collapse
Affiliation(s)
- Emre Taylan Duman
- Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey.
- NGS-Core Unit for Integrative Genomics, Institute of Pathology, University Medical Center Göttingen, Göttingen, Germany.
| | - Gizem Tuna
- Department of Molecular Biology, Gebze Technical University, Kocaeli, Turkey
| | - Enes Ak
- Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey
| | - Gülben Avsar
- Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey
| | - Pinar Pir
- Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey
| |
Collapse
|
5
|
Probst D. An explainability framework for deep learning on chemical reactions exemplified by enzyme-catalysed reaction classification. J Cheminform 2023; 15:113. [PMID: 37996942 PMCID: PMC10668483 DOI: 10.1186/s13321-023-00784-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 11/13/2023] [Indexed: 11/25/2023] Open
Abstract
Assigning or proposing a catalysing enzyme given a chemical or biochemical reaction is of great interest to life sciences and chemistry alike. The exploration and design of metabolic pathways and the challenge of finding more sustainable enzyme-catalysed alternatives to traditional organic reactions are just two examples of tasks that require an association between reaction and enzyme. However, given the lack of large and balanced annotated data sets of enzyme-catalysed reactions, assigning an enzyme to a reaction still relies on expert-curated rules and databases. Here, we present a data-driven explainable human-in-the-loop machine learning approach to support and ultimately automate the association of a catalysing enzyme with a given biochemical reaction. In addition, the proposed method is capable of predicting enzymes as candidate catalysts for organic reactions amendable to biocatalysis. Finally, the introduced explainability and visualisation methods can easily be generalised to support other machine-learning approaches involving chemical and biochemical reactions.
Collapse
Affiliation(s)
- Daniel Probst
- Signal Processing Laboratory 2, Institute of Electrical and Micro Engineering, School of Engineering, EPFL, Rte Cantonale, 1015, Lausanne, Vaud, Switzerland.
| |
Collapse
|
6
|
Wu S, Liu X, Dong A, Gragnoli C, Griffin C, Wu J, Yau ST, Wu R. The metabolomic physics of complex diseases. Proc Natl Acad Sci U S A 2023; 120:e2308496120. [PMID: 37812720 PMCID: PMC10589719 DOI: 10.1073/pnas.2308496120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Accepted: 08/15/2023] [Indexed: 10/11/2023] Open
Abstract
Human diseases involve metabolic alterations. Metabolomic profiles have served as a vital biomarker for the early identification of high-risk individuals and disease prevention. However, current approaches can only characterize individual key metabolites, without taking into account the reality that complex diseases are multifactorial, dynamic, heterogeneous, and interdependent. Here, we leverage a statistical physics model to combine all metabolites into bidirectional, signed, and weighted interaction networks and trace how the flow of information from one metabolite to the next causes changes in health state. Viewing a disease outcome as the consequence of complex interactions among its interconnected components (metabolites), we integrate concepts from ecosystem theory and evolutionary game theory to model how the health state-dependent alteration of a metabolite is shaped by its intrinsic properties and through extrinsic influences from its conspecifics. We code intrinsic contributions as nodes and extrinsic contributions as edges into quantitative networks and implement GLMY homology theory to analyze and interpret the topological change of health state from symbiosis to dysbiosis and vice versa. The application of this model to real data allows us to identify several hub metabolites and their interaction webs, which play a part in the formation of inflammatory bowel diseases. The findings by our model could provide important information on drug design to treat these diseases and beyond.
Collapse
Affiliation(s)
- Shuang Wu
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing100083, China
| | - Xiang Liu
- Chern Institute of Mathematics, Nankai University, Tianjin300071, China
- Beijing Yanqi Lake Institute of Mathematical Sciences and Applications, Beijing101408, China
| | - Ang Dong
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing100083, China
| | - Claudia Gragnoli
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA17033
- Department of Medicine, Creighton University School of Medicine, Omaha, NE68124
- Molecular Biology Laboratory, Bios Biotech Multi-Diagnostic Health Center, Rome00197, Italy
| | - Christopher Griffin
- Applied Research Laboratory, The Pennsylvania State University, University Park, PA16802
| | - Jie Wu
- Beijing Yanqi Lake Institute of Mathematical Sciences and Applications, Beijing101408, China
| | - Shing-Tung Yau
- Beijing Yanqi Lake Institute of Mathematical Sciences and Applications, Beijing101408, China
- Yau Mathematical Sciences Center, Tsinghua University, Beijing100084, China
| | - Rongling Wu
- Beijing Yanqi Lake Institute of Mathematical Sciences and Applications, Beijing101408, China
- Yau Mathematical Sciences Center, Tsinghua University, Beijing100084, China
| |
Collapse
|
7
|
Sánchez-Valle J, Valencia A. Molecular bases of comorbidities: present and future perspectives. Trends Genet 2023; 39:773-786. [PMID: 37482451 DOI: 10.1016/j.tig.2023.06.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 06/12/2023] [Accepted: 06/12/2023] [Indexed: 07/25/2023]
Abstract
Co-occurrence of diseases decreases patient quality of life, complicates treatment choices, and increases mortality. Analyses of electronic health records present a complex scenario of comorbidity relationships that vary by age, sex, and cohort under study. The study of similarities between diseases using 'omics data, such as genes altered in diseases, gene expression, proteome, and microbiome, are fundamental to uncovering the origin of, and potential treatment for, comorbidities. Recent studies have produced a first generation of genetic interpretations for as much as 46% of the comorbidities described in large cohorts. Integrating different sources of molecular information and using artificial intelligence (AI) methods are promising approaches for the study of comorbidities. They may help to improve the treatment of comorbidities, including the potential repositioning of drugs.
Collapse
Affiliation(s)
- Jon Sánchez-Valle
- Life Sciences Department, Barcelona Supercomputing Center, Barcelona, 08034, Spain.
| | - Alfonso Valencia
- Life Sciences Department, Barcelona Supercomputing Center, Barcelona, 08034, Spain; ICREA, Barcelona, 08010, Spain.
| |
Collapse
|
8
|
Skinner DJ, Jeckel H, Martin AC, Drescher K, Dunkel J. Topological packing statistics of living and nonliving matter. SCIENCE ADVANCES 2023; 9:eadg1261. [PMID: 37672580 PMCID: PMC10482333 DOI: 10.1126/sciadv.adg1261] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Accepted: 07/27/2023] [Indexed: 09/08/2023]
Abstract
Complex disordered matter is of central importance to a wide range of disciplines, from bacterial colonies and embryonic tissues in biology to foams and granular media in materials science to stellar configurations in astrophysics. Because of the vast differences in composition and scale, comparing structural features across such disparate systems remains challenging. Here, by using the statistical properties of Delaunay tessellations, we introduce a mathematical framework for measuring topological distances between general three-dimensional point clouds. The resulting system-agnostic metric reveals subtle structural differences between bacterial biofilms as well as between zebrafish brain regions, and it recovers temporal ordering of embryonic development. We apply the metric to construct a universal topological atlas encompassing bacterial biofilms, snowflake yeast, plant shoots, zebrafish brain matter, organoids, and embryonic tissues as well as foams, colloidal packings, glassy materials, and stellar configurations. Living systems localize within a bounded island-like region of the atlas, reflecting that biological growth mechanisms result in characteristic topological properties.
Collapse
Affiliation(s)
- Dominic J Skinner
- Department of Mathematics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
- NSF-Simons Center for Quantitative Biology, Northwestern University, 2205 Tech Drive, Evanston, IL 60208, USA
| | - Hannah Jeckel
- Department of Physics, Philipps-Universität Marburg, Renthof 6, 35032 Marburg, Germany
- Biozentrum, University of Basel, Spitalstrasse 41, 4056 Basel, Switzerland
| | - Adam C Martin
- Department of Biology, Massachusetts Institute of Technology, 77 Massachusetts Ave., Cambridge, MA 02139, USA
| | - Knut Drescher
- Biozentrum, University of Basel, Spitalstrasse 41, 4056 Basel, Switzerland
| | - Jörn Dunkel
- Department of Mathematics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| |
Collapse
|
9
|
Jia G, Li Y, Zhong X, Wang K, Pividori M, Alomairy R, Esposito A, Ltaief H, Terao C, Akiyama M, Matsuda K, Keyes DE, Im HK, Gojobori T, Kamatani Y, Kubo M, Cox NJ, Evans J, Gao X, Rzhetsky A. The high-dimensional space of human diseases built from diagnosis records and mapped to genetic loci. NATURE COMPUTATIONAL SCIENCE 2023; 3:403-417. [PMID: 38177845 PMCID: PMC10766526 DOI: 10.1038/s43588-023-00453-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 04/13/2023] [Indexed: 01/06/2024]
Abstract
Human diseases are traditionally studied as singular, independent entities, limiting researchers' capacity to view human illnesses as dependent states in a complex, homeostatic system. Here, using time-stamped clinical records of over 151 million unique Americans, we construct a disease representation as points in a continuous, high-dimensional space, where diseases with similar etiology and manifestations lie near one another. We use the UK Biobank cohort, with half a million participants, to perform a genome-wide association study of newly defined human quantitative traits reflecting individuals' health states, corresponding to patient positions in our disease space. We discover 116 genetic associations involving 108 genetic loci and then use ten disease constellations resulting from clustering analysis of diseases in the embedding space, as well as 30 common diseases, to demonstrate that these genetic associations can be used to robustly predict various morbidities.
Collapse
Affiliation(s)
- Gengjie Jia
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
| | - Yu Li
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Xue Zhong
- Department of Medicine and Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, US
| | - Kanix Wang
- Department of Medicine, Institute of Genomics and Systems Biology, Committee on Genomics, Genetics, and Systems Biology, University of Chicago, Chicago, IL, US
- Department of Operations, Business Analytics, and Information Systems, University of Cincinnati, Cincinnati, OH, US
| | - Milton Pividori
- Department of Medicine, Institute of Genomics and Systems Biology, Committee on Genomics, Genetics, and Systems Biology, University of Chicago, Chicago, IL, US
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, US
| | - Rabab Alomairy
- Extreme Computing Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia
| | | | - Hatem Ltaief
- Extreme Computing Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Chikashi Terao
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- Department of Applied Genetics, The School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| | - Masato Akiyama
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Department of Ophthalmology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Koichi Matsuda
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - David E Keyes
- Extreme Computing Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Hae Kyung Im
- Department of Medicine, Institute of Genomics and Systems Biology, Committee on Genomics, Genetics, and Systems Biology, University of Chicago, Chicago, IL, US
| | - Takashi Gojobori
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Yoichiro Kamatani
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Michiaki Kubo
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Nancy J Cox
- Department of Medicine and Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, US
| | - James Evans
- Department of Sociology, University of Chicago, Chicago, IL, US
| | - Xin Gao
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
| | - Andrey Rzhetsky
- Department of Medicine, Institute of Genomics and Systems Biology, Committee on Genomics, Genetics, and Systems Biology, University of Chicago, Chicago, IL, US.
- Department of Human Genetics, University of Chicago, Chicago, IL, US.
| |
Collapse
|
10
|
Discovering Entities Similarities in Biological Networks Using a Hybrid Immune Algorithm. INFORMATICS 2023. [DOI: 10.3390/informatics10010018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Disease phenotypes are generally caused by the failure of gene modules which often have similar biological roles. Through the study of biological networks, it is possible to identify the intrinsic structure of molecular interactions in order to identify the so-called “disease modules”. Community detection is an interesting and valuable approach to discovering the structure of the community in a complex network, revealing the internal organization of the nodes, and has become a leading research topic in the analysis of complex networks. This work investigates the link between biological modules and network communities in test-case biological networks that are commonly used as a reference point and which include Protein–Protein Interaction Networks, Metabolic Networks and Transcriptional Regulation Networks. In order to identify small and structurally well-defined communities in the biological context, a hybrid immune metaheuristic algorithm Hybrid-IA is proposed and compared with several metaheuristics, hyper-heuristics, and the well-known greedy algorithm Louvain, with respect to modularity maximization. Considering the limitation of modularity optimization, which can fail to identify smaller communities, the reliability of Hybrid-IA was also analyzed with respect to three well-known sensitivity analysis measures (NMI, ARI and NVI) that assess how similar the detected communities are to real ones. By inspecting all outcomes and the performed comparisons, we will see that on one hand Hybrid-IA finds slightly lower modularity values than Louvain, but outperforms all other metaheuristics, while on the other hand, it can detect communities more similar to the real ones when compared to those detected by Louvain.
Collapse
|
11
|
Nam Y, Jung SH, Yun JS, Sriram V, Singhal P, Byrska-Bishop M, Verma A, Shin H, Park WY, Won HH, Kim D. Discovering comorbid diseases using an inter-disease interactivity network based on biobank-scale PheWAS data. Bioinformatics 2023; 39:6960923. [PMID: 36571484 PMCID: PMC9825330 DOI: 10.1093/bioinformatics/btac822] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 12/03/2022] [Accepted: 12/23/2022] [Indexed: 12/27/2022] Open
Abstract
MOTIVATION Understanding comorbidity is essential for disease prevention, treatment and prognosis. In particular, insight into which pairs of diseases are likely or unlikely to co-occur may help elucidate the potential relationships between complex diseases. Here, we introduce the use of an inter-disease interactivity network to discover/prioritize comorbidities. Specifically, we determine disease associations by accounting for the direction of effects of genetic components shared between diseases, and categorize those associations as synergistic or antagonistic. We further develop a comorbidity scoring algorithm to predict whether diseases are more or less likely to co-occur in the presence of a given index disease. This algorithm can handle networks that incorporate relationships with opposite signs. RESULTS We finally investigate inter-disease associations among 427 phenotypes in UK Biobank PheWAS data and predict the priority of comorbid diseases. The predicted comorbidities were verified using the UK Biobank inpatient electronic health records. Our findings demonstrate that considering the interaction of phenotype associations might be helpful in better predicting comorbidity. AVAILABILITY AND IMPLEMENTATION The source code and data of this study are available at https://github.com/dokyoonkimlab/DiseaseInteractiveNetwork. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Jae-Seung Yun
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Division of Endocrinology and Metabolism, Department of Internal Medicine, St. Vincent’s Hospital, College of Medicine, The Catholic University of Korea, Seoul 06591, Republic of Korea
| | - Vivek Sriram
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Pankhuri Singhal
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | | | - Anurag Verma
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Hyunjung Shin
- Department of Artificial Intelligence, Ajou University, Suwon 16499, Republic of Korea
| | - Woong-Yang Park
- Samsung Genome Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea
| | | | - Dokyoon Kim
- To whom correspondence should be addressed. or
| |
Collapse
|
12
|
Demirbilek O, Rekik I. Predicting the evolution trajectory of population-driven connectional brain templates using recurrent multigraph neural networks. Med Image Anal 2023; 83:102649. [PMID: 36257134 DOI: 10.1016/j.media.2022.102649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 09/28/2022] [Accepted: 09/30/2022] [Indexed: 11/05/2022]
Abstract
The mapping of the time-dependent evolution of the human brain connectivity using longitudinal and multimodal neuroimaging datasets provides insights into the development of neurological disorders and the way they alter the brain morphology, structure and function over time. Recently, the connectional brain template (CBT) was introduced as a compact representation integrating a population of brain multigraphs, where two brain regions can have multiple connections, into a single graph. Given a population of brain multigraphs observed at a baseline timepoint t1, we aim to learn how to predict the evolution of the population CBT at follow-up timepoints t>t1. Such model will allow us to foresee the evolution of the connectivity patterns of healthy and disordered individuals at the population level. Here we present recurrent multigraph integrator network (ReMI-Net⋆) to forecast population templates at consecutive timepoints from a given single timepoint. In particular, we unprecedentedly design a graph neural network architecture to model the changes in the brain multigraph and identify the biomarkers that differentiate between the typical and atypical populations. Addressing such issues is of paramount importance in diagnosing neurodegenerative disorders at early stages and promoting new clinical studies based on the pinned-down biomarker brain regions or connectivities. In this paper, we demonstrate the design and use of the ReMI-Net⋆ model, which learns both the multigraph node level and time level dependencies concurrently. Thanks to its novel graph convolutional design and normalization layers, ReMI-Net⋆ predicts well-centered, discriminative, and topologically sound connectional templates over time. Additionally, the results show that our model outperforms all benchmarks and state-of-the-art methods by comparing and discovering the atypical connectivity alterations over time. Our ReMI-Net⋆ code is available on GitHub at https://github.com/basiralab/ReMI-Net-Star.
Collapse
Affiliation(s)
- Oytun Demirbilek
- BASIRA lab, Faculty of Computer and Informatics, Istanbul Technical University, Istanbul, Turkey
| | - Islem Rekik
- BASIRA lab, Faculty of Computer and Informatics, Istanbul Technical University, Istanbul, Turkey; School of Science and Engineering, Computing, University of Dundee, UK.
| | | |
Collapse
|
13
|
Yang X, Xu W, Leng D, Wen Y, Wu L, Li R, Huang J, Bo X, He S. Exploring novel disease-disease associations based on multi-view fusion network. Comput Struct Biotechnol J 2023; 21:1807-1819. [PMID: 36923471 PMCID: PMC10009443 DOI: 10.1016/j.csbj.2023.02.038] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 02/02/2023] [Accepted: 02/22/2023] [Indexed: 03/06/2023] Open
Abstract
Established taxonomy system based on disease symptom and tissue characteristics have provided an important basis for physicians to correctly identify diseases and treat them successfully. However, these classifications tend to be based on phenotypic observations, lacking a molecular biological foundation. Therefore, there is an urgent to integrate multi-dimensional molecular biological information or multi-omics data to redefine disease classification in order to provide a powerful perspective for understanding the molecular structure of diseases. Therefore, we offer a flexible disease classification that integrates the biological process, gene expression, and symptom phenotype of diseases, and propose a disease-disease association network based on multi-view fusion. We applied the fusion approach to 223 diseases and divided them into 24 disease clusters. The contribution of internal and external edges of disease clusters were analyzed. The results of the fusion model were compared with Medical Subject Headings, a traditional and commonly used disease taxonomy. Then, experimental results of model performance comparison show that our approach performs better than other integration methods. As it was observed, the obtained clusters provided more interesting and novel disease-disease associations. This multi-view human disease association network describes relationships between diseases based on multiple molecular levels, thus breaking through the limitation of the disease classification system based on tissues and organs. This approach which motivates clinicians and researchers to reposition the understanding of diseases and explore diagnosis and therapy strategies, extends the existing disease taxonomy. Availability of data and materials The preprocessed dataset and source code supporting the conclusions of this article are available at GitHub repository https://github.com/yangxiaoxi89/mvHDN.
Collapse
Affiliation(s)
- Xiaoxi Yang
- Clinical Medicine Institute, Beijing Friendship Hospital, Capital Medical University, Beijing 100050, China.,Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Wenjian Xu
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China.,Rare Disease Center, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing 100045, China.,MOE Key Laboratory of Major Diseases in Children, Beijing 100045, China.,Beijing Key Laboratory for Genetics of Birth Defects, Beijing Pediatric Research Institute, Beijing 100045, China
| | - Dongjin Leng
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yuqi Wen
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Lianlian Wu
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Ruijiang Li
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Jian Huang
- Clinical Medicine Institute, Beijing Friendship Hospital, Capital Medical University, Beijing 100050, China
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| |
Collapse
|
14
|
Emmert-Streib F. Severe testing with high-dimensional omics data for enhancing biomedical scientific discovery. NPJ Syst Biol Appl 2022; 8:40. [PMID: 36271093 PMCID: PMC9587237 DOI: 10.1038/s41540-022-00251-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 09/27/2022] [Indexed: 12/02/2022] Open
Abstract
High-throughput omics experiments provide a wealth of data for exploring biomedical questions and for advancing translational research. However, despite this great potential, results that enter the clinical practice are scarce even twenty years after the completion of the human genome project. For this reason in this paper, we revisit problems with scientific discovery commonly summarized under the term reproducibility crisis. We will argue that the major problem that hampers progress in translational research is threefold. First, in order to establish biological foundations of disorders or general complex phenotypes, one needs to embrace emergence. Second, there seems to be confusion about the underlying hypotheses tested by omics studies. Third, most contemporary omics studies are designed to perform what can be seen as incremental corroborations of a hypothesis. In order to improve upon these shortcomings, we define a severe testing framework (STF) that can be applied to a large number of omics studies for enhancing scientific discovery in the biomedical sciences. Briefly, STF provides systematic means to trim wild-grown omics studies in a constructive way.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland.
| |
Collapse
|
15
|
García-Sancha N, Corchado-Cobos R, Gómez-Vecino A, Jiménez-Navas A, Pérez-Baena MJ, Blanco-Gómez A, Holgado-Madruga M, Mao JH, Cañueto J, Castillo-Lluva S, Mendiburu-Eliçabe M, Pérez-Losada J. Evolutionary Origins of Metabolic Reprogramming in Cancer. Int J Mol Sci 2022; 23:ijms232012063. [PMID: 36292921 PMCID: PMC9603151 DOI: 10.3390/ijms232012063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 09/29/2022] [Accepted: 10/06/2022] [Indexed: 11/23/2022] Open
Abstract
Metabolic changes that facilitate tumor growth are one of the hallmarks of cancer. These changes are not specific to tumors but also take place during the physiological growth of tissues. Indeed, the cellular and tissue mechanisms present in the tumor have their physiological counterpart in the repair of tissue lesions and wound healing. These molecular mechanisms have been acquired during metazoan evolution, first to eliminate the infection of the tissue injury, then to enter an effective regenerative phase. Cancer itself could be considered a phenomenon of antagonistic pleiotropy of the genes involved in effective tissue repair. Cancer and tissue repair are complex traits that share many intermediate phenotypes at the molecular, cellular, and tissue levels, and all of these are integrated within a Systems Biology structure. Complex traits are influenced by a multitude of common genes, each with a weak effect. This polygenic component of complex traits is mainly unknown and so makes up part of the missing heritability. Here, we try to integrate these different perspectives from the point of view of the metabolic changes observed in cancer.
Collapse
Affiliation(s)
- Natalia García-Sancha
- Instituto de Biología Molecular y Celular del Cáncer (IBMCC-CIC), Universidad de Salamanca/CSIC, 37007 Salamanca, Spain
- Instituto de Investigación Biosanitaria de Salamanca (IBSAL), 37007 Salamanca, Spain
| | - Roberto Corchado-Cobos
- Instituto de Biología Molecular y Celular del Cáncer (IBMCC-CIC), Universidad de Salamanca/CSIC, 37007 Salamanca, Spain
- Instituto de Investigación Biosanitaria de Salamanca (IBSAL), 37007 Salamanca, Spain
| | - Aurora Gómez-Vecino
- Instituto de Biología Molecular y Celular del Cáncer (IBMCC-CIC), Universidad de Salamanca/CSIC, 37007 Salamanca, Spain
- Instituto de Investigación Biosanitaria de Salamanca (IBSAL), 37007 Salamanca, Spain
| | - Alejandro Jiménez-Navas
- Instituto de Biología Molecular y Celular del Cáncer (IBMCC-CIC), Universidad de Salamanca/CSIC, 37007 Salamanca, Spain
- Instituto de Investigación Biosanitaria de Salamanca (IBSAL), 37007 Salamanca, Spain
| | - Manuel Jesús Pérez-Baena
- Instituto de Biología Molecular y Celular del Cáncer (IBMCC-CIC), Universidad de Salamanca/CSIC, 37007 Salamanca, Spain
- Instituto de Investigación Biosanitaria de Salamanca (IBSAL), 37007 Salamanca, Spain
| | - Adrián Blanco-Gómez
- Instituto de Biología Molecular y Celular del Cáncer (IBMCC-CIC), Universidad de Salamanca/CSIC, 37007 Salamanca, Spain
- Instituto de Investigación Biosanitaria de Salamanca (IBSAL), 37007 Salamanca, Spain
| | - Marina Holgado-Madruga
- Instituto de Investigación Biosanitaria de Salamanca (IBSAL), 37007 Salamanca, Spain
- Departamento de Fisiología y Farmacología, Universidad de Salamanca, 37007 Salamanca, Spain
- Instituto de Neurociencias de Castilla y León (INCyL), 37007 Salamanca, Spain
| | - Jian-Hua Mao
- Lawrence Berkeley National Laboratory, Biological Systems and Engineering Division, Berkeley, CA 94720, USA
- Berkeley Biomedical Data Science Center, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Javier Cañueto
- Instituto de Biología Molecular y Celular del Cáncer (IBMCC-CIC), Universidad de Salamanca/CSIC, 37007 Salamanca, Spain
- Instituto de Investigación Biosanitaria de Salamanca (IBSAL), 37007 Salamanca, Spain
- Departamento de Dermatología, Hospital Universitario de Salamanca, Paseo de San Vicente 58-182, 37007 Salamanca, Spain
| | - Sonia Castillo-Lluva
- Departamento de Bioquímica y Biología Molecular, Facultad de Ciencias Químicas, Universidad Complutense, 28040 Madrid, Spain
- Instituto de Investigaciones Sanitarias San Carlos (IdISSC), 28040 Madrid, Spain
| | - Marina Mendiburu-Eliçabe
- Instituto de Biología Molecular y Celular del Cáncer (IBMCC-CIC), Universidad de Salamanca/CSIC, 37007 Salamanca, Spain
- Instituto de Investigación Biosanitaria de Salamanca (IBSAL), 37007 Salamanca, Spain
- Correspondence: (M.M.-E.); (J.P.-L.)
| | - Jesús Pérez-Losada
- Instituto de Biología Molecular y Celular del Cáncer (IBMCC-CIC), Universidad de Salamanca/CSIC, 37007 Salamanca, Spain
- Instituto de Investigación Biosanitaria de Salamanca (IBSAL), 37007 Salamanca, Spain
- Correspondence: (M.M.-E.); (J.P.-L.)
| |
Collapse
|
16
|
Ferolito B, do Valle IF, Gerlovin H, Costa L, Casas JP, Gaziano JM, Gagnon DR, Begoli E, Barabási AL, Cho K. Visualizing novel connections and genetic similarities across diseases using a network-medicine based approach. Sci Rep 2022; 12:14914. [PMID: 36050444 PMCID: PMC9436158 DOI: 10.1038/s41598-022-19244-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 08/26/2022] [Indexed: 11/08/2022] Open
Abstract
Understanding the genetic relationships between human disorders could lead to better treatment and prevention strategies, especially for individuals with multiple comorbidities. A common resource for studying genetic-disease relationships is the GWAS Catalog, a large and well curated repository of SNP-trait associations from various studies and populations. Some of these populations are contained within mega-biobanks such as the Million Veteran Program (MVP), which has enabled the genetic classification of several diseases in a large well-characterized and heterogeneous population. Here we aim to provide a network of the genetic relationships among diseases and to demonstrate the utility of quantifying the extent to which a given resource such as MVP has contributed to the discovery of such relations. We use a network-based approach to evaluate shared variants among thousands of traits in the GWAS Catalog repository. Our results indicate many more novel disease relationships that did not exist in early studies and demonstrate that the network can reveal clusters of diseases mechanistically related. Finally, we show novel disease connections that emerge when MVP data is included, highlighting methodology that can be used to indicate the contributions of a given biobank.
Collapse
Affiliation(s)
- Brian Ferolito
- VA Boston Healthcare System, Massachusetts Veterans Epidemiology and Research Information Center, (MAVERIC), 150 S. Huntington Avenue, Boston, 02130, USA.
| | - Italo Faria do Valle
- VA Boston Healthcare System, Massachusetts Veterans Epidemiology and Research Information Center, (MAVERIC), 150 S. Huntington Avenue, Boston, 02130, USA
- Center for Complex Network Research, Department of Physics, Northeastern University, Boston, 02115, USA
| | - Hanna Gerlovin
- VA Boston Healthcare System, Massachusetts Veterans Epidemiology and Research Information Center, (MAVERIC), 150 S. Huntington Avenue, Boston, 02130, USA
| | - Lauren Costa
- VA Boston Healthcare System, Massachusetts Veterans Epidemiology and Research Information Center, (MAVERIC), 150 S. Huntington Avenue, Boston, 02130, USA
| | - Juan P Casas
- VA Boston Healthcare System, Massachusetts Veterans Epidemiology and Research Information Center, (MAVERIC), 150 S. Huntington Avenue, Boston, 02130, USA
- Brigham and Women's Hospital, Division of Aging, Department of Medicine, Harvard Medical School, Boston, 02115, USA
| | - J Michael Gaziano
- VA Boston Healthcare System, Massachusetts Veterans Epidemiology and Research Information Center, (MAVERIC), 150 S. Huntington Avenue, Boston, 02130, USA
- Brigham and Women's Hospital, Division of Aging, Department of Medicine, Harvard Medical School, Boston, 02115, USA
| | - David R Gagnon
- VA Boston Healthcare System, Massachusetts Veterans Epidemiology and Research Information Center, (MAVERIC), 150 S. Huntington Avenue, Boston, 02130, USA
- School of Public Health, Department of Biostatistics, Boston University, Boston, 02215, USA
| | - Edmon Begoli
- Oak Ridge National Laboratory, Oak Ridge, 37830, USA
| | - Albert-László Barabási
- Center for Complex Network Research, Department of Physics, Northeastern University, Boston, 02115, USA
| | - Kelly Cho
- VA Boston Healthcare System, Massachusetts Veterans Epidemiology and Research Information Center, (MAVERIC), 150 S. Huntington Avenue, Boston, 02130, USA
- Brigham and Women's Hospital, Division of Aging, Department of Medicine, Harvard Medical School, Boston, 02115, USA
| |
Collapse
|
17
|
Astore C, Zhou H, Ilkowski B, Forness J, Skolnick J. LeMeDISCO is a computational method for large-scale prediction & molecular interpretation of disease comorbidity. Commun Biol 2022; 5:870. [PMID: 36008469 PMCID: PMC9411158 DOI: 10.1038/s42003-022-03816-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 08/08/2022] [Indexed: 11/09/2022] Open
Abstract
To understand the origin of disease comorbidity and to identify the essential proteins and pathways underlying comorbid diseases, we developed LeMeDISCO (Large-Scale Molecular Interpretation of Disease Comorbidity), an algorithm that predicts disease comorbidities from shared mode of action proteins predicted by the artificial intelligence-based MEDICASCY algorithm. LeMeDISCO was applied to predict the occurrence of comorbid diseases for 3608 distinct diseases. Benchmarking shows that LeMeDISCO has much better comorbidity recall than the two molecular methods XD-score (44.5% vs. 6.4%) and the SAB score (68.6% vs. 8.0%). Its performance is somewhat comparable to the phenotype method-based Symptom Similarity Score, 63.7% vs. 100%, but LeMeDISCO works for far more cases and its large comorbidity recall is attributed to shared proteins that can help provide an understanding of the molecular mechanism(s) underlying disease comorbidity. The LeMeDISCO web server is available for academic users at: http://sites.gatech.edu/cssb/LeMeDISCO .
Collapse
Affiliation(s)
- Courtney Astore
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, 30332, USA
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, 30332, USA
| | - Bartosz Ilkowski
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, 30332, USA
| | - Jessica Forness
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, 30332, USA
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, 30332, USA.
| |
Collapse
|
18
|
Dong G, Zhang ZC, Feng J, Zhao XM. MorbidGCN: prediction of multimorbidity with a graph convolutional network based on integration of population phenotypes and disease network. Brief Bioinform 2022; 23:6627601. [PMID: 35780382 DOI: 10.1093/bib/bbac255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/17/2022] [Accepted: 06/01/2022] [Indexed: 02/06/2023] Open
Abstract
Exploring multimorbidity relationships among diseases is of great importance for understanding their shared mechanisms, precise diagnosis and treatment. However, the landscape of multimorbidities is still far from complete due to the complex nature of multimorbidity. Although various types of biological data, such as biomolecules and clinical symptoms, have been used to identify multimorbidities, the population phenotype information (e.g. physical activity and diet) remains less explored for multimorbidity. Here, we present a graph convolutional network (GCN) model, named MorbidGCN, for multimorbidity prediction by integrating population phenotypes and disease network. Specifically, MorbidGCN treats the multimorbidity prediction as a missing link prediction problem in the disease network, where a novel feature selection method is embedded to select important phenotypes. Benchmarking results on two large-scale multimorbidity data sets, i.e. the UK Biobank (UKB) and Human Disease Network (HuDiNe) data sets, demonstrate that MorbidGCN outperforms other competitive methods. With MorbidGCN, 9742 and 14 010 novel multimorbidities are identified in the UKB and HuDiNe data sets, respectively. Moreover, we notice that the selected phenotypes that are generally differentially distributed between multimorbidity patients and single-disease patients can help interpret multimorbidities and show potential for prognosis of multimorbidities.
Collapse
Affiliation(s)
- Guiying Dong
- Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, 200433, China.,MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433, China
| | - Zi-Chao Zhang
- Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, 200433, China.,MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433, China
| | - Jianfeng Feng
- Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, 200433, China.,MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433, China.,Zhangjiang Fudan International Innovation Center, Shanghai, 200433, China
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, 200433, China.,MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433, China.,Zhangjiang Fudan International Innovation Center, Shanghai, 200433, China
| |
Collapse
|
19
|
Li X, Xiang J, Wu FX, Li M. A Dual Ranking Algorithm Based on the Multiplex Network for Heterogeneous Complex Disease Analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1993-2002. [PMID: 33577455 DOI: 10.1109/tcbb.2021.3059046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Identifying biomarkers of heterogeneous complex diseases has always been one of the focuses in medical research. In previous studies, the powerful network propagation methods have been applied to finding marker genes related to specific diseases, but existing methods are mostly based on a single network, which may be greatly affected by the incompleteness of the network and the ignorance of a large amount of information about physical and functional interactions between biological components. Other methods that directly integrate multiple types of interactions into an aggregate network have the risks that different types of data may conflict with each other and the characteristics and topologies of each individual network are lost. Meanwhile, biomarkers used in clinical trials should have the characteristics of small quantity and strong discriminate ability. In this study, we developed a multiplex network-based dual ranking framework (DualRank) for heterogeneous complex disease analysis. We applied the proposed method to heterogeneous complex diseases for diagnosis, prognosis, and classification. The results showed that DualRank outperformed competing methods and could identify biomarkers with the small quantity, great prediction performance (average AUC = 0.818) and biological interpretability.
Collapse
|
20
|
Wang Y, Juan L, Peng J, Wang T, Zang T, Wang Y. Explore potential disease related metabolites based on latent factor model. BMC Genomics 2022; 23:269. [PMID: 35387615 PMCID: PMC8985251 DOI: 10.1186/s12864-022-08504-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 03/25/2022] [Indexed: 11/17/2022] Open
Abstract
Background In biological systems, metabolomics can not only contribute to the discovery of metabolic signatures for disease diagnosis, but is very helpful to illustrate the underlying molecular disease-causing mechanism. Therefore, identification of disease-related metabolites is of great significance for comprehensively understanding the pathogenesis of diseases and improving clinical medicine. Results In the paper, we propose a disease and literature driven metabolism prediction model (DLMPM) to identify the potential associations between metabolites and diseases based on latent factor model. We build the disease glossary with disease terms from different databases and an association matrix based on the mapping between diseases and metabolites. The similarity of diseases and metabolites is used to complete the association matrix. Finally, we predict potential associations between metabolites and diseases based on the matrix decomposition method. In total, 1,406 direct associations between diseases and metabolites are found. There are 119,206 unknown associations between diseases and metabolites predicted with a coverage rate of 80.88%. Subsequently, we extract training sets and testing sets based on data increment from the database of disease-related metabolites and assess the performance of DLMPM on 19 diseases. As a result, DLMPM is proven to be successful in predicting potential metabolic signatures for human diseases with an average AUC value of 82.33%. Conclusion In this paper, a computational model is proposed for exploring metabolite-disease pairs and has good performance in predicting potential metabolites related to diseases through adequate validation. The results show that DLMPM has a better performance in prioritizing candidate diseases-related metabolites compared with the previous methods and would be helpful for researchers to reveal more information about human diseases.
Collapse
Affiliation(s)
- Yongtian Wang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China. .,Key Laboratory of Big Data Storage and Management Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi'an, China.
| | - Liran Juan
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China.,Key Laboratory of Big Data Storage and Management Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi'an, China
| | - Tao Wang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China.,Key Laboratory of Big Data Storage and Management Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi'an, China
| | - Tianyi Zang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.
| | - Yadong Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.
| |
Collapse
|
21
|
Xiang J, Zhang J, Zhao Y, Wu FX, Li M. Biomedical data, computational methods and tools for evaluating disease-disease associations. Brief Bioinform 2022; 23:6522999. [PMID: 35136949 DOI: 10.1093/bib/bbac006] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 01/04/2022] [Accepted: 01/05/2022] [Indexed: 12/12/2022] Open
Abstract
In recent decades, exploring potential relationships between diseases has been an active research field. With the rapid accumulation of disease-related biomedical data, a lot of computational methods and tools/platforms have been developed to reveal intrinsic relationship between diseases, which can provide useful insights to the study of complex diseases, e.g. understanding molecular mechanisms of diseases and discovering new treatment of diseases. Human complex diseases involve both external phenotypic abnormalities and complex internal molecular mechanisms in organisms. Computational methods with different types of biomedical data from phenotype to genotype can evaluate disease-disease associations at different levels, providing a comprehensive perspective for understanding diseases. In this review, available biomedical data and databases for evaluating disease-disease associations are first summarized. Then, existing computational methods for disease-disease associations are reviewed and classified into five groups in terms of the usages of biomedical data, including disease semantic-based, phenotype-based, function-based, representation learning-based and text mining-based methods. Further, we summarize software tools/platforms for computation and analysis of disease-disease associations. Finally, we give a discussion and summary on the research of disease-disease associations. This review provides a systematic overview for current disease association research, which could promote the development and applications of computational methods and tools/platforms for disease-disease associations.
Collapse
Affiliation(s)
- Ju Xiang
- School of Computer Science and Engineering, Central South University, China
| | - Jiashuai Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, China
| | - Fang-Xiang Wu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Min Li
- Division of Biomedical Engineering and Department of Mechanical Engineering at University of Saskatchewan, Saskatoon, Canada
| |
Collapse
|
22
|
Convalescing the Process of Ranking Metabolites for Diseases using Subcellular Localization. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-021-06023-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
23
|
López-Sánchez M, Loucera C, Peña-Chilet M, Dopazo J. Discovering potential interactions between rare diseases and COVID-19 by combining mechanistic models of viral infection with statistical modeling. Hum Mol Genet 2022; 31:2078-2089. [PMID: 35022696 PMCID: PMC9239744 DOI: 10.1093/hmg/ddac007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 12/30/2021] [Accepted: 01/10/2022] [Indexed: 11/28/2022] Open
Abstract
Recent studies have demonstrated a relevant role of the host genetics in the coronavirus disease 2019 (COVID-19) prognosis. Most of the 7000 rare diseases described to date have a genetic component, typically highly penetrant. However, this vast spectrum of genetic variability remains yet unexplored with respect to possible interactions with COVID-19. Here, a mathematical mechanistic model of the COVID-19 molecular disease mechanism has been used to detect potential interactions between rare disease genes and the COVID-19 infection process and downstream consequences. Out of the 2518 disease genes analyzed, causative of 3854 rare diseases, a total of 254 genes have a direct effect on the COVID-19 molecular disease mechanism and 207 have an indirect effect revealed by a significant strong correlation. This remarkable potential of interaction occurs for >300 rare diseases. Mechanistic modeling of COVID-19 disease map has allowed a holistic systematic analysis of the potential interactions between the loss of function in known rare disease genes and the pathological consequences of COVID-19 infection. The results identify links between disease genes and COVID-19 hallmarks and demonstrate the usefulness of the proposed approach for future preventive measures in some rare diseases.
Collapse
Affiliation(s)
- Macarena López-Sánchez
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio. 41013. Sevilla. Spain.,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio. 41013. Sevilla. Spain
| | - Carlos Loucera
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio. 41013. Sevilla. Spain.,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio. 41013. Sevilla. Spain
| | - María Peña-Chilet
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio. 41013. Sevilla. Spain.,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio. 41013. Sevilla. Spain.,Bioinformatics in Rare Diseases (BiER). Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), FPS, Hospital Virgen del Rocío. 41013. Sevilla, Spain
| | - Joaquín Dopazo
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio. 41013. Sevilla. Spain.,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio. 41013. Sevilla. Spain.,Bioinformatics in Rare Diseases (BiER). Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), FPS, Hospital Virgen del Rocío. 41013. Sevilla, Spain.,FPS/ELIXIR-es, Hospital Virgen del Rocío, Sevilla, 42013, Spain
| |
Collapse
|
24
|
Nam Y, Jung SH, Verma A, Sriram V, Won HH, Yun JS, Kim D. netCRS: Network-based comorbidity risk score for prediction of myocardial infarction using biobank-scaled PheWAS data. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2022; 27:325-336. [PMID: 34890160 PMCID: PMC8682919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
The polygenic risk score (PRS) can help to identify individuals' genetic susceptibility for various diseases by combining patient genetic profiles and identified single-nucleotide polymorphisms (SNPs) from genome-wide association studies. Although multiple diseases will usually afflict patients at once or in succession, conventional PRSs fail to consider genetic relationships across multiple diseases. Even multi-trait PRSs, which take into account genetic effects for more than one disease at a time, fail to consider a sufficient number of phenotypes to accurately reflect the state of disease comorbidity in a patient, or are biased in terms of the traits that are selected. Thus, we developed novel network-based comorbidity risk scores to quantify associations among multiple phenotypes from phenome-wide association studies (PheWAS). We first constructed a disease-SNP heterogeneous multi-layered network (DS-Net), which consists of a disease network (disease-layer) and SNP network (SNP-layer). The disease-layer describes the population-level interactome from PheWAS data. The SNP-layer was constructed according to linkage disequilibrium. Both layers were attached to transform the information from a population-level interactome to individual-level inferences. Then, graph-based semi-supervised learning was applied to predict possible comorbidity scores on disease-layer for each subject. The SNP-layer serves as receiving individual genotyping data in the scoring process, and the disease-layer serves as the propagated output for an individual's multiple disease comorbidity scores. The possible comorbidity scores were combined by logistic regression, and it is denoted as netCRS. The DS-Net was constructed from UK Biobank PheWAS data, and the individual genetic profiles were collected from the Penn Medicine Biobank. As a proof-of-concept study, myocardial infarction (MI) was selected to compare netCRS with the PRS with pruning and thresholding (PRS-PT). The combined model (netCRS + PRS-PT + covariates) achieved an AUC improvement of 6.26% compared to the (PRS-PT + covariates) model. In terms of risk stratification, the combined model was able to capture the risk of MI up to approximately eight-fold higher than that of the low-risk group. The netCRS and PRS-PT complement each other in predicting high-risk groups of patients with MI. We expect that using these risk prediction models will allow for the development of prevention strategies and reduction of MI morbidity and mortality.
Collapse
Affiliation(s)
- Yonghyun Nam
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sang-Hyuk Jung
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Samsung Advanced Institute for Health Sciences and Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul 06351, Republic of Korea
| | - Anurag Verma
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Vivek Sriram
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Hong-Hee Won
- Samsung Advanced Institute for Health Sciences and Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul 06351, Republic of Korea
| | - Jae-Seung Yun
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Division of Endocrinology and Metabolism, Department of Internal Medicine, St. Vincent’s Hospital, College of Medicine, The Catholic University of Korea, Seoul 06591, Republic of Korea
| | | | - Dokyoon Kim
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
25
|
Ma Y, Ma Y. Hypergraph-based logistic matrix factorization for metabolite-disease interaction prediction. Bioinformatics 2022; 38:435-443. [PMID: 34499104 DOI: 10.1093/bioinformatics/btab652] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 08/08/2021] [Accepted: 09/06/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Function-related metabolites, the terminal products of the cell regulation, show a close association with complex diseases. The identification of disease-related metabolites is critical to the diagnosis, prevention and treatment of diseases. However, most existing computational approaches build networks by calculating pairwise relationships, which is inappropriate for mining higher-order relationships. RESULTS In this study, we presented a novel approach with hypergraph-based logistic matrix factorization, HGLMF, to predict the potential interactions between metabolites and disease. First, the molecular structures and gene associations of metabolites and the hierarchical structures and GO functional annotations of diseases were extracted to build various similarity measures of metabolites and diseases. Next, the kernel neighborhood similarity of metabolites (or diseases) was calculated according to the completed interactive network. Second, multiple networks of metabolites and diseases were fused, respectively, and the hypergraph structures of metabolites and diseases were built. Finally, a logistic matrix factorization based on hypergraph was proposed to predict potential metabolite-disease interactions. In computational experiments, HGLMF accurately predicted the metabolite-disease interaction, and performed better than other state-of-the-art methods. Moreover, HGLMF could be used to predict new metabolites (or diseases). As suggested from the case studies, the proposed method could discover novel disease-related metabolites, which has been confirmed in existing studies. AVAILABILITY AND IMPLEMENTATION The codes and dataset are available at: https://github.com/Mayingjun20179/HGLMF. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yingjun Ma
- School of Applied Mathematics, Xiamen University of Technology, Xiamen 361024, China
| | - Yuanyuan Ma
- School of Computer & Information Engineering, Anyang Normal University, Anyang 455000, China
| |
Collapse
|
26
|
Redhu N, Thakur Z. Network biology and applications. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00024-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
27
|
Li X, Xiang J, Wang J, Li J, Wu FX, Li M. FUNMarker: Fusion Network-Based Method to Identify Prognostic and Heterogeneous Breast Cancer Biomarkers. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2483-2491. [PMID: 32070993 DOI: 10.1109/tcbb.2020.2973148] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Breast cancer is a heterogeneous disease with many clinically distinguishable molecular subtypes each corresponding to a cluster of patients. Identification of prognostic and heterogeneous biomarkers for breast cancer is to detect cluster-specific gene biomarkers which can be used for accurate survival prediction of breast cancer outcomes. In this study, we proposed a FUsion Network-based method (FUNMarker) to identify prognostic and heterogeneous breast cancer biomarkers by considering the heterogeneity of patient samples and biological information from multiple sources. To reduce the affect of heterogeneity of patients, samples were first clustered using the K-means algorithm based on the principal components of gene expression. For each cluster, to comprehensively evaluate the influence of genes on breast cancer, genes were weighted from three aspects: biological function, prognostic ability and correlation with known disease genes. Then they were ranked via a label propagation model on a fusion network that combined physical protein interactions from seven types of networks and thus could reduce the impact of incompleteness of interactome. We compared FUNMarker with three state-of-the-art methods and the results showed that biomarkers identified by FUNMarker were biological interpretable and had stronger discriminative power than the existing methods in differentiating patients with different prognostic outcomes.
Collapse
|
28
|
Ma L, Wang S, Lin Q, Li J, You Z, Huang J, Gong M. Multi-Neighborhood Learning for Global Alignment in Biological Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2598-2611. [PMID: 32305933 DOI: 10.1109/tcbb.2020.2985838] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The global alignment of biological networks (GABN) aims to find an optimal alignment between proteins across species, such that both the biological structures and the topological structures of the proteins are maximally conserved. The research on GABN has attracted great attention due to its applications on species evolution, orthology detection and genetic analyses. Most of the existing methods for GABN are difficult to obtain a good tradeoff between the conservation of the biological structures and topological structures. In this paper, we propose a multi-neighborhood learning method for solving GABN (called as CLMNA). CLMNA first models GABN as an optimization of a weighted similarity which evaluates the conserved biological and topological similarities of an alignment, and then it combines a first-proximity, second-proximity and individual-aware proximity learning algorithm to solve the modeled problem. Finally, systematic experiments on 10 pairs of biological networks across 5 species show the superiority of CLMNA over the state-of-the-art network alignment algorithms. They also validate the effectiveness of CLMNA as a refinement method on improving the performance of the compared algorithms.
Collapse
|
29
|
Liu J, Zhu H, Qiu J. Locally Adjust Networks Based on Connectivity and Semantic Similarities for Disease Module Detection. Front Genet 2021; 12:726596. [PMID: 34759955 PMCID: PMC8575408 DOI: 10.3389/fgene.2021.726596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 09/22/2021] [Indexed: 11/13/2022] Open
Abstract
For studying the pathogenesis of complex diseases, it is important to identify the disease modules in the system level. Since the protein-protein interaction (PPI) networks contain a number of incomplete and incorrect interactome, most existing methods often lead to many disease proteins isolating from disease modules. In this paper, we propose an effective disease module identification method IDMCSS, where the used human PPI networks are obtained by adding some potential missing interactions from existing PPI networks, as well as removing some potential incorrect interactions. In IDMCSS, a network adjustment strategy is developed to add or remove links around disease proteins based on both topological and semantic information. Next, neighboring proteins of disease proteins are prioritized according to a suggested similarity between each of them and disease proteins, and the protein with the largest similarity with disease proteins is added into a candidate disease protein set one by one. The stopping criterion is set to the boundary of the disease proteins. Finally, the connected subnetwork having the largest number of disease proteins is selected as a disease module. Experimental results on asthma demonstrate the effectiveness of the method in comparison to existing algorithms for disease module identification. It is also shown that the proposed IDMCSS can obtain the disease modules having crucial biological processes of asthma and 12 targets for drug intervention can be predicted.
Collapse
Affiliation(s)
- Jia Liu
- State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing, China
| | - Huole Zhu
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, Hefei, China
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, School of Artificial Intelligence, Anhui University, Hefei, China
| | - Jianfeng Qiu
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, Hefei, China
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, School of Artificial Intelligence, Anhui University, Hefei, China
| |
Collapse
|
30
|
Rosário-Ferreira N, Guimarães V, Costa VS, Moreira IS. SicknessMiner: a deep-learning-driven text-mining tool to abridge disease-disease associations. BMC Bioinformatics 2021; 22:482. [PMID: 34607568 PMCID: PMC8491382 DOI: 10.1186/s12859-021-04397-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 09/24/2021] [Indexed: 12/24/2022] Open
Abstract
Background Blood cancers (BCs) are responsible for over 720 K yearly deaths worldwide. Their prevalence and mortality-rate uphold the relevance of research related to BCs. Despite the availability of different resources establishing Disease-Disease Associations (DDAs), the knowledge is scattered and not accessible in a straightforward way to the scientific community. Here, we propose SicknessMiner, a biomedical Text-Mining (TM) approach towards the centralization of DDAs. Our methodology encompasses Named Entity Recognition (NER) and Named Entity Normalization (NEN) steps, and the DDAs retrieved were compared to the DisGeNET resource for qualitative and quantitative comparison. Results We obtained the DDAs via co-mention using our SicknessMiner or gene- or variant-disease similarity on DisGeNET. SicknessMiner was able to retrieve around 92% of the DisGeNET results and nearly 15% of the SicknessMiner results were specific to our pipeline. Conclusions SicknessMiner is a valuable tool to extract disease-disease relationship from RAW input corpus. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04397-w.
Collapse
Affiliation(s)
- Nícia Rosário-Ferreira
- CQC - Coimbra Chemistry Center, Chemistry Department, Faculty of Science and Technology, University of Coimbra, 3004-535, Coimbra, Portugal. .,CNC - Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal.
| | - Victor Guimarães
- Department of Sciences, University of Porto, Porto, Portugal.,INESC-TEC - Centre of Advanced Computing Systems, Porto, Portugal
| | - Vítor S Costa
- Department of Sciences, University of Porto, Porto, Portugal.,INESC-TEC - Centre of Advanced Computing Systems, Porto, Portugal
| | - Irina S Moreira
- Department of Life Sciences, University of Coimbra, Calçada Martim de Freitas, 3000-456, Coimbra, Portugal. .,CNC - Center for Neuroscience and Cell Biology, CIBB - Center for Innovative Biomedicine and Biotechnology, University of Coimbra, Coimbra, Portugal.
| |
Collapse
|
31
|
Dong G, Feng J, Sun F, Chen J, Zhao XM. A global overview of genetically interpretable multimorbidities among common diseases in the UK Biobank. Genome Med 2021; 13:110. [PMID: 34225788 PMCID: PMC8258962 DOI: 10.1186/s13073-021-00927-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 06/22/2021] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Multimorbidities greatly increase the global health burdens, but the landscapes of their genetic risks have not been systematically investigated. METHODS We used the hospital inpatient data of 385,335 patients in the UK Biobank to investigate the multimorbid relations among 439 common diseases. Post-GWAS analyses were performed to identify multimorbidity shared genetic risks at the genomic loci, network, as well as overall genetic architecture levels. We conducted network decomposition for the networks of genetically interpretable multimorbidities to detect the hub diseases and the involved molecules and functions in each module. RESULTS In total, 11,285 multimorbidities among 439 common diseases were identified, and 46% of them were genetically interpretable at the loci, network, or overall genetic architecture levels. Multimorbidities affecting the same and different physiological systems displayed different patterns of the shared genetic components, with the former more likely to share loci-level genetic components while the latter more likely to share network-level genetic components. Moreover, both the loci- and network-level genetic components shared by multimorbidities converged on cell immunity, protein metabolism, and gene silencing. Furthermore, we found that the genetically interpretable multimorbidities tend to form network modules, mediated by hub diseases and featuring physiological categories. Finally, we showcased how hub diseases mediating the multimorbidity modules could help provide useful insights for the genetic contributors of multimorbidities. CONCLUSIONS Our results provide a systematic resource for understanding the genetic predispositions of multimorbidities and indicate that hub diseases and converged molecules and functions may be the key for treating multimorbidities. We have created an online database that facilitates researchers and physicians to browse, search, or download these multimorbidities ( https://multimorbidity.comp-sysbio.org ).
Collapse
Affiliation(s)
- Guiying Dong
- Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, 200433 China
- MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433 China
| | - Jianfeng Feng
- Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, 200433 China
- MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433 China
- Zhangjiang Fudan International Innovation Center, Shanghai, 200433 China
| | - Fengzhu Sun
- Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089 USA
| | - Jingqi Chen
- Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, 200433 China
- MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433 China
- Zhangjiang Fudan International Innovation Center, Shanghai, 200433 China
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, 200433 China
- MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433 China
- Zhangjiang Fudan International Innovation Center, Shanghai, 200433 China
| |
Collapse
|
32
|
Ozden F, Siper MC, Acarsoy N, Elmas T, Marty B, Qi X, Cicek AE. DORMAN: Database of Reconstructed MetAbolic Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1474-1480. [PMID: 31581093 DOI: 10.1109/tcbb.2019.2944905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Genome-scale reconstructed metabolic networks have provided an organism specific understanding of cellular processes and their relations to phenotype. As they are deemed essential to study metabolism, the number of organisms with reconstructed metabolic networks continues to increase. This everlasting research interest lead to the development of online systems/repositories that store existing reconstructions and enable new model generation, integration, and constraint-based analyses. While features that support model reconstruction are widely available, current systems lack the means to help users who are interested in analyzing the topology of the reconstructed networks. Here, we present the Database of Reconstructed Metabolic Networks - DORMAN. DORMAN is a centralized online database that stores SBML-based reconstructed metabolic networks published in the literature, and provides web-based computational tools for visualizing and analyzing the model topology. Novel features of DORMAN are (i) interactive visualization interface that allows rendering of the complete network as well as editing and exporting the model, (ii) hierarchical navigation that provides efficient access to connected entities in the model, (iii) built-in query interface that allow posing topological queries, and finally, and (iv) model comparison tool that enables comparing models with different nomenclatures, using approximate string matching. DORMAN is online and freely accessible at http://ciceklab.cs.bilkent.edu.tr/dorman.
Collapse
|
33
|
Hossain MJ, Chowdhury UN, Islam MB, Uddin S, Ahmed MB, Quinn JMW, Moni MA. Machine learning and network-based models to identify genetic risk factors to the progression and survival of colorectal cancer. Comput Biol Med 2021; 135:104539. [PMID: 34153790 DOI: 10.1016/j.compbiomed.2021.104539] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 05/12/2021] [Accepted: 05/26/2021] [Indexed: 01/04/2023]
Abstract
Colorectal cancer (CRC) is one of the most common and lethal malignant lesions. Determining how the identified risk factors drive the formation and development of CRC could be an essential means for effective therapeutic development. Aiming this, we investigated how the altered gene expression resulting from exposure to putative CRC risk factors contribute to prognostic biomarker identification. Differentially expressed genes (DEGs) were first identified for CRC and other eight risk factors. Gene set enrichment analysis (GSEA) through the molecular pathway and gene ontology (GO), as well as protein-protein interaction (PPI) network, were then conducted to predict the functions of these DEGs. Our identified genes were explored through the dbGaP and OMIM databases to compare with the already identified and known prognostic CRC biomarkers. The survival time of CRC patients was also examined using a Cox Proportional Hazard regression-based prognostic model by integrating transcriptome data from The Cancer Genome Atlas (TCGA). In this study, PPI analysis identified 4 sub-networks and 8 hub genes that may be potential therapeutic targets, including CXCL8, ICAM1, SOD2, CXCL2, CCL20, OIP5, BUB1, ASPM and IL1RN. We also identified seven signature genes (PRR5.ARHGAP8, CA7, NEDD4L, GFR2, ARHGAP8, SMTN, OIP5) in independent analysis and among which PRR5. ARHGAP8 was found in both multivariate analyses and in analyses that combined gene expression and clinical information. This approach provides both mechanistic information and, when combined with predictive clinical information, good evidence that the identified genes are significant biomarkers of processes involved in CRC progression and survival.
Collapse
Affiliation(s)
- Md Jakir Hossain
- Department of Electrical and Electronic Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh
| | - Utpala Nanda Chowdhury
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh
| | - M Babul Islam
- Department of Electrical and Electronic Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh
| | - Shahadat Uddin
- Complex Systems Research Group & Project Management Program, Faculty of Engineering, The University of Sydney, NSW, 2006, Australia
| | - Mohammad Boshir Ahmed
- School of Material Science and Engineering, Gwangju Institute of Science and Technology, Gwangju, 61005, Republic of Korea
| | - Julian M W Quinn
- Healthy Ageing Theme, Garvan Institute of Medical Research, Darlinghurst, NSW, 2010, Australia
| | - Mohammad Ali Moni
- Healthy Ageing Theme, Garvan Institute of Medical Research, Darlinghurst, NSW, 2010, Australia; WHO Collaborating Centre on eHealth, School of Public Health and Community Medicine, Faculty of Medicine, UNSW Sydney, NSW, 2052, Australia.
| |
Collapse
|
34
|
Shin EK, Choi HY, Hayes N. The anatomy of COVID-19 comorbidity networks among hospitalized Korean patients. Epidemiol Health 2021; 43:e2021035. [PMID: 33971700 PMCID: PMC8289479 DOI: 10.4178/epih.e2021035] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Accepted: 05/07/2021] [Indexed: 12/17/2022] Open
Abstract
OBJECTIVES We aimed to examine how comorbidities were associated with outcomes (illness severity or death) among hospitalized patients with coronavirus disease 2019 (COVID-19). METHODS Data were provided by the National Medical Center of the Korea Disease Control and Prevention Agency. These data included the clinical and epidemiological information of all patients hospitalized with COVID-19 who were discharged on or before April 30, 2020 in Korea. We conducted comorbidity network and multinomial logistic regression analyses to identify risk factors associated with COVID-19 disease severity and mortality. The outcome variable was the clinical severity score (CSS), categorized as mild (oxygen treatment not needed), severe (oxygen treatment needed), or death. RESULTS In total, 5,771 patients were included. In the fully adjusted model, chronic kidney disease (CKD) (odds ratio [OR], 2.58; 95% confidence interval [CI], 1.19 to 5.61) and chronic obstructive pulmonary disease (COPD) (OR, 3.19; 95% CI, 1.35 to 7.52) were significantly associated with disease severity. CKD (OR, 5.35; 95% CI, 2.00 to 14.31), heart failure (HF) (OR, 3.15; 95% CI, 1.22 to 8.15), malignancy (OR, 3.38; 95% CI, 1.59 to 7.17), dementia (OR, 2.62; 95% CI, 1.45 to 4.72), and diabetes mellitus (OR, 2.26; 95% CI, 1.46 to 3.49) were associated with an increased risk of death. Asthma and hypertension showed statistically insignificant associations with an increased risk of death. CONCLUSIONS Underlying diseases contribute differently to the severity of COVID-19. To efficiently allocate limited medical resources, underlying comorbidities should be closely monitored, particularly CKD, COPD, and HF.
Collapse
Affiliation(s)
| | - Hyo Young Choi
- Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Neil Hayes
- Department of Medicine, Division of Hematology and Oncology, University of Tennessee Health Science Center, Memphis, TN, USA
| |
Collapse
|
35
|
Xiang J, Zhang J, Zheng R, Li X, Li M. NIDM: network impulsive dynamics on multiplex biological network for disease-gene prediction. Brief Bioinform 2021; 22:6236070. [PMID: 33866352 DOI: 10.1093/bib/bbab080] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 02/11/2021] [Accepted: 02/21/2021] [Indexed: 12/12/2022] Open
Abstract
The prediction of genes related to diseases is important to the study of the diseases due to high cost and time consumption of biological experiments. Network propagation is a popular strategy for disease-gene prediction. However, existing methods focus on the stable solution of dynamics while ignoring the useful information hidden in the dynamical process, and it is still a challenge to make use of multiple types of physical/functional relationships between proteins/genes to effectively predict disease-related genes. Therefore, we proposed a framework of network impulsive dynamics on multiplex biological network (NIDM) to predict disease-related genes, along with four variants of NIDM models and four kinds of impulsive dynamical signatures (IDSs). NIDM is to identify disease-related genes by mining the dynamical responses of nodes to impulsive signals being exerted at specific nodes. By a series of experimental evaluations in various types of biological networks, we confirmed the advantage of multiplex network and the important roles of functional associations in disease-gene prediction, demonstrated superior performance of NIDM compared with four types of network-based algorithms and then gave the effective recommendations of NIDM models and IDS signatures. To facilitate the prioritization and analysis of (candidate) genes associated to specific diseases, we developed a user-friendly web server, which provides three kinds of filtering patterns for genes, network visualization, enrichment analysis and a wealth of external links (http://bioinformatics.csu.edu.cn/DGP/NID.jsp). NIDM is a protocol for disease-gene prediction integrating different types of biological networks, which may become a very useful computational tool for the study of disease-related genes.
Collapse
Affiliation(s)
- Ju Xiang
- School of Computer Science and Engineering, Central South University, Human, China
| | - Jiashuai Zhang
- School of Computer Science and Engineering, Central South University, Human, China
| | - Ruiqing Zheng
- School of Computer Science and Engineering, Central South University, China
| | - Xingyi Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
36
|
Burak Gürbüz M, Rekik I. MGN-Net: A multi-view graph normalizer for integrating heterogeneous biological network populations. Med Image Anal 2021; 71:102059. [PMID: 33930831 DOI: 10.1016/j.media.2021.102059] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 03/21/2021] [Accepted: 03/29/2021] [Indexed: 11/17/2022]
Abstract
With the recent technological advances, biological datasets, often represented by networks (i.e., graphs) of interacting entities, proliferate with unprecedented complexity and heterogeneity. Although modern network science opens new frontiers of analyzing connectivity patterns in such datasets, we still lack data-driven methods for extracting an integral connectional fingerprint of a multi-view graph population, let alone disentangling the typical from the atypical variations across the population samples. We present the multi-view graph normalizer network (MGN-Net2), a graph neural network based method to normalize and integrate a set of multi-view biological networks into a single connectional template that is centered, representative, and topologically sound. We demonstrate the use of MGN-Net by discovering the connectional fingerprints of healthy and neurologically disordered brain network populations including Alzheimer's disease and Autism spectrum disorder patients. Additionally, by comparing the learned templates of healthy and disordered populations, we show that MGN-Net significantly outperforms conventional network integration methods across extensive experiments in terms of producing the most centered templates, recapitulating unique traits of populations, and preserving the complex topology of biological networks. Our evaluations showed that MGN-Net is powerfully generic and easily adaptable in design to different graph-based problems such as identification of relevant connections, normalization and integration.
Collapse
Affiliation(s)
- Mustafa Burak Gürbüz
- BASIRA Lab, Faculty of Computer and Informatics Engineering, Istanbul Technical University, Istanbul, Turkey
| | - Islem Rekik
- BASIRA Lab, Faculty of Computer and Informatics Engineering, Istanbul Technical University, Istanbul, Turkey; School of Science and Engineering, Computing, University of Dundee, UK. http://www.basira-lab.com/
| |
Collapse
|
37
|
Dikaios N. Deep learning magnetic resonance spectroscopy fingerprints of brain tumours using quantum mechanically synthesised data. NMR IN BIOMEDICINE 2021; 34:e4479. [PMID: 33448078 DOI: 10.1002/nbm.4479] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 11/24/2020] [Accepted: 01/05/2021] [Indexed: 06/12/2023]
Abstract
Metabolic fingerprints are valuable biomarkers for diseases that are associated with metabolic disorders. 1H magnetic resonance spectroscopy (MRS) is a unique noninvasive diagnostic tool that can depict the metabolic fingerprint based solely on the proton signal of different molecules present in the tissue. However, its performance is severely hindered by low SNR, field inhomogeneities and overlapping spectra of metabolites, which affect the quantification of metabolites. Consequently, MRS is rarely included in routine clinical protocols and has not been proven in multi-institutional trials. This work proposes an alternative approach, where instead of quantifying metabolites' concentration, deep learning (DL) is used to model the complex nonlinear relationship between diseases and their spectroscopic metabolic fingerprint (pattern). DL requires large training datasets, acquired (ideally) with the same protocol/scanner, which are very rarely available. To overcome this limitation, a novel method is proposed that can quantum mechanically synthesise MRS data for any scanner/acquisition protocol. The proposed methodology is applied to the challenging clinical problem of differentiating metastasis from glioblastoma brain tumours on data acquired across multiple institutions. DL algorithms were trained on the augmented synthetic spectra and tested on two independent datasets acquired by different scanners, achieving a receiver operating characteristic area under the curve of up to 0.96 and 0.97, respectively.
Collapse
Affiliation(s)
- Nikolaos Dikaios
- Mathematics Research Center, Academy of Athens, Athens, Greece
- Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, UK
| |
Collapse
|
38
|
Guo Z, Fu Y, Huang C, Zheng C, Wu Z, Chen X, Gao S, Ma Y, Shahen M, Li Y, Tu P, Zhu J, Wang Z, Xiao W, Wang Y. NOGEA: A Network-oriented Gene Entropy Approach for Dissecting Disease Comorbidity and Drug Repositioning. GENOMICS, PROTEOMICS & BIOINFORMATICS 2021; 19:549-564. [PMID: 33744433 PMCID: PMC9040018 DOI: 10.1016/j.gpb.2020.06.023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Revised: 04/04/2020] [Accepted: 09/24/2020] [Indexed: 10/31/2022]
Abstract
Rapid development of high-throughput technologies has permitted the identification of an increasing number of disease-associated genes (DAGs), which are important for understanding disease initiation and developing precision therapeutics. However, DAGs often contain large amounts of redundant or false positive information, leading to difficulties in quantifying and prioritizing potential relationships between these DAGs and human diseases. In this study, a network-oriented gene entropy approach (NOGEA) is proposed for accurately inferring master genes that contribute to specific diseases by quantitatively calculating their perturbation abilities on directed disease-specific gene networks. In addition, we confirmed that the master genes identified by NOGEA have a high reliability for predicting disease-specific initiation events and progression risk. Master genes may also be used to extract the underlying information of different diseases, thus revealing mechanisms of disease comorbidity. More importantly, approved therapeutic targets are topologically localized in a small neighborhood of master genes on the interactome network, which provides a new way for predicting drug-disease associations. Through this method, 11 old drugs were newly identified and predicted to be effective for treating pancreatic cancer and then validated by in vitro experiments. Collectively, the NOGEA was useful for identifying master genes that control disease initiation and co-occurrence, thus providing a valuable strategy for drug efficacy screening and repositioning. NOGEA codes are publicly available at https://github.com/guozihuaa/NOGEA.
Collapse
Affiliation(s)
- Zihu Guo
- College of Life Science, Northwest University, Xi'an 710069, China; College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Yingxue Fu
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Chao Huang
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Chunli Zheng
- College of Life Science, Northwest University, Xi'an 710069, China
| | - Ziyin Wu
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Xuetong Chen
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Shuo Gao
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Yaohua Ma
- College of Life Science, Northwest University, Xi'an 710069, China
| | - Mohamed Shahen
- Zoology Department, Faculty of Science, Tanta University, Tanta 31527, Egypt
| | - Yan Li
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Faculty of Chemical, Environmental and Biological Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Pengfei Tu
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Jingbo Zhu
- School of Food Science and Technology, Dalian Polytechnic University, Dalian 116034, China
| | - Zhenzhong Wang
- State Key Laboratory of New-tech for Chinese Medicine Pharmaceutical Process, Lianyungang 222001, China
| | - Wei Xiao
- State Key Laboratory of New-tech for Chinese Medicine Pharmaceutical Process, Lianyungang 222001, China.
| | - Yonghua Wang
- College of Life Science, Northwest University, Xi'an 710069, China; College of Life Science, Northwest A & F University, Yangling 712100, China; State Key Laboratory of New-tech for Chinese Medicine Pharmaceutical Process, Lianyungang 222001, China.
| |
Collapse
|
39
|
Karimi M, Hasanzadeh A, Shen Y. Network-principled deep generative models for designing drug combinations as graph sets. Bioinformatics 2021; 36:i445-i454. [PMID: 32657357 PMCID: PMC7355302 DOI: 10.1093/bioinformatics/btaa317] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Motivation Combination therapy has shown to improve therapeutic efficacy while reducing side effects. Importantly, it has become an indispensable strategy to overcome resistance in antibiotics, antimicrobials and anticancer drugs. Facing enormous chemical space and unclear design principles for small-molecule combinations, computational drug-combination design has not seen generative models to meet its potential to accelerate resistance-overcoming drug combination discovery. Results We have developed the first deep generative model for drug combination design, by jointly embedding graph-structured domain knowledge and iteratively training a reinforcement learning-based chemical graph-set designer. First, we have developed hierarchical variational graph auto-encoders trained end-to-end to jointly embed gene–gene, gene–disease and disease–disease networks. Novel attentional pooling is introduced here for learning disease representations from associated genes’ representations. Second, targeting diseases in learned representations, we have recast the drug-combination design problem as graph-set generation and developed a deep learning-based model with novel rewards. Specifically, besides chemical validity rewards, we have introduced novel generative adversarial award, being generalized sliced Wasserstein, for chemically diverse molecules with distributions similar to known drugs. We have also designed a network principle-based reward for disease-specific drug combinations. Numerical results indicate that, compared to state-of-the-art graph embedding methods, hierarchical variational graph auto-encoder learns more informative and generalizable disease representations. Results also show that the deep generative models generate drug combinations following the principle across diseases. Case studies on four diseases show that network-principled drug combinations tend to have low toxicity. The generated drug combinations collectively cover the disease module similar to FDA-approved drug combinations and could potentially suggest novel systems pharmacology strategies. Our method allows for examining and following network-based principle or hypothesis to efficiently generate disease-specific drug combinations in a vast chemical combinatorial space. Availability and implementation https://github.com/Shen-Lab/Drug-Combo-Generator. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mostafa Karimi
- Department of Electrical and Computer Engineering.,TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX 77843, USA
| | | | - Yang Shen
- Department of Electrical and Computer Engineering.,TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX 77843, USA
| |
Collapse
|
40
|
Yang J, Dong C, Duan H, Shu Q, Li H. RDmap: a map for exploring rare diseases. Orphanet J Rare Dis 2021; 16:101. [PMID: 33632281 PMCID: PMC7905868 DOI: 10.1186/s13023-021-01741-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Accepted: 02/11/2021] [Indexed: 02/01/2023] Open
Abstract
Background The complexity of the phenotypic characteristics and molecular bases of many rare human genetic diseases makes the diagnosis of such diseases a challenge for clinicians. A map for visualizing, locating and navigating rare diseases based on similarity will help clinicians and researchers understand and easily explore these diseases. Methods A distance matrix of rare diseases included in Orphanet was measured by calculating the quantitative distance among phenotypes and pathogenic genes based on Human Phenotype Ontology (HPO) and Gene Ontology (GO), and each disease was mapped into Euclidean space. A rare disease map, enhanced by clustering classes and disease information, was developed based on ECharts. Results A rare disease map called RDmap was published at http://rdmap.nbscn.org. Total 3287 rare diseases are included in the phenotype-based map, and 3789 rare genetic diseases are included in the gene-based map; 1718 overlapping diseases are connected between two maps. RDmap works similarly to the widely used Google Map service and supports zooming and panning. The phenotype similarity base disease location function performed better than traditional keyword searches in an in silico evaluation, and 20 published cases of rare diseases also demonstrated that RDmap can assist clinicians in seeking the rare disease diagnosis. Conclusion RDmap is the first user-interactive map-style rare disease knowledgebase. It will help clinicians and researchers explore the increasingly complicated realm of rare genetic diseases.
Collapse
Affiliation(s)
- Jian Yang
- The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Binsheng Road 3333#, Hangzhou, Zhejiang, 310052, China.,The College of Biomedical Engineering and Instrument Science, Zhejiang University, Zhejiang, China
| | - Cong Dong
- The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Binsheng Road 3333#, Hangzhou, Zhejiang, 310052, China.,The College of Biomedical Engineering and Instrument Science, Zhejiang University, Zhejiang, China
| | - Huilong Duan
- The College of Biomedical Engineering and Instrument Science, Zhejiang University, Zhejiang, China
| | - Qiang Shu
- The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Binsheng Road 3333#, Hangzhou, Zhejiang, 310052, China
| | - Haomin Li
- The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Binsheng Road 3333#, Hangzhou, Zhejiang, 310052, China.
| |
Collapse
|
41
|
Towards the routine use of in silico screenings for drug discovery using metabolic modelling. Biochem Soc Trans 2021; 48:955-969. [PMID: 32369553 PMCID: PMC7329353 DOI: 10.1042/bst20190867] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 04/01/2020] [Accepted: 04/06/2020] [Indexed: 12/12/2022]
Abstract
Currently, the development of new effective drugs for cancer therapy is not only hindered by development costs, drug efficacy, and drug safety but also by the rapid occurrence of drug resistance in cancer. Hence, new tools are needed to study the underlying mechanisms in cancer. Here, we discuss the current use of metabolic modelling approaches to identify cancer-specific metabolism and find possible new drug targets and drugs for repurposing. Furthermore, we list valuable resources that are needed for the reconstruction of cancer-specific models by integrating various available datasets with genome-scale metabolic reconstructions using model-building algorithms. We also discuss how new drug targets can be determined by using gene essentiality analysis, an in silico method to predict essential genes in a given condition such as cancer and how synthetic lethality studies could greatly benefit cancer patients by suggesting drug combinations with reduced side effects.
Collapse
|
42
|
Degree Adjusted Large-Scale Network Analysis Reveals Novel Putative Metabolic Disease Genes. BIOLOGY 2021; 10:biology10020107. [PMID: 33546175 PMCID: PMC7913176 DOI: 10.3390/biology10020107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 01/24/2021] [Accepted: 01/30/2021] [Indexed: 11/16/2022]
Abstract
Simple Summary To explore some of the low-degree but topologically important nodes in the Metabolic disease (MD) network, we propose a background-corrected betweenness centrality (BC) and identify 16 novel candidates likely to play a role in MD. MD specific protein–protein interaction networks (PPINs) were constructed using two known databasesHuman Protein Reference Database (HPRD) and BioGRID. The identified candidates have been found to play a role in diverse conditions including co-morbidities of MD, neurological and immune system-related conditions. Abstract A large percentage of the global population is currently afflicted by metabolic diseases (MD), and the incidence is likely to double in the next decades. MD associated co-morbidities such as non-alcoholic fatty liver disease (NAFLD) and cardiomyopathy contribute significantly to impaired health. MD are complex, polygenic, with many genes involved in its aetiology. A popular approach to investigate genetic contributions to disease aetiology is biological network analysis. However, data dependence introduces a bias (noise, false positives, over-publication) in the outcome. While several approaches have been proposed to overcome these biases, many of them have constraints, including data integration issues, dependence on arbitrary parameters, database dependent outcomes, and computational complexity. Network topology is also a critical factor affecting the outcomes. Here, we propose a simple, parameter-free method, that takes into account database dependence and network topology, to identify central genes in the MD network. Among them, we infer novel candidates that have not yet been annotated as MD genes and show their relevance by highlighting their differential expression in public datasets and carefully examining the literature. The method contributes to uncovering connections in the MD mechanisms and highlights several candidates for in-depth study of their contribution to MD and its co-morbidities.
Collapse
|
43
|
Jayanthi B, Bachhav B, Wan Z, Martinez Legaspi S, Segatori L. A platform for post-translational spatiotemporal control of cellular proteins. Synth Biol (Oxf) 2021; 6:ysab002. [PMID: 33763602 PMCID: PMC7976946 DOI: 10.1093/synbio/ysab002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 12/31/2020] [Accepted: 01/06/2021] [Indexed: 12/11/2022] Open
Abstract
Mammalian cells process information through coordinated spatiotemporal regulation of proteins. Engineering cellular networks thus relies on efficient tools for regulating protein levels in specific subcellular compartments. To address the need to manipulate the extent and dynamics of protein localization, we developed a platform technology for the target-specific control of protein destination. This platform is based on bifunctional molecules comprising a target-specific nanobody and universal sequences determining target subcellular localization or degradation rate. We demonstrate that nanobody-mediated localization depends on the expression level of the target and the nanobody, and the extent of target subcellular localization can be regulated by combining multiple target-specific nanobodies with distinct localization or degradation sequences. We also show that this platform for nanobody-mediated target localization and degradation can be regulated transcriptionally and integrated within orthogonal genetic circuits to achieve the desired temporal control over spatial regulation of target proteins. The platform reported in this study provides an innovative tool to control protein subcellular localization, which will be useful to investigate protein function and regulate large synthetic gene circuits.
Collapse
Affiliation(s)
- Brianna Jayanthi
- Systems, Synthetic and Physical Biology Graduate Program, Rice University, Houston, TX, USA
| | - Bhagyashree Bachhav
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, TX, USA
| | - Zengyi Wan
- Department of Bioengineering, Rice University, Houston, TX, USA
| | | | - Laura Segatori
- Systems, Synthetic and Physical Biology Graduate Program, Rice University, Houston, TX, USA
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, TX, USA
- Department of Bioengineering, Rice University, Houston, TX, USA
- Department of Biosciences, Rice University, Houston, TX, USA
| |
Collapse
|
44
|
Disease network delineates the disease progression profile of cardiovascular diseases. J Biomed Inform 2021; 115:103686. [PMID: 33493631 DOI: 10.1016/j.jbi.2021.103686] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 01/14/2021] [Accepted: 01/15/2021] [Indexed: 11/20/2022]
Abstract
OBJECTIVE As Electronic Health Records (EHR) data accumulated explosively in recent years, the tremendous amount of patient clinical data provided opportunities to discover real world evidence. In this study, a graphical disease network, named progressive cardiovascular disease network (progCDN), was built to delineate the progression profiles of cardiovascular diseases (CVD). MATERIALS AND METHODS The EHR data of 14.3 million patients with CVD diagnoses were collected for building disease network and further analysis. We applied a new designed method, progression rates (PR), to calculate the progression relationship among different diagnoses. Based on the disease network outcome, 23 disease progression pair were selected to screen for salient features. RESULTS The network depicted the dominant diseases in CVD development, such as the heart failure and coronary arteriosclerosis. Novel progression relationships were also discovered, such as the progression path from long QT syndrome to major depression. In addition, three age-group progCDNs identified a series of age-associated disease progression paths and important successor diseases with age bias. Furthermore, a list of important features with sufficient abundance and high correlation was extracted for building disease risk models. DISCUSSION The PR method designed for identifying the progression relationship could be widely applied in any EHR database due to its flexibility and robust functionality. Meanwhile, researchers could use the progCDN network to validate or explore novel disease relationships in real world data. CONCLUSION The first-time interrogation of such a huge CVD patients cohort enabled us to explore the general and age-specific disease progression patterns in CVD development.
Collapse
|
45
|
Slater O, Miller B, Kontoyianni M. Decoding Protein-protein Interactions: An Overview. Curr Top Med Chem 2021; 20:855-882. [PMID: 32101126 DOI: 10.2174/1568026620666200226105312] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2019] [Revised: 11/27/2019] [Accepted: 11/27/2019] [Indexed: 12/24/2022]
Abstract
Drug discovery has focused on the paradigm "one drug, one target" for a long time. However, small molecules can act at multiple macromolecular targets, which serves as the basis for drug repurposing. In an effort to expand the target space, and given advances in X-ray crystallography, protein-protein interactions have become an emerging focus area of drug discovery enterprises. Proteins interact with other biomolecules and it is this intricate network of interactions that determines the behavior of the system and its biological processes. In this review, we briefly discuss networks in disease, followed by computational methods for protein-protein complex prediction. Computational methodologies and techniques employed towards objectives such as protein-protein docking, protein-protein interactions, and interface predictions are described extensively. Docking aims at producing a complex between proteins, while interface predictions identify a subset of residues on one protein that could interact with a partner, and protein-protein interaction sites address whether two proteins interact. In addition, approaches to predict hot spots and binding sites are presented along with a representative example of our internal project on the chemokine CXC receptor 3 B-isoform and predictive modeling with IP10 and PF4.
Collapse
Affiliation(s)
- Olivia Slater
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| | - Bethany Miller
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| | - Maria Kontoyianni
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| |
Collapse
|
46
|
Gómez-Romero L, López-Reyes K, Hernández-Lemus E. The Large Scale Structure of Human Metabolism Reveals Resilience via Extensive Signaling Crosstalk. Front Physiol 2021; 11:588012. [PMID: 33391012 PMCID: PMC7772240 DOI: 10.3389/fphys.2020.588012] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 11/26/2020] [Indexed: 12/20/2022] Open
Abstract
Metabolism is loosely defined as the set of physical and chemical interactions associated with the processes responsible for sustaining life. Two evident features arise whenever one looks at metabolism: first, metabolism is conformed as a very complex and intertwined construct of the many associated biomolecular processes. Second, metabolism is characterized by a high degree of stability reflected by the organisms resilience to either environmental changes or pathogenic conditions. Here we will investigate the relationship between these two features. By having access to the full set of human metabolic interactions as reported in the highly curated KEGG database, we built an integrated human metabolic network comprising metabolic, transcriptional regulation, and protein-protein interaction networks. We hypothesized that a metabolic process may exhibit resilience if it can recover from perturbations at the pathway level; in other words, metabolic resilience could be due to pathway crosstalk which may implicate that a metabolic process could proceed even when a perturbation has occurred. By analyzing the topological structure of the integrated network, as well as the hierarchical structure of its main modules or subnetworks, we observed that behind biological resilience lies an intricate communication structure at the topological and functional level with pathway crosstalk as the main component. The present findings, alongside the advent of large biomolecular databases, such as KEGG may allow the study of the consequences of this redundancy and resilience for the study of healthy and pathological phenotypes with many potential applications in biomedical science.
Collapse
Affiliation(s)
- Laura Gómez-Romero
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico, Mexico
| | - Karina López-Reyes
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico, Mexico.,Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Mexico, Mexico
| |
Collapse
|
47
|
Digital Health for Enhanced Understanding and Management of Chronic Conditions: COPD as a Use Case. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11690-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
|
48
|
Uppal K. Models of Metabolomic Networks. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11615-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
49
|
Sylvester KG, Hao S, You J, Zheng L, Tian L, Yao X, Mo L, Ladella S, Wong RJ, Shaw GM, Stevenson DK, Cohen HJ, Whitin JC, McElhinney DB, Ling XB. Maternal metabolic profiling to assess fetal gestational age and predict preterm delivery: a two-centre retrospective cohort study in the US. BMJ Open 2020; 10:e040647. [PMID: 33268420 PMCID: PMC7713207 DOI: 10.1136/bmjopen-2020-040647] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
OBJECTIVES The aim of this study was to develop a single blood test that could determine gestational age and estimate the risk of preterm birth by measuring serum metabolites. We hypothesised that serial metabolic modelling of serum analytes throughout pregnancy could be used to describe fetal gestational age and project preterm birth with a high degree of precision. STUDY DESIGN A retrospective cohort study. SETTING Two medical centres from the USA. PARTICIPANTS Thirty-six patients (20 full-term, 16 preterm) enrolled at Stanford University were used to develop gestational age and preterm birth risk algorithms, 22 patients (9 full-term, 13 preterm) enrolled at the University of Alabama were used to validate the algorithms. OUTCOME MEASURES Maternal blood was collected serially throughout pregnancy. Metabolic datasets were generated using mass spectrometry. RESULTS A model to determine gestational age was developed (R2=0.98) and validated (R2=0.81). 66.7% of the estimates fell within ±1 week of ultrasound results during model validation. Significant disruptions from full-term pregnancy metabolic patterns were observed in preterm pregnancies (R2=-0.68). A separate algorithm to predict preterm birth was developed using a set of 10 metabolic pathways that resulted in an area under the curve of 0.96 and 0.92, a sensitivity of 0.88 and 0.86, and a specificity of 0.96 and 0.92 during development and validation testing, respectively. CONCLUSIONS In this study, metabolic profiling was used to develop and test a model for determining gestational age during full-term pregnancy progression, and to determine risk of preterm birth. With additional patient validation studies, these algorithms may be used to identify at-risk pregnancies prompting alterations in clinical care, and to gain biological insights into the pathophysiology of preterm birth. Metabolic pathway-based pregnancy modelling is a novel modality for investigation and clinical application development.
Collapse
Affiliation(s)
- Karl G Sylvester
- Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Shiying Hao
- Department of Cardiothoracic Surgery, Stanford University School of Medicine, Stanford, California, USA
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Palo Alto, California, USA
| | - Jin You
- Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Le Zheng
- Department of Cardiothoracic Surgery, Stanford University School of Medicine, Stanford, California, USA
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Palo Alto, California, USA
| | - Lu Tian
- Department of Health Research and Policy, Stanford University, Stanford, California, USA
| | - Xiaoming Yao
- Translational Medicine Laboratory, West China Hospital, Chengdu, China
| | - Lihong Mo
- Department of Obstetrics and Gynecology, University of California San Francisco-Fresno, Fresno, California, USA
| | - Subhashini Ladella
- Department of Obstetrics and Gynecology, University of California San Francisco-Fresno, Fresno, California, USA
| | - Ronald J Wong
- Department of Pediatrics, Stanford University School of Medicine, Stanford, California, USA
| | - Gary M Shaw
- Department of Pediatrics, Stanford University School of Medicine, Stanford, California, USA
| | - David K Stevenson
- Department of Pediatrics, Stanford University School of Medicine, Stanford, California, USA
| | - Harvey J Cohen
- Department of Pediatrics, Stanford University School of Medicine, Stanford, California, USA
| | - John C Whitin
- Department of Pediatrics, Stanford University School of Medicine, Stanford, California, USA
| | - Doff B McElhinney
- Department of Cardiothoracic Surgery, Stanford University School of Medicine, Stanford, California, USA
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Palo Alto, California, USA
| | - Xuefeng B Ling
- Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Palo Alto, California, USA
| |
Collapse
|
50
|
Lee LY, Pandey AK, Maron BA, Loscalzo J. Network medicine in Cardiovascular Research. Cardiovasc Res 2020; 117:2186-2202. [PMID: 33165538 DOI: 10.1093/cvr/cvaa321] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 09/08/2020] [Accepted: 10/30/2020] [Indexed: 12/21/2022] Open
Abstract
The ability to generate multi-omics data coupled with deeply characterizing the clinical phenotype of individual patients promises to improve understanding of complex cardiovascular pathobiology. There remains an important disconnection between the magnitude and granularity of these data and our ability to improve phenotype-genotype correlations for complex cardiovascular diseases. This shortcoming may be due to limitations associated with traditional reductionist analytical methods, which tend to emphasize a single molecular event in the pathogenesis of diseases more aptly characterized by crosstalk between overlapping molecular pathways. Network medicine is a rapidly growing discipline that considers diseases as the consequences of perturbed interactions between multiple interconnected biological components. This powerful integrative approach has enabled a number of important discoveries in complex disease mechanisms. In this review, we introduce the basic concepts of network medicine and highlight specific examples by which this approach has accelerated cardiovascular research. We also review how network medicine is well-positioned to promote rational drug design for patients with cardiovascular diseases, with particular emphasis on advancing precision medicine.
Collapse
Affiliation(s)
- Laurel Y Lee
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, 75 Francis Street, Boston, MA 02115, USA
| | - Arvind K Pandey
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, 75 Francis Street, Boston, MA 02115, USA
| | - Bradley A Maron
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, 75 Francis Street, Boston, MA 02115, USA.,Department of Cardiology, Boston VA Healthcare System, Boston, MA, USA
| | - Joseph Loscalzo
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, 75 Francis Street, Boston, MA 02115, USA
| |
Collapse
|