1
|
Sriram V, Woerner J, Ahn YY, Kim D. The interplay of sex and genotype in disease associations: a comprehensive network analysis in the UK Biobank. Hum Genomics 2025; 19:4. [PMID: 39825454 PMCID: PMC11740496 DOI: 10.1186/s40246-024-00710-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2024] [Accepted: 12/17/2024] [Indexed: 01/20/2025] Open
Abstract
BACKGROUND Disease comorbidities and longer-term complications, arising from biologically related associations across phenotypes, can lead to increased risk of severe health outcomes. Given that many diseases exhibit sex-specific differences in their genetics, our objective was to determine whether genotype-by-sex (GxS) interactions similarly influence cross-phenotype associations. Through comparison of sex-stratified disease-disease networks (DDNs)-where nodes represent diseases and edges represent their relationships-we investigate sex differences in patterns of polygenicity and pleiotropy between diseases. RESULTS Using UK Biobank summary statistics, we built male- and female-specific DDNs for 103 diseases. This revealed that male and female diseasomes have similar topology and central diseases (e.g., hypertensive, chronic respiratory, and thyroid-based disorders), yet some phenotypes exhibit sex-specific influence in cross-phenotype associations. Multiple sclerosis and osteoarthritis are central only in the female DDN, while cardiometabolic diseases and skin cancer are more prominent in the male DDN. Edge comparison indicated similar shared genetics between the two graphs relative to a random model of disease association, though notable discrepancies in embedding distances and clustering patterns imply a more expansive genetic influence on multimorbidity risk for females than males. Analysis of pleiotropic contributions of two sexually-dimorphic single-nucleotide polymorphisms related to thyroid disorders further validated a distinct genetic architecture across sexes that influences associations, confirmed through examination of corresponding gene expression profiles from the GTEx Portal. CONCLUSIONS Our analysis affirms the presence of GxS interactions in cross-phenotype associations, emphasizing the need to investigate the role of sex in disease onset and its importance in biomedical discovery and precision medicine research.
Collapse
Affiliation(s)
- Vivek Sriram
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Richards Building B304, 3700 Hamilton Walk, Philadelphia, PA, 19104, USA
| | - Jakob Woerner
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Richards Building B304, 3700 Hamilton Walk, Philadelphia, PA, 19104, USA
| | - Yong-Yeol Ahn
- Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University Bloomington, Bloomington, IN, 47405, USA.
| | - Dokyoon Kim
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Richards Building B304, 3700 Hamilton Walk, Philadelphia, PA, 19104, USA.
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
2
|
Hall M, Skinderhaug MK, Almaas E. Phenome-wide association network demonstrates close connection with individual disease trajectories from the HUNT study. PLoS One 2024; 19:e0311485. [PMID: 39729424 DOI: 10.1371/journal.pone.0311485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 09/16/2024] [Indexed: 12/29/2024] Open
Abstract
Disease networks offer a potential road map of connections between diseases. Several studies have created disease networks where diseases are connected either based on shared genes or Single Nucleotide Polymorphism (SNP) associations. However, it is still unclear to which degree SNP-based networks map to empirical, co-observed diseases within a different, general, adult study population spanning over a long time period. We created a SNP-based phenome-wide association network (PheNet) from a large population using the UK biobank phenome-wide association studies. Importantly, the SNP-associations are unbiased towards much studied diseases, adjusted for linkage disequilibrium, case/control imbalances, as well as relatedness. We map the PheNet to significantly co-occurring diseases in the Norwegian HUNT study population, and further, identify consecutively occurring diseases with significant ordering in occurrence, independent of age and gender in the PheNet. Our analysis reveals an overlap far larger than expected by chance between the two disease networks, with diseases typically connecting within their own category. Upon examining the sequential occurrence of diseases in the HUNT dataset, we find a giant component consisting of mostly cardiovascular disorders. This allows us to identify sequentially occurring diseases that are genetically linked and co-occur frequently, while also highlighting non-sequential diseases. Furthermore, we observe that survivors of severe cardiovascular diseases subsequently often face less severe conditions, but with a reduced time until their next fatal illness. The HUNT sub-PheNet showing both genetically and co-observed diseases offers an interesting framework to study groups of diseases and examine if they, in fact, are comorbidities. We find that the HUNT sub-PheNet offers the possibility to pinpoint exactly which mutation(s) constitute shared cause of the diseases. This could be of great benefit to both researchers and clinicians studying relationships between diseases.
Collapse
Affiliation(s)
- Martina Hall
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Marit K Skinderhaug
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Eivind Almaas
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- K. G. Jebsen Center for Genetic Epidemiology, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
3
|
Woerner J, Sriram V, Nam Y, Verma A, Kim D. Uncovering genetic associations in the human diseasome using an endophenotype-augmented disease network. Bioinformatics 2024; 40:btae126. [PMID: 38527901 PMCID: PMC10963079 DOI: 10.1093/bioinformatics/btae126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 01/17/2024] [Indexed: 03/27/2024] Open
Abstract
MOTIVATION Many diseases, particularly cardiometabolic disorders, exhibit complex multimorbidities with one another. An intuitive way to model the connections between phenotypes is with a disease-disease network (DDN), where nodes represent diseases and edges represent associations, such as shared single-nucleotide polymorphisms (SNPs), between pairs of diseases. To gain further genetic understanding of molecular contributors to disease associations, we propose a novel version of the shared-SNP DDN (ssDDN), denoted as ssDDN+, which includes connections between diseases derived from genetic correlations with intermediate endophenotypes. We hypothesize that a ssDDN+ can provide complementary information to the disease connections in a ssDDN, yielding insight into the role of clinical laboratory measurements in disease interactions. RESULTS Using PheWAS summary statistics from the UK Biobank, we constructed a ssDDN+ revealing hundreds of genetic correlations between diseases and quantitative traits. Our augmented network uncovers genetic associations across different disease categories, connects relevant cardiometabolic diseases, and highlights specific biomarkers that are associated with cross-phenotype associations. Out of the 31 clinical measurements under consideration, HDL-C connects the greatest number of diseases and is strongly associated with both type 2 diabetes and heart failure. Triglycerides, another blood lipid with known genetic causes in non-mendelian diseases, also adds a substantial number of edges to the ssDDN. This work demonstrates how association with clinical biomarkers can better explain the shared genetics between cardiometabolic disorders. Our study can facilitate future network-based investigations of cross-phenotype associations involving pleiotropy and genetic heterogeneity, potentially uncovering sources of missing heritability in multimorbidities. AVAILABILITY AND IMPLEMENTATION The generated ssDDN+ can be explored at https://hdpm.biomedinfolab.com/ddn/biomarkerDDN.
Collapse
Affiliation(s)
- Jakob Woerner
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Vivek Sriram
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Yonghyun Nam
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Anurag Verma
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Dokyoon Kim
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, United States
| |
Collapse
|
4
|
Zhao B, Huepenbecker S, Zhu G, Rajan SS, Fujimoto K, Luo X. Comorbidity network analysis using graphical models for electronic health records. Front Big Data 2023; 6:846202. [PMID: 37663273 PMCID: PMC10470017 DOI: 10.3389/fdata.2023.846202] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 07/25/2023] [Indexed: 09/05/2023] Open
Abstract
Importance The comorbidity network represents multiple diseases and their relationships in a graph. Understanding comorbidity networks among critical care unit (CCU) patients can help doctors diagnose patients faster, minimize missed diagnoses, and potentially decrease morbidity and mortality. Objective The main objective of this study was to identify the comorbidity network among CCU patients using a novel application of a machine learning method (graphical modeling method). The second objective was to compare the machine learning method with a traditional pairwise method in simulation. Method This cross-sectional study used CCU patients' data from Medical Information Mart for the Intensive Care-3 (MIMIC-3) dataset, an electronic health record (EHR) of patients with CCU hospitalizations within Beth Israel Deaconess Hospital from 2001 to 2012. A machine learning method (graphical modeling method) was applied to identify the comorbidity network of 654 diagnosis categories among 46,511 patients. Results Out of the 654 diagnosis categories, the graphical modeling method identified a comorbidity network of 2,806 associations in 510 diagnosis categories. Two medical professionals reviewed the comorbidity network and confirmed that the associations were consistent with current medical understanding. Moreover, the strongest association in our network was between "poisoning by psychotropic agents" and "accidental poisoning by tranquilizers" (logOR 8.16), and the most connected diagnosis was "disorders of fluid, electrolyte, and acid-base balance" (63 associated diagnosis categories). Our method outperformed traditional pairwise comorbidity network methods in simulation studies. Some strongest associations between diagnosis categories were also identified, for example, "diagnoses of mitral and aortic valve" and "other rheumatic heart disease" (logOR: 5.15). Furthermore, our method identified diagnosis categories that were connected with most other diagnosis categories, for example, "disorders of fluid, electrolyte, and acid-base balance" was associated with 63 other diagnosis categories. Additionally, using a data-driven approach, our method partitioned the diagnosis categories into 14 modularity classes. Conclusion and relevance Our graphical modeling method inferred a logical comorbidity network whose associations were consistent with current medical understanding and outperformed traditional network methods in simulation. Our comorbidity network method can potentially assist CCU doctors in diagnosing patients faster and minimizing missed diagnoses.
Collapse
Affiliation(s)
- Bo Zhao
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center, Houston, TX, United States
| | - Sarah Huepenbecker
- Department of Gynecologic Oncology and Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | - Gen Zhu
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center, Houston, TX, United States
| | - Suja S. Rajan
- Department of Management, Policy and Community Health, School of Public Health, The University of Texas Health Science Center, Houston, TX, United States
| | - Kayo Fujimoto
- Department of Health Promotion and Behavioral Sciences, School of Public Health, The University of Texas Health Science Center, Houston, TX, United States
| | - Xi Luo
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center, Houston, TX, United States
| |
Collapse
|
5
|
Woerner J, Sriram V, Nam Y, Verma A, Kim D. Uncovering genetic associations in the human diseasome using an endophenotype-augmented disease network. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.05.11.23289852. [PMID: 37293013 PMCID: PMC10246076 DOI: 10.1101/2023.05.11.23289852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Many diseases exhibit complex multimorbidities with one another. An intuitive way to model the connections between phenotypes is with a disease-disease network (DDN), where nodes represent diseases and edges represent associations, such as shared single-nucleotide polymorphisms (SNPs), between pairs of diseases. To gain further genetic understanding of molecular contributors to disease associations, we propose a novel version of the shared-SNP DDN (ssDDN), denoted as ssDDN+, which includes connections between diseases derived from genetic correlations with endophenotypes. We hypothesize that a ssDDN+ can provide complementary information to the disease connections in a ssDDN, yielding insight into the role of clinical laboratory measurements in disease interactions. Using PheWAS summary statistics from the UK Biobank, we constructed a ssDDN+ revealing hundreds of genetic correlations between disease phenotypes and quantitative traits. Our augmented network uncovers genetic associations across different disease categories, connects relevant cardiometabolic diseases, and highlights specific biomarkers that are associated with cross-phenotype associations. Out of the 31 clinical measurements under consideration, HDL-C connects the greatest number of diseases and is strongly associated with both type 2 diabetes and diabetic retinopathy. Triglycerides, another blood lipid with known genetics causes in non-mendelian diseases, also adds a substantial number of edges to the ssDDN. Our study can facilitate future network-based investigations of cross-phenotype associations involving pleiotropy and genetic heterogeneity, potentially uncovering sources of missing heritability in multimorbidities.
Collapse
Affiliation(s)
- Jakob Woerner
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Vivek Sriram
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Yonghyun Nam
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Anurag Verma
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Dokyoon Kim
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|