1
|
Gillespie NA, Bell TR, Hearn GC, Hess JL, Tsuang MT, Lyons MJ, Franz CE, Kremen WS, Glatt SJ. A twin analysis to estimate genetic and environmental factors contributing to variation in weighted gene co-expression network module eigengenes. Am J Med Genet B Neuropsychiatr Genet 2025; 198:e33003. [PMID: 39126209 PMCID: PMC11778624 DOI: 10.1002/ajmg.b.33003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 06/18/2024] [Accepted: 07/22/2024] [Indexed: 08/12/2024]
Abstract
Multivariate network-based analytic methods such as weighted gene co-expression network analysis are frequently applied to human and animal gene-expression data to estimate the first principal component of a module, or module eigengene (ME). MEs are interpreted as multivariate summaries of correlated gene-expression patterns and network connectivity across genes within a module. As such, they have the potential to elucidate the mechanisms by which molecular genomic variation contributes to individual differences in complex traits. Although increasingly used to test for associations between modules and complex traits, the genetic and environmental etiology of MEs has not been empirically established. It is unclear if, and to what degree, individual differences in blood-derived MEs reflect random variation versus familial aggregation arising from heritable or shared environmental influences. We used biometrical genetic analyses to estimate the contribution of genetic and environmental influences on MEs derived from blood lymphocytes collected on a sample of N = 661 older male twins from the Vietnam Era Twin Study of Aging (VETSA) whose mean age at assessment was 67.7 years (SD = 2.6 years, range = 62-74 years). Of the 26 detected MEs, 14 (56%) had statistically significant additive genetic variation with an average heritability of 44% (SD = 0.08, range = 35%-64%). Despite the relatively small sample size, this demonstration of significant family aggregation including estimates of heritability in 14 of the 26 MEs suggests that blood-based MEs are reliable and merit further exploration in terms of their associations with complex traits and diseases.
Collapse
Affiliation(s)
- Nathan A. Gillespie
- Virginia Institute for Psychiatric and Behavior Genetics, Virginia Commonwealth University, Virginia, USA
- QIMR Berghofer Medical Research Institute, Herston, Queensland, Australia
| | - Tyler R. Bell
- Department of Psychiatry, University of California San Diego, La Jolla, California, USA
- Center for Behavior Genetics of Aging, University of California San Diego, La Jolla, California, USA
| | - Gentry C. Hearn
- Department of Psychological and Brain Sciences, Boston University, Boston, Massachusetts, USA
| | - Jonathan L. Hess
- Department of Psychological and Brain Sciences, Boston University, Boston, Massachusetts, USA
| | - Ming T. Tsuang
- Department of Psychiatry, University of California San Diego, La Jolla, California, USA
| | - Michael J. Lyons
- Department of Psychological and Brain Sciences, Boston University, Boston, Massachusetts, USA
| | - Carol E. Franz
- Department of Psychiatry, University of California San Diego, La Jolla, California, USA
- Center for Behavior Genetics of Aging, University of California San Diego, La Jolla, California, USA
| | - William S. Kremen
- Department of Psychiatry, University of California San Diego, La Jolla, California, USA
- Center for Behavior Genetics of Aging, University of California San Diego, La Jolla, California, USA
| | - Stephen J. Glatt
- Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, New York, USA
| |
Collapse
|
2
|
Luo M, Trindade Pons V, Thomas NS, Drake J, Su MH, Vladimirov V, van Loo HM, Gillespie NA. The Mechanisms Underlying the Intergenerational Transmission of Substance Use and Misuse: An Integrated Research Approach. Twin Res Hum Genet 2024:1-12. [PMID: 39710930 DOI: 10.1017/thg.2024.46] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
Substance use and substance use disorders run in families. While it has long been recognized that the etiology of substance use behaviors and disorders involves a combination of genetic and environmental factors, two key questions remain largely unanswered: (1) the intergenerational transmission through which these genetic predispositions are passed from parents to children, and (2) the molecular mechanisms linking genetic variants to substance use behaviors and disorders. This article aims to provide a comprehensive conceptual framework and methodological approach for investigating the intergenerational transmission of substance use behaviors and disorders, by integrating genetic nurture analysis, gene expression imputation, and weighted gene co-expression network analysis. We also additionally describe two longitudinal cohorts - the Brisbane Longitudinal Twin Study in Australia and the Lifelines Cohort Study in the Netherlands. By applying the methodological framework to these two unique datasets, our future research will explore the complex interplay between genetic factors, gene expression, and environmental influences on substance use behaviors and disorders across different life stages and populations.
Collapse
Affiliation(s)
- Mannan Luo
- Department of Psychiatry, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Victória Trindade Pons
- Department of Psychiatry, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Nathaniel S Thomas
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, Virginia, USA
| | - John Drake
- Department of Psychiatry, College of Medicine, University of Arizona Phoenix, Phoenix, Arizona, USA
| | - Mei-Hsin Su
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Vladimir Vladimirov
- Department of Psychiatry, College of Medicine, University of Arizona Phoenix, Phoenix, Arizona, USA
- Lieber Institute for Brain Development, Johns Hopkins University, Baltimore, Maryland, USA
| | - Hanna M van Loo
- Department of Psychiatry, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Nathan A Gillespie
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, Virginia, USA
| |
Collapse
|
3
|
Mohammad GI, Michoel T. Predicting the genetic component of gene expression using gene regulatory networks. BIOINFORMATICS ADVANCES 2024; 4:vbae180. [PMID: 39717201 PMCID: PMC11665636 DOI: 10.1093/bioadv/vbae180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2024] [Revised: 10/15/2024] [Accepted: 11/12/2024] [Indexed: 12/25/2024]
Abstract
Motivation Gene expression prediction plays a vital role in transcriptome-wide association studies. Traditional models rely on genetic variants in close genomic proximity to the gene of interest to predict the genetic component of gene expression. Here, we propose a novel approach incorporating distal genetic variants acting through gene regulatory networks, in line with the omnigenic model of complex traits. Results Using causal and coexpression Bayesian networks reconstructed from genomic and transcriptomic data, inference of gene expression from genotypic data is achieved through a two-step process. Initially, the expression level of each gene is predicted using its local genetic variants. The residual differences between the observed and predicted expression levels are then modeled using the genotype information of parent and/or grandparent nodes in the network. The final predicted expression level is obtained by summing the predictions from both models, effectively incorporating both local and distal genetic influences. Using regularized regression techniques for parameter estimation, we found that gene regulatory network-based gene expression prediction outperformed the traditional approach on simulated data and real data from yeast and humans. This study provides important insights into the challenge of gene expression prediction for transcriptome-wide association studies. Availability and implementation The code is available on Github at github.com/guutama/GRN-TI.
Collapse
Affiliation(s)
- Gutama Ibrahim Mohammad
- Computational Biology Unit, Department of Informatics, University of Bergen, 5008 Bergen, Norway
| | - Tom Michoel
- Computational Biology Unit, Department of Informatics, University of Bergen, 5008 Bergen, Norway
| |
Collapse
|
4
|
Martinez-Boggio G, Monteiro HF, Lima FS, Figueiredo CC, Bisinotto RS, Santos JEP, Mion B, Schenkel FS, Ribeiro ES, Weigel KA, Rosa GJM, Peñagaricano F. Revealing host genome-microbiome networks underlying feed efficiency in dairy cows. Sci Rep 2024; 14:26060. [PMID: 39472728 PMCID: PMC11522680 DOI: 10.1038/s41598-024-77782-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Accepted: 10/25/2024] [Indexed: 11/02/2024] Open
Abstract
Ruminants have the ability to digest human-inedible plant materials, due to the symbiotic relationship with the rumen microbiota. Rumen microbes supply short chain fatty acids, amino acids, and vitamins to dairy cows that are used for maintenance, growth, and lactation functions. The main goal of this study was to investigate gene-microbiome networks underlying feed efficiency traits by integrating genotypic, microbial, and phenotypic data from lactating dairy cows. Data consisted of dry matter intake (DMI), net energy secreted in milk, and residual feed intake (RFI) records, SNP genotype, and 16S rRNA rumen microbial abundances from 448 mid-lactation Holstein cows. We first assessed marginal associations between genotypes and phenotypic and microbial traits through genomic scans, and then, in regions with multiple significant hits, we assessed gene-microbiome-phenotype networks using causal structural learning algorithms. We found significant regions co-localizing the rumen microbiome and feed efficiency traits. Interestingly, we found three types of network relationships: (1) the cow genome directly affects both rumen microbial abundances and feed efficiency traits; (2) the cow genome (Chr3: 116.5 Mb) indirectly affects RFI, mediated by the abundance of Syntrophococcus, Prevotella, and an unknown genus of Class Bacilli; and (3) the cow genome (Chr7: 52.8 Mb and Chr11: 6.1-6.2 Mb) affects the abundance of Rikenellaceae RC9 gut group mediated by DMI. Our findings shed light on how the host genome acts directly and indirectly on the rumen microbiome and feed efficiency traits and the potential benefits of the inclusion of specific microbes in selection indexes or as correlated traits in breeding programs. Overall, the multistep approach described here, combining whole-genome scans and causal network reconstruction, allows us to reveal the relationship between genome and microbiome underlying dairy cow feed efficiency.
Collapse
Affiliation(s)
- Guillermo Martinez-Boggio
- Department of Animal and Dairy Sciences, University of Wisconsin, 1675 Observatory Dr, Madison, WI, 53706, USA.
| | - Hugo F Monteiro
- Department of Population Health and Reproduction, University of California, Davis, 95616, USA
| | - Fabio S Lima
- Department of Population Health and Reproduction, University of California, Davis, 95616, USA
| | - Caio C Figueiredo
- Department of Veterinary Clinical Sciences, Washington State University, Pullman, 99163, USA
| | - Rafael S Bisinotto
- Department of Large Animal Clinical Sciences, University of Florida, Gainesville, 32610, USA
| | - José E P Santos
- Department of Animal Sciences, University of Florida, Gainesville, 32611, USA
| | - Bruna Mion
- Department of Animal Biosciences, University of Guelph, Guelph, N1G-2W1, Canada
| | - Flavio S Schenkel
- Department of Animal Biosciences, University of Guelph, Guelph, N1G-2W1, Canada
| | - Eduardo S Ribeiro
- Department of Animal Biosciences, University of Guelph, Guelph, N1G-2W1, Canada
| | - Kent A Weigel
- Department of Animal and Dairy Sciences, University of Wisconsin, 1675 Observatory Dr, Madison, WI, 53706, USA
| | - Guilherme J M Rosa
- Department of Animal and Dairy Sciences, University of Wisconsin, 1675 Observatory Dr, Madison, WI, 53706, USA
| | - Francisco Peñagaricano
- Department of Animal and Dairy Sciences, University of Wisconsin, 1675 Observatory Dr, Madison, WI, 53706, USA
| |
Collapse
|
5
|
Guo X, Song Y, Xu D, Jin X, Shang X. Genotype and Phenotype Association Analysis Based on Multi-omics Statistical Data. Curr Bioinform 2024; 19:933-942. [DOI: 10.2174/0115748936276861240109045208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/29/2023] [Accepted: 12/07/2023] [Indexed: 01/03/2025]
Abstract
Background:
When using clinical data for multi-omics analysis, there are issues such as
the insufficient number of omics data types and relatively small sample size due to the protection of
patients' privacy, the requirements of data management by various institutions, and the relatively
large number of features of each omics data. This paper describes the analysis of multi-omics pathway
relationships using statistical data in the absence of clinical data.
Methods:
We proposed a novel approach to exploit easily accessible statistics in public databases.
This approach introduces phenotypic associations that are not included in the clinical data and uses
these data to build a three-layer heterogeneous network. To simplify the analysis, we decomposed
the three-layer network into double two-layer networks to predict the weights of the inter-layer associations.
By adding a hyperparameter β, the weights of the two layers of the network were
merged, and then k-fold cross-validation was used to evaluate the accuracy of this method. In calculating
the weights of the two-layer networks, the RWR with fixed restart probability was combined
with PBMDA and CIPHER to generate the PCRWR with biased weights and improved accuracy.
Results:
The area under the receiver operating characteristic curve was increased by approximately
7% in the case of the RWR with initial weights.
Conclusion:
Multi-omics statistical data were used to establish genotype and phenotype correlation
networks for analysis, which was similar to the effect of clinical multi-omics analysis.
Collapse
Affiliation(s)
- Xinpeng Guo
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, 710051, People’s Republic of China
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710072, People’s Republic of China
| | - Yafei Song
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, 710051, People’s Republic of China
| | - Dongyan Xu
- Department of Basic Sciences, Air Force Engineering University, Xi’an, 710051, People’s Republic
of China
| | - Xueping Jin
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, 710051, People’s Republic of China
| | - Xuequn Shang
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710072, People’s
Republic of China
| |
Collapse
|
6
|
Lu Y, Xu K, Maydanchik N, Kang B, Pierce BL, Yang F, Chen LS. An integrative multi-context Mendelian randomization method for identifying risk genes across human tissues. Am J Hum Genet 2024; 111:1736-1749. [PMID: 39053459 PMCID: PMC11339623 DOI: 10.1016/j.ajhg.2024.06.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 06/11/2024] [Accepted: 06/24/2024] [Indexed: 07/27/2024] Open
Abstract
Mendelian randomization (MR) provides valuable assessments of the causal effect of exposure on outcome, yet the application of conventional MR methods for mapping risk genes encounters new challenges. One of the issues is the limited availability of expression quantitative trait loci (eQTLs) as instrumental variables (IVs), hampering the estimation of sparse causal effects. Additionally, the often context- or tissue-specific eQTL effects challenge the MR assumption of consistent IV effects across eQTL and GWAS data. To address these challenges, we propose a multi-context multivariable integrative MR framework, mintMR, for mapping expression and molecular traits as joint exposures. It models the effects of molecular exposures across multiple tissues in each gene region, while simultaneously estimating across multiple gene regions. It uses eQTLs with consistent effects across more than one tissue type as IVs, improving IV consistency. A major innovation of mintMR involves employing multi-view learning methods to collectively model latent indicators of disease relevance across multiple tissues, molecular traits, and gene regions. The multi-view learning captures the major patterns of disease relevance and uses these patterns to update the estimated tissue relevance probabilities. The proposed mintMR iterates between performing a multi-tissue MR for each gene region and joint learning the disease-relevant tissue probabilities across gene regions, improving the estimation of sparse effects across genes. We apply mintMR to evaluate the causal effects of gene expression and DNA methylation for 35 complex traits using multi-tissue QTLs as IVs. The proposed mintMR controls genome-wide inflation and offers insights into disease mechanisms.
Collapse
Affiliation(s)
- Yihao Lu
- Department of Public Health Sciences, The University of Chicago, Chicago, IL, USA
| | - Ke Xu
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN, USA
| | - Nathaniel Maydanchik
- Department of Public Health Sciences, The University of Chicago, Chicago, IL, USA
| | - Bowei Kang
- Department of Public Health Sciences, The University of Chicago, Chicago, IL, USA
| | - Brandon L Pierce
- Department of Public Health Sciences, The University of Chicago, Chicago, IL, USA
| | - Fan Yang
- Yau Mathematical Sciences Center, Tsinghua University, Beijing, China; Yanqi Lake Beijing Institute of Mathematical Sciences and Applications, Beijing, China.
| | - Lin S Chen
- Department of Public Health Sciences, The University of Chicago, Chicago, IL, USA.
| |
Collapse
|
7
|
Meng Z, Liu S, Liang S, Jani B, Meng Z. Heterogeneous biomedical entity representation learning for gene-disease association prediction. Brief Bioinform 2024; 25:bbae380. [PMID: 39154194 PMCID: PMC11330343 DOI: 10.1093/bib/bbae380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 05/29/2024] [Accepted: 07/22/2024] [Indexed: 08/19/2024] Open
Abstract
Understanding the genetic basis of disease is a fundamental aspect of medical research, as genes are the classic units of heredity and play a crucial role in biological function. Identifying associations between genes and diseases is critical for diagnosis, prevention, prognosis, and drug development. Genes that encode proteins with similar sequences are often implicated in related diseases, as proteins causing identical or similar diseases tend to show limited variation in their sequences. Predicting gene-disease association (GDA) requires time-consuming and expensive experiments on a large number of potential candidate genes. Although methods have been proposed to predict associations between genes and diseases using traditional machine learning algorithms and graph neural networks, these approaches struggle to capture the deep semantic information within the genes and diseases and are dependent on training data. To alleviate this issue, we propose a novel GDA prediction model named FusionGDA, which utilizes a pre-training phase with a fusion module to enrich the gene and disease semantic representations encoded by pre-trained language models. Multi-modal representations are generated by the fusion module, which includes rich semantic information about two heterogeneous biomedical entities: protein sequences and disease descriptions. Subsequently, the pooling aggregation strategy is adopted to compress the dimensions of the multi-modal representation. In addition, FusionGDA employs a pre-training phase leveraging a contrastive learning loss to extract potential gene and disease features by training on a large public GDA dataset. To rigorously evaluate the effectiveness of the FusionGDA model, we conduct comprehensive experiments on five datasets and compare our proposed model with five competitive baseline models on the DisGeNet-Eval dataset. Notably, our case study further demonstrates the ability of FusionGDA to discover hidden associations effectively. The complete code and datasets of our experiments are available at https://github.com/ZhaohanM/FusionGDA.
Collapse
Affiliation(s)
- Zhaohan Meng
- School of Computing Science, University of Glasgow, 18 Lilybank Gardens, Glasgow G12 8RZ, UK
| | - Siwei Liu
- School of Natural and Computing Science, University of Aberdeen King’s College, Aberdeen, AB24 3FX, UK
| | - Shangsong Liang
- Machine Learning Department, Mohamed bin Zayed University of Artificial Intelligence, Building 1B, Masdar City, Abu Dhabi 000000, UAE
| | - Bhautesh Jani
- School of Computing Science, University of Glasgow, 18 Lilybank Gardens, Glasgow G12 8RZ, UK
| | - Zaiqiao Meng
- School of Computing Science, University of Glasgow, 18 Lilybank Gardens, Glasgow G12 8RZ, UK
| |
Collapse
|
8
|
Lee PC, Jung IH, Thussu S, Patel V, Wagoner R, Burks KH, Amrute J, Elenbaas JS, Kang CJ, Young EP, Scherer PE, Stitziel NO. Instrumental variable and colocalization analyses identify endotrophin and HTRA1 as potential therapeutic targets for coronary artery disease. iScience 2024; 27:110104. [PMID: 38989470 PMCID: PMC11233907 DOI: 10.1016/j.isci.2024.110104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 03/26/2024] [Accepted: 05/22/2024] [Indexed: 07/12/2024] Open
Abstract
Coronary artery disease (CAD) remains a leading cause of disease burden globally, and there is a persistent need for new therapeutic targets. Instrumental variable (IV) and genetic colocalization analyses can help identify novel therapeutic targets for human disease by nominating causal genes in genome-wide association study (GWAS) loci. We conducted cis-IV analyses for 20,125 genes and 1,746 plasma proteins with CAD using molecular trait quantitative trait loci variant (QTLs) data from three different studies. 19 proteins and 119 genes were significantly associated with CAD risk by IV analyses and demonstrated evidence of genetic colocalization. Notably, our analyses validated well-established targets such as PCSK9 and ANGPTL4 while also identifying HTRA1 and endotrophin (a cleavage product of COL6A3) as proteins whose levels are causally associated with CAD risk. Further experimental studies are needed to confirm the causal role of the genes and proteins identified through our multiomic cis-IV analyses on human disease.
Collapse
Affiliation(s)
- Paul C. Lee
- Center for Cardiovascular Research, Division of Cardiology, Department of Medicine, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - In-Hyuk Jung
- Center for Cardiovascular Research, Division of Cardiology, Department of Medicine, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Shreeya Thussu
- Center for Cardiovascular Research, Division of Cardiology, Department of Medicine, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Ved Patel
- Center for Cardiovascular Research, Division of Cardiology, Department of Medicine, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Ryan Wagoner
- Center for Cardiovascular Research, Division of Cardiology, Department of Medicine, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Kendall H. Burks
- Center for Cardiovascular Research, Division of Cardiology, Department of Medicine, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Junedh Amrute
- Center for Cardiovascular Research, Division of Cardiology, Department of Medicine, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Jared S. Elenbaas
- Center for Cardiovascular Research, Division of Cardiology, Department of Medicine, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Chul Joo Kang
- McDonnell Genome Institute, Washington University School of Medicine, Saint Louis, MO 63108, USA
| | - Erica P. Young
- Center for Cardiovascular Research, Division of Cardiology, Department of Medicine, Washington University School of Medicine, Saint Louis, MO 63110, USA
- McDonnell Genome Institute, Washington University School of Medicine, Saint Louis, MO 63108, USA
| | - Philipp E. Scherer
- Touchstone Diabetes Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Nathan O. Stitziel
- Center for Cardiovascular Research, Division of Cardiology, Department of Medicine, Washington University School of Medicine, Saint Louis, MO 63110, USA
- McDonnell Genome Institute, Washington University School of Medicine, Saint Louis, MO 63108, USA
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA
| |
Collapse
|
9
|
Caudal É, Loegler V, Dutreux F, Vakirlis N, Teyssonnière É, Caradec C, Friedrich A, Hou J, Schacherer J. Pan-transcriptome reveals a large accessory genome contribution to gene expression variation in yeast. Nat Genet 2024; 56:1278-1287. [PMID: 38778243 PMCID: PMC11176082 DOI: 10.1038/s41588-024-01769-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 04/24/2024] [Indexed: 05/25/2024]
Abstract
Gene expression is an essential step in the translation of genotypes into phenotypes. However, little is known about the transcriptome architecture and the underlying genetic effects at the species level. Here we generated and analyzed the pan-transcriptome of ~1,000 yeast natural isolates across 4,977 core and 1,468 accessory genes. We found that the accessory genome is an underappreciated driver of transcriptome divergence. Global gene expression patterns combined with population structure showed that variation in heritable expression mainly lies within subpopulation-specific signatures, for which accessory genes are overrepresented. Genome-wide association analyses consistently highlighted that accessory genes are associated with proportionally more variants with larger effect sizes, illustrating the critical role of the accessory genome on the transcriptional landscape within and between populations.
Collapse
Affiliation(s)
- Élodie Caudal
- Université de Strasbourg, CNRS GMGM UMR 7156, Strasbourg, France
| | - Victor Loegler
- Université de Strasbourg, CNRS GMGM UMR 7156, Strasbourg, France
| | - Fabien Dutreux
- Université de Strasbourg, CNRS GMGM UMR 7156, Strasbourg, France
| | | | | | - Claudia Caradec
- Université de Strasbourg, CNRS GMGM UMR 7156, Strasbourg, France
| | - Anne Friedrich
- Université de Strasbourg, CNRS GMGM UMR 7156, Strasbourg, France
| | - Jing Hou
- Université de Strasbourg, CNRS GMGM UMR 7156, Strasbourg, France.
| | - Joseph Schacherer
- Université de Strasbourg, CNRS GMGM UMR 7156, Strasbourg, France.
- Institut Universitaire de France (IUF), Paris, France.
| |
Collapse
|
10
|
Kakoulidou I, Piecyk RS, Meyer RC, Kuhlmann M, Gutjahr C, Altmann T, Johannes F. Mapping parental DMRs predictive of local and distal methylome remodeling in epigenetic F1 hybrids. Life Sci Alliance 2024; 7:e202402599. [PMID: 38290756 PMCID: PMC10828516 DOI: 10.26508/lsa.202402599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 01/21/2024] [Accepted: 01/22/2024] [Indexed: 02/01/2024] Open
Abstract
F1 hybrids derived from a cross between two inbred parental lines often display widespread changes in DNA methylation and gene expression patterns relative to their parents. An emerging challenge is to understand how parental epigenomic differences contribute to these events. Here, we generated a large mapping panel of F1 epigenetic hybrids, whose parents are isogenic but variable in their DNA methylation patterns. Using a combination of multi-omic profiling and epigenetic mapping strategies we show that differentially methylated regions in parental pericentromeres act as major reorganizers of hybrid methylomes and transcriptomes, even in the absence of genetic variation. These parental differentially methylated regions are associated with hybrid methylation remodeling events at thousands of target regions throughout the genome, both locally (in cis) and distally (in trans). Many of these distally-induced methylation changes lead to nonadditive expression of nearby genes and associate with phenotypic heterosis. Our study highlights the pleiotropic potential of parental pericentromeres in the functional remodeling of hybrid genomes and phenotypes.
Collapse
Affiliation(s)
- Ioanna Kakoulidou
- Plant Epigenomics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - Robert S Piecyk
- Plant Epigenomics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - Rhonda C Meyer
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Markus Kuhlmann
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Caroline Gutjahr
- Plant Genetics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Thomas Altmann
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Frank Johannes
- Plant Epigenomics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
- Institute of Advanced Studies, Technical University of Munich, Munich, Germany
| |
Collapse
|
11
|
Barra J, Taverna F, Bong F, Ahmed I, Karakach TK. Error modelled gene expression analysis (EMOGEA) provides a superior overview of time course RNA-seq measurements and low count gene expression. Brief Bioinform 2024; 25:bbae233. [PMID: 38770716 PMCID: PMC11106635 DOI: 10.1093/bib/bbae233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 04/03/2024] [Accepted: 04/30/2024] [Indexed: 05/22/2024] Open
Abstract
Temporal RNA-sequencing (RNA-seq) studies of bulk samples provide an opportunity for improved understanding of gene regulation during dynamic phenomena such as development, tumor progression or response to an incremental dose of a pharmacotherapeutic. Moreover, single-cell RNA-seq (scRNA-seq) data implicitly exhibit temporal characteristics because gene expression values recapitulate dynamic processes such as cellular transitions. Unfortunately, temporal RNA-seq data continue to be analyzed by methods that ignore this ordinal structure and yield results that are often difficult to interpret. Here, we present Error Modelled Gene Expression Analysis (EMOGEA), a framework for analyzing RNA-seq data that incorporates measurement uncertainty, while introducing a special formulation for those acquired to monitor dynamic phenomena. This method is specifically suited for RNA-seq studies in which low-count transcripts with small-fold changes lead to significant biological effects. Such transcripts include genes involved in signaling and non-coding RNAs that inherently exhibit low levels of expression. Using simulation studies, we show that this framework down-weights samples that exhibit extreme responses such as batch effects allowing them to be modeled with the rest of the samples and maintain the degrees of freedom originally envisioned for a study. Using temporal experimental data, we demonstrate the framework by extracting a cascade of gene expression waves from a well-designed RNA-seq study of zebrafish embryogenesis and an scRNA-seq study of mouse pre-implantation and provide unique biological insights into the regulation of genes in each wave. For non-ordinal measurements, we show that EMOGEA has a much higher rate of true positive calls and a vanishingly small rate of false negative discoveries compared to common approaches. Finally, we provide two packages in Python and R that are self-contained and easy to use, including test data.
Collapse
Affiliation(s)
- Jasmine Barra
- Laboratory of Integrative Multi-Omics Research, Department of Pharmacology, Dalhousie University, 5850 College Street, Halifax, NS, B3H 4R2, Canada
- Beatrice Hunter Cancer Research Institute, 5743 University Avenue, Suite 98, Halifax, NS, B3H 0A2, Canada
- Department of Microbiology & Immunology, Dalhousie University, 5850 College Street, Halifax, NS, B3H 4R2, Canada
| | - Federico Taverna
- Laboratory of Integrative Multi-Omics Research, Department of Pharmacology, Dalhousie University, 5850 College Street, Halifax, NS, B3H 4R2, Canada
- Beatrice Hunter Cancer Research Institute, 5743 University Avenue, Suite 98, Halifax, NS, B3H 0A2, Canada
| | - Fabian Bong
- Laboratory of Integrative Multi-Omics Research, Department of Pharmacology, Dalhousie University, 5850 College Street, Halifax, NS, B3H 4R2, Canada
- Beatrice Hunter Cancer Research Institute, 5743 University Avenue, Suite 98, Halifax, NS, B3H 0A2, Canada
| | - Ibrahim Ahmed
- Laboratory of Integrative Multi-Omics Research, Department of Pharmacology, Dalhousie University, 5850 College Street, Halifax, NS, B3H 4R2, Canada
- Beatrice Hunter Cancer Research Institute, 5743 University Avenue, Suite 98, Halifax, NS, B3H 0A2, Canada
| | - Tobias K Karakach
- Laboratory of Integrative Multi-Omics Research, Department of Pharmacology, Dalhousie University, 5850 College Street, Halifax, NS, B3H 4R2, Canada
- Beatrice Hunter Cancer Research Institute, 5743 University Avenue, Suite 98, Halifax, NS, B3H 0A2, Canada
| |
Collapse
|
12
|
Lu Y, Xu K, Kang B, Pierce BL, Yang F, Chen LS. An integrative multi-context Mendelian randomization method for identifying risk genes across human tissues. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.04.24303731. [PMID: 38496462 PMCID: PMC10942526 DOI: 10.1101/2024.03.04.24303731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Mendelian randomization (MR) provides valuable assessments of the causal effect of exposure on outcome, yet the application of conventional MR methods for mapping risk genes encounters new challenges. One of the issues is the limited availability of expression quantitative trait loci (eQTLs) as instrumental variables (IVs), hampering the estimation of sparse causal effects. Additionally, the often context/tissue-specific eQTL effects challenge the MR assumption of consistent IV effects across eQTL and GWAS data. To address these challenges, we propose a multi-context multivariable integrative MR framework, mintMR, for mapping expression and molecular traits as joint exposures. It models the effects of molecular exposures across multiple tissues in each gene region, while simultaneously estimating across multiple gene regions. It uses eQTLs with consistent effects across more than one tissue type as IVs, improving IV consistency. A major innovation of mintMR involves employing multi-view learning methods to collectively model latent indicators of disease relevance across multiple tissues, molecular traits, and gene regions. The multi-view learning captures the major patterns of disease-relevance and uses these patterns to update the estimated tissue relevance probabilities. The proposed mintMR iterates between performing a multi-tissue MR for each gene region and joint learning the disease-relevant tissue probabilities across gene regions, improving the estimation of sparse effects across genes. We apply mintMR to evaluate the causal effects of gene expression and DNA methylation for 35 complex traits using multi-tissue QTLs as IVs. The proposed mintMR controls genome-wide inflation and offers new insights into disease mechanisms.
Collapse
|
13
|
Xu M, Abdullah NA, Md Sabri AQ. A method to improve the prediction performance of cancer-gene association by screening negative training samples through gene network data. Comput Biol Chem 2024; 108:107997. [PMID: 38154318 DOI: 10.1016/j.compbiolchem.2023.107997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 11/03/2023] [Accepted: 12/03/2023] [Indexed: 12/30/2023]
Abstract
This work focuses on data sampling in cancer-gene association prediction. Currently, researchers are using machine learning methods to predict genes that are more likely to produce cancer-causing mutations. To improve the performance of machine learning models, methods have been proposed, one of which is to improve the quality of the training data. Existing methods focus mainly on positive data, i.e. cancer driver genes, for screening selection. This paper proposes a low-cancer-related gene screening method based on gene network and graph theory algorithms to improve the negative samples selection. Genetic data with low cancer correlation is used as negative training samples. After experimental verification, using the negative samples screened by this method to train the cancer gene classification model can improve prediction performance. The biggest advantage of this method is that it can be easily combined with other methods that focus on enhancing the quality of positive training samples. It has been demonstrated that significant improvement is achieved by combining this method with three state-of-the-arts cancer gene prediction methods.
Collapse
Affiliation(s)
- Mingzhe Xu
- Faculty of Computer Science & Information Technology, Universiti Malaya, Kuala Lumpur, 50603 Malaysia; School of Energy and Intelligence Engineering, Henan University of Animal Husbandry and Economy, #6 North Longzihu Rd, Zhengzhou 450000, China.
| | - Nor Aniza Abdullah
- Faculty of Computer Science & Information Technology, Universiti Malaya, Kuala Lumpur, 50603 Malaysia.
| | - Aznul Qalid Md Sabri
- Faculty of Computer Science & Information Technology, Universiti Malaya, Kuala Lumpur, 50603 Malaysia.
| |
Collapse
|
14
|
Pathak RK, Kim JM. Veterinary systems biology for bridging the phenotype-genotype gap via computational modeling for disease epidemiology and animal welfare. Brief Bioinform 2024; 25:bbae025. [PMID: 38343323 PMCID: PMC10859662 DOI: 10.1093/bib/bbae025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 01/02/2024] [Accepted: 01/15/2024] [Indexed: 02/15/2024] Open
Abstract
Veterinary systems biology is an innovative approach that integrates biological data at the molecular and cellular levels, allowing for a more extensive understanding of the interactions and functions of complex biological systems in livestock and veterinary science. It has tremendous potential to integrate multi-omics data with the support of vetinformatics resources for bridging the phenotype-genotype gap via computational modeling. To understand the dynamic behaviors of complex systems, computational models are frequently used. It facilitates a comprehensive understanding of how a host system defends itself against a pathogen attack or operates when the pathogen compromises the host's immune system. In this context, various approaches, such as systems immunology, network pharmacology, vaccinology and immunoinformatics, can be employed to effectively investigate vaccines and drugs. By utilizing this approach, we can ensure the health of livestock. This is beneficial not only for animal welfare but also for human health and environmental well-being. Therefore, the current review offers a detailed summary of systems biology advancements utilized in veterinary sciences, demonstrating the potential of the holistic approach in disease epidemiology, animal welfare and productivity.
Collapse
Affiliation(s)
- Rajesh Kumar Pathak
- Department of Animal Science and Technology, Chung-Ang University, Anseong-si, Gyeonggi-do 17546, Republic of Korea
| | - Jun-Mo Kim
- Department of Animal Science and Technology, Chung-Ang University, Anseong-si, Gyeonggi-do 17546, Republic of Korea
| |
Collapse
|
15
|
Tsouris A, Brach G, Schacherer J, Hou J. Non-additive genetic components contribute significantly to population-wide gene expression variation. CELL GENOMICS 2024; 4:100459. [PMID: 38190102 PMCID: PMC10794783 DOI: 10.1016/j.xgen.2023.100459] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 09/19/2023] [Accepted: 11/09/2023] [Indexed: 01/09/2024]
Abstract
Gene expression variation, an essential step between genotype and phenotype, is collectively controlled by local (cis) and distant (trans) regulatory changes. Nevertheless, how these regulatory elements differentially influence gene expression variation remains unclear. Here, we bridge this gap by analyzing the transcriptomes of a large diallel panel consisting of 323 unique hybrids originating from genetically divergent Saccharomyces cerevisiae isolates. Our analysis across 5,087 transcript abundance traits showed that non-additive components account for 36% of the gene expression variance on average. By comparing allele-specific read counts in parent-hybrid trios, we found that trans-regulatory changes underlie the majority of gene expression variation in the population. Remarkably, most cis-regulatory variations are also exaggerated or attenuated by additional trans effects. Overall, we showed that the transcriptome is globally buffered at the genetic level mainly due to trans-regulatory variation in the population.
Collapse
Affiliation(s)
- Andreas Tsouris
- Université de Strasbourg, CNRS, GMGM UMR, 7156 Strasbourg, France
| | - Gauthier Brach
- Université de Strasbourg, CNRS, GMGM UMR, 7156 Strasbourg, France
| | - Joseph Schacherer
- Université de Strasbourg, CNRS, GMGM UMR, 7156 Strasbourg, France; Institut Universitaire de France (IUF), Paris, France.
| | - Jing Hou
- Université de Strasbourg, CNRS, GMGM UMR, 7156 Strasbourg, France.
| |
Collapse
|
16
|
Chu J, Liu W, Hu X, Zhang H, Jiang J. P2RY13 is a prognostic biomarker and associated with immune infiltrates in renal clear cell carcinoma: A comprehensive bioinformatic study. Health Sci Rep 2023; 6:e1646. [PMID: 38045624 PMCID: PMC10691167 DOI: 10.1002/hsr2.1646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 09/03/2023] [Accepted: 10/10/2023] [Indexed: 12/05/2023] Open
Abstract
Background and Aims Clear cell renal cell carcinoma (ccRCC) is a common and aggressive form of cancer with a high incidence globally. This study aimed to investigate the role of P2RY13 in the progression of ccRCC and elucidate its mechanism of action. Methods Gene Expression Omnibus and The Cancer Genome Atlas databases were used to extract gene expression profiles of ccRCC. These profiles were annotated and visualized by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) functional enrichment analyses, as well as Gene Set Enrichment Analysis (GSEA). The STRING database was used to establish a protein-protein interaction network and to analyze the functional similarity. The GEPIA2 database was used to predict survival associated with hub genes. Meanwhile, the TIMER2.0 database was used to assess immune cell infiltration and its link with the hub genes. Immunohistochemistry (IHC) was used to determine the difference between ccRCC and adjacent normal tissue. Results We identified 272 differentially expressed genes (DEGs). GO and KEGG analyses suggested that DEGs were primarily involved in lymphocyte activation, inflammatory response, immunological effector mechanism pathways. By cytohubba, the 20 highest-scoring hub genes were screened to identify critical genes in the protein-protein interaction network linked with ccRCC. Resting dendritic cells, CD8 T cells, and activated mast cells all showed a significant positive correlation with these hub genes. Moreover, a higher immune score was associated with increased prognostic risk scores, which in turn correlated with a poorer prognosis. IHC revealed that P2RY13 was expressed at higher levels in ccRCC compared to para-cancer tissues. Conclusion Identifying the DEGs will aid in the understanding of the causes and molecular mechanisms involved in ccRCC. P2RY13 may play a pivotal role in the progression and prognosis of ccRCC, potentially driving carcinogenesis though immune system mechanisms.
Collapse
Affiliation(s)
- Jie Chu
- Department of OncologyThe First People's Hospital of ZiyangZiyangChina
| | - Wei Liu
- Department of General Family MedicineThe First People's Hospital of NeiJiangNeiJiangChina
| | - Xinyue Hu
- Department of Clinical Laboratory, Kunming First People's HospitalKunming Medical UniversityKunmingChina
| | - Huiling Zhang
- Department of OncologyThe First People's Hospital of ZiyangZiyangChina
| | - Jiudong Jiang
- Department of SurgeryThe First People's Hospital of ZiYangZiyangChina
| |
Collapse
|
17
|
Mozhui K, Kim H, Villani F, Haghani A, Sen S, Horvath S. Pleiotropic influence of DNA methylation QTLs on physiological and ageing traits. Epigenetics 2023; 18:2252631. [PMID: 37691384 PMCID: PMC10496549 DOI: 10.1080/15592294.2023.2252631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 07/31/2023] [Accepted: 08/16/2023] [Indexed: 09/12/2023] Open
Abstract
DNA methylation is influenced by genetic and non-genetic factors. Here, we chart quantitative trait loci (QTLs) that modulate levels of methylation at highly conserved CpGs using liver methylome data from mouse strains belonging to the BXD family. A regulatory hotspot on chromosome 5 had the highest density of trans-acting methylation QTLs (trans-meQTLs) associated with multiple distant CpGs. We refer to this locus as meQTL.5a. Trans-modulated CpGs showed age-dependent changes and were enriched in developmental genes, including several members of the MODY pathway (maturity onset diabetes of the young). The joint modulation by genotype and ageing resulted in a more 'aged methylome' for BXD strains that inherited the DBA/2J parental allele at meQTL.5a. Further, several gene expression traits, body weight, and lipid levels mapped to meQTL.5a, and there was a modest linkage with lifespan. DNA binding motif and protein-protein interaction enrichment analyses identified the hepatic nuclear factor, Hnf1a (MODY3 gene in humans), as a strong candidate. The pleiotropic effects of meQTL.5a could contribute to variations in body size and metabolic traits, and influence CpG methylation and epigenetic ageing that could have an impact on lifespan.
Collapse
Affiliation(s)
- Khyobeni Mozhui
- Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN, USA
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Hyeonju Kim
- Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Flavia Villani
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Amin Haghani
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Altos Labs, San Diego, CA, USA
| | - Saunak Sen
- Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Steve Horvath
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Altos Labs, San Diego, CA, USA
- Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, Los Angeles, CA, USA
| |
Collapse
|
18
|
Allayee H, Farber CR, Seldin MM, Williams EG, James DE, Lusis AJ. Systems genetics approaches for understanding complex traits with relevance for human disease. eLife 2023; 12:e91004. [PMID: 37962168 PMCID: PMC10645424 DOI: 10.7554/elife.91004] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 10/16/2023] [Indexed: 11/15/2023] Open
Abstract
Quantitative traits are often complex because of the contribution of many loci, with further complexity added by environmental factors. In medical research, systems genetics is a powerful approach for the study of complex traits, as it integrates intermediate phenotypes, such as RNA, protein, and metabolite levels, to understand molecular and physiological phenotypes linking discrete DNA sequence variation to complex clinical and physiological traits. The primary purpose of this review is to describe some of the resources and tools of systems genetics in humans and rodent models, so that researchers in many areas of biology and medicine can make use of the data.
Collapse
Affiliation(s)
- Hooman Allayee
- Departments of Population & Public Health Sciences, University of Southern CaliforniaLos AngelesUnited States
- Biochemistry & Molecular Medicine, Keck School of Medicine, University of Southern CaliforniaLos AngelesUnited States
| | - Charles R Farber
- Center for Public Health Genomics, University of Virginia School of MedicineCharlottesvilleUnited States
- Departments of Biochemistry & Molecular Genetics, University of Virginia School of MedicineCharlottesvilleUnited States
- Public Health Sciences, University of Virginia School of MedicineCharlottesvilleUnited States
| | - Marcus M Seldin
- Department of Biological Chemistry, University of California, IrvineIrvineUnited States
| | - Evan Graehl Williams
- Luxembourg Centre for Systems Biomedicine, University of LuxembourgLuxembourgLuxembourg
| | - David E James
- School of Life and Environmental Sciences, University of SydneyCamperdownAustralia
- Faculty of Medicine and Health, University of SydneyCamperdownAustralia
- Charles Perkins Centre, University of SydneyCamperdownAustralia
| | - Aldons J Lusis
- Departments of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Medicine, University of California, Los AngelesLos AngelesUnited States
- Microbiology, Immunology, & Molecular Genetics, David Geffen School of Medicine of UCLALos AngelesUnited States
| |
Collapse
|
19
|
Zhu Z, Chen X, Zhang S, Yu R, Qi C, Cheng L, Zhang X. Leveraging molecular quantitative trait loci to comprehend complex diseases/traits from the omics perspective. Hum Genet 2023; 142:1543-1560. [PMID: 37755483 DOI: 10.1007/s00439-023-02602-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 09/14/2023] [Indexed: 09/28/2023]
Abstract
Comprehending the molecular basis of quantitative genetic variation is a principal goal for complex diseases or traits. Molecular quantitative trait loci (molQTLs) have made it possible to investigate the effects of genetic variants hiding behind large-scale omics data. A deeper understanding of molQTL is urgently required in light of the multi-dimensionalization of omics data to more fully elucidate the pertinent biological mechanisms. Herein, we reviewed molQTLs with the corresponding resource from the omics perspective and further discussed the integrative strategy of GWAS-molQTL to infer their causal effects. Subsequently, we described the opportunities and challenges encountered by molQTL. The case studies showed that molQTL is essential for complex diseases and traits, whether single- or multi-omics QTLs. Overall, we highlighted the functional significance of genetic variants to employ the discovery of molQTL in complex diseases and traits.
Collapse
Affiliation(s)
- Zijun Zhu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Xinyu Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Sainan Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Rui Yu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Changlu Qi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China.
- NHC Key Laboratory of Molecular Probe and Targeted Diagnosis and Therapy, Harbin Medical University, Harbin, 150028, Heilongjiang, China.
| | - Xue Zhang
- NHC Key Laboratory of Molecular Probe and Targeted Diagnosis and Therapy, Harbin Medical University, Harbin, 150028, Heilongjiang, China
- McKusick-Zhang Center for Genetic Medicine, State Key Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100005, China
| |
Collapse
|
20
|
Nguyen TM, Craig DB, Tran D, Nguyen T, Draghici S. A novel approach for predicting upstream regulators (PURE) that affect gene expression. Sci Rep 2023; 13:18571. [PMID: 37903768 PMCID: PMC10616115 DOI: 10.1038/s41598-023-41374-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 08/25/2023] [Indexed: 11/01/2023] Open
Abstract
External factors such as exposure to a chemical, drug, or toxicant (CDT), or conversely, the lack of certain chemicals can cause many diseases. The ability to identify such causal CDTs based on changes in the gene expression profile is extremely important in many studies. Furthermore, the ability to correctly infer CDTs that can revert the gene expression changes induced by a given disease phenotype is a crucial step in drug repurposing. We present an approach for Predicting Upstream REgulators (PURE) designed to tackle this challenge. PURE can correctly infer a CDT from the measured expression changes in a given phenotype, as well as correctly identify drugs that could revert disease-induced gene expression changes. We compared the proposed approach with four classical approaches as well as with the causal analysis used in Ingenuity Pathway Analysis (IPA) on 16 data sets (1 rat, 5 mouse, and 10 human data sets), involving 8 chemicals or drugs. We assessed the results based on the ability to correctly identify the CDT as indicated by its rank. We also considered the number of false positives, i.e. CDTs other than the correct CDT that were reported to be significant by each method. The proposed approach performed best in 11 out of the 16 experiments, reporting the correct CDT at the very top 7 times. IPA was the second best, reporting the correct CDT at the top 5 times, but was unable to identify the correct CDT at all in 5 out of the 16 experiments. The validation results showed that our approach, PURE, outperformed some of the most popular methods in the field. PURE could effectively infer the true CDTs responsible for the observed gene expression changes and could also be useful in drug repurposing applications.
Collapse
Affiliation(s)
- Tuan-Minh Nguyen
- Department of Computer Science, Wayne State University, Detroit, 48202, USA
| | - Douglas B Craig
- Department of Computer Science, Wayne State University, Detroit, 48202, USA
- Department of Oncology, School of Medicine, Wayne State University, Detroit, MI, 48201, USA
| | - Duc Tran
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Tin Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, 36849, USA
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, 48202, USA.
- Advaita Bioinformatics, Ann Arbor, MI, 48105, USA.
| |
Collapse
|
21
|
Michoel T, Zhang JD. Causal inference in drug discovery and development. Drug Discov Today 2023; 28:103737. [PMID: 37591410 DOI: 10.1016/j.drudis.2023.103737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Revised: 07/31/2023] [Accepted: 08/10/2023] [Indexed: 08/19/2023]
Abstract
To discover new drugs is to seek and to prove causality. As an emerging approach leveraging human knowledge and creativity, data, and machine intelligence, causal inference holds the promise of reducing cognitive bias and improving decision-making in drug discovery. Although it has been applied across the value chain, the concepts and practice of causal inference remain obscure to many practitioners. This article offers a nontechnical introduction to causal inference, reviews its recent applications, and discusses opportunities and challenges of adopting the causal language in drug discovery and development.
Collapse
Affiliation(s)
- Tom Michoel
- Computational Biology Unit, Department of Informatics, University of Bergen, Postboks 7803, 5020 Bergen, Norway
| | - Jitao David Zhang
- Pharma Early Research and Development, Roche Innovation Centre Basel, F. Hoffmann-La Roche, Grenzacherstrasse 124, 4070 Basel, Switzerland; Department of Mathematics and Computer Science, University of Basel, Spiegelgasse 1, 4051 Basel, Switzerland.
| |
Collapse
|
22
|
Brown AA, Fernandez-Tajes JJ, Hong MG, Brorsson CA, Koivula RW, Davtian D, Dupuis T, Sartori A, Michalettou TD, Forgie IM, Adam J, Allin KH, Caiazzo R, Cederberg H, De Masi F, Elders PJM, Giordano GN, Haid M, Hansen T, Hansen TH, Hattersley AT, Heggie AJ, Howald C, Jones AG, Kokkola T, Laakso M, Mahajan A, Mari A, McDonald TJ, McEvoy D, Mourby M, Musholt PB, Nilsson B, Pattou F, Penet D, Raverdy V, Ridderstråle M, Romano L, Rutters F, Sharma S, Teare H, 't Hart L, Tsirigos KD, Vangipurapu J, Vestergaard H, Brunak S, Franks PW, Frost G, Grallert H, Jablonka B, McCarthy MI, Pavo I, Pedersen O, Ruetten H, Walker M, Adamski J, Schwenk JM, Pearson ER, Dermitzakis ET, Viñuela A. Genetic analysis of blood molecular phenotypes reveals common properties in the regulatory networks affecting complex traits. Nat Commun 2023; 14:5062. [PMID: 37604891 PMCID: PMC10442420 DOI: 10.1038/s41467-023-40569-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 08/02/2023] [Indexed: 08/23/2023] Open
Abstract
We evaluate the shared genetic regulation of mRNA molecules, proteins and metabolites derived from whole blood from 3029 human donors. We find abundant allelic heterogeneity, where multiple variants regulate a particular molecular phenotype, and pleiotropy, where a single variant associates with multiple molecular phenotypes over multiple genomic regions. The highest proportion of share genetic regulation is detected between gene expression and proteins (66.6%), with a further median shared genetic associations across 49 different tissues of 78.3% and 62.4% between plasma proteins and gene expression. We represent the genetic and molecular associations in networks including 2828 known GWAS variants, showing that GWAS variants are more often connected to gene expression in trans than other molecular phenotypes in the network. Our work provides a roadmap to understanding molecular networks and deriving the underlying mechanism of action of GWAS variants using different molecular phenotypes in an accessible tissue.
Collapse
Affiliation(s)
- Andrew A Brown
- Population Health and Genomics, Ninewells Hospital and Medical School, University of Dundee, Dundee, DD1 9SY, United Kingdom
| | - Juan J Fernandez-Tajes
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, United Kingdom
| | - Mun-Gwan Hong
- Science for Life Laboratory, School of Biotechnology, KTH - Royal Institute of Technology, Solna, SE-171 21, Sweden
| | - Caroline A Brorsson
- Department of Health Technology, Technical University of Denmark, Kongens Lyngby, Denmark
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, DK-2100, Denmark
| | - Robert W Koivula
- Oxford Centre for Diabetes Endocrinology and Metabolism, University of Oxford, Oxford, OX3 7LJ, United Kingdom
| | - David Davtian
- Population Health and Genomics, Ninewells Hospital and Medical School, University of Dundee, Dundee, DD1 9SY, United Kingdom
| | - Théo Dupuis
- Population Health and Genomics, Ninewells Hospital and Medical School, University of Dundee, Dundee, DD1 9SY, United Kingdom
| | - Ambra Sartori
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, 1211, Switzerland
- Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, Geneva, 1211, Switzerland
- Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland
| | - Theodora-Dafni Michalettou
- Biosciences Institute, Faculty of Medical Sciences, University of Newcastle, Newcastle upon Tyne, NE1 4EP, United Kingdom
| | - Ian M Forgie
- Population Health and Genomics, Ninewells Hospital and Medical School, University of Dundee, Dundee, DD1 9SY, United Kingdom
| | - Jonathan Adam
- German Center for Diabetes Research (DZD), Neuherberg, 85764, Germany
- Research Unit of Molecular Epidemiology, Institute of Epidemiology, German Research Center for Environmental Health, Helmholtz Zentrum München, Neuherberg, 85764, Germany
| | - Kristine H Allin
- The Novo Nordisk Center for Basic Metabolic Research, Faculty of Health and Medical Science, University of Copenhagen, Copenhagen, DK-2100, Denmark
| | - Robert Caiazzo
- University of Lille, Inserm, Lille Pasteur Institute, Lille, France
| | - Henna Cederberg
- Internal Medicine, Institute of Clinical Medicine, University of Eastern Finland, Kuopio, Finland
| | - Federico De Masi
- Department of Health Technology, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Petra J M Elders
- Department of General Practice, Amsterdam UMC- location Vumc, Amsterdam Public Health research institute, Amsterdam, The Netherlands
| | - Giuseppe N Giordano
- Department of Clinical Science, Genetic and Molecular Epidemiology, Lund University Diabetes Centre, Malmö, Sweden
| | - Mark Haid
- Metabolomics and Proteomics Core, German Research Center for Environmental Health, Helmholtz Zentrum München, Neuherberg, 85764, Germany
| | - Torben Hansen
- The Novo Nordisk Center for Basic Metabolic Research, Faculty of Health and Medical Science, University of Copenhagen, Copenhagen, DK-2100, Denmark
| | - Tue H Hansen
- The Novo Nordisk Center for Basic Metabolic Research, Faculty of Health and Medical Science, University of Copenhagen, Copenhagen, DK-2100, Denmark
| | - Andrew T Hattersley
- Department of Clinical and Biomedical Sciences, University of Exeter College of Medicine & Health, Exeter, EX25DW, United Kingdom
| | - Alison J Heggie
- Institute of Cellular Medicine, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Cédric Howald
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, 1211, Switzerland
- Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, Geneva, 1211, Switzerland
- Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland
| | - Angus G Jones
- Department of Clinical and Biomedical Sciences, University of Exeter College of Medicine & Health, Exeter, EX25DW, United Kingdom
| | - Tarja Kokkola
- Internal Medicine, Institute of Clinical Medicine, University of Eastern Finland, Kuopio, Finland
| | - Markku Laakso
- Internal Medicine, Institute of Clinical Medicine, University of Eastern Finland, Kuopio, Finland
| | - Anubha Mahajan
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, United Kingdom
| | - Andrea Mari
- Institute of Neuroscience, National Research Council, Padova, 35127, Italy
| | - Timothy J McDonald
- Blood Sciences, Royal Devon and Exeter NHS Foundation Trust, Exeter, EX2 5DW, United Kingdom
| | - Donna McEvoy
- Diabetes Research Network, Royal Victoria Infirmary, Newcastle upon Tyne, United Kingdom
| | - Miranda Mourby
- Nuffield Department of Population Health, Centre for Health, Law and Emerging Technologies (HeLEX), University of Oxford, Oxford, OX2 7DD, United Kingdom
| | - Petra B Musholt
- Global Development, Sanofi-Aventis Deutschland GmbH, Hoechst Industrial Park, Frankfurt am Main, 65926, Germany
| | - Birgitte Nilsson
- Department of Health Technology, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Francois Pattou
- University of Lille, Inserm, Lille Pasteur Institute, Lille, France
| | - Deborah Penet
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, 1211, Switzerland
- Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, Geneva, 1211, Switzerland
- Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland
| | - Violeta Raverdy
- University of Lille, Inserm, Lille Pasteur Institute, Lille, France
| | | | - Luciana Romano
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, 1211, Switzerland
- Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, Geneva, 1211, Switzerland
- Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland
| | - Femke Rutters
- Epidemiology and Data Science, VUMC, Amsterdam, The Netherlands
| | - Sapna Sharma
- Research Unit of Molecular Epidemiology, Institute of Epidemiology, German Research Center for Environmental Health, Helmholtz Zentrum München, Neuherberg, 85764, Germany
- Food Chemistry and Molecular and Sensory Science, Technical University of Munich, München, Germany
| | - Harriet Teare
- Centre for Health Law and Emerging Technologies, Department of Population Health, University of Oxford, Old Road Campus, Oxford, OX3 7DQ, United Kingdom
| | - Leen 't Hart
- Epidemiology and Data Science, VUMC, Amsterdam, The Netherlands
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, The Netherlands
- Department of Biomedical Data Sciences, Molecular Epidemiology section, Leiden University Medical Center, Leiden, The Netherlands
| | | | - Jagadish Vangipurapu
- Internal Medicine, Institute of Clinical Medicine, University of Eastern Finland, Kuopio, Finland
| | - Henrik Vestergaard
- The Novo Nordisk Center for Basic Metabolic Research, Faculty of Health and Medical Science, University of Copenhagen, Copenhagen, DK-2100, Denmark
- Steno Diabetes Center Copenhagen, Copenhagen, Denmark
| | - Søren Brunak
- Department of Health Technology, Technical University of Denmark, Kongens Lyngby, Denmark
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, DK-2100, Denmark
| | - Paul W Franks
- Department of Clinical Science, Genetic and Molecular Epidemiology, Lund University Diabetes Centre, Malmö, Sweden
| | - Gary Frost
- Nutrition and Dietetics Research Group, Imperial College London, London, SW7 2AZ, United Kingdom
| | - Harald Grallert
- German Center for Diabetes Research (DZD), Neuherberg, 85764, Germany
- Research Unit of Molecular Epidemiology, Institute of Epidemiology, German Research Center for Environmental Health, Helmholtz Zentrum München, Neuherberg, 85764, Germany
| | - Bernd Jablonka
- Sanofi Partnering, Sanofi-Aventis Deutschland GmbH, Frankfurt am Main, 65926, Germany
| | - Mark I McCarthy
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, United Kingdom
- GENENTECH, 1 DNA Way, San Francisco, CA, 94080, USA
| | - Imre Pavo
- Eli Lilly Regional Operations Ges.m.b.H, Vienna, 1030, Austria
| | - Oluf Pedersen
- Center for Clinical Metabolic Research, Herlev and Gentofte University Hospital, Copenhagen, Denmark
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, DK-2100, Denmark
| | - Hartmut Ruetten
- Sanofi Partnering, Sanofi-Aventis Deutschland GmbH, Frankfurt am Main, 65926, Germany
| | - Mark Walker
- Translational and Clinical Research Institute, Faculty of Medical Sciences, University of Newcastle, Newcastle upon Tyne, United Kingdom
| | - Jerzy Adamski
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117597, Singapore
- Institute of Experimental Genetics, German Research Center for Environmental Health, Helmholtz Zentrum München, Neuherberg, 85764, Germany
- Institute of Biochemistry, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Jochen M Schwenk
- Science for Life Laboratory, School of Biotechnology, KTH - Royal Institute of Technology, Solna, SE-171 21, Sweden
| | - Ewan R Pearson
- Population Health and Genomics, Ninewells Hospital and Medical School, University of Dundee, Dundee, DD1 9SY, United Kingdom
| | - Emmanouil T Dermitzakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, 1211, Switzerland.
- Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, Geneva, 1211, Switzerland.
- Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland.
| | - Ana Viñuela
- Biosciences Institute, Faculty of Medical Sciences, University of Newcastle, Newcastle upon Tyne, NE1 4EP, United Kingdom.
| |
Collapse
|
23
|
Tsouris A, Brach G, Schacherer J, Hou J. Non-additive genetic components contribute significantly to population-wide gene expression variation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.21.550013. [PMID: 37546809 PMCID: PMC10401925 DOI: 10.1101/2023.07.21.550013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Gene expression variation, an essential step between genomic variation and phenotypic landscape, is collectively controlled by local (cis) and distant (trans) regulatory changes. Nevertheless, how these regulatory elements differentially influence the heritability of expression traits remains unclear. Here, we bridge this gap by analyzing the transcriptomes of a large diallel panel consisting of 323 unique hybrids originated from genetically divergent yeast isolates. We estimated the broad- and narrow-sense heritability across 5,087 transcript abundance traits and showed that non-additive components account for 36% of the phenotypic variance on average. By comparing allelic expression ratios in the hybrid and the corresponding parental pair, we identified regulatory changes in 25% of all cases, with a majority acting in trans. We further showed that trans-regulation could underlie coordinated expression variation across highly connected genes, resulting in significantly higher non-additive variance and most likely in some of the missing heritability of gene expression traits.
Collapse
Affiliation(s)
- Andreas Tsouris
- Université de Strasbourg, CNRS, GMGM UMR 7156, Strasbourg, France
| | - Gauthier Brach
- Université de Strasbourg, CNRS, GMGM UMR 7156, Strasbourg, France
| | - Joseph Schacherer
- Université de Strasbourg, CNRS, GMGM UMR 7156, Strasbourg, France
- Institut Universitaire de France (IUF), Paris, France
| | - Jing Hou
- Université de Strasbourg, CNRS, GMGM UMR 7156, Strasbourg, France
| |
Collapse
|
24
|
Littman R, Cheng M, Wang N, Peng C, Yang X. SCING: Inference of robust, interpretable gene regulatory networks from single cell and spatial transcriptomics. iScience 2023; 26:107124. [PMID: 37434694 PMCID: PMC10331489 DOI: 10.1016/j.isci.2023.107124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 03/31/2023] [Accepted: 06/09/2023] [Indexed: 07/13/2023] Open
Abstract
Gene regulatory network (GRN) inference is an integral part of understanding physiology and disease. Single cell/nuclei RNA-seq (scRNA-seq/snRNA-seq) data has been used to elucidate cell-type GRNs; however, the accuracy and speed of current scRNAseq-based GRN approaches are suboptimal. Here, we present Single Cell INtegrative Gene regulatory network inference (SCING), a gradient boosting and mutual information-based approach for identifying robust GRNs from scRNA-seq, snRNA-seq, and spatial transcriptomics data. Performance evaluation using Perturb-seq datasets, held-out data, and the mouse cell atlas combined with the DisGeNET database demonstrates the improved accuracy and biological interpretability of SCING compared to existing methods. We applied SCING to the entire mouse single cell atlas, human Alzheimer's disease (AD), and mouse AD spatial transcriptomics. SCING GRNs reveal unique disease subnetwork modeling capabilities, have intrinsic capacity to correct for batch effects, retrieve disease relevant genes and pathways, and are informative on spatial specificity of disease pathogenesis.
Collapse
Affiliation(s)
- Russell Littman
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Michael Cheng
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Ning Wang
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
| | - Chao Peng
- Department of Neurology, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Xia Yang
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
- Institute for Quantitative and Computational Biosciences (QCBio), Los Angeles, CA, USA
- Molecular Biology Institute (MBI), Los Angeles, CA, USA
- Brain Research Institute (BRI), Los Angeles, CA, USA
| |
Collapse
|
25
|
Merchant JP, Zhu K, Henrion MYR, Zaidi SSA, Lau B, Moein S, Alamprese ML, Pearse RV, Bennett DA, Ertekin-Taner N, Young-Pearse TL, Chang R. Predictive network analysis identifies JMJD6 and other potential key drivers in Alzheimer's disease. Commun Biol 2023; 6:503. [PMID: 37188718 PMCID: PMC10185548 DOI: 10.1038/s42003-023-04791-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 03/31/2023] [Indexed: 05/17/2023] Open
Abstract
Despite decades of genetic studies on late-onset Alzheimer's disease, the underlying molecular mechanisms remain unclear. To better comprehend its complex etiology, we use an integrative approach to build robust predictive (causal) network models using two large human multi-omics datasets. We delineate bulk-tissue gene expression into single cell-type gene expression and integrate clinical and pathologic traits, single nucleotide variation, and deconvoluted gene expression for the construction of cell type-specific predictive network models. Here, we focus on neuron-specific network models and prioritize 19 predicted key drivers modulating Alzheimer's pathology, which we then validate by knockdown in human induced pluripotent stem cell-derived neurons. We find that neuronal knockdown of 10 of the 19 targets significantly modulates levels of amyloid-beta and/or phosphorylated tau peptides, most notably JMJD6. We also confirm our network structure by RNA sequencing in the neurons following knockdown of each of the 10 targets, which additionally predicts that they are upstream regulators of REST and VGF. Our work thus identifies robust neuronal key drivers of the Alzheimer's-associated network state which may represent therapeutic targets with relevance to both amyloid and tau pathology in Alzheimer's disease.
Collapse
Affiliation(s)
- Julie P Merchant
- Ann Romney Center for Neurologic Diseases, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Neuroscience Graduate Group, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Kuixi Zhu
- The Center for Innovation in Brain Sciences, University of Arizona, Tucson, AZ, USA
| | - Marc Y R Henrion
- Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, Pembroke Place, L3 5QA, UK
- Malawi-Liverpool-Wellcome Trust Clinical Research Programme, PO Box 30096, Blantyre, Malawi
| | - Syed S A Zaidi
- The Center for Innovation in Brain Sciences, University of Arizona, Tucson, AZ, USA
| | - Branden Lau
- The Center for Innovation in Brain Sciences, University of Arizona, Tucson, AZ, USA
- Arizona Research Labs, Genetics Core, University of Arizona, Tucson, AZ, USA
| | - Sara Moein
- The Center for Innovation in Brain Sciences, University of Arizona, Tucson, AZ, USA
| | - Melissa L Alamprese
- The Center for Innovation in Brain Sciences, University of Arizona, Tucson, AZ, USA
| | - Richard V Pearse
- Ann Romney Center for Neurologic Diseases, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - Nilüfer Ertekin-Taner
- Department of Neuroscience, Mayo Clinic Florida, Jacksonville, FL, USA
- Department of Neurology, Mayo Clinic Florida, Jacksonville, FL, USA
| | - Tracy L Young-Pearse
- Ann Romney Center for Neurologic Diseases, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Harvard Stem Cell Institute, Harvard University, Boston, MA, USA.
| | - Rui Chang
- The Center for Innovation in Brain Sciences, University of Arizona, Tucson, AZ, USA.
- Department of Neurology, University of Arizona, Tucson, AZ, USA.
- INTelico Therapeutics LLC, Tucson, AZ, USA.
- PATH Biotech LLC, Tucson, AZ, USA.
| |
Collapse
|
26
|
Gastonguay MS, Keele GR, Churchill GA. The trouble with triples: Examining the impact of measurement error in mediation analysis. Genetics 2023; 224:iyad045. [PMID: 36932658 PMCID: PMC10158839 DOI: 10.1093/genetics/iyad045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 01/03/2023] [Accepted: 02/11/2023] [Indexed: 03/19/2023] Open
Abstract
Mediation analysis is used in genetic mapping studies to identify candidate gene mediators of quantitative trait loci (QTL). We consider genetic mediation analysis of triplets-sets of three variables consisting of a target trait, the genotype at a QTL for the target trait, and a candidate mediator that is the abundance of a transcript or protein whose coding gene co-locates with the QTL. We show that, in the presence of measurement error, mediation analysis can infer partial mediation even in the absence of a causal relationship between the candidate mediator and the target. We describe a measurement error model and a corresponding latent variable model with estimable parameters that are combinations of the causal effects and measurement errors across all three variables. The relative magnitudes of the latent variable correlations determine whether or not mediation analysis will tend to infer the correct causal relationship in large samples. We examine case studies that illustrate the common failure modes of genetic mediation analysis and demonstrate how to evaluate the effects of measurement error. While genetic mediation analysis is a powerful tool for identifying candidate genes, we recommend caution when interpreting mediation analysis findings.
Collapse
|
27
|
Koskinen MK, Hovatta I. Genetic insights into the neurobiology of anxiety. Trends Neurosci 2023; 46:318-331. [PMID: 36828693 DOI: 10.1016/j.tins.2023.01.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 01/20/2023] [Accepted: 01/30/2023] [Indexed: 02/25/2023]
Abstract
Anxiety and fear are evolutionarily conserved emotions that increase the likelihood of an organism surviving threatening situations. Anxiety and vigilance states are regulated by neural networks involving multiple brain regions. In anxiety disorders, this intricate regulatory system is disturbed, leading to excessive or prolonged anxiety or fear. Anxiety disorders have both genetic and environmental risk factors. Genetic research has the potential to identify specific genetic variants causally associated with specific phenotypes. In recent decades, genome-wide association studies (GWASs) have revealed variants predisposing to neuropsychiatric disorders, suggesting novel neurobiological pathways in the etiology of these disorders. Here, we review recent human GWASs of anxiety disorders, and genetic studies of anxiety-like behavior in rodent models. These studies are paving the way for a better understanding of the neurobiological mechanisms underlying anxiety disorders.
Collapse
Affiliation(s)
- Maija-Kreetta Koskinen
- SleepWell Research Program and Department of Psychology and Logopedics, Faculty of Medicine, PO Box 21, 00014, University of Helsinki, Helsinki, Finland
| | - Iiris Hovatta
- SleepWell Research Program and Department of Psychology and Logopedics, Faculty of Medicine, PO Box 21, 00014, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
28
|
Zinc Finger Protein 90 Knockdown Promotes Cisplatin Sensitivity via Nrf2/HO-1 Pathway in Ovarian Cancer Cell. Cancers (Basel) 2023; 15:cancers15051586. [PMID: 36900383 PMCID: PMC10000492 DOI: 10.3390/cancers15051586] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 02/24/2023] [Accepted: 03/01/2023] [Indexed: 03/08/2023] Open
Abstract
Our study discussed the role of Zfp90 in ovarian cancer (OC) cell lines' sensitivity to cisplatin. We used two OC cell lines, SK-OV-3 and ES-2, to evaluate their role in cisplatin sensitization. The protein levels of p-Akt, ERK, caspase 3, Bcl-2, Bax, E-cadherin, MMP-2, MMP-9 and other drug resistance-related molecules, including Nrf2/HO-1, were discovered in the SK-OV-3 and ES-2 cells. We also used a human ovarian surface epithelial cell to compare the effect of Zfp90. Our outcomes indicated that cisplatin treatment generates reactive oxygen species (ROS) that modulate apoptotic protein expression. The anti-oxidative signal was also stimulated, which could hinder cell migration. The intervention of Zfp90 could greatly improve the apoptosis pathway and block the migrative pathway to regulate the cisplatin sensitivity in the OC cells. This study implies that the loss of function of Zfp90 might promote cisplatin sensitization in OC cells via regulating the Nrf2/HO-1 pathway to enhance cell apoptosis and inhibit the migrative effect in both SK-OV-3 and ES-2 cells.
Collapse
|
29
|
Ha D, Kong J, Kim D, Lee K, Lee J, Park M, Ahn H, Oh Y, Kim S. Development of bioinformatics and multi-omics analyses in organoids. BMB Rep 2023; 56:43-48. [PMID: 36284440 PMCID: PMC9887100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Indexed: 01/28/2023] Open
Abstract
Pre-clinical models are critical in gaining mechanistic and biological insights into disease progression. Recently, patient-derived organoid models have been developed to facilitate our understanding of disease development and to improve the discovery of therapeutic options by faithfully recapitulating in vivo tissues or organs. As technological developments of organoid models are rapidly growing, computational methods are gaining attention in organoid researchers to improve the ability to systematically analyze experimental results. In this review, we summarize the recent advances in organoid models to recapitulate human diseases and computational advancements to analyze experimental results from organoids. [BMB Reports 2023; 56(1): 43-48].
Collapse
Affiliation(s)
- Doyeon Ha
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 37673, Korea
| | - JungHo Kong
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 37673, Korea
| | - Donghyo Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 37673, Korea
| | - Kwanghwan Lee
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 37673, Korea
| | - Juhun Lee
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 37673, Korea
| | - Minhyuk Park
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 37673, Korea
| | - Hyunsoo Ahn
- Graduate School of Artificial Intelligence, Pohang University of Science and Technology, Pohang 37673, Korea
| | - Youngchul Oh
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 37673, Korea
| | - Sanguk Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 37673, Korea,Graduate School of Artificial Intelligence, Pohang University of Science and Technology, Pohang 37673, Korea,Corresponding author. Tel: +82-54-279-2348; Fax: +82-54-279-2199; E-mail:
| |
Collapse
|
30
|
Xu P, Wang M, Sharma NK, Comeau ME, Wabitsch M, Langefeld CD, Civelek M, Zhang B, Das SK. Multi-omic integration reveals cell-type-specific regulatory networks of insulin resistance in distinct ancestry populations. Cell Syst 2023; 14:41-57.e8. [PMID: 36630956 PMCID: PMC9852073 DOI: 10.1016/j.cels.2022.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 09/26/2022] [Accepted: 12/13/2022] [Indexed: 01/12/2023]
Abstract
Our knowledge of the cell-type-specific mechanisms of insulin resistance remains limited. To dissect the cell-type-specific molecular signatures of insulin resistance, we performed a multiscale gene network analysis of adipose and muscle tissues in African and European ancestry populations. In adipose tissues, a comparative analysis revealed ethnically conserved cell-type signatures and two adipocyte subtype-enriched modules with opposite insulin sensitivity responses. The modules enriched for adipose stem and progenitor cells as well as immune cells showed negative correlations with insulin sensitivity. In muscle tissues, the modules enriched for stem cells and fibro-adipogenic progenitors responded to insulin sensitivity oppositely. The adipocyte and muscle fiber-enriched modules shared cellular-respiration-related genes but had tissue-specific rearrangements of gene regulations in response to insulin sensitivity. Integration of the gene co-expression and causal networks further pinpointed key drivers of insulin resistance. Together, this study revealed the cell-type-specific transcriptomic networks and signaling maps underlying insulin resistance in major glucose-responsive tissues. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Peng Xu
- Department of Genetics & Genomic Sciences, Mount Sinai Center for Transformative Disease Modeling, Icahn Genomics Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Minghui Wang
- Department of Genetics & Genomic Sciences, Mount Sinai Center for Transformative Disease Modeling, Icahn Genomics Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Neeraj K Sharma
- Department of Internal Medicine, Section of Endocrinology and Metabolism, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA
| | - Mary E Comeau
- Department of Biostatistics and Data Science, Division of Public Health Sciences, and Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA
| | - Martin Wabitsch
- Division of Pediatric Endocrinology and Diabetes, Department of Pediatrics and Adolescent Medicine, University Medical Center Ulm, Eythstr. 24, D-89075 Ulm, Germany
| | - Carl D Langefeld
- Department of Biostatistics and Data Science, Division of Public Health Sciences, and Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA
| | - Mete Civelek
- Center for Public Health Genomics, Department of Biomedical Engineering, University of Virginia, Charlottesville, VA 22908, USA
| | - Bin Zhang
- Department of Genetics & Genomic Sciences, Mount Sinai Center for Transformative Disease Modeling, Icahn Genomics Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| | - Swapan K Das
- Department of Internal Medicine, Section of Endocrinology and Metabolism, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA.
| |
Collapse
|
31
|
Cutano V, Ferreira Mendes JM, Escudeiro-Lopes S, Machado S, Vinaixa Forner J, Gonzales-Morena JM, Prevorovsky M, Zemlianski V, Feng Y, Kralova Viziova P, Hartmanova A, Malcekova B, Jakoube P, Iyer S, Keckesova Z. LACTB exerts tumor suppressor properties in epithelial ovarian cancer through regulation of Slug. Life Sci Alliance 2023; 6:e202201510. [PMID: 36375842 PMCID: PMC9664245 DOI: 10.26508/lsa.202201510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 10/24/2022] [Accepted: 10/25/2022] [Indexed: 11/16/2022] Open
Abstract
Epithelial-mesenchymal transition (EMT) is a cellular mechanism used by cancer cells to acquire migratory and stemness properties. In this study, we show, through in vitro, in vivo, and 3D culture experiments, that the mitochondrial protein LACTB manifests tumor suppressor properties in ovarian cancer. We show that LACTB is significantly down-regulated in epithelial ovarian cancer cells and clinical tissues. Re-expression of LACTB negatively effects the growth of cancer cells but not of non-tumorigenic cells. Mechanistically, we show that LACTB leads to differentiation of ovarian cancer cells and loss of their stemness properties, which is achieved through the inhibition of the EMT program and the LACTB-dependent down-regulation of Snail2/Slug transcription factor. This study uncovers a novel role of LACTB in ovarian cancer and proposes new ways of counteracting the oncogenic EMT program in this model system.
Collapse
Affiliation(s)
- Valentina Cutano
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague, Czech Republic
| | | | - Sara Escudeiro-Lopes
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague, Czech Republic
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Susana Machado
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague, Czech Republic
| | - Judith Vinaixa Forner
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague, Czech Republic
| | - Juan M Gonzales-Morena
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague, Czech Republic
| | - Martin Prevorovsky
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Viacheslav Zemlianski
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Yuxiong Feng
- Zhejiang Provincial Key Laboratory of Pancreatic Disease, First Affiliated Hospital, and Institute of Translational Medicine, Zhejiang University School of Medicine, Hangzhou, China
| | - Petra Kralova Viziova
- The Czech Center for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Vestec, Czech Republic
| | - Andrea Hartmanova
- The Czech Center for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Vestec, Czech Republic
| | - Beata Malcekova
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague, Czech Republic
| | - Pavel Jakoube
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague, Czech Republic
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Sonia Iyer
- Whitehead Institute for Biomedical Research, Cambridge, MA, USA
| | - Zuzana Keckesova
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague, Czech Republic
| |
Collapse
|
32
|
Chen M, Jia S, Xue M, Huang H, Xu Z, Yang D, Zhu W, Song Q. Dual-Stream Subspace Clustering Network for revealing gene targets in Alzheimer's disease. Comput Biol Med 2022; 151:106305. [PMID: 36401971 DOI: 10.1016/j.compbiomed.2022.106305] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 11/02/2022] [Accepted: 11/06/2022] [Indexed: 11/13/2022]
Abstract
The rapid development of scRNA-seq technology in recent years has enabled us to capture high-throughput gene expression profiles at single-cell resolution, reveal the heterogeneity of complex cell populations, and greatly advance our understanding of the underlying mechanisms in human diseases. Traditional methods for gene co-expression clustering are limited to discovering effective gene groups in scRNA-seq data. In this paper, we propose a novel gene clustering method based on convolutional neural networks called Dual-Stream Subspace Clustering Network (DS-SCNet). DS-SCNet can accurately identify important gene clusters from large scales of single-cell RNA-seq data and provide useful information for downstream analysis. Based on the simulated datasets, DS-SCNet successfully clusters genes into different groups and outperforms mainstream gene clustering methods, such as DBSCAN and DESC, across different evaluation metrics. To explore the biological insights of our proposed method, we applied it to real scRNA-seq data of patients with Alzheimer's disease (AD). DS-SCNet analyzed the single-cell RNA-seq data with 10,850 genes, and accurately identified 8 optimal clusters from 6673 cells. Enrichment analysis of these gene clusters revealed functional signaling pathways including the ILS signaling, the Rho GTPase signaling, and hemostasis pathways. Further analysis of gene regulatory networks identified new hub genes such as ELF4 as important regulators of AD, which indicates that DS-SCNet contributes to the discovery and understanding of the pathogenesis in Alzheimer's disease.
Collapse
Affiliation(s)
- Minghan Chen
- Department of Computer Science, Wake Forest University, Winston-Salem, NC, USA
| | - Shishen Jia
- School of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, China
| | - Mengfan Xue
- School of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, China; Zhejiang Lab, Hangzhou, Zhejiang, China
| | | | - Ziang Xu
- Department of Computer Science, Wake Forest University, Winston-Salem, NC, USA
| | - Defu Yang
- Zhejiang Lab, Hangzhou, Zhejiang, China
| | - Wentao Zhu
- Zhejiang Lab, Hangzhou, Zhejiang, China.
| | - Qianqian Song
- Center for Cancer Genomics and Precision Oncology, Wake Forest Baptist Comprehensive Cancer Center, Wake Forest Baptist Medical Center, Winston Salem, NC, USA; Department of Cancer Biology, Wake Forest School of Medicine, Winston Salem, NC, USA.
| |
Collapse
|
33
|
Chen X, Chen L, Kürten CHL, Jabbari F, Vujanovic L, Ding Y, Lu B, Lu K, Kulkarni A, Tabib T, Lafyatis R, Cooper GF, Ferris R, Lu X. An individualized causal framework for learning intercellular communication networks that define microenvironments of individual tumors. PLoS Comput Biol 2022; 18:e1010761. [PMID: 36548438 PMCID: PMC9822106 DOI: 10.1371/journal.pcbi.1010761] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 01/06/2023] [Accepted: 11/26/2022] [Indexed: 12/24/2022] Open
Abstract
Cells within a tumor microenvironment (TME) dynamically communicate and influence each other's cellular states through an intercellular communication network (ICN). In cancers, intercellular communications underlie immune evasion mechanisms of individual tumors. We developed an individualized causal analysis framework for discovering tumor specific ICNs. Using head and neck squamous cell carcinoma (HNSCC) tumors as a testbed, we first mined single-cell RNA-sequencing data to discover gene expression modules (GEMs) that reflect the states of transcriptomic processes within tumor and stromal single cells. By deconvoluting bulk transcriptomes of HNSCC tumors profiled by The Cancer Genome Atlas (TCGA), we estimated the activation states of these transcriptomic processes in individual tumors. Finally, we applied individualized causal network learning to discover an ICN within each tumor. Our results show that cellular states of cells in TMEs are coordinated through ICNs that enable multi-way communications among epithelial, fibroblast, endothelial, and immune cells. Further analyses of individual ICNs revealed structural patterns that were shared across subsets of tumors, leading to the discovery of 4 different subtypes of networks that underlie disparate TMEs of HNSCC. Patients with distinct TMEs exhibited significantly different clinical outcomes. Our results show that the capability of estimating individual ICNs reveals heterogeneity of ICNs and sheds light on the importance of intercellular communication in impacting disease development and progression.
Collapse
Affiliation(s)
- Xueer Chen
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Center for Causal Discovery, University of Pittsburgh, Pennsylvania, Pittsburgh, United States of America
| | - Lujia Chen
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Center for Causal Discovery, University of Pittsburgh, Pennsylvania, Pittsburgh, United States of America
| | - Cornelius H. L. Kürten
- Department of Otolaryngology, University of Pittsburgh, Pennsylvania, Pittsburgh, United States of America
- University of Pittsburgh Hillman Cancer Center, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Essen, University Duisburg-Essen, Duisburg, Germany
| | - Fattaneh Jabbari
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Center for Causal Discovery, University of Pittsburgh, Pennsylvania, Pittsburgh, United States of America
| | - Lazar Vujanovic
- Department of Otolaryngology, University of Pittsburgh, Pennsylvania, Pittsburgh, United States of America
- University of Pittsburgh Hillman Cancer Center, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Ying Ding
- Department of Biostatistics, University of Pittsburgh, Pennsylvania, Pittsburgh, United States of America
| | - Binfeng Lu
- Department of Immunology, University of Pittsburgh, Pennsylvania, Pittsburgh, United States of America
| | - Kevin Lu
- Williamsville North High School, Williamsville, New York, United States of America
| | - Aditi Kulkarni
- Department of Otolaryngology, University of Pittsburgh, Pennsylvania, Pittsburgh, United States of America
| | - Tracy Tabib
- Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Essen, University Duisburg-Essen, Duisburg, Germany
| | - Robert Lafyatis
- Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Essen, University Duisburg-Essen, Duisburg, Germany
| | - Gregory F. Cooper
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Center for Causal Discovery, University of Pittsburgh, Pennsylvania, Pittsburgh, United States of America
- University of Pittsburgh Hillman Cancer Center, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Robert Ferris
- Department of Otolaryngology, University of Pittsburgh, Pennsylvania, Pittsburgh, United States of America
- University of Pittsburgh Hillman Cancer Center, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Xinghua Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Center for Causal Discovery, University of Pittsburgh, Pennsylvania, Pittsburgh, United States of America
- University of Pittsburgh Hillman Cancer Center, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|
34
|
Naik S, Mohammed A. Coexpression network analysis of human candida infection reveals key modules and hub genes responsible for host-pathogen interactions. Front Genet 2022; 13:917636. [PMID: 36482897 PMCID: PMC9722774 DOI: 10.3389/fgene.2022.917636] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 11/08/2022] [Indexed: 07/30/2023] Open
Abstract
Invasive fungal infections are a significant reason for morbidity and mortality among organ transplant recipients. Therefore, it is critical to investigate the host and candida niches to understand the epidemiology of fungal infections in transplantation. Candida albicans is an opportunistic fungal pathogen that causes fatal invasive mucosal infections, particularly in solid organ transplant patients. Therefore, identifying and characterizing these genes would play a vital role in understanding the complex regulation of host-pathogen interactions. Using 32 RNA-sequencing samples of human cells infected with C. albicans, we developed WGCNA coexpression networks and performed DESeq2 differential gene expression analysis to identify the genes that positively correlate with human candida infection. Using hierarchical clustering, we identified 5 distinct modules. We studied the inter- and intramodular gene network properties in the context of sample status traits and identified the highly enriched genes in the correlated modules. We identified 52 genes that were common in the most significant WGCNA turquoise module and differentially expressed genes in human endothelial cells (HUVEC) infection vs. control samples. As a validation step, we identified the differentially expressed genes from the independent Candida-infected human oral keratinocytes (OKF6) samples and validated 30 of the 52 common genes. We then performed the functional enrichment analysis using KEGG and GO. Finally, we performed protein-protein interaction (PPI) analysis using STRING and CytoHubba from 30 validated genes. We identified 8 hub genes (JUN, ATF3, VEGFA, SLC2A1, HK2, PTGS2, PFKFB3, and KLF6) that were enriched in response to hypoxia, angiogenesis, vasculogenesis, hypoxia-induced signaling, cancer, diabetes, and transplant-related disease pathways. The discovery of genes and functional pathways related to the immune system and gene coexpression and differential gene expression analyses may serve as novel diagnostic markers and potential therapeutic targets.
Collapse
Affiliation(s)
- Surabhi Naik
- Department of Surgery, James D. Eason Transplant Institute, College of Medicine, University of Tennessee Health Science Center, Memphis, TN, United States
| | - Akram Mohammed
- Center for Biomedical Informatics, College of Medicine, University of Tennessee Health Science Center, Memphis, TN, United States
| |
Collapse
|
35
|
Verweij KJH, Vink JM, Abdellaoui A, Gillespie NA, Derks EM, Treur JL. The genetic aetiology of cannabis use: from twin models to genome-wide association studies and beyond. Transl Psychiatry 2022; 12:489. [PMID: 36411281 PMCID: PMC9678872 DOI: 10.1038/s41398-022-02215-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 09/26/2022] [Accepted: 10/03/2022] [Indexed: 11/22/2022] Open
Abstract
Cannabis is among the most widely consumed psychoactive substances worldwide. Individual differences in cannabis use phenotypes can partly be explained by genetic differences. Technical and methodological advances have increased our understanding of the genetic aetiology of cannabis use. This narrative review discusses the genetic literature on cannabis use, covering twin, linkage, and candidate-gene studies, and the more recent genome-wide association studies (GWASs), as well as the interplay between genetic and environmental factors. Not only do we focus on the insights that these methods have provided on the genetic aetiology of cannabis use, but also on how they have helped to clarify the relationship between cannabis use and co-occurring traits, such as the use of other substances and mental health disorders. Twin studies have shown that cannabis use is moderately heritable, with higher heritability estimates for more severe phases of use. Linkage and candidate-gene studies have been largely unsuccessful, while GWASs so far only explain a small portion of the heritability. Dozens of genetic variants predictive of cannabis use have been identified, located in genes such as CADM2, FOXP2, and CHRNA2. Studies that applied multivariate methods (twin models, genetic correlation analysis, polygenic score analysis, genomic structural equation modelling, Mendelian randomisation) indicate that there is considerable genetic overlap between cannabis use and other traits (especially other substances and externalising disorders) and some evidence for causal relationships (most convincingly for schizophrenia). We end our review by discussing implications of these findings and suggestions for future work.
Collapse
Affiliation(s)
- Karin J. H. Verweij
- grid.7177.60000000084992262Department of Psychiatry, Amsterdam UMC, University of Amsterdam, Meibergdreef 5, 1105 AZ Amsterdam, The Netherlands
| | - Jacqueline M. Vink
- grid.5590.90000000122931605Behavioural Science Institute, Radboud University Nijmegen, Thomas van Aquinostraat 4, 6525 GD Nijmegen, The Netherlands
| | - Abdel Abdellaoui
- grid.7177.60000000084992262Department of Psychiatry, Amsterdam UMC, University of Amsterdam, Meibergdreef 5, 1105 AZ Amsterdam, The Netherlands
| | - Nathan A. Gillespie
- grid.224260.00000 0004 0458 8737Virginia Institute for Psychiatric and Behavior Genetics, Virginia Commonwealth University, 800 East Leigh St, Suite 100, Richmond, VA 23219 USA
| | - Eske M. Derks
- grid.1049.c0000 0001 2294 1395Translational Neurogenomics, QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston, QLD 4006 Australia
| | - Jorien L. Treur
- grid.7177.60000000084992262Department of Psychiatry, Amsterdam UMC, University of Amsterdam, Meibergdreef 5, 1105 AZ Amsterdam, The Netherlands
| |
Collapse
|
36
|
Hawe JS, Saha A, Waldenberger M, Kunze S, Wahl S, Müller-Nurasyid M, Prokisch H, Grallert H, Herder C, Peters A, Strauch K, Theis FJ, Gieger C, Chambers J, Battle A, Heinig M. Network reconstruction for trans acting genetic loci using multi-omics data and prior information. Genome Med 2022; 14:125. [PMID: 36344995 PMCID: PMC9641770 DOI: 10.1186/s13073-022-01124-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 10/11/2022] [Indexed: 11/09/2022] Open
Abstract
BACKGROUND Molecular measurements of the genome, the transcriptome, and the epigenome, often termed multi-omics data, provide an in-depth view on biological systems and their integration is crucial for gaining insights in complex regulatory processes. These data can be used to explain disease related genetic variants by linking them to intermediate molecular traits (quantitative trait loci, QTL). Molecular networks regulating cellular processes leave footprints in QTL results as so-called trans-QTL hotspots. Reconstructing these networks is a complex endeavor and use of biological prior information can improve network inference. However, previous efforts were limited in the types of priors used or have only been applied to model systems. In this study, we reconstruct the regulatory networks underlying trans-QTL hotspots using human cohort data and data-driven prior information. METHODS We devised a new strategy to integrate QTL with human population scale multi-omics data. State-of-the art network inference methods including BDgraph and glasso were applied to these data. Comprehensive prior information to guide network inference was manually curated from large-scale biological databases. The inference approach was extensively benchmarked using simulated data and cross-cohort replication analyses. Best performing methods were subsequently applied to real-world human cohort data. RESULTS Our benchmarks showed that prior-based strategies outperform methods without prior information in simulated data and show better replication across datasets. Application of our approach to human cohort data highlighted two novel regulatory networks related to schizophrenia and lean body mass for which we generated novel functional hypotheses. CONCLUSIONS We demonstrate that existing biological knowledge can improve the integrative analysis of networks underlying trans associations and generate novel hypotheses about regulatory mechanisms.
Collapse
Affiliation(s)
- Johann S Hawe
- Institute of Computational Biology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,German Heart Centre Munich, Department of Cardiology, Technical University Munich, Munich, Germany.,Department of Informatics, Technical University of Munich, Garching, Germany
| | - Ashis Saha
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Melanie Waldenberger
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Sonja Kunze
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Simone Wahl
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Martina Müller-Nurasyid
- Institute of Genetic Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,IBE, Faculty of Medicine, LMU Munich, 81377, Munich, Germany.,Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany.,Department of Internal Medicine I (Cardiology), Hospital of the Ludwig-Maximilians-University (LMU) Munich, Munich, Germany
| | - Holger Prokisch
- Institute of Human Genetics, School of Medicine, Technische Universität München, Munich, Germany
| | - Harald Grallert
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,Institute of Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Christian Herder
- German Center for Diabetes Research (DZD), Neuherberg, Germany.,Institute for Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University, Düsseldorf, Germany.,Division of Endocrinology and Diabetology, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
| | - Annette Peters
- Institute of Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Konstantin Strauch
- Institute of Genetic Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany.,Chair of Genetic Epidemiology, IBE, Faculty of Medicine, LMU Munich, Munich, Germany
| | - Fabian J Theis
- Department of Informatics, Technical University of Munich, Garching, Germany.,Department of Mathematics, Technical University of Munich, Garching, Germany
| | - Christian Gieger
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,Institute of Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - John Chambers
- Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London, UK.,Lee Kong Chian School of Medicine, Nanyang Technological University, 308232, Singapore, Singapore
| | - Alexis Battle
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Matthias Heinig
- Institute of Computational Biology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany. .,Department of Informatics, Technical University of Munich, Garching, Germany. .,Munich Heart Association, Partner Site Munich, DZHK (German Centre for Cardiovascular Research), 10785, Berlin, Germany.
| |
Collapse
|
37
|
Yue R, Dutta A. Computational systems biology in disease modeling and control, review and perspectives. NPJ Syst Biol Appl 2022; 8:37. [PMID: 36192551 PMCID: PMC9528884 DOI: 10.1038/s41540-022-00247-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 09/05/2022] [Indexed: 02/02/2023] Open
Abstract
Omics-based approaches have become increasingly influential in identifying disease mechanisms and drug responses. Considering that diseases and drug responses are co-expressed and regulated in the relevant omics data interactions, the traditional way of grabbing omics data from single isolated layers cannot always obtain valuable inference. Also, drugs have adverse effects that may impair patients, and launching new medicines for diseases is costly. To resolve the above difficulties, systems biology is applied to predict potential molecular interactions by integrating omics data from genomic, proteomic, transcriptional, and metabolic layers. Combined with known drug reactions, the resulting models improve medicines' therapeutical performance by re-purposing the existing drugs and combining drug molecules without off-target effects. Based on the identified computational models, drug administration control laws are designed to balance toxicity and efficacy. This review introduces biomedical applications and analyses of interactions among gene, protein and drug molecules for modeling disease mechanisms and drug responses. The therapeutical performance can be improved by combining the predictive and computational models with drug administration designed by control laws. The challenges are also discussed for its clinical uses in this work.
Collapse
Affiliation(s)
- Rongting Yue
- Department of Electrical and Computer Engineering, University of Connecticut, 371 Fairfield Way, Storrs, CT, 06269, USA.
| | - Abhishek Dutta
- Department of Electrical and Computer Engineering, University of Connecticut, 371 Fairfield Way, Storrs, CT, 06269, USA
| |
Collapse
|
38
|
Bankier S, Michoel T. eQTLs as causal instruments for the reconstruction of hormone linked gene networks. Front Endocrinol (Lausanne) 2022; 13:949061. [PMID: 36060942 PMCID: PMC9428692 DOI: 10.3389/fendo.2022.949061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 07/25/2022] [Indexed: 11/17/2022] Open
Abstract
Hormones act within in highly dynamic systems and much of the phenotypic response to variation in hormone levels is mediated by changes in gene expression. The increase in the number and power of large genetic association studies has led to the identification of hormone linked genetic variants. However, the biological mechanisms underpinning the majority of these loci are poorly understood. The advent of affordable, high throughput next generation sequencing and readily available transcriptomic databases has shown that many of these genetic variants also associate with variation in gene expression levels as expression Quantitative Trait Loci (eQTLs). In addition to further dissecting complex genetic variation, eQTLs have been applied as tools for causal inference. Many hormone networks are driven by transcription factors, and many of these genes can be linked to eQTLs. In this mini-review, we demonstrate how causal inference and gene networks can be used to describe the impact of hormone linked genetic variation upon the transcriptome within an endocrinology context.
Collapse
Affiliation(s)
- Sean Bankier
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | | |
Collapse
|
39
|
Salukhov VV, Lopatin YR, Minakov AA. Adipsin – summing up large-scale results: A review. CONSILIUM MEDICUM 2022. [DOI: 10.26442/20751753.2022.5.201280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Adipsin is one of the first discovered adipokines hormones produced by adipose tissue. Adipsin performs the function of a regulator of carbohydrate and lipid metabolism and participates in the adaptation of metabolism to the real needs of the body, being a powerful stimulant of anabolic processes. A characteristic feature of adipsin is that it is also a complement factor D, which is necessary for the normal functioning of an alternative pathway of activation of the complement system. Due to this, adipsin is represented in the body as a link between the energy block of the endocrine system and the humoral block of the immune system. Adipsin is known as a regulator of the function of pancreatic beta cells, a stimulator of lipogenesis, a modulator of inflammation processes. Recently, there have been works indicating the effect of adipsin on the microbiota, as well as its role in non-alcoholic fatty liver disease. To date, there are a large number of publications describing the biochemical structure, functions of adipsin, mechanisms of regulation of its synthesis, as well as changes in the level of adipsin in various pathological conditions. Attempts are also described to pharmacologically influence adipsin in order to modulate its functions or use it as a biomarker for the diagnosis of diseases. However, there is currently no structured review that summarizes and systematizes all available information about this adipokine. This is exactly the task we set ourselves in this study. The paper contains the results of all available studies on adipsin. In some cases, they are contradictory in nature, which indicates the need for further research in detecting connections between the body's systems.
Collapse
|
40
|
Quantifying biochemical reaction rates from static population variability within incompletely observed complex networks. PLoS Comput Biol 2022; 18:e1010183. [PMID: 35731728 PMCID: PMC9216546 DOI: 10.1371/journal.pcbi.1010183] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 05/07/2022] [Indexed: 11/19/2022] Open
Abstract
Quantifying biochemical reaction rates within complex cellular processes remains a key challenge of systems biology even as high-throughput single-cell data have become available to characterize snapshots of population variability. That is because complex systems with stochastic and non-linear interactions are difficult to analyze when not all components can be observed simultaneously and systems cannot be followed over time. Instead of using descriptive statistical models, we show that incompletely specified mechanistic models can be used to translate qualitative knowledge of interactions into reaction rate functions from covariability data between pairs of components. This promises to turn a globally intractable problem into a sequence of solvable inference problems to quantify complex interaction networks from incomplete snapshots of their stochastic fluctuations.
Collapse
|
41
|
Enhancer methylation dynamics drive core transcriptional regulatory circuitry in pan-cancer. Oncogene 2022; 41:3474-3484. [PMID: 35655092 DOI: 10.1038/s41388-022-02359-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 05/11/2022] [Accepted: 05/19/2022] [Indexed: 12/16/2022]
Abstract
Accumulating evidence has demonstrated that enhancer methylation has strong and dynamic regulatory effects on gene expression. Some transcription factors (TFs) can auto- and cross-regulate in a feed-forward manner, and cooperate with their enhancers to form core transcriptional regulatory circuitries (CRCs). However, the elaborated regulatory mechanism between enhancer methylation and CRC remains the tip of the iceberg. Here, we revealed that DNA methylation could drive the tissue-specific enhancer basal transcription and target gene expression in human cancers. By integrating methylome, transcriptome, and 3D genomic data, we identified enhancer methylation triplets (enhancer methylation-enhancer transcription-target gene expression) and dissected potential regulatory patterns within them. Moreover, we observed that cancer-specific core TFs regulated by enhancers were able to shape their enhancer methylation forming the enhancer methylation-driven CRCs (emCRCs). Further parsing of clinical implications showed rewired emCRCs could serve as druggable targets and prognostic risk markers. In summary, the integrative analysis of enhancer methylation regulome would facilitate portraying the cancer epigenomics landscape and developing the epigenetic anti-cancer approaches.
Collapse
|
42
|
Tai AS, Lin PH, Huang YT, Lin SH. Path-specific effects in the presence of a survival outcome and causally ordered multiple mediators with application to genomic data. Stat Methods Med Res 2022; 31:1916-1933. [PMID: 35635267 DOI: 10.1177/09622802221104239] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Causal multimediation analysis (i.e. the causal mediation analysis with multiple mediators) is critical for understanding the effectiveness of interventions, especially in medical research. Deriving the path-specific effects of exposure on the outcome through a set of mediators can provide detail about the causal mechanism of interest However, existing models are usually restricted to partial decomposition, which can only be used to evaluate the cumulative effect of several paths. In genetics studies, partial decomposition fails to reflect the real causal effects mediated by genes, especially in complex gene regulatory networks. Moreover, because of the lack of a generalized identification procedure, the current multimediation analysis cannot be applied to the estimation of path-specific effects for any number of mediators. In this study, we derive the interventional analogs of path-specific effect for complete decomposition to address the difficulty of nonidentifiability. On the basis of two survival models of the outcome, we derive the generalized analytic forms for interventional analogs of path-specific effects by assuming the normal distributions of mediators. We apply the new methodology to investigate the causal mechanism of signature genes in lung cancer based on the cell cycle pathway, and the results clarify the gene pathway in cancer.
Collapse
Affiliation(s)
- An-Shun Tai
- Department of Statistics, 34912National Cheng Kung University, Tainan.,Institute of Statistics, 34914National Yang Ming Chiao Tung University, Hsin-Chu
| | - Pei-Hsuan Lin
- Institute of Statistics, 34914National Yang Ming Chiao Tung University, Hsin-Chu
| | - Yen-Tsung Huang
- Institute of Statistical Science, 38017Academia Sinica, Taipei
| | - Sheng-Hsuan Lin
- Institute of Statistics, 34914National Yang Ming Chiao Tung University, Hsin-Chu
| |
Collapse
|
43
|
Gaynor SM, Fagny M, Lin X, Platig J, Quackenbush J. Connectivity in eQTL networks dictates reproducibility and genomic properties. CELL REPORTS METHODS 2022; 2:100218. [PMID: 35637906 PMCID: PMC9142682 DOI: 10.1016/j.crmeth.2022.100218] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 02/08/2022] [Accepted: 04/25/2022] [Indexed: 01/11/2023]
Abstract
Expression quantitative trait locus (eQTL) analysis associates SNPs with gene expression; these relationships can be represented as a bipartite network with association strength as "edge weights" between SNPs and genes. However, most eQTL networks use binary edge weights based on thresholded FDR estimates: definitions that influence reproducibility and downstream analyses. We constructed twenty-nine tissue-specific eQTL networks using GTEx data and evaluated a comprehensive set of network specifications based on false discovery rates, test statistics, and p values, focusing on the degree centrality-a metric of an SNP or gene node's potential network influence. We found a thresholded Benjamini-Hochberg q value weighted by the Z-statistic balances metric reproducibility and computational efficiency. Our estimated gene degrees positively correlate with gene degrees in gene regulatory networks, demonstrating that these networks are complementary in understanding regulation. Gene degrees also correlate with genetic diversity, and heritability analyses show that highly connected nodes are enriched for tissue-relevant traits.
Collapse
Affiliation(s)
- Sheila M. Gaynor
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Biostatistics and Computational Biology and Center for Cancer Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02115, USA
| | - Maud Fagny
- Department of Biostatistics and Computational Biology and Center for Cancer Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02115, USA
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190 Gif-sur-Yvette, France
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Statistics, Harvard University, Cambridge, MA 02138, USA
| | - John Platig
- Department of Biostatistics and Computational Biology and Center for Cancer Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02115, USA
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA 02115, USA
- Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Biostatistics and Computational Biology and Center for Cancer Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02115, USA
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA 02115, USA
| |
Collapse
|
44
|
Cascone A, Lalowski M, Lindholm D, Eriksson O. Unveiling the Function of the Mitochondrial Filament-Forming Protein LACTB in Lipid Metabolism and Cancer. Cells 2022; 11:cells11101703. [PMID: 35626737 PMCID: PMC9139886 DOI: 10.3390/cells11101703] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 05/18/2022] [Accepted: 05/19/2022] [Indexed: 02/04/2023] Open
Abstract
LACTB is a relatively unknown mitochondrial protein structurally related to the bacterial penicillin-binding and beta-lactamase superfamily of serine proteases. LACTB has recently gained an increased interest due to its potential role in lipid metabolism and tumorigenesis. To date, around ninety studies pertaining to LACTB have been published, but the exact biochemical and cell biological function of LACTB still remain elusive. In this review, we summarise the current knowledge about LACTB with particular attention to the implications of the recently published study on the cryo-electron microscopy structure of the filamentous form of LACTB. From this and other studies, several specific properties of LACTB emerge, suggesting that the protein has distinct functions in different physiological settings. Resolving these issues by further research may ultimately lead to a unified model of LACTB’s function in cell and organismal physiology. LACTB is the only member of its protein family in higher animals and LACTB may, therefore, be of particular interest for future drug targeting initiatives.
Collapse
Affiliation(s)
- Annunziata Cascone
- Department of Biochemistry and Developmental Biology, Faculty of Medicine, University of Helsinki, FIN-00014 Helsinki, Finland; (A.C.); (D.L.)
| | - Maciej Lalowski
- HiLIFE, Meilahti Clinical Proteomics Core Facility, Faculty of Medicine, University of Helsinki, FIN-00014 Helsinki, Finland;
| | - Dan Lindholm
- Department of Biochemistry and Developmental Biology, Faculty of Medicine, University of Helsinki, FIN-00014 Helsinki, Finland; (A.C.); (D.L.)
- Minerva Foundation Institute for Medical Research, Biomedicum Helsinki 2, Tukholmankatu 8, FIN-00290 Helsinki, Finland
| | - Ove Eriksson
- Department of Biochemistry and Developmental Biology, Faculty of Medicine, University of Helsinki, FIN-00014 Helsinki, Finland; (A.C.); (D.L.)
- Correspondence:
| |
Collapse
|
45
|
Sonawane AR, Aikawa E, Aikawa M. Connections for Matters of the Heart: Network Medicine in Cardiovascular Diseases. Front Cardiovasc Med 2022; 9:873582. [PMID: 35665246 PMCID: PMC9160390 DOI: 10.3389/fcvm.2022.873582] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 04/19/2022] [Indexed: 01/18/2023] Open
Abstract
Cardiovascular diseases (CVD) are diverse disorders affecting the heart and vasculature in millions of people worldwide. Like other fields, CVD research has benefitted from the deluge of multiomics biomedical data. Current CVD research focuses on disease etiologies and mechanisms, identifying disease biomarkers, developing appropriate therapies and drugs, and stratifying patients into correct disease endotypes. Systems biology offers an alternative to traditional reductionist approaches and provides impetus for a comprehensive outlook toward diseases. As a focus area, network medicine specifically aids the translational aspect of in silico research. This review discusses the approach of network medicine and its application to CVD research.
Collapse
Affiliation(s)
- Abhijeet Rajendra Sonawane
- Center for Interdisciplinary Cardiovascular Sciences, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
- Center for Excellence in Vascular Biology, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Elena Aikawa
- Center for Interdisciplinary Cardiovascular Sciences, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
- Center for Excellence in Vascular Biology, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Masanori Aikawa
- Center for Interdisciplinary Cardiovascular Sciences, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
- Center for Excellence in Vascular Biology, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| |
Collapse
|
46
|
Bing X, Lovelace T, Bunea F, Wegkamp M, Kasturi SP, Singh H, Benos PV, Das J. Essential Regression: A generalizable framework for inferring causal latent factors from multi-omic datasets. PATTERNS (NEW YORK, N.Y.) 2022; 3:100473. [PMID: 35607614 PMCID: PMC9122954 DOI: 10.1016/j.patter.2022.100473] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 09/17/2021] [Accepted: 03/01/2022] [Indexed: 01/19/2023]
Abstract
High-dimensional cellular and molecular profiling of biological samples highlights the need for analytical approaches that can integrate multi-omic datasets to generate prioritized causal inferences. Current methods are limited by high dimensionality of the combined datasets, the differences in their data distributions, and their integration to infer causal relationships. Here, we present Essential Regression (ER), a novel latent-factor-regression-based interpretable machine-learning approach that addresses these problems by identifying latent factors and their likely cause-effect relationships with system-wide outcomes/properties of interest. ER can integrate many multi-omic datasets without structural or distributional assumptions regarding the data. It outperforms a range of state-of-the-art methods in terms of prediction. ER can be coupled with probabilistic graphical modeling, thereby strengthening the causal inferences. The utility of ER is demonstrated using multi-omic system immunology datasets to generate and validate novel cellular and molecular inferences in a wide range of contexts including immunosenescence and immune dysregulation.
Collapse
Affiliation(s)
- Xin Bing
- Department of Statistics and Data Science, Cornell University, Ithaca, NY, USA
| | - Tyler Lovelace
- Department of Computational & Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
- Joint CMU-Pitt PhD Program in Computational Biology, Carnegie Mellon – University of Pittsburgh, Pittsburgh, PA, USA
| | - Florentina Bunea
- Department of Statistics and Data Science, Cornell University, Ithaca, NY, USA
| | - Marten Wegkamp
- Department of Statistics and Data Science, Cornell University, Ithaca, NY, USA
- Department of Mathematics, Cornell University, Ithaca, NY, USA
| | - Sudhir Pai Kasturi
- Division of Microbiology and Immunology, Yerkes National Primate Research Center, Emory University, Atlanta, GA, USA
| | - Harinder Singh
- Center for Systems Immunology, Departments of Immunology and Computational & Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Panayiotis V. Benos
- Department of Computational & Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Jishnu Das
- Center for Systems Immunology, Departments of Immunology and Computational & Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
47
|
Gilliland DG, Regev A, Schadt EE, Tung J. Traversing industry and academia in biomedicine: the best of both worlds? Nat Rev Genet 2022; 23:461-466. [DOI: 10.1038/s41576-022-00486-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/31/2022] [Indexed: 11/09/2022]
|
48
|
Olayinka OA, O'Neill NK, Farrer LA, Wang G, Zhang X. Molecular Quantitative Trait Locus Mapping in Human Complex Diseases. Curr Protoc 2022; 2:e426. [PMID: 35587224 PMCID: PMC9186089 DOI: 10.1002/cpz1.426] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Mapping quantitative trait loci (QTLs) for molecular traits from chromatin to metabolites (i.e., xQTLs) provides insight into the locations and effect modes of genetic variants that influence these molecular phenotypes and the propagation of functional consequences of each variant. xQTL studies indirectly interrogate the functional landscape of the molecular basis of complex diseases, including the impact of non-coding regulatory variants, the tissue specificity of regulatory elements, and their contribution to disease by integrating with genome-wide association studies (GWAS). We summarize a variety of molecular xQTL studies in human tissues and cells. In addition, using the Alzheimer's Disease Sequencing Project (ADSP) as an example, we describe the ADSP xQTL project, a collaborative effort across the ADSP Functional Genomics Consortium (ADSP-FGC). The project's ultimate goal is a reference map of Alzheimer's-related QTLs using existing datasets from multiple omics layers to help us study the consequences of genetic variants identified in the ADSP. xQTL studies enable the identification of the causal genes and pathways in GWAS loci, which will likely aid in the discovery of novel biomarkers and therapeutic targets for complex diseases. © 2022 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- Oluwatosin A Olayinka
- Bioinformatics Program, Boston University, Boston, Massachusetts
- Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, Massachusetts
| | - Nicholas K O'Neill
- Bioinformatics Program, Boston University, Boston, Massachusetts
- Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, Massachusetts
| | - Lindsay A Farrer
- Bioinformatics Program, Boston University, Boston, Massachusetts
- Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, Massachusetts
- Department of Neurology, Boston University School of Medicine, Boston, Massachusetts
- Department of Ophthalmology, Boston University School of Medicine, Boston, Massachusetts
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts
- Department of Epidemiology, Boston University School of Public Health, Boston, Massachusetts
| | - Gao Wang
- Department of Neurology, Columbia University, New York, New York
- Gertrude H. Sergievsky Center, Columbia University, New York, New York
| | - Xiaoling Zhang
- Bioinformatics Program, Boston University, Boston, Massachusetts
- Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, Massachusetts
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts
| |
Collapse
|
49
|
Nguyen QH, Nguyen T, Le DH. DrGA: cancer driver gene analysis in a simpler manner. BMC Bioinformatics 2022; 23:86. [PMID: 35247965 PMCID: PMC8897886 DOI: 10.1186/s12859-022-04606-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Accepted: 02/08/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
To date, cancer still is one of the leading causes of death worldwide, in which the cumulative of genes carrying mutations was said to be held accountable for the establishment and development of this disease mainly. From that, identification and analysis of driver genes were vital. Our previous study indicated disagreement on a unifying pipeline for these tasks and then introduced a complete one. However, this pipeline gradually manifested its weaknesses as being unfamiliar to non-technical users, time-consuming, and inconvenient.
Results
This study presented an R package named DrGA, developed based on our previous pipeline, to tackle the mentioned problems above. It wholly automated four widely used downstream analyses for predicted driver genes and offered additional improvements. We described the usage of the DrGA on driver genes of human breast cancer. Besides, we also gave the users another potential application of DrGA in analyzing genomic biomarkers of a complex disease in another organism.
Conclusions
DrGA facilitated the users with limited IT backgrounds and rapidly created consistent and reproducible results. DrGA and its applications, along with example data, were freely provided at https://github.com/huynguyen250896/DrGA.
Collapse
|
50
|
Zhang W, Shen J, Wang Y, Cai K, Zhang Q, Cao M. Blood SSR1: A Possible Biomarker for Early Prediction of Parkinson’s Disease. Front Mol Neurosci 2022; 15:762544. [PMID: 35310885 PMCID: PMC8924528 DOI: 10.3389/fnmol.2022.762544] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Accepted: 01/14/2022] [Indexed: 01/31/2023] Open
Abstract
Parkinson’s disease (PD) is the second most common neurodegenerative disease associated with age. Early diagnosis of PD is key to preventing the loss of dopamine neurons. Peripheral-blood biomarkers have shown their value in recent years because of their easy access and long-term monitoring advantages. However, few peripheral-blood biomarkers have proven useful. This study aims to explore potential peripheral-blood biomarkers for the early diagnosis of PD. Three substantia nigra (SN) transcriptome datasets from the Gene Expression Omnibus (GEO) database were divided into a training cohort and a test cohort. We constructed a protein–protein interaction (PPI) network and a weighted gene co-expression network analysis (WGCNA) network, found their overlapping differentially expressed genes and studied them as the key genes. Analysis of the peripheral-blood transcriptome datasets of PD patients from GEO showed that three key genes were upregulated in PD over healthy participants. Analysis of the relationship between their expression and survival and analysis of their brain expression suggested that these key genes could become biomarkers. Then, animal models were studied to validate the expression of the key genes, and only SSR1 (the signal sequence receptor subunit1) was significantly upregulated in both animal models in peripheral blood. Correlation analysis and logistic regression analysis were used to analyze the correlation between brain dopaminergic neurons and SSR1 expression, and it was found that SSR1 expression was negatively correlated with dopaminergic neuron survival. The upregulation of SSR1 expression in peripheral blood was also found to precede the abnormal behavior of animals. In addition, the application of artificial intelligence technology further showed the value of SSR1 in clinical PD prediction. The three classifiers all showed that SSR1 had high predictability for PD. The classifier with the best prediction accuracy was selected through AUC and MCC to construct a prediction model. In short, this research not only provides potential biomarkers for the early diagnosis of PD but also establishes a possible artificial intelligence model for predicting PD.
Collapse
Affiliation(s)
- Wen Zhang
- Department of Neurology, Affiliated Hospital of Nantong University, Nantong, China
| | - Jiabing Shen
- Department of Neurology, Affiliated Hospital of Nantong University, Nantong, China
| | - Yuhui Wang
- Department of Microelectrics, Peking University, Peking, China
| | - Kefu Cai
- Department of Neurology, Affiliated Hospital of Nantong University, Nantong, China
| | - Qi Zhang
- Key Laboratory of Neuroregeneration of Jiangsu and Ministry of Education, Co-innovation Center of Neuroregeneration, Nantong University, Nantong, China
- *Correspondence: Maohong Cao Qi Zhang
| | - Maohong Cao
- Department of Neurology, Affiliated Hospital of Nantong University, Nantong, China
- *Correspondence: Maohong Cao Qi Zhang
| |
Collapse
|