1
|
Melton HJ, Zhang Z, Deng HW, Wu L, Wu C. MIMOSA: a resource consisting of improved methylome prediction models increases power to identify DNA methylation-phenotype associations. Epigenetics 2024; 19:2370542. [PMID: 38963888 PMCID: PMC11225927 DOI: 10.1080/15592294.2024.2370542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Accepted: 06/12/2024] [Indexed: 07/06/2024] Open
Abstract
Although DNA methylation (DNAm) has been implicated in the pathogenesis of numerous complex diseases, from cancer to cardiovascular disease to autoimmune disease, the exact methylation sites that play key roles in these processes remain elusive. One strategy to identify putative causal CpG sites and enhance disease etiology understanding is to conduct methylome-wide association studies (MWASs), in which predicted DNA methylation that is associated with complex diseases can be identified. However, current MWAS models are primarily trained using the data from single studies, thereby limiting the methylation prediction accuracy and the power of subsequent association studies. Here, we introduce a new resource, MWAS Imputing Methylome Obliging Summary-level mQTLs and Associated LD matrices (MIMOSA), a set of models that substantially improve the prediction accuracy of DNA methylation and subsequent MWAS power through the use of a large summary-level mQTL dataset provided by the Genetics of DNA Methylation Consortium (GoDMC). Through the analyses of GWAS (genome-wide association study) summary statistics for 28 complex traits and diseases, we demonstrate that MIMOSA considerably increases the accuracy of DNA methylation prediction in whole blood, crafts fruitful prediction models for low heritability CpG sites, and determines markedly more CpG site-phenotype associations than preceding methods. Finally, we use MIMOSA to conduct a case study on high cholesterol, pinpointing 146 putatively causal CpG sites.
Collapse
Affiliation(s)
- Hunter J. Melton
- Department of Statistics, Florida State University, Tallahassee, FL, USA
| | - Zichen Zhang
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Hong-Wen Deng
- Center of Bioinformatics and Genomics, Tulane University, New Orleans, LA, USA
| | - Lang Wu
- Center of Bioinformatics and Genomics, Tulane University, New Orleans, LA, USA
| | - Chong Wu
- Cancer Epidemiology Division, University of Hawaii Cancer Center, Honolulu, HI, USA
- Institute for Data Science in Oncology, The UT MD Anderson Cancer Center
| |
Collapse
|
2
|
Ružičková N, Hledík M, Tkačik G. Quantitative omnigenic model discovers interpretable genome-wide associations. Proc Natl Acad Sci U S A 2024; 121:e2402340121. [PMID: 39441639 DOI: 10.1073/pnas.2402340121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 09/20/2024] [Indexed: 10/25/2024] Open
Abstract
As their statistical power grows, genome-wide association studies (GWAS) have identified an increasing number of loci underlying quantitative traits of interest. These loci are scattered throughout the genome and are individually responsible only for small fractions of the total heritable trait variance. The recently proposed omnigenic model provides a conceptual framework to explain these observations by postulating that numerous distant loci contribute to each complex trait via effect propagation through intracellular regulatory networks. We formalize this conceptual framework by proposing the "quantitative omnigenic model" (QOM), a statistical model that combines prior knowledge of the regulatory network topology with genomic data. By applying our model to gene expression traits in yeast, we demonstrate that QOM achieves similar gene expression prediction performance to traditional GWAS with hundreds of times less parameters, while simultaneously extracting candidate causal and quantitative chains of effect propagation through the regulatory network for every individual gene. We estimate the fraction of heritable trait variance in cis- and in trans-, break the latter down by effect propagation order, assess the trans- variance not attributable to transcriptional regulation, and show that QOM correctly accounts for the low-dimensional structure of gene expression covariance. We furthermore demonstrate the relevance of QOM for systems biology, by employing it as a statistical test for the quality of regulatory network reconstructions, and linking it to the propagation of nontranscriptional (including environmental) effects.
Collapse
Affiliation(s)
- Natália Ružičková
- Institute of Science and Technology Austria, Klosterneuburg AT-3400, Austria
| | - Michal Hledík
- Institute of Science and Technology Austria, Klosterneuburg AT-3400, Austria
| | - Gašper Tkačik
- Institute of Science and Technology Austria, Klosterneuburg AT-3400, Austria
| |
Collapse
|
3
|
King A, Wu C. Integrative Multi-Omics Approach for Improving Causal Gene Identification. Genet Epidemiol 2024. [PMID: 39444114 DOI: 10.1002/gepi.22601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2024] [Revised: 10/01/2024] [Accepted: 10/04/2024] [Indexed: 10/25/2024]
Abstract
Transcriptome-wide association studies (TWAS) have been widely used to identify thousands of likely causal genes for diseases and complex traits using predicted expression models. However, most existing TWAS methods rely on gene expression alone and overlook other regulatory mechanisms of gene expression, including DNA methylation and splicing, that contribute to the genetic basis of these complex traits and diseases. Here we introduce a multi-omics method that integrates gene expression, DNA methylation, and splicing data to improve the identification of associated genes with our traits of interest. Through simulations and by analyzing genome-wide association study (GWAS) summary statistics for 24 complex traits, we show that our integrated method, which leverages these complementary omics biomarkers, achieves higher statistical power, and improves the accuracy of likely causal gene identification in blood tissues over individual omics methods. Finally, we apply our integrated model to a lung cancer GWAS data set, demonstrating the integrated models improved identification of prioritized genes for lung cancer risk.
Collapse
Affiliation(s)
- Austin King
- Department of Statistics, Florida State University, Tallahassee, Florida, USA
| | - Chong Wu
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| |
Collapse
|
4
|
Ko BS, Lee SB, Kim TK. A brief guide to analyzing expression quantitative trait loci. Mol Cells 2024:100139. [PMID: 39447874 DOI: 10.1016/j.mocell.2024.100139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Revised: 10/14/2024] [Accepted: 10/17/2024] [Indexed: 10/26/2024] Open
Abstract
Molecular quantitative trait locus (molQTL) mapping has emerged as an important approach for elucidating the functional consequences of genetic variants and unraveling the causal mechanisms underlying diseases or complex traits. However, the variety of analysis tools and sophisticated methodologies available for molQTL studies can be overwhelming for researchers with limited computational expertise. Here, we provide a brief guideline with a curated list of methods and software tools for analyzing expression quantitative trait loci (eQTL), the most widely studied type of molQTL.
Collapse
Affiliation(s)
- Byung Su Ko
- Department of Brain Sciences, DGIST, Daegu 42988, Republic of Korea
| | - Sung Bae Lee
- Department of Brain Sciences, DGIST, Daegu 42988, Republic of Korea
| | - Tae-Kyung Kim
- Department of Life Sciences, Pohang University of Science and Technology (POSTECH), Pohang 37673, Republic of Korea; Institute for Convergence Research and Education in Advanced Technology, Yonsei University, Seoul 03722, Republic of Korea.
| |
Collapse
|
5
|
Weinstock JS, Arce MM, Freimer JW, Ota M, Marson A, Battle A, Pritchard JK. Gene regulatory network inference from CRISPR perturbations in primary CD4 + T cells elucidates the genomic basis of immune disease. CELL GENOMICS 2024:100671. [PMID: 39395408 DOI: 10.1016/j.xgen.2024.100671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 06/04/2024] [Accepted: 09/16/2024] [Indexed: 10/14/2024]
Abstract
The effects of genetic variation on complex traits act mainly through changes in gene regulation. Although many genetic variants have been linked to target genes in cis, the trans-regulatory cascade mediating their effects remains largely uncharacterized. Mapping trans-regulators based on natural genetic variation has been challenging due to small effects, but experimental perturbations offer a complementary approach. Using CRISPR, we knocked out 84 genes in primary CD4+ T cells, targeting inborn error of immunity (IEI) disease transcription factors (TFs) and TFs without immune disease association. We developed a novel gene network inference method called linear latent causal Bayes (LLCB) to estimate the network from perturbation data and observed 211 regulatory connections between genes. We characterized programs affected by the TFs, which we associated with immune genome-wide association study (GWAS) genes, finding that JAK-STAT family members are regulated by KMT2A, an epigenetic regulator. These analyses reveal the trans-regulatory cascades linking GWAS genes to signaling pathways.
Collapse
Affiliation(s)
- Joshua S Weinstock
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Maya M Arce
- Gladstone-UCSF Institute of Genomic Immunology, San Francisco, CA 94158, USA; Department of Medicine, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Jacob W Freimer
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Gladstone-UCSF Institute of Genomic Immunology, San Francisco, CA 94158, USA
| | - Mineto Ota
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Gladstone-UCSF Institute of Genomic Immunology, San Francisco, CA 94158, USA
| | - Alexander Marson
- Gladstone-UCSF Institute of Genomic Immunology, San Francisco, CA 94158, USA; Department of Medicine, University of California, San Francisco, San Francisco, CA 94143, USA; Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA 94720, USA; Institute for Human Genetics (IHG), University of California, San Francisco, San Francisco, CA 94143, USA; Parker Institute for Cancer Immunotherapy, University of California, San Francisco, San Francisco, CA 94129, USA; Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA 94143, USA; UCSF Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94158, USA.
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA; Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, USA; Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA; Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA.
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Biology, Stanford University, Stanford, CA, USA.
| |
Collapse
|
6
|
Kusmec A, Yeh CT'E, Schnable PS. Data-driven identification of environmental variables influencing phenotypic plasticity to facilitate breeding for future climates. THE NEW PHYTOLOGIST 2024; 244:618-634. [PMID: 39183371 DOI: 10.1111/nph.19937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 05/20/2024] [Indexed: 08/27/2024]
Abstract
Phenotypic plasticity describes a genotype's ability to produce different phenotypes in response to different environments. Breeding crops that exhibit appropriate levels of plasticity for future climates will be crucial to meeting global demand, but knowledge of the critical environmental factors is limited to a handful of well-studied major crops. Using 727 maize (Zea mays L.) hybrids phenotyped for grain yield in 45 environments, we investigated the ability of a genetic algorithm and two other methods to identify environmental determinants of grain yield from a large set of candidate environmental variables constructed using minimal assumptions. The genetic algorithm identified pre- and postanthesis maximum temperature, mid-season solar radiation, and whole season net evapotranspiration as the four most important variables from a candidate set of 9150. Importantly, these four variables are supported by previous literature. After calculating reaction norms for each environmental variable, candidate genes were identified and gene annotations investigated to demonstrate how this method can generate insights into phenotypic plasticity. The genetic algorithm successfully identified known environmental determinants of hybrid maize grain yield. This demonstrates that the methodology could be applied to other less well-studied phenotypes and crops to improve understanding of phenotypic plasticity and facilitate breeding crops for future climates.
Collapse
Affiliation(s)
- Aaron Kusmec
- Department of Agronomy, Iowa State University, Ames, IA, 50011-3650, USA
| | | | - Patrick S Schnable
- Department of Agronomy, Iowa State University, Ames, IA, 50011-3650, USA
- Plant Sciences Institute, Iowa State University, Ames, IA, 50011-3650, USA
| |
Collapse
|
7
|
Feitosa MF, Lin SJ, Acharya S, Thyagarajan B, Wojczynski MK, Kuipers AL, Kulminski A, Christensen K, Zmuda JM, Brent MR, Province MA. Discovery of genomic and transcriptomic pleiotropy between kidney function and soluble receptor for advanced glycation end products using correlated meta-analyses: The Long Life Family Study. Aging Cell 2024; 23:e14261. [PMID: 38932496 PMCID: PMC11464144 DOI: 10.1111/acel.14261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 06/11/2024] [Accepted: 06/14/2024] [Indexed: 06/28/2024] Open
Abstract
Patients with chronic kidney disease (CKD) have increased oxidative stress and chronic inflammation, which may escalate the production of advanced glycation end-products (AGEs). High soluble receptor for AGE (sRAGE) and low estimated glomerular filtration rate (eGFR) levels are associated with CKD and aging. We evaluated whether eGFR calculated from creatinine and cystatin C share pleiotropic genetic factors with sRAGE. We employed whole-genome sequencing and correlated meta-analyses on combined genome-wide association study (GWAS) p-values in 4182 individuals (age range: 24-110) from the Long Life Family Study (LLFS). We also conducted transcriptome-wide association studies (TWAS) on whole blood in a subset of 1209 individuals. We identified 59 pleiotropic GWAS loci (p < 5 × 10-8) and 17 TWAS genes (Bonferroni-p < 2.73 × 10-6) for eGFR traits and sRAGE. TWAS genes, LSP1 and MIR23AHG, were associated with eGFR and sRAGE located within GWAS loci, lncRNA-KCNQ1OT1 and CACNA1A/CCDC130, respectively. GWAS variants were eQTLs in the kidney glomeruli and tubules, and GWAS genes predicted kidney carcinoma. TWAS genes harbored eQTLs in the kidney, predicted kidney carcinoma, and connected enhancer-promoter variants with kidney function-related phenotypes at p < 5 × 10-8. Additionally, higher allele frequencies of protective variants for eGFR traits were detected in LLFS than in ALFA-Europeans and TOPMed, suggesting better kidney function in healthy-aging LLFS than in general populations. Integrating genomic annotation and transcriptional gene activity revealed the enrichment of genetic elements in kidney function and aging-related processes. The identified pleiotropic loci and gene expressions for eGFR and sRAGE suggest their underlying shared genetic effects and highlight their roles in kidney- and aging-related signaling pathways.
Collapse
Affiliation(s)
- Mary F. Feitosa
- Division of Statistical Genomics, Department of GeneticsWashington University in St Louis School of MedicineSt. LouisMissouriUSA
| | - Shiow J. Lin
- Division of Statistical Genomics, Department of GeneticsWashington University in St Louis School of MedicineSt. LouisMissouriUSA
| | - Sandeep Acharya
- Department of Computer Science and EngineeringWashington UniversitySt. LouisMissouriUSA
| | - Bharat Thyagarajan
- Department of Laboratory Medicine and Pathology, School of MedicineUniversity of MinnesotaMinneapolisMinnesotaUSA
| | - Mary K. Wojczynski
- Division of Statistical Genomics, Department of GeneticsWashington University in St Louis School of MedicineSt. LouisMissouriUSA
| | - Allison L. Kuipers
- Department of Epidemiology, School of Public HealthUniversity of PittsburghPittsburghPennsylvaniaUSA
| | - Alexander Kulminski
- Biodemography of Aging Research Unit, Social Science Research InstituteDuke UniversityDurhamNorth CarolinaUSA
| | - Kaare Christensen
- Unit of Epidemiology, Biostatistics and Biodemography, Department of Public HealthSouthern Denmark UniversityOdenseDenmark
| | - Joseph M. Zmuda
- Department of Epidemiology, School of Public HealthUniversity of PittsburghPittsburghPennsylvaniaUSA
| | - Michael R. Brent
- Department of Computer Science and EngineeringWashington UniversitySt. LouisMissouriUSA
| | - Michael A. Province
- Division of Statistical Genomics, Department of GeneticsWashington University in St Louis School of MedicineSt. LouisMissouriUSA
| |
Collapse
|
8
|
Tyler AL, Mahoney JM, Keller MP, Baker CN, Gaca M, Srivastava A, Gerdes Gyuricza I, Braun MJ, Rosenthal NA, Attie AD, Churchill GA, Carter GW. Transcripts with high distal heritability mediate genetic effects on complex metabolic traits. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.26.613931. [PMID: 39386475 PMCID: PMC11463413 DOI: 10.1101/2024.09.26.613931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Although many genes are subject to local regulation, recent evidence suggests that complex distal regulation may be more important in mediating phenotypic variability. To assess the role of distal gene regulation in complex traits, we combined multi-tissue transcriptomes with physiological outcomes to model diet-induced obesity and metabolic disease in a population of Diversity Outbred mice. Using a novel high-dimensional mediation analysis, we identified a composite transcriptome signature that summarized genetic effects on gene expression and explained 30% of the variation across all metabolic traits. The signature was heritable, interpretable in biological terms, and predicted obesity status from gene expression in an independently derived mouse cohort and multiple human studies. Transcripts contributing most strongly to this composite mediator frequently had complex, distal regulation distributed throughout the genome. These results suggest that trait-relevant variation in transcription is largely distally regulated, but is nonetheless identifiable, interpretable, and translatable across species.
Collapse
|
9
|
Tibbs-Cortes LE, Guo T, Andorf CM, Li X, Yu J. Comprehensive identification of genomic and environmental determinants of phenotypic plasticity in maize. Genome Res 2024; 34:1253-1263. [PMID: 39271292 PMCID: PMC11444181 DOI: 10.1101/gr.279027.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 08/06/2024] [Indexed: 09/15/2024]
Abstract
Maize phenotypes are plastic, determined by the complex interplay of genetics and environmental variables. Uncovering the genes responsible and understanding how their effects change across a large geographic region are challenging. In this study, we conducted systematic analysis to identify environmental indices that strongly influence 19 traits (including flowering time, plant architecture, and yield component traits) measured in the maize nested association mapping (NAM) population grown in 11 environments. Identified environmental indices based on day length, temperature, moisture, and combinations of these are biologically meaningful. Next, we leveraged a total of more than 20 million SNP and SV markers derived from recent de novo sequencing of the NAM founders for trait prediction and dissection. When combined with identified environmental indices, genomic prediction enables accurate performance predictions. Genome-wide association studies (GWASs) detected genetic loci associated with the plastic response to the identified environmental indices for all examined traits. By systematically uncovering the major environmental and genomic factors underlying phenotypic plasticity in a wide variety of traits and depositing our results as a track on the MaizeGDB genome browser, we provide a community resource as well as a comprehensive analytical framework to facilitate continuing complex trait dissection and prediction in maize and other crops. Our findings also provide a conceptual framework for the genetic architecture of phenotypic plasticity by accommodating two alternative models, regulatory gene model and allelic sensitivity model, as special cases of a continuum.
Collapse
Affiliation(s)
- Laura E Tibbs-Cortes
- Department of Agronomy, Iowa State University, Ames, Iowa 50011, USA
- USDA-ARS, Wheat Health, Genetics, and Quality Research Unit, Pullman, Washington 99164, USA
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, Iowa 50011, USA
| | - Tingting Guo
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Hongshan Laboratory, Wuhan, Hubei 430070, China
| | - Carson M Andorf
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, Iowa 50011, USA
- Department of Computer Science, Iowa State University, Ames, Iowa 50011, USA
| | - Xianran Li
- USDA-ARS, Wheat Health, Genetics, and Quality Research Unit, Pullman, Washington 99164, USA;
| | - Jianming Yu
- Department of Agronomy, Iowa State University, Ames, Iowa 50011, USA;
| |
Collapse
|
10
|
Khan M, Ludl AA, Bankier S, Björkegren JLM, Michoel T. Prediction of causal genes at GWAS loci with pleiotropic gene regulatory effects using sets of correlated instrumental variables. ARXIV 2024:arXiv:2401.06261v3. [PMID: 38259344 PMCID: PMC10802687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent (not in linkage disequilibrium, which is usually not possible when considering a group of candidate genes from the same locus. Here we used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of individual causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results. Importantly, the causal effect estimates remained unbiased and their variance small even when instruments are highly correlated, while bias introduced by horizontal pleiotropy or LD matrix sampling error was comparable to standard MR. We applied MVMR with correlated instrumental variable sets at genome-wide significant loci for coronary artery disease (CAD) risk using expression Quantitative Trait Loci (eQTL) data from seven vascular and metabolic tissues in the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. We confirm causal roles for PHACTR 1 and ADAMTS 7 in arterial tissues, among others. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs with cis-eQTL associations at a given locus in a single model to predict causal gene-tissue combinations remains infeasible. Our results show that within tissues, MVMR with dependent, as opposed to independent, sets of instrumental variables significantly expands the scope for predicting causal genes in disease risk loci with pleiotropic regulatory effects. However, considering risk loci with regulatory pleiotropy that also spans across tissues remains an unsolved problem.
Collapse
Affiliation(s)
- Mariyam Khan
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Adriaan-Alexander Ludl
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Sean Bankier
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Johan LM Björkegren
- Department of Medicine (Huddinge), Karolinska Institutet, Huddinge, Sweden
- Department of Genetics & Genomic Sciences/Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Tom Michoel
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| |
Collapse
|
11
|
Pividori M, Ritchie MD, Milone DH, Greene CS. An efficient, not-only-linear correlation coefficient based on clustering. Cell Syst 2024; 15:854-868.e3. [PMID: 39243756 DOI: 10.1016/j.cels.2024.08.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 06/18/2024] [Accepted: 08/15/2024] [Indexed: 09/09/2024]
Abstract
Identifying meaningful patterns in data is crucial for understanding complex biological processes, particularly in transcriptomics, where genes with correlated expression often share functions or contribute to disease mechanisms. Traditional correlation coefficients, which primarily capture linear relationships, may overlook important nonlinear patterns. We introduce the clustermatch correlation coefficient (CCC), a not-only-linear coefficient that utilizes clustering to efficiently detect both linear and nonlinear associations. CCC outperforms standard methods by revealing biologically meaningful patterns that linear-only coefficients miss and is faster than state-of-the-art coefficients such as the maximal information coefficient. When applied to human gene expression data from genotype-tissue expression (GTEx), CCC identified robust linear relationships and nonlinear patterns, such as sex-specific differences, that are undetectable by standard methods. Highly ranked gene pairs were enriched for interactions in integrated networks built from protein-protein interactions, transcription factor regulation, and chemical and genetic perturbations, suggesting that CCC can detect functional relationships missed by linear-only approaches. CCC is a highly efficient, next-generation, not-only-linear correlation coefficient for genome-scale data. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Milton Pividori
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, USA; Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Marylyn D Ritchie
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Diego H Milone
- Research Institute for Signals, Systems and Computational Intelligence (sinc(i)), Universidad Nacional del Litoral, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Santa Fe CP3000, Argentina
| | - Casey S Greene
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, USA; Center for Health AI, University of Colorado School of Medicine, Aurora, CO 80045, USA.
| |
Collapse
|
12
|
Smail C, Ge B, Keever-Keigher MR, Schwendinger-Schreck C, Cheung WA, Johnston JJ, Barrett C, Feldman K, Cohen ASA, Farrow EG, Thiffault I, Grundberg E, Pastinen T. Complex trait associations in rare diseases and impacts on Mendelian variant interpretation. Nat Commun 2024; 15:8196. [PMID: 39294130 PMCID: PMC11411080 DOI: 10.1038/s41467-024-52407-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 09/05/2024] [Indexed: 09/20/2024] Open
Abstract
Emerging evidence implicates common genetic variation - aggregated into polygenic scores (PGS) - in the onset and phenotypic presentation of rare diseases. Here, we comprehensively map individual polygenic liability for 1102 open-source PGS in a cohort of 3059 probands enrolled in the Genomic Answers for Kids (GA4K) rare disease study, revealing widespread associations between rare disease phenotypes and PGSs for common complex diseases and traits, blood protein levels, and brain and other organ morphological measurements. Using this resource, we demonstrate increased polygenic liability in probands with an inherited candidate disease variant (VUS) compared to unaffected carrier parents. Further, we show an enrichment for large-effect rare variants in putative core PGS genes for associated complex traits. Overall, our study supports and expands on previous findings of complex trait associations in rare diseases, implicates polygenic liability as a potential mechanism underlying variable penetrance of candidate causal variants, and provides a framework for identifying novel candidate rare disease genes.
Collapse
Affiliation(s)
- Craig Smail
- Genomic Medicine Center, Department of Pediatrics, Children's Mercy Kansas City, Kansas City, USA.
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, USA.
| | - Bing Ge
- Department of Human Genetics, McGill University, Montreal, Canada
| | - Marissa R Keever-Keigher
- Genomic Medicine Center, Department of Pediatrics, Children's Mercy Kansas City, Kansas City, USA
| | | | - Warren A Cheung
- Genomic Medicine Center, Department of Pediatrics, Children's Mercy Kansas City, Kansas City, USA
| | - Jeffrey J Johnston
- Genomic Medicine Center, Department of Pediatrics, Children's Mercy Kansas City, Kansas City, USA
| | - Cassandra Barrett
- Genomic Medicine Center, Department of Pediatrics, Children's Mercy Kansas City, Kansas City, USA
| | - Keith Feldman
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, USA
- Health Outcomes and Health Services Research, Department of Pediatrics, Children's Mercy Kansas City, Kansas City, USA
| | - Ana S A Cohen
- Genomic Medicine Center, Department of Pediatrics, Children's Mercy Kansas City, Kansas City, USA
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, USA
- Department of Pathology and Laboratory Medicine, Children's Mercy Kansas City, Kansas City, USA
| | - Emily G Farrow
- Genomic Medicine Center, Department of Pediatrics, Children's Mercy Kansas City, Kansas City, USA
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, USA
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, USA
| | - Isabelle Thiffault
- Genomic Medicine Center, Department of Pediatrics, Children's Mercy Kansas City, Kansas City, USA
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, USA
- Department of Pathology and Laboratory Medicine, Children's Mercy Kansas City, Kansas City, USA
| | - Elin Grundberg
- Genomic Medicine Center, Department of Pediatrics, Children's Mercy Kansas City, Kansas City, USA
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, USA
| | - Tomi Pastinen
- Genomic Medicine Center, Department of Pediatrics, Children's Mercy Kansas City, Kansas City, USA.
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, USA.
| |
Collapse
|
13
|
Vattathil SM, Gerasimov ES, Canon SM, Lori A, Tan SSM, Kim PJ, Liu Y, Lai EC, Bennett DA, Wingo TS, Wingo AP. Genetic regulation of microRNAs in the older adult brain and their contribution to neuropsychiatric conditions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.10.610174. [PMID: 39314369 PMCID: PMC11419020 DOI: 10.1101/2024.09.10.610174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
MicroRNAs are essential post-transcriptional regulators of gene expression and involved in many biological processes; however, our understanding of their genetic regulation and role in brain illnesses is limited. Here, we mapped brain microRNA expression quantitative trait loci (miR-QTLs) using genome-wide small RNA sequencing profiles from dorsolateral prefrontal cortex (dlPFC) samples of 604 older adult donors of European ancestry. miR-QTLs were identified for 224 miRNAs (48% of 470 tested miRNAs) at false discovery rate < 1%. We found that miR-QTLs were enriched in brain promoters and enhancers, and that intragenic miRNAs often did not share QTLs with their host gene. Additionally, we integrated the brain miR-QTLs with results from 16 GWAS of psychiatric and neurodegenerative diseases using multiple independent integration approaches and identified four miRNAs that contribute to the pathogenesis of bipolar disorder, major depression, post-traumatic stress disorder, schizophrenia, and Parkinson's disease. This study provides novel insights into the contribution of miRNAs to the complex biological networks that link genetic variation to disease.
Collapse
Affiliation(s)
- Selina M Vattathil
- Department of Neurology, University of California, Davis, Sacramento, CA, USA
| | | | - Se Min Canon
- Department of Neurology, Emory University School of Medicine, Atlanta, GA, USA
| | - Adriana Lori
- Department of Neurology, Emory University School of Medicine, Atlanta, GA, USA
| | - Sarah Sze Min Tan
- Department of Neurology, Emory University School of Medicine, Atlanta, GA, USA
| | - Paul J Kim
- Department of Psychiatry, Emory University School of Medicine, Atlanta, GA, USA
| | - Yue Liu
- Department of Neurology, University of California, Davis, Sacramento, CA, USA
| | - Eric C Lai
- Developmental Biology Program, Sloan Kettering Institute, New York, NY, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, Illinois, USA
| | - Thomas S Wingo
- Department of Neurology, University of California, Davis, Sacramento, CA, USA
- Alzheimer's Disease Research Center, University of California, Davis, Sacramento, CA, USA
| | - Aliza P Wingo
- Department of Psychiatry, University of California, Davis, Sacramento, CA, USA
- Veterans Affairs Northern California Health Care System, Sacramento, CA, USA
| |
Collapse
|
14
|
Yang L, Qin W, Wei X, Liu R, Yang J, Wang Z, Yan Q, Zhang Y, Hu W, Han X, Gao C, Zhan J, Gao B, Ge X, Li F, Yang Z. Regulatory networks of coresident subgenomes during rapid fiber cell elongation in upland cotton. PLANT COMMUNICATIONS 2024:101130. [PMID: 39257006 DOI: 10.1016/j.xplc.2024.101130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 08/09/2024] [Accepted: 09/05/2024] [Indexed: 09/12/2024]
Abstract
Cotton, an intriguing plant species shaped by polyploidization, evolution, and domestication, holds particular interest due to the complex mechanisms governing fiber traits across its two subgenomes. However, the regulatory elements or transcriptional networks between subgenomes during fiber elongation remain to be fully clarified. Here, we analyzed 1462 cotton fiber samples to reconstruct the gene-expression regulatory networks that influence fiber cell elongation. Inter-subgenome expression quantitative trait loci (eQTLs) largely dictate gene transcription, with a notable tendency for the D subgenome to regulate A-subgenome eGenes. This regulation reveals synchronized homoeologous gene expression driven by co-localized eQTLs and divergent patterns that diminish genetic correlations, thus leading to preferential expression in the A and D subgenomes. Hotspot456 emerged as a key regulator of fiber initiation and elongation, and artificial selection of trans-eQTLs in hotspot456 that positively regulate KCS1 has facilitated cell elongation. Experiments designed to clarify the roles of trans-eQTLs in improved fiber breeding confirmed the inhibition of GhTOL9 by a specific trans-eQTL via GhWRKY28, which negatively affects fiber elongation. We propose a model in which the GhWRKY28-GhTOL9 module regulates this process through the ESCRT (endosomal sorting complex required for transport) pathway. This research significantly advances our understanding of cotton's evolutionary and domestication processes and the intricate regulatory mechanisms that underlie significant plant traits.
Collapse
Affiliation(s)
- Lan Yang
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Wenqiang Qin
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Xi Wei
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Rui Liu
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Jiaxiang Yang
- National Key Laboratory of Cotton Bio‑breeding and Integrated Utilization, School of Agricultural Sciences, Zhengzhou Univeristy, Zhengzhou 450000, China
| | - Zhi Wang
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Qingdi Yan
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Yihao Zhang
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China; National Key Laboratory of Cotton Bio‑breeding and Integrated Utilization, School of Agricultural Sciences, Zhengzhou Univeristy, Zhengzhou 450000, China
| | - Wei Hu
- National Key Laboratory of Cotton Bio‑breeding and Integrated Utilization, School of Agricultural Sciences, Zhengzhou Univeristy, Zhengzhou 450000, China
| | - Xiao Han
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Chenxu Gao
- National Key Laboratory of Cotton Bio‑breeding and Integrated Utilization, School of Agricultural Sciences, Zhengzhou Univeristy, Zhengzhou 450000, China
| | - Jingjing Zhan
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Baibai Gao
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Xiaoyang Ge
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China; National Key Laboratory of Cotton Bio‑breeding and Integrated Utilization, School of Agricultural Sciences, Zhengzhou Univeristy, Zhengzhou 450000, China.
| | - Fuguang Li
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China; National Key Laboratory of Cotton Bio‑breeding and Integrated Utilization, School of Agricultural Sciences, Zhengzhou Univeristy, Zhengzhou 450000, China.
| | - Zhaoen Yang
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research Chinese Academy of Agricultural Sciences, Anyang 455000, China; National Key Laboratory of Cotton Bio‑breeding and Integrated Utilization, School of Agricultural Sciences, Zhengzhou Univeristy, Zhengzhou 450000, China.
| |
Collapse
|
15
|
Stone K, Platig J, Quackenbush J, Fagny M. The Importance of Regulatory Network Structure for Complex Trait Heritability and Evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.27.582063. [PMID: 38464142 PMCID: PMC10925220 DOI: 10.1101/2024.02.27.582063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Complex traits are determined by many loci-mostly regulatory elements-that, through combinatorial interactions, can affect multiple traits. Such high levels of epistasis and pleiotropy have been proposed in the omnigenic model and may explain why such a large part of complex trait heritability is usually missed by genome-wide association studies while raising questions about the possibility for such traits to evolve in response to environmental constraints. To explore the molecular bases of complex traits and understand how they can adapt, we systematically analyzed the distribution of SNP heritability for ten traits across 29 tissue-specific Expression Quantitative Trait Locus (eQTL) networks. We find that heritability is clustered in a small number of tissue-specific, functionally relevant SNP-gene modules and that the greatest heritability occurs in local "hubs" that are both the cornerstone of the network's modules and tissue-specific regulatory elements. The network structure could thus both amplify the genotype-phenotype connection and buffer the deleterious effect of the genetic variations on other traits. We confirm that this structure has allowed complex traits to evolve in response to environmental constraints, with the local "hubs" being the preferential targets of past and ongoing directional selection. Together, these results provide a conceptual framework for understanding complex trait architecture and evolution.
Collapse
Affiliation(s)
- Katherine Stone
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Department of Data Science and Center for Cancer Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
| | - John Platig
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia, USA
- Department of Public Health Sciences, University of Virginia, Charlottesville, Virginia, USA
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Department of Data Science and Center for Cancer Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts, United States
| | - Maud Fagny
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Department of Data Science and Center for Cancer Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Genetique Quantitative et Evolution - Le Moulon, Gif-sur-Yvette 91190 France
| |
Collapse
|
16
|
Battlay P, Yeaman S, Hodgins KA. Impacts of pleiotropy and migration on repeated genetic adaptation. Genetics 2024; 228:iyae111. [PMID: 38996046 PMCID: PMC11373517 DOI: 10.1093/genetics/iyae111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Revised: 05/09/2024] [Accepted: 06/11/2024] [Indexed: 07/14/2024] Open
Abstract
Observations of genetically repeated evolution (repeatability) in complex organisms are incongruent with the Fisher-Orr model, which implies that repeated use of the same gene should be rare when mutations are pleiotropic (i.e. affect multiple traits). When spatially divergent selection occurs in the presence of migration, mutations of large effect are more strongly favored, and hence, repeatability is more likely, but it is unclear whether this observation is limited by pleiotropy. Here, we explore this question using individual-based simulations of a two-patch model incorporating multiple quantitative traits governed by mutations with pleiotropic effects. We explore the relationship between fitness trade-offs and repeatability by varying the alignment between mutation effect and spatial variation in trait optima. While repeatability decreases with increasing trait dimensionality, trade-offs in mutation effects on traits do not strongly limit the contribution of a locus of large effect to repeated adaptation, particularly under increased migration. These results suggest that repeatability will be more pronounced for local rather than global adaptation. Whereas pleiotropy limits repeatability in a single-population model, when there is local adaptation with gene flow, repeatability can occur if some loci are able to produce alleles of large effect, even when there are pleiotropic trade-offs.
Collapse
Affiliation(s)
- Paul Battlay
- School of Biological Sciences, Monash University, 25 Rainforest Walk, Clayton, Victoria 3800, Australia
| | - Sam Yeaman
- Department of Biological Sciences, University of Calgary, 2500 University Drive NW, Calgary, Alberta, Canada T2N 1N4
| | - Kathryn A Hodgins
- School of Biological Sciences, Monash University, 25 Rainforest Walk, Clayton, Victoria 3800, Australia
| |
Collapse
|
17
|
Chu YH, Lee YS, Gomez-Cano F, Gomez-Cano L, Zhou P, Doseff AI, Springer N, Grotewold E. Molecular mechanisms underlying gene regulatory variation of maize metabolic traits. THE PLANT CELL 2024; 36:3709-3728. [PMID: 38922302 PMCID: PMC11371180 DOI: 10.1093/plcell/koae180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 04/17/2024] [Accepted: 06/17/2024] [Indexed: 06/27/2024]
Abstract
Variation in gene expression levels is pervasive among individuals and races or varieties, and has substantial agronomic consequences, for example, by contributing to hybrid vigor. Gene expression level variation results from mutations in regulatory sequences (cis) and/or transcription factor (TF) activity (trans), but the mechanisms underlying cis- and/or trans-regulatory variation of complex phenotypes remain largely unknown. Here, we investigated gene expression variation mechanisms underlying the differential accumulation of the insecticidal compounds maysin and chlorogenic acid in silks of widely used maize (Zea mays) inbreds, B73 and A632. By combining transcriptomics and cistromics, we identified 1,338 silk direct targets of the maize R2R3-MYB TF Pericarp color1 (P1), consistent with it being a regulator of maysin and chlorogenic acid biosynthesis. Among these P1 targets, 464 showed allele-specific expression (ASE) between B73 and A632 silks. Allelic DNA-affinity purification sequencing identified 34 examples in which P1 allelic specific binding (ASB) correlated with cis-expression variation. From previous yeast one-hybrid studies, we identified 9 TFs potentially implicated in the control of P1 targets, with ASB to 83 out of 464 ASE genes (cis) and differential expression of 4 out of 9 TFs between B73 and A632 silks (trans). These results provide a molecular framework for understanding universal mechanisms underlying natural variation of gene expression levels, and how the regulation of metabolic diversity is established.
Collapse
Affiliation(s)
- Yi-Hsuan Chu
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Yun Sun Lee
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Fabio Gomez-Cano
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Lina Gomez-Cano
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Peng Zhou
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN 55108, USA
| | - Andrea I Doseff
- Department of Physiology and Department of Pharmacology & Toxicology, Michigan State University, East Lansing, MI 48824, USA
| | - Nathan Springer
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN 55108, USA
| | - Erich Grotewold
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
18
|
Mackay TFC, Anholt RRH. Pleiotropy, epistasis and the genetic architecture of quantitative traits. Nat Rev Genet 2024; 25:639-657. [PMID: 38565962 PMCID: PMC11330371 DOI: 10.1038/s41576-024-00711-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/14/2024] [Indexed: 04/04/2024]
Abstract
Pleiotropy (whereby one genetic polymorphism affects multiple traits) and epistasis (whereby non-linear interactions between genetic polymorphisms affect the same trait) are fundamental aspects of the genetic architecture of quantitative traits. Recent advances in the ability to characterize the effects of polymorphic variants on molecular and organismal phenotypes in human and model organism populations have revealed the prevalence of pleiotropy and unexpected shared molecular genetic bases among quantitative traits, including diseases. By contrast, epistasis is common between polymorphic loci associated with quantitative traits in model organisms, such that alleles at one locus have different effects in different genetic backgrounds, but is rarely observed for human quantitative traits and common diseases. Here, we review the concepts and recent inferences about pleiotropy and epistasis, and discuss factors that contribute to similarities and differences between the genetic architecture of quantitative traits in model organisms and humans.
Collapse
Affiliation(s)
- Trudy F C Mackay
- Center for Human Genetics, Clemson University, Greenwood, SC, USA.
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA.
| | - Robert R H Anholt
- Center for Human Genetics, Clemson University, Greenwood, SC, USA.
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA.
| |
Collapse
|
19
|
Borcuk C, Parihar M, Sportelli L, Kleinman JE, Shin JH, Hyde TM, Bertolino A, Weinberger DR, Pergola G. Network-wide risk convergence in gene co-expression identifies reproducible genetic hubs of schizophrenia risk. Neuron 2024:S0896-6273(24)00575-0. [PMID: 39236717 DOI: 10.1016/j.neuron.2024.08.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 04/03/2024] [Accepted: 08/07/2024] [Indexed: 09/07/2024]
Abstract
The omnigenic model posits that genetic risk for traits with complex heritability involves cumulative effects of peripheral genes on mechanistic "core genes," suggesting that in a network of genes, those closer to clusters including core genes should have higher GWAS signals. In gene co-expression networks, we confirmed that GWAS signals accumulate in genes more connected to risk-enriched gene clusters, highlighting across-network risk convergence. This was strongest in adult psychiatric disorders, especially schizophrenia (SCZ), spanning 70% of network genes, suggestive of super-polygenic architecture. In snRNA-seq cell type networks, SCZ risk convergence was strongest in L2/L3 excitatory neurons. We prioritized genes most connected to SCZ-GWAS genes, which showed robust association to a CRISPRa measure of PGC3 regulation and were consistently identified across several brain regions. Several genes, including dopamine-associated ones, were prioritized specifically in the striatum. This strategy thus retrieves current drug targets and can be used to prioritize other potential drug targets.
Collapse
Affiliation(s)
- Christopher Borcuk
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA; Group of Psychiatric Neuroscience, Department of Translational Biomedicine and Neuroscience, University of Bari Aldo Moro, Bari, Italy
| | - Madhur Parihar
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
| | - Leonardo Sportelli
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA; Group of Psychiatric Neuroscience, Department of Translational Biomedicine and Neuroscience, University of Bari Aldo Moro, Bari, Italy
| | - Joel E Kleinman
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA; Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Joo Heon Shin
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
| | - Thomas M Hyde
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA; Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Alessandro Bertolino
- Group of Psychiatric Neuroscience, Department of Translational Biomedicine and Neuroscience, University of Bari Aldo Moro, Bari, Italy; Azienda Ospedaliero-Universitaria Consorziale Policlinico, Bari, Italy
| | - Daniel R Weinberger
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA; Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Giulio Pergola
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA; Group of Psychiatric Neuroscience, Department of Translational Biomedicine and Neuroscience, University of Bari Aldo Moro, Bari, Italy; Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
20
|
Johnson OD, Paul S, Gutierrez JA, Russell WK, Ward MC. DNA damage-associated protein co-expression network in cardiomyocytes informs on tolerance to genetic variation and disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.14.607863. [PMID: 39185220 PMCID: PMC11343126 DOI: 10.1101/2024.08.14.607863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]
Abstract
Cardiovascular disease (CVD) is associated with both genetic variants and environmental factors. One unifying consequence of the molecular risk factors in CVD is DNA damage, which must be repaired by DNA damage response proteins. However, the impact of DNA damage on global cardiomyocyte protein abundance, and its relationship to CVD risk remains unclear. We therefore treated induced pluripotent stem cell-derived cardiomyocytes with the DNA-damaging agent Doxorubicin (DOX) and a vehicle control, and identified 4,178 proteins that contribute to a network comprising 12 co-expressed modules and 403 hub proteins with high intramodular connectivity. Five modules correlate with DOX and represent distinct biological processes including RNA processing, chromatin regulation and metabolism. DOX-correlated hub proteins are depleted for proteins that vary in expression across individuals due to genetic variation but are enriched for proteins encoded by loss-of-function intolerant genes. While proteins associated with genetic risk for CVD, such as arrhythmia are enriched in specific DOX-correlated modules, DOX-correlated hub proteins are not enriched for known CVD risk proteins. Instead, they are enriched among proteins that physically interact with CVD risk proteins. Our data demonstrate that DNA damage in cardiomyocytes induces diverse effects on biological processes through protein co-expression modules that are relevant for CVD, and that the level of protein connectivity in DNA damage-associated modules influences the tolerance to genetic variation.
Collapse
Affiliation(s)
- Omar D. Johnson
- Biochemistry, Cellular and Molecular Biology Graduate Program, University of Texas Medical Branch, Galveston, Texas, USA
- MD-PhD Combined Degree Program, University of Texas Medical Branch, Galveston, Texas, USA
| | - Sayan Paul
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, USA
| | - Jose A. Gutierrez
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, USA
| | - William K. Russell
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, USA
| | - Michelle C. Ward
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, USA
| |
Collapse
|
21
|
Qi G, Chhetri SB, Ray D, Dutta D, Battle A, Bhattacharjee S, Chatterjee N. Genome-wide large-scale multi-trait analysis characterizes global patterns of pleiotropy and unique trait-specific variants. Nat Commun 2024; 15:6985. [PMID: 39143063 PMCID: PMC11324957 DOI: 10.1038/s41467-024-51075-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Accepted: 07/29/2024] [Indexed: 08/16/2024] Open
Abstract
Genome-wide association studies (GWAS) have found widespread evidence of pleiotropy, but characterization of global patterns of pleiotropy remain highly incomplete due to insufficient power of current approaches. We develop fastASSET, a method that allows efficient detection of variant-level pleiotropic association across many traits. We analyze GWAS summary statistics of 116 complex traits of diverse types collected from the GRASP repository and large GWAS Consortia. We identify 2293 independent loci and find that the lead variants in nearly all these loci (~99%) to be associated with ≥ 2 traits (median = 6). We observe that degree of pleiotropy estimated from our study predicts that observed in the UK Biobank for a much larger number of traits (K = 4114) (correlation = 0.43, p-value < 2.2 × 10 - 16 ). Follow-up analyzes of 21 trait-specific variants indicate their link to the expression in trait-related tissues for a small number of genes involved in relevant biological processes. Our findings provide deeper insight into the nature of pleiotropy and leads to identification of highly trait-specific susceptibility variants.
Collapse
Affiliation(s)
- Guanghao Qi
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, WA, USA
| | - Surya B Chhetri
- Department of Biomedical Engineering, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Debashree Ray
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
| | - Diptavo Dutta
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
| | - Alexis Battle
- Department of Biomedical Engineering, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Genetic Medicine, School of Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Samsiddhi Bhattacharjee
- Biotechnology Research and Innovation Council-National Institute of Biomedical Genomics (BRIC-NIBMG), Kalyani, India.
| | - Nilanjan Chatterjee
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA.
- Department of Oncology, School of Medicine, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
22
|
Sargurupremraj M. Genetic Architecture of Neurological Disorders and Their Endophenotypes: Insights from Genetic Association Studies. Curr Top Behav Neurosci 2024. [PMID: 39138743 DOI: 10.1007/7854_2024_513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
Population-scale genetic association studies of complex neurologic diseases have identified the underlying genetic architecture as multifactorial. Despite the study sample sizes reaching the millions, the identified disease-related genes explain only a small fraction of the phenotypic variance. Notable advancements in statistical methods now enable researchers to gain insights even from genomic regions where genotype-phenotype associations do not reach statistical significance. Such studies confirm a highly interconnected molecular network comprising a core group of genes directly involved in the disease process, alongside an expanded peripheral network, each contributing a small but potentially important (modulatory) effect. Additionally, causal inference methods, utilizing genetic instruments, have shed light on putative causal links between risk factors and clinical endpoints. In light of the pervasive genetic overlap or pleiotropy, however, caution is warranted in interpreting causal relationships inferred from these analyses. In this chapter, I will introduce the genetic association model, provide insights into the current state of genetic association studies, and discuss potential future directions.
Collapse
Affiliation(s)
- Muralidharan Sargurupremraj
- Glenn Biggs Institute for Alzheimer's and Neurodegenerative Diseases, University of Texas Health Sciences Center, San Antonio, TX, USA.
| |
Collapse
|
23
|
Starr AL, Nishimura T, Igarashi KJ, Funamoto C, Nakauchi H, Fraser HB. Disentangling cell-intrinsic and extrinsic factors underlying evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.06.592777. [PMID: 38798687 PMCID: PMC11118348 DOI: 10.1101/2024.05.06.592777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
A key goal of developmental biology is to determine the extent to which cells and organs develop autonomously, as opposed to requiring interactions with other cells or environmental factors. Chimeras have played a foundational role in this by enabling qualitative classification of cell-intrinsically vs. extrinsically driven processes. Here, we extend this framework to precisely decompose evolutionary divergence in any quantitative trait into cell-intrinsic, extrinsic, and intrinsic-extrinsic interaction components. Applying this framework to thousands of gene expression levels in reciprocal rat-mouse chimeras, we found that the majority of their divergence is attributable to cell-intrinsic factors, though extrinsic factors also play an integral role. For example, a rat-like extracellular environment extrinsically up-regulates the expression of a key transcriptional regulator of the endoplasmic reticulum (ER) stress response in some but not all cell types, which in turn strongly predicts extrinsic up-regulation of its target genes and of the ER stress response pathway as a whole. This effect is also seen at the protein level, suggesting propagation through multiple regulatory levels. Applying our framework to a cellular trait, neuronal differentiation, revealed a complex interaction of intrinsic and extrinsic factors. Finally, we show that imprinted genes are dramatically mis-expressed in species-mismatched environments, suggesting that mismatch between rapidly evolving intrinsic and extrinsic mechanisms controlling gene imprinting may contribute to barriers to interspecies chimerism. Overall, our conceptual framework opens new avenues to investigate the mechanistic basis of developmental processes and evolutionary divergence across myriad quantitative traits in any multicellular organism.
Collapse
Affiliation(s)
| | - Toshiya Nishimura
- Institute for Stem Cell Biology and Regenerative Medicine, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
- WPI Premium Research Institute for Human Metaverse Medicine (WPI-PRIMe), Osaka University, Osaka, 565-0871, Japan (current address for T.N.)
- Division of Stem Cell and Organoid Medicine, Department of Genome Biology, Graduate School of Medicine, Osaka University, Osaka, 565-0871, Japan
| | - Kyomi J. Igarashi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Chihiro Funamoto
- Division of Stem Cell and Organoid Medicine, Department of Genome Biology, Graduate School of Medicine, Osaka University, Osaka, 565-0871, Japan
| | - Hiromitsu Nakauchi
- Institute for Stem Cell Biology and Regenerative Medicine, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
- Division of Stem Cell Therapy, Distinguished Professor Unit, The Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo 108-8639, Japan
| | - Hunter B. Fraser
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
24
|
Leask MP, Crișan TO, Ji A, Matsuo H, Köttgen A, Merriman TR. The pathogenesis of gout: molecular insights from genetic, epigenomic and transcriptomic studies. Nat Rev Rheumatol 2024; 20:510-523. [PMID: 38992217 DOI: 10.1038/s41584-024-01137-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/11/2024] [Indexed: 07/13/2024]
Abstract
The pathogenesis of gout involves a series of steps beginning with hyperuricaemia, followed by the deposition of monosodium urate crystal in articular structures and culminating in an innate immune response, mediated by the NLRP3 inflammasome, to the deposited crystals. Large genome-wide association studies (GWAS) of serum urate levels initially identified the genetic variants with the strongest effects, mapping mainly to genes that encode urate transporters in the kidney and gut. Other GWAS highlighted the importance of uncommon genetic variants. More recently, genetic and epigenetic genome-wide studies have revealed new pathways in the inflammatory process of gout, including genetic associations with epigenomic modifiers. Epigenome-wide association studies are also implicating epigenomic remodelling in gout, which perhaps regulates the responsiveness of the innate immune system to monosodium urate crystals. Notably, genes implicated in gout GWAS do not include those encoding components of the NLRP3 inflammasome itself, but instead include genes encoding molecules involved in its regulation. Knowledge of the molecular mechanisms underlying gout has advanced through the translation of genetic associations into specific molecular mechanisms. Notable examples include ABCG2, HNF4A, PDZK1, MAF and IL37. Current genetic studies are dominated by participants of European ancestry; however, studies focusing on other population groups are discovering informative population-specific variants associated with gout.
Collapse
Affiliation(s)
- Megan P Leask
- Department of Physiology, University of Otago, Dunedin, Aotearoa, New Zealand
- Division of Clinical Immunology and Rheumatology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Tania O Crișan
- Department of Medical Genetics, "Iuliu Haţieganu" University of Medicine and Pharmacy, Cluj-Napoca, Romania
| | - Aichang Ji
- Affiliated Hospital of Qingdao University, Qingdao University, Qingdao, China
| | - Hirotaka Matsuo
- Department of Integrative Physiology and Bio-Nano Medicine, National Defense Medical College, Saitama, Japan
| | - Anna Köttgen
- Institute of Genetic Epidemiology, Faculty of Medicine and Medical Center - University of Freiburg, Freiburg, Germany
| | - Tony R Merriman
- Division of Clinical Immunology and Rheumatology, University of Alabama at Birmingham, Birmingham, AL, USA.
- Department of Microbiology and Immunology, University of Otago, Dunedin, Aotearoa, New Zealand.
| |
Collapse
|
25
|
Yao D, Binan L, Bezney J, Simonton B, Freedman J, Frangieh CJ, Dey K, Geiger-Schuller K, Eraslan B, Gusev A, Regev A, Cleary B. Scalable genetic screening for regulatory circuits using compressed Perturb-seq. Nat Biotechnol 2024; 42:1282-1295. [PMID: 37872410 PMCID: PMC11035494 DOI: 10.1038/s41587-023-01964-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 08/22/2023] [Indexed: 10/25/2023]
Abstract
Pooled CRISPR screens with single-cell RNA sequencing readout (Perturb-seq) have emerged as a key technique in functional genomics, but they are limited in scale by cost and combinatorial complexity. In this study, we modified the design of Perturb-seq by incorporating algorithms applied to random, low-dimensional observations. Compressed Perturb-seq measures multiple random perturbations per cell or multiple cells per droplet and computationally decompresses these measurements by leveraging the sparse structure of regulatory circuits. Applied to 598 genes in the immune response to bacterial lipopolysaccharide, compressed Perturb-seq achieves the same accuracy as conventional Perturb-seq with an order of magnitude cost reduction and greater power to learn genetic interactions. We identified known and novel regulators of immune responses and uncovered evolutionarily constrained genes with downstream targets enriched for immune disease heritability, including many missed by existing genome-wide association studies. Our framework enables new scales of interrogation for a foundational method in functional genomics.
Collapse
Affiliation(s)
- Douglas Yao
- Program in Systems, Synthetic, and Quantitative Biology, Harvard University, Cambridge, MA, USA
| | - Loic Binan
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jon Bezney
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Brooke Simonton
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jahanara Freedman
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Chris J Frangieh
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Kushal Dey
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | | | | | - Alexander Gusev
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Division of Genetics, Brigham and Women's Hospital, Boston, MA, USA
| | - Aviv Regev
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Genentech, South San Francisco, CA, USA
| | - Brian Cleary
- Faculty of Computing and Data Sciences, Boston University, Boston, MA, USA.
- Department of Biology, Boston University, Boston, MA, USA.
- Department of Biomedical Engineering, Boston University, Boston, MA, USA.
- Program in Bioinformatics, Boston University, Boston, MA, USA.
- Biological Design Center, Boston University, Boston, MA, USA.
| |
Collapse
|
26
|
Qi T, Song L, Guo Y, Chen C, Yang J. From genetic associations to genes: methods, applications, and challenges. Trends Genet 2024; 40:642-667. [PMID: 38734482 DOI: 10.1016/j.tig.2024.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 04/15/2024] [Accepted: 04/16/2024] [Indexed: 05/13/2024]
Abstract
Genome-wide association studies (GWASs) have identified numerous genetic loci associated with human traits and diseases. However, pinpointing the causal genes remains a challenge, which impedes the translation of GWAS findings into biological insights and medical applications. In this review, we provide an in-depth overview of the methods and technologies used for prioritizing genes from GWAS loci, including gene-based association tests, integrative analysis of GWAS and molecular quantitative trait loci (xQTL) data, linking GWAS variants to target genes through enhancer-gene connection maps, and network-based prioritization. We also outline strategies for generating context-dependent xQTL data and their applications in gene prioritization. We further highlight the potential of gene prioritization in drug repurposing. Lastly, we discuss future challenges and opportunities in this field.
Collapse
Affiliation(s)
- Ting Qi
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China.
| | - Liyang Song
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China
| | - Yazhou Guo
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China
| | - Chang Chen
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China
| | - Jian Yang
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China.
| |
Collapse
|
27
|
Li J, Wang B, Ma X. Non-Coding RNAs Extended Omnigenic Module of Cancers. ENTROPY (BASEL, SWITZERLAND) 2024; 26:640. [PMID: 39202109 PMCID: PMC11353529 DOI: 10.3390/e26080640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 07/24/2024] [Accepted: 07/25/2024] [Indexed: 09/03/2024]
Abstract
The emergence of cancers involves numerous coding and non-coding genes. Understanding the contribution of non-coding RNAs (ncRNAs) to the cancer neighborhood is crucial for interpreting the interaction between molecular markers of cancer. However, there is a lack of systematic studies on the involvement of ncRNAs in the cancer neighborhood. In this paper, we construct an interaction network which encompasses multiple genes. We focus on the fundamental topological indicator, namely connectivity, and evaluate its performance when applied to cancer-affected genes using statistical indices. Our findings reveal that ncRNAs significantly enhance the connectivity of affected genes and mediate the inclusion of more genes in the cancer module. To further explore the role of ncRNAs in the network, we propose a connectivity-based method which leverages the bridging function of ncRNAs across cancer-affected genes and reveals the non-coding RNAs extended omnigenic module (NeOModule). Topologically, this module promotes the formation of cancer patterns involving ncRNAs. Biologically, it is enriched with cancer pathways and treatment targets, providing valuable insights into disease relationships.
Collapse
Affiliation(s)
| | - Bingbo Wang
- School of Computer Science and Technology, Xidian University, Xi’an 710119, China; (J.L.); (X.M.)
| | | |
Collapse
|
28
|
Cote AC, Young HE, Huckins LM. Critical reasoning on the co-expression module QTL in the dorsolateral prefrontal cortex. HGG ADVANCES 2024; 5:100311. [PMID: 38773772 PMCID: PMC11214266 DOI: 10.1016/j.xhgg.2024.100311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 05/16/2024] [Accepted: 05/16/2024] [Indexed: 05/24/2024] Open
Abstract
Expression quantitative trait locus (eQTL) analysis is a popular method of gaining insight into the function of regulatory variation. While cis-eQTL resources have been instrumental in linking genome-wide association study variants to gene function, complex trait heritability may be additionally mediated by other forms of gene regulation. Toward this end, novel eQTL methods leverage gene co-expression (module-QTL) to investigate joint regulation of gene modules by single genetic variants. Here we broadly define a "module-QTL" as the association of a genetic variant with a summary measure of gene co-expression. This approach aims to reduce the multiple testing burden of a trans-eQTL search through the consolidation of gene-based testing and provide biological context to eQTLs shared between genes. In this article we provide an in-depth examination of the co-expression module eQTL (module-QTL) through literature review, theoretical investigation, and real-data application of the module-QTL to three large prefrontal cortex genotype-RNA sequencing datasets. We find module-QTLs in our study that are disease associated and reproducible are not additionally informative beyond cis- or trans-eQTLs for module genes. Through comparison to prior studies, we highlight promises and limitations of the module-QTL across study designs and provide recommendations for further investigation of the module-QTL framework.
Collapse
Affiliation(s)
- Alanna C Cote
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| | - Hannah E Young
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Laura M Huckins
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT 06511, USA.
| |
Collapse
|
29
|
Akinbiyi T, McPeek MS, Abney M. ADELLE: A global testing method for Trans-eQTL mapping. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.24.581871. [PMID: 38464248 PMCID: PMC10925110 DOI: 10.1101/2024.02.24.581871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Understanding the genetic regulatory mechanisms of gene expression is a challenging and ongoing problem. Genetic variants that are associated with expression levels are readily identified when they are proximal to the gene (i.e., cis-eQTLs), but SNPs distant from the gene whose expression levels they are associated with (i.e., trans-eQTLs) have been much more difficult to discover, even though they account for a majority of the heritability in gene expression levels. A major impediment to the identification of more trans-eQTLs is the lack of statistical methods that are powerful enough to overcome the obstacles of small effect sizes and large multiple testing burden of trans-eQTL mapping. Here, we propose ADELLE, a powerful statistical testing framework that requires only summary statistics and is designed to be most sensitive to SNPs that are associated with multiple gene expression levels, a characteristic of many trans-eQTLs. In simulations, we show that for detecting SNPs that are associated with 0.1%-2% of 10,000 traits, among the 7 methods we consider ADELLE is clearly the most powerful overall, with either the highest power or power not significantly different from the highest for all settings in that range. We apply ADELLE to a mouse advanced intercross line data set and show its ability to find trans-eQTLs that were not significant under a standard analysis. This demonstrates that ADELLE is a powerful tool at uncovering trans regulators of genetic expression.
Collapse
|
30
|
He J, Perera D, Wen W, Ping J, Li Q, Lyu L, Chen Z, Shu X, Long J, Cai Q, Shu XO, Zheng W, Long Q, Guo X. Enhancing Disease Risk Gene Discovery by Integrating Transcription Factor-Linked Trans-located Variants into Transcriptome-Wide Association Analyses. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.10.10.23295443. [PMID: 37873299 PMCID: PMC10593059 DOI: 10.1101/2023.10.10.23295443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Transcriptome-wide association studies (TWAS) have been successful in identifying disease susceptibility genes by integrating cis-variants predicted gene expression with genome-wide association studies (GWAS) data. However, trans-located variants for predicting gene expression remain largely unexplored. Here, we introduce transTF-TWAS, which incorporates transcription factor (TF)-linked trans-located variants to enhance model building. Using data from the Genotype-Tissue Expression project, we predict gene expression and alternative splicing and applied these models to large GWAS datasets for breast, prostate, and lung cancers. We demonstrate that transTF-TWAS outperforms other existing TWAS approaches in both constructing gene prediction models and identifying disease-associated genes, as evidenced by simulations and real data analysis. Our transTF-TWAS approach significantly contributes to the discovery of disease risk genes. Findings from this study have shed new light on several genetically driven key regulators and their associated regulatory networks underlying disease susceptibility.
Collapse
|
31
|
Fazel-Najafabadi M, Looger LL, Rallabandi HR, Nath SK. A Multilayered Post-Genome-Wide Association Study Analysis Pipeline Defines Functional Variants and Target Genes for Systemic Lupus Erythematosus. Arthritis Rheumatol 2024; 76:1071-1084. [PMID: 38369936 PMCID: PMC11213670 DOI: 10.1002/art.42829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 01/31/2024] [Accepted: 02/14/2024] [Indexed: 02/20/2024]
Abstract
OBJECTIVE Systemic lupus erythematosus (SLE), an autoimmune disease with incompletely understood etiology, has a strong genetic component. Although genome-wide association studies (GWASs) have revealed multiple SLE susceptibility loci and associated single-nucleotide polymorphisms (SNPs), the precise causal variants, target genes, cell types, tissues, and mechanisms of action remain largely unknown. METHODS Here, we report a comprehensive post-GWAS analysis using extensive bioinformatics, molecular modeling, and integrative functional genomic and epigenomic analyses to optimize fine-mapping. We compile and cross-reference immune cell-specific expression quantitative trait loci (cis- and trans-expression quantitative trait loci) with promoter capture high-throughput capture chromatin conformation (PCHi-C), allele-specific chromatin accessibility, and massively parallel reporter assay data to define predisposing variants and target genes. We experimentally validate a predicted locus using CRISPR/Cas9 genome editing, quantitative polymerase chain reaction, and Western blot. RESULTS Anchoring on 452 index SNPs, we selected 9,931 high linkage disequilibrium (r2 > 0.8) SNPs and defined 182 independent non-human leukocyte antigen (HLA) SLE loci. The 3,746 SNPs from 143 loci were identified as regulating 564 unique genes. Target genes are enriched in lupus-related tissues and associated with other autoimmune diseases. Of these, 329 SNPs (106 loci) showed significant allele-specific chromatin accessibility and/or enhancer activity, indicating regulatory potential. Using CRISPR/Cas9, we validated reference SNP identifier 57668933 (rs57668933) as a functional variant regulating multiple targets, including SLE-risk gene ELF1 in B cells. CONCLUSION We demonstrate and validate post-GWAS strategies for using multidimensional data to prioritize likely causal variants with cognate gene targets underlying SLE pathogenesis. Our results provide a catalog of significantly SLE-associated SNPs and loci, target genes, and likely biochemical mechanisms to guide experimental characterization.
Collapse
Affiliation(s)
- Mehdi Fazel-Najafabadi
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Loren L. Looger
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92121, USA
- Howard Hughes Medical Institute, University of California, San Diego, La Jolla, CA 92121, USA
| | - Harikrishna Reddy Rallabandi
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Swapan K. Nath
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| |
Collapse
|
32
|
Kozlowska J, Humphryes-Kirilov N, Pavlovets A, Connolly M, Kuncheva Z, Horner J, Manso AS, Murray C, Fox JC, McCarthy A. Unveiling new genetic insights in rheumatoid arthritis for drug discovery through Taxonomy3 analysis. Sci Rep 2024; 14:14153. [PMID: 38898196 PMCID: PMC11186831 DOI: 10.1038/s41598-024-64970-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 06/14/2024] [Indexed: 06/21/2024] Open
Abstract
Genetic support for a drug target has been shown to increase the probability of success in drug development, with the potential to reduce attrition in the pharmaceutical industry alongside discovering novel therapeutic targets. It is therefore important to maximise the detection of genetic associations that affect disease susceptibility. Conventional statistical methods such as genome-wide association studies (GWAS) only identify some of the genetic contribution to disease, so novel analytical approaches are required to extract additional insights. C4X Discovery has developed Taxonomy3, a unique method for analysing genetic datasets based on mathematics that is novel in drug discovery. When applied to a previously published rheumatoid arthritis GWAS dataset, Taxonomy3 identified many additional novel genetic signals associated with this autoimmune disease. Follow-up studies using tool compounds support the utility of the method in identifying novel biology and tractable drug targets with genetic support for further investigation.
Collapse
Affiliation(s)
- Justyna Kozlowska
- C4X Discovery Ltd, Manchester One, 53 Portland Street, Manchester, M1 3LD, UK.
| | | | - Anastasia Pavlovets
- C4X Discovery Ltd, Manchester One, 53 Portland Street, Manchester, M1 3LD, UK
| | - Martin Connolly
- C4X Discovery Ltd, Manchester One, 53 Portland Street, Manchester, M1 3LD, UK
| | - Zhana Kuncheva
- C4X Discovery Ltd, Manchester One, 53 Portland Street, Manchester, M1 3LD, UK
| | - Jonathan Horner
- C4X Discovery Ltd, Manchester One, 53 Portland Street, Manchester, M1 3LD, UK
| | - Ana Sousa Manso
- C4X Discovery Ltd, Manchester One, 53 Portland Street, Manchester, M1 3LD, UK
| | - Clare Murray
- C4X Discovery Ltd, Manchester One, 53 Portland Street, Manchester, M1 3LD, UK
| | - J Craig Fox
- C4X Discovery Ltd, Manchester One, 53 Portland Street, Manchester, M1 3LD, UK
| | - Alun McCarthy
- C4X Discovery Ltd, Manchester One, 53 Portland Street, Manchester, M1 3LD, UK
| |
Collapse
|
33
|
Chin IM, Gardell ZA, Corces MR. Decoding polygenic diseases: advances in noncoding variant prioritization and validation. Trends Cell Biol 2024; 34:465-483. [PMID: 38719704 DOI: 10.1016/j.tcb.2024.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 03/12/2024] [Accepted: 03/21/2024] [Indexed: 06/09/2024]
Abstract
Genome-wide association studies (GWASs) provide a key foundation for elucidating the genetic underpinnings of common polygenic diseases. However, these studies have limitations in their ability to assign causality to particular genetic variants, especially those residing in the noncoding genome. Over the past decade, technological and methodological advances in both analytical and empirical prioritization of noncoding variants have enabled the identification of causative variants by leveraging orthogonal functional evidence at increasing scale. In this review, we present an overview of these approaches and describe how this workflow provides the groundwork necessary to move beyond associations toward genetically informed studies on the molecular and cellular mechanisms of polygenic disease.
Collapse
Affiliation(s)
- Iris M Chin
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Zachary A Gardell
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - M Ryan Corces
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
34
|
Benjamin KJM, Chen Q, Eagles NJ, Huuki-Myers LA, Collado-Torres L, Stolz JM, Pertea G, Shin JH, Paquola ACM, Hyde TM, Kleinman JE, Jaffe AE, Han S, Weinberger DR. Analysis of gene expression in the postmortem brain of neurotypical Black Americans reveals contributions of genetic ancestry. Nat Neurosci 2024; 27:1064-1074. [PMID: 38769152 PMCID: PMC11156587 DOI: 10.1038/s41593-024-01636-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 03/29/2024] [Indexed: 05/22/2024]
Abstract
Ancestral differences in genomic variation affect the regulation of gene expression; however, most gene expression studies have been limited to European ancestry samples or adjusted to identify ancestry-independent associations. Here, we instead examined the impact of genetic ancestry on gene expression and DNA methylation in the postmortem brain tissue of admixed Black American neurotypical individuals to identify ancestry-dependent and ancestry-independent contributions. Ancestry-associated differentially expressed genes (DEGs), transcripts and gene networks, while notably not implicating neurons, are enriched for genes related to the immune response and vascular tissue and explain up to 26% of heritability for ischemic stroke, 27% of heritability for Parkinson disease and 30% of heritability for Alzheimer's disease. Ancestry-associated DEGs also show general enrichment for the heritability of diverse immune-related traits but depletion for psychiatric-related traits. We also compared Black and non-Hispanic white Americans, confirming most ancestry-associated DEGs. Our results delineate the extent to which genetic ancestry affects differences in gene expression in the human brain and the implications for brain illness risk.
Collapse
Affiliation(s)
- Kynon J M Benjamin
- Lieber Institute for Brain Development, Baltimore, MD, USA.
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Qiang Chen
- Lieber Institute for Brain Development, Baltimore, MD, USA
| | | | | | - Leonardo Collado-Torres
- Lieber Institute for Brain Development, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Joshua M Stolz
- Lieber Institute for Brain Development, Baltimore, MD, USA
| | - Geo Pertea
- Lieber Institute for Brain Development, Baltimore, MD, USA
| | - Joo Heon Shin
- Lieber Institute for Brain Development, Baltimore, MD, USA
| | - Apuã C M Paquola
- Lieber Institute for Brain Development, Baltimore, MD, USA
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Thomas M Hyde
- Lieber Institute for Brain Development, Baltimore, MD, USA
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Joel E Kleinman
- Lieber Institute for Brain Development, Baltimore, MD, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Andrew E Jaffe
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Neumora Therapeutics, Watertown, MA, USA
| | - Shizhong Han
- Lieber Institute for Brain Development, Baltimore, MD, USA.
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Daniel R Weinberger
- Lieber Institute for Brain Development, Baltimore, MD, USA.
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
35
|
Chakraborty S, Sharma G, Karmakar S, Banerjee S. Multi-OMICS approaches in cancer biology: New era in cancer therapy. Biochim Biophys Acta Mol Basis Dis 2024; 1870:167120. [PMID: 38484941 DOI: 10.1016/j.bbadis.2024.167120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/06/2024] [Accepted: 03/06/2024] [Indexed: 04/01/2024]
Abstract
Innovative multi-omics frameworks integrate diverse datasets from the same patients to enhance our understanding of the molecular and clinical aspects of cancers. Advanced omics and multi-view clustering algorithms present unprecedented opportunities for classifying cancers into subtypes, refining survival predictions and treatment outcomes, and unravelling key pathophysiological processes across various molecular layers. However, with the increasing availability of cost-effective high-throughput technologies (HTT) that generate vast amounts of data, analyzing single layers often falls short of establishing causal relations. Integrating multi-omics data spanning genomes, epigenomes, transcriptomes, proteomes, metabolomes, and microbiomes offers unique prospects to comprehend the underlying biology of complex diseases like cancer. This discussion explores algorithmic frameworks designed to uncover cancer subtypes, disease mechanisms, and methods for identifying pivotal genomic alterations. It also underscores the significance of multi-omics in tumor classifications, diagnostics, and prognostications. Despite its unparalleled advantages, the integration of multi-omics data has been slow to find its way into everyday clinics. A major hurdle is the uneven maturity of different omics approaches and the widening gap between the generation of large datasets and the capacity to process this data. Initiatives promoting the standardization of sample processing and analytical pipelines, as well as multidisciplinary training for experts in data analysis and interpretation, are crucial for translating theoretical findings into practical applications.
Collapse
Affiliation(s)
- Sohini Chakraborty
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Gaurav Sharma
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Sricheta Karmakar
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Satarupa Banerjee
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India.
| |
Collapse
|
36
|
Li XC, Gandara L, Ekelöf M, Richter K, Alexandrov T, Crocker J. Rapid response of fly populations to gene dosage across development and generations. Nat Commun 2024; 15:4551. [PMID: 38811562 PMCID: PMC11137061 DOI: 10.1038/s41467-024-48960-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 05/17/2024] [Indexed: 05/31/2024] Open
Abstract
Although the effects of genetic and environmental perturbations on multicellular organisms are rarely restricted to single phenotypic layers, our current understanding of how developmental programs react to these challenges remains limited. Here, we have examined the phenotypic consequences of disturbing the bicoid regulatory network in early Drosophila embryos. We generated flies with two extra copies of bicoid, which causes a posterior shift of the network's regulatory outputs and a decrease in fitness. We subjected these flies to EMS mutagenesis, followed by experimental evolution. After only 8-15 generations, experimental populations have normalized patterns of gene expression and increased survival. Using a phenomics approach, we find that populations were normalized through rapid increases in embryo size driven by maternal changes in metabolism and ovariole development. We extend our results to additional populations of flies, demonstrating predictability. Together, our results necessitate a broader view of regulatory network evolution at the systems level.
Collapse
Affiliation(s)
- Xueying C Li
- European Molecular Biology Laboratory (EMBL), Heidelberg, Germany.
- College of Life Sciences, Beijing Normal University, Beijing, China.
| | - Lautaro Gandara
- European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Måns Ekelöf
- European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Kerstin Richter
- European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Theodore Alexandrov
- European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
- Molecular Medicine Partnership Unit between EMBL and Heidelberg University, Heidelberg, Germany
- BioInnovation Institute, Copenhagen, Denmark
| | - Justin Crocker
- European Molecular Biology Laboratory (EMBL), Heidelberg, Germany.
| |
Collapse
|
37
|
Ruzicka WB, Mohammadi S, Fullard JF, Davila-Velderrain J, Subburaju S, Tso DR, Hourihan M, Jiang S, Lee HC, Bendl J, Voloudakis G, Haroutunian V, Hoffman GE, Roussos P, Kellis M. Single-cell multi-cohort dissection of the schizophrenia transcriptome. Science 2024; 384:eadg5136. [PMID: 38781388 DOI: 10.1126/science.adg5136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 07/21/2023] [Indexed: 05/25/2024]
Abstract
The complexity and heterogeneity of schizophrenia have hindered mechanistic elucidation and the development of more effective therapies. Here, we performed single-cell dissection of schizophrenia-associated transcriptomic changes in the human prefrontal cortex across 140 individuals in two independent cohorts. Excitatory neurons were the most affected cell group, with transcriptional changes converging on neurodevelopment and synapse-related molecular pathways. Transcriptional alterations included known genetic risk factors, suggesting convergence of rare and common genomic variants on neuronal population-specific alterations in schizophrenia. Based on the magnitude of schizophrenia-associated transcriptional change, we identified two populations of individuals with schizophrenia marked by expression of specific excitatory and inhibitory neuronal cell states. This single-cell atlas links transcriptomic changes to etiological genetic risk factors, contextualizing established knowledge within the human cortical cytoarchitecture and facilitating mechanistic understanding of schizophrenia pathophysiology and heterogeneity.
Collapse
Affiliation(s)
- W Brad Ruzicka
- Laboratory for Epigenomics in Human Psychopathology, McLean Hospital, Belmont, MA 02478, USA
- Department of Psychiatry, Harvard Medical School, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Shahin Mohammadi
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - John F Fullard
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jose Davila-Velderrain
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Neurogenomics Research Center, Human Technopole, 20157 Milan, Italy
| | - Sivan Subburaju
- Laboratory for Epigenomics in Human Psychopathology, McLean Hospital, Belmont, MA 02478, USA
- Department of Psychiatry, Harvard Medical School, Boston, MA 02115, USA
| | - Daniel Reed Tso
- Laboratory for Epigenomics in Human Psychopathology, McLean Hospital, Belmont, MA 02478, USA
| | - Makayla Hourihan
- Laboratory for Epigenomics in Human Psychopathology, McLean Hospital, Belmont, MA 02478, USA
| | - Shan Jiang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Hao-Chih Lee
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jaroslav Bendl
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Georgios Voloudakis
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Vahram Haroutunian
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Mental Illness Research, Education and Clinical Centers, James J. Peters VA Medical Center, Bronx, NY 10468, USA
| | - Gabriel E Hoffman
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Panos Roussos
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Neurogenomics Research Center, Human Technopole, 20157 Milan, Italy
| | - Manolis Kellis
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
38
|
Weine E, Smith SP, Knowlton RK, Harpak A. Tradeoffs in Modeling Context Dependency in Complex Trait Genetics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.06.21.545998. [PMID: 38370664 PMCID: PMC10871201 DOI: 10.1101/2023.06.21.545998] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Genetic effects on complex traits may depend on context, such as age, sex, environmental exposures or social settings. However, it is often unclear if the extent of context dependency, or Gene-by-Environment interaction (GxE), merits more involved models than the additive model typically used to analyze data from genome-wide association studies (GWAS). Here, we suggest considering the utility of GxE models in GWAS as a tradeoff between bias and variance parameters. In particular, We derive a decision rule for choosing between competing models for the estimation of allelic effects. The rule weighs the increased estimation noise when context is considered against the potential bias when context dependency is ignored. In the empirical example of GxSex in human physiology, the increased noise of context-specific estimation often outweighs the bias reduction, rendering GxE models less useful when variants are considered independently. However, we argue that for complex traits, the joint consideration of context dependency across many variants mitigates both noise and bias. As a result, polygenic GxE models can improve both estimation and trait prediction. Finally, we exemplify (using GxDiet effects on longevity in fruit flies) how analyses based on independently ascertained "top hits" alone can be misleading, and that considering polygenic patterns of GxE can improve interpretation.
Collapse
|
39
|
O’Brien NLV, Holland B, Engelstädter J, Ortiz-Barrientos D. The distribution of fitness effects during adaptive walks using a simple genetic network. PLoS Genet 2024; 20:e1011289. [PMID: 38787919 PMCID: PMC11156440 DOI: 10.1371/journal.pgen.1011289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 06/06/2024] [Accepted: 05/04/2024] [Indexed: 05/26/2024] Open
Abstract
The tempo and mode of adaptation depends on the availability of beneficial alleles. Genetic interactions arising from gene networks can restrict this availability. However, the extent to which networks affect adaptation remains largely unknown. Current models of evolution consider additive genotype-phenotype relationships while often ignoring the contribution of gene interactions to phenotypic variance. In this study, we model a quantitative trait as the product of a simple gene regulatory network, the negative autoregulation motif. Using forward-time genetic simulations, we measure adaptive walks towards a phenotypic optimum in both additive and network models. A key expectation from adaptive walk theory is that the distribution of fitness effects of new beneficial mutations is exponential. We found that both models instead harbored distributions with fewer large-effect beneficial alleles than expected. The network model also had a complex and bimodal distribution of fitness effects among all mutations, with a considerable density at deleterious selection coefficients. This behavior is reminiscent of the cost of complexity, where correlations among traits constrain adaptation. Our results suggest that the interactions emerging from genetic networks can generate complex and multimodal distributions of fitness effects.
Collapse
Affiliation(s)
- Nicholas L. V. O’Brien
- School of the Environment, The University of Queensland, Brisbane, Queensland, Australia
- ARC Centre of Excellence for Plant Success in Nature and Agriculture, The University of Queensland, Brisbane, QLD, Australia
| | - Barbara Holland
- School of Natural Sciences, University of Tasmania, Hobart, Tasmania, Australia
- ARC Centre of Excellence for Plant Success in Nature and Agriculture, University of Tasmania, Hobart, Tasmania, Australia
| | - Jan Engelstädter
- School of the Environment, The University of Queensland, Brisbane, Queensland, Australia
- ARC Centre of Excellence for Plant Success in Nature and Agriculture, The University of Queensland, Brisbane, QLD, Australia
| | - Daniel Ortiz-Barrientos
- School of the Environment, The University of Queensland, Brisbane, Queensland, Australia
- ARC Centre of Excellence for Plant Success in Nature and Agriculture, The University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
40
|
Mews MA, Naj AC, Griswold AJ, Below JE, Bush WS. Brain and Blood Transcriptome-Wide Association Studies Identify Five Novel Genes Associated with Alzheimer's Disease. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.17.24305737. [PMID: 38699333 PMCID: PMC11065015 DOI: 10.1101/2024.04.17.24305737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
INTRODUCTION Transcriptome-wide Association Studies (TWAS) extend genome-wide association studies (GWAS) by integrating genetically-regulated gene expression models. We performed the most powerful AD-TWAS to date, using summary statistics from cis -eQTL meta-analyses and the largest clinically-adjudicated Alzheimer's Disease (AD) GWAS. METHODS We implemented the OTTERS TWAS pipeline, leveraging cis -eQTL data from cortical brain tissue (MetaBrain; N=2,683) and blood (eQTLGen; N=31,684) to predict gene expression, then applied these models to AD-GWAS data (Cases=21,982; Controls=44,944). RESULTS We identified and validated five novel gene associations in cortical brain tissue ( PRKAG1 , C3orf62 , LYSMD4 , ZNF439 , SLC11A2 ) and six genes proximal to known AD-related GWAS loci (Blood: MYBPC3 ; Brain: MTCH2 , CYB561 , MADD , PSMA5 , ANXA11 ). Further, using causal eQTL fine-mapping, we generated sparse models that retained the strength of the AD-TWAS association for MTCH2 , MADD , ZNF439 , CYB561 , and MYBPC3 . DISCUSSION Our comprehensive AD-TWAS discovered new gene associations and provided insights into the functional relevance of previously associated variants.
Collapse
|
41
|
Dagostino R, Gottlieb A. Tissue-specific atlas of trans-models for gene regulation elucidates complex regulation patterns. BMC Genomics 2024; 25:377. [PMID: 38632500 PMCID: PMC11022497 DOI: 10.1186/s12864-024-10317-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 04/16/2024] [Indexed: 04/19/2024] Open
Abstract
BACKGROUND Deciphering gene regulation is essential for understanding the underlying mechanisms of healthy and disease states. While the regulatory networks formed by transcription factors (TFs) and their target genes has been mostly studied with relation to cis effects such as in TF binding sites, we focused on trans effects of TFs on the expression of their transcribed genes and their potential mechanisms. RESULTS We provide a comprehensive tissue-specific atlas, spanning 49 tissues of TF variations affecting gene expression through computational models considering two potential mechanisms, including combinatorial regulation by the expression of the TFs, and by genetic variants within the TF. We demonstrate that similarity between tissues based on our discovered genes corresponds to other types of tissue similarity. The genes affected by complex TF regulation, and their modelled TFs, were highly enriched for pharmacogenomic functions, while the TFs themselves were also enriched in several cancer and metabolic pathways. Additionally, genes that appear in multiple clusters are enriched for regulation of immune system while tissue clusters include cluster-specific genes that are enriched for biological functions and diseases previously associated with the tissues forming the cluster. Finally, our atlas exposes multilevel regulation across multiple tissues, where TFs regulate other TFs through the two tested mechanisms. CONCLUSIONS Our tissue-specific atlas provides hierarchical tissue-specific trans genetic regulations that can be further studied for association with human phenotypes.
Collapse
Affiliation(s)
- Robert Dagostino
- McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Assaf Gottlieb
- McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA.
| |
Collapse
|
42
|
Hansen TJ, Fong SL, Day JK, Capra JA, Hodges E. Human gene regulatory evolution is driven by the divergence of regulatory element function in both cis and trans. CELL GENOMICS 2024; 4:100536. [PMID: 38604126 PMCID: PMC11019363 DOI: 10.1016/j.xgen.2024.100536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 01/03/2024] [Accepted: 03/10/2024] [Indexed: 04/13/2024]
Abstract
Gene regulatory divergence between species can result from cis-acting local changes to regulatory element DNA sequences or global trans-acting changes to the regulatory environment. Understanding how these mechanisms drive regulatory evolution has been limited by challenges in identifying trans-acting changes. We present a comprehensive approach to directly identify cis- and trans-divergent regulatory elements between human and rhesus macaque lymphoblastoid cells using assay for transposase-accessible chromatin coupled to self-transcribing active regulatory region (ATAC-STARR) sequencing. In addition to thousands of cis changes, we discover an unexpected number (∼10,000) of trans changes and show that cis and trans elements exhibit distinct patterns of sequence divergence and function. We further identify differentially expressed transcription factors that underlie ∼37% of trans differences and trace how cis changes can produce cascades of trans changes. Overall, we find that most divergent elements (67%) experienced changes in both cis and trans, revealing a substantial role for trans divergence-alone and together with cis changes-in regulatory differences between species.
Collapse
Affiliation(s)
- Tyler J Hansen
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | - Sarah L Fong
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN 37232, USA; Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Jessica K Day
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | - John A Capra
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94143, USA.
| | - Emily Hodges
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN 37232, USA; Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN 37232, USA; Vanderbilt Ingram Cancer Center, Nashville, TN 37232, USA.
| |
Collapse
|
43
|
Wang L, Babushkin N, Liu Z, Liu X. Trans-eQTL mapping in gene sets identifies network effects of genetic variants. CELL GENOMICS 2024; 4:100538. [PMID: 38565144 PMCID: PMC11019359 DOI: 10.1016/j.xgen.2024.100538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 12/08/2023] [Accepted: 03/13/2024] [Indexed: 04/04/2024]
Abstract
Nearly all trait-associated variants identified in genome-wide association studies (GWASs) are noncoding. The cis regulatory effects of these variants have been extensively characterized, but how they affect gene regulation in trans has been the subject of fewer studies because of the difficulty in detecting trans-expression quantitative loci (eQTLs). We developed trans-PCO for detecting trans effects of genetic variants on gene networks. Our simulations demonstrate that trans-PCO substantially outperforms existing trans-eQTL mapping methods. We applied trans-PCO to two gene expression datasets from whole blood, DGN (N = 913) and eQTLGen (N = 31,684), and identified 14,985 high-quality trans-eSNP-module pairs associated with 197 co-expression gene modules and biological processes. We performed colocalization analyses between GWAS loci of 46 complex traits and the trans-eQTLs. We demonstrated that the identified trans effects can help us understand how trait-associated variants affect gene regulatory networks and biological pathways.
Collapse
Affiliation(s)
- Lili Wang
- The Committee on Genetics, Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA; Department of Medicine, Section of Genetic Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Nikita Babushkin
- Department of Medicine, Section of Genetic Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Zhonghua Liu
- Department of Biostatistics, Columbia University, New York, NY 10032, USA
| | - Xuanyao Liu
- The Committee on Genetics, Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA; Department of Medicine, Section of Genetic Medicine, University of Chicago, Chicago, IL 60637, USA; Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
44
|
Lai WY, Nolte V, Jakšić AM, Schlötterer C. Evolution of Phenotypic Variance Provides Insights into the Genetic Basis of Adaptation. Genome Biol Evol 2024; 16:evae077. [PMID: 38620076 PMCID: PMC11057206 DOI: 10.1093/gbe/evae077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 03/27/2024] [Accepted: 04/02/2024] [Indexed: 04/17/2024] Open
Abstract
Most traits are polygenic, and the contributing loci can be identified by genome-wide association studies. The genetic basis of adaptation (adaptive architecture) is, however, difficult to characterize. Here, we propose to study the adaptive architecture of traits by monitoring the evolution of their phenotypic variance during adaptation to a new environment in well-defined laboratory conditions. Extensive computer simulations show that the evolution of phenotypic variance in a replicated experimental evolution setting can distinguish between oligogenic and polygenic adaptive architectures. We compared gene expression variance in male Drosophila simulans before and after 100 generations of adaptation to a novel hot environment. The variance change in gene expression was indistinguishable for genes with and without a significant change in mean expression after 100 generations of evolution. We suggest that the majority of adaptive gene expression evolution can be explained by a polygenic architecture. We propose that tracking the evolution of phenotypic variance across generations can provide an approach to characterize the adaptive architecture.
Collapse
Affiliation(s)
- Wei-Yun Lai
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
- Vienna Graduate School of Population Genetics, Vetmeduni Vienna, Vienna, Austria
| | - Viola Nolte
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | - Ana Marija Jakšić
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
- Vienna Graduate School of Population Genetics, Vetmeduni Vienna, Vienna, Austria
- Present address: École polytechnique fédérale de Lausanne, Lausanne, Switzerland
| | | |
Collapse
|
45
|
Veltsos P, Kelly JK. The quantitative genetics of gene expression in Mimulus guttatus. PLoS Genet 2024; 20:e1011072. [PMID: 38603726 PMCID: PMC11060551 DOI: 10.1371/journal.pgen.1011072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 04/30/2024] [Accepted: 03/23/2024] [Indexed: 04/13/2024] Open
Abstract
Gene expression can be influenced by genetic variants that are closely linked to the expressed gene (cis eQTLs) and variants in other parts of the genome (trans eQTLs). We created a multiparental mapping population by sampling genotypes from a single natural population of Mimulus guttatus and scored gene expression in the leaves of 1,588 plants. We find that nearly every measured gene exhibits cis regulatory variation (91% have FDR < 0.05). cis eQTLs are usually allelic series with three or more functionally distinct alleles. The cis locus explains about two thirds of the standing genetic variance (on average) but varies among genes and tends to be greatest when there is high indel variation in the upstream regulatory region and high nucleotide diversity in the coding sequence. Despite mapping over 10,000 trans eQTL / affected gene pairs, most of the genetic variance generated by trans acting loci remains unexplained. This implies a large reservoir of trans acting genes with subtle or diffuse effects. Mapped trans eQTLs show lower allelic diversity but much higher genetic dominance than cis eQTLs. Several analyses also indicate that trans eQTLs make a substantial contribution to the genetic correlations in expression among different genes. They may thus be essential determinants of "gene expression modules," which has important implications for the evolution of gene expression and how it is studied by geneticists.
Collapse
Affiliation(s)
- Paris Veltsos
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas, United States of America
| | - John K. Kelly
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas, United States of America
| |
Collapse
|
46
|
Melton HJ, Zhang Z, Wu C. SUMMIT-FA: a new resource for improved transcriptome imputation using functional annotations. Hum Mol Genet 2024; 33:624-635. [PMID: 38129112 PMCID: PMC10954367 DOI: 10.1093/hmg/ddad205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 10/24/2023] [Accepted: 11/30/2023] [Indexed: 12/23/2023] Open
Abstract
Transcriptome-wide association studies (TWAS) integrate gene expression prediction models and genome-wide association studies (GWAS) to identify gene-trait associations. The power of TWAS is determined by the sample size of GWAS and the accuracy of the expression prediction model. Here, we present a new method, the Summary-level Unified Method for Modeling Integrated Transcriptome using Functional Annotations (SUMMIT-FA), which improves gene expression prediction accuracy by leveraging functional annotation resources and a large expression quantitative trait loci (eQTL) summary-level dataset. We build gene expression prediction models in whole blood using SUMMIT-FA with the comprehensive functional database MACIE and eQTL summary-level data from the eQTLGen consortium. We apply these models to GWAS for 24 complex traits and show that SUMMIT-FA identifies significantly more gene-trait associations and improves predictive power for identifying "silver standard" genes compared to several benchmark methods. We further conduct a simulation study to demonstrate the effectiveness of SUMMIT-FA.
Collapse
Affiliation(s)
- Hunter J Melton
- Department of Statistics, Florida State University, 214 Rogers Building, 117 N. Woodward Avenue, Tallahassee, FL 32306, United States
| | - Zichen Zhang
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, 7007 Bertner Avenue, Unit 1689, Houston, TX 77030, United States
| | - Chong Wu
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, 7007 Bertner Avenue, Unit 1689, Houston, TX 77030, United States
| |
Collapse
|
47
|
Wittich H, Ardlie K, Taylor KD, Durda P, Liu Y, Mikhaylova A, Gignoux CR, Cho MH, Rich SS, Rotter JI, Manichaikul A, Im HK, Wheeler HE. Transcriptome-wide association study of the plasma proteome reveals cis and trans regulatory mechanisms underlying complex traits. Am J Hum Genet 2024; 111:445-455. [PMID: 38320554 PMCID: PMC10940016 DOI: 10.1016/j.ajhg.2024.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 01/12/2024] [Accepted: 01/12/2024] [Indexed: 02/08/2024] Open
Abstract
Regulation of transcription and translation are mechanisms through which genetic variants affect complex traits. Expression quantitative trait locus (eQTL) studies have been more successful at identifying cis-eQTL (within 1 Mb of the transcription start site) than trans-eQTL. Here, we tested the cis component of gene expression for association with observed plasma protein levels to identify cis- and trans-acting genes that regulate protein levels. We used transcriptome prediction models from 49 Genotype-Tissue Expression (GTEx) Project tissues to predict the cis component of gene expression and tested the predicted expression of every gene in every tissue for association with the observed abundance of 3,622 plasma proteins measured in 3,301 individuals from the INTERVAL study. We tested significant results for replication in 971 individuals from the Trans-omics for Precision Medicine (TOPMed) Multi-Ethnic Study of Atherosclerosis (MESA). We found 1,168 and 1,210 cis- and trans-acting associations that replicated in TOPMed (FDR < 0.05) with a median expected true positive rate (π1) across tissues of 0.806 and 0.390, respectively. The target proteins of trans-acting genes were enriched for transcription factor binding sites and autoimmune diseases in the GWAS catalog. Furthermore, we found a higher correlation between predicted expression and protein levels of the same underlying gene (R = 0.17) than observed expression (R = 0.10, p = 7.50 × 10-11). This indicates the cis-acting genetically regulated (heritable) component of gene expression is more consistent across tissues than total observed expression (genetics + environment) and is useful in uncovering the function of SNPs associated with complex traits.
Collapse
Affiliation(s)
- Henry Wittich
- Program in Bioinformatics, Loyola University Chicago, Chicago, IL 60660, USA
| | - Kristin Ardlie
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kent D Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
| | - Peter Durda
- Laboratory for Clinical Biochemistry Research, University of Vermont, Colchester, VT 05446, USA
| | - Yongmei Liu
- Department of Medicine, Duke University School of Medicine, Durham, NC 27710, USA
| | - Anna Mikhaylova
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Chris R Gignoux
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado Denver Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Michael H Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Stephen S Rich
- Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
| | - Ani Manichaikul
- Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA
| | - Hae Kyung Im
- Section of Genetic Medicine, The University of Chicago, Chicago, IL 60637, USA
| | - Heather E Wheeler
- Program in Bioinformatics, Loyola University Chicago, Chicago, IL 60660, USA; Department of Biology, Loyola University Chicago, Chicago, IL 60660, USA.
| |
Collapse
|
48
|
Mize TJ, Evans LM. Examination of a novel expression-based gene-SNP annotation strategy to identify tissue-specific contributions to heritability in multiple traits. Eur J Hum Genet 2024; 32:263-269. [PMID: 36446896 PMCID: PMC10924090 DOI: 10.1038/s41431-022-01244-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 10/20/2022] [Accepted: 11/15/2022] [Indexed: 12/03/2022] Open
Abstract
Complex traits show clear patterns of tissue-specific expression influenced by single nucleotide polymorphisms (SNPs), yet current strategies aggregate SNP effects to genes by employing simple physical proximity-based windows. Here, we examined whether incorporating SNPs with effects on tissue-specific cis-expression would improve our ability to detect trait-relevant tissues across 31 complex traits using stratified linkage disequilibrium score regression (S-LDSC). We found that a physical proximity annotation produced more significant tissue enrichments and larger S-LDSC regression coefficients, as compared to an expression-based annotation. Furthermore, we showed that our expression-based annotation did not outperform an annotation strategy in which an equal number of randomly chosen SNPs were annotated to genes within the same genomic window, suggesting extensive redundancy among SNP effect estimates due to linkage disequilibrium. That said, current sample sizes limit estimation of cis-genetic SNP effects; therefore, we recommend reexamination of the expression-based annotation when larger tissue-specific expression datasets become available. To examine the influence of sample size, we used a large whole blood eQTL reference panel (N = 31,684) applying a similar expression-based annotation strategy. We found that significant cis-expression QTLs in whole blood did not outperform the physical proximity annotation when estimating tissue-specific SNP heritability enrichment for either high- or low-density lipoprotein phenotypes but performed similarly for inflammatory bowel disease. Finally, we report new and updated tissue enrichment estimates across 31 complex traits, such as significant heritability enrichment of the frontal cortex for cognitive performance, educational attainment, and intelligence, providing further evidence of this structure's importance in higher cognitive function.
Collapse
Affiliation(s)
- Travis J Mize
- Institute for Behavioral Genetics, University of Colorado, Boulder, CO, USA.
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, CO, USA.
| | - Luke M Evans
- Institute for Behavioral Genetics, University of Colorado, Boulder, CO, USA.
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, CO, USA.
| |
Collapse
|
49
|
Lappalainen T, Li YI, Ramachandran S, Gusev A. Genetic and molecular architecture of complex traits. Cell 2024; 187:1059-1075. [PMID: 38428388 DOI: 10.1016/j.cell.2024.01.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/20/2023] [Accepted: 01/16/2024] [Indexed: 03/03/2024]
Abstract
Human genetics has emerged as one of the most dynamic areas of biology, with a broadening societal impact. In this review, we discuss recent achievements, ongoing efforts, and future challenges in the field. Advances in technology, statistical methods, and the growing scale of research efforts have all provided many insights into the processes that have given rise to the current patterns of genetic variation. Vast maps of genetic associations with human traits and diseases have allowed characterization of their genetic architecture. Finally, studies of molecular and cellular effects of genetic variants have provided insights into biological processes underlying disease. Many outstanding questions remain, but the field is well poised for groundbreaking discoveries as it increases the use of genetic data to understand both the history of our species and its applications to improve human health.
Collapse
Affiliation(s)
- Tuuli Lappalainen
- New York Genome Center, New York, NY, USA; Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - Yang I Li
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA; Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Sohini Ramachandran
- Ecology, Evolution and Organismal Biology, Center for Computational Molecular Biology, and the Data Science Institute, Brown University, Providence, RI 029129, USA
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| |
Collapse
|
50
|
Antón-Galindo E, Adel MR, García-González J, Leggieri A, López-Blanch L, Irimia M, Norton WHJ, Brennan CH, Fernàndez-Castillo N, Cormand B. Pleiotropic contribution of rbfox1 to psychiatric and neurodevelopmental phenotypes in two zebrafish models. Transl Psychiatry 2024; 14:99. [PMID: 38374212 PMCID: PMC10876957 DOI: 10.1038/s41398-024-02801-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 01/16/2024] [Accepted: 01/18/2024] [Indexed: 02/21/2024] Open
Abstract
RBFOX1 is a highly pleiotropic gene that contributes to several psychiatric and neurodevelopmental disorders. Both rare and common variants in RBFOX1 have been associated with several psychiatric conditions, but the mechanisms underlying the pleiotropic effects of RBFOX1 are not yet understood. Here we found that, in zebrafish, rbfox1 is expressed in spinal cord, mid- and hindbrain during developmental stages. In adults, expression is restricted to specific areas of the brain, including telencephalic and diencephalic regions with an important role in receiving and processing sensory information and in directing behaviour. To investigate the contribution of rbfox1 to behaviour, we used rbfox1sa15940, a zebrafish mutant line with TL background. We found that rbfox1sa15940 mutants present hyperactivity, thigmotaxis, decreased freezing behaviour and altered social behaviour. We repeated these behavioural tests in a second rbfox1 mutant line with a different genetic background (TU), rbfox1del19, and found that rbfox1 deficiency affects behaviour similarly in this line, although there were some differences. rbfox1del19 mutants present similar thigmotaxis, but stronger alterations in social behaviour and lower levels of hyperactivity than rbfox1sa15940 fish. Taken together, these results suggest that mutations in rbfox1 lead to multiple behavioural changes in zebrafish that might be modulated by environmental, epigenetic and genetic background effects, and that resemble phenotypic alterations present in Rbfox1-deficient mice and in patients with different psychiatric conditions. Our study, thus, highlights the evolutionary conservation of rbfox1 function in behaviour and paves the way to further investigate the mechanisms underlying rbfox1 pleiotropy on the onset of neurodevelopmental and psychiatric disorders.
Collapse
Affiliation(s)
- Ester Antón-Galindo
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona, Barcelona, Catalunya, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Instituto de Salud Carlos III, Madrid, Spain
- Institut de Biomedicina de la Universitat de Barcelona (IBUB), Barcelona, Catalunya, Spain
- Institut de Recerca Sant Joan de Déu (IRSJD), Esplugues de Llobregat, Catalunya, Spain
| | - Maja R Adel
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona, Barcelona, Catalunya, Spain
- Department of Psychiatry, Psychosomatic Medicine and Psychotherapy, University Hospital Frankfurt, Goethe University, Frankfurt am Main, Germany
- Faculty of Biological Sciences, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Judit García-González
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York, NY, NYC 10029, USA
| | - Adele Leggieri
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
| | - Laura López-Blanch
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Catalunya, Spain
| | - Manuel Irimia
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Catalunya, Spain
- Universitat Pompeu Fabra, Barcelona, Catalunya, Spain
- ICREA, Barcelona, Catalunya, Spain
| | - William H J Norton
- Department of Genetics and Genome Biology, College of Life Sciences, University of Leicester, Leicester, UK
| | - Caroline H Brennan
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
| | - Noèlia Fernàndez-Castillo
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona, Barcelona, Catalunya, Spain.
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Instituto de Salud Carlos III, Madrid, Spain.
- Institut de Biomedicina de la Universitat de Barcelona (IBUB), Barcelona, Catalunya, Spain.
- Institut de Recerca Sant Joan de Déu (IRSJD), Esplugues de Llobregat, Catalunya, Spain.
| | - Bru Cormand
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona, Barcelona, Catalunya, Spain.
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Instituto de Salud Carlos III, Madrid, Spain.
- Institut de Biomedicina de la Universitat de Barcelona (IBUB), Barcelona, Catalunya, Spain.
- Institut de Recerca Sant Joan de Déu (IRSJD), Esplugues de Llobregat, Catalunya, Spain.
| |
Collapse
|