1
|
Yaacov O, Mathiyalagan P, Berk-Rauch HE, Ganesh SK, Zhu L, Hoffmann TJ, Iribarren C, Risch N, Lee D, Chakravarti A. Identification of the Molecular Components of Enhancer-Mediated Gene Expression Variation in Multiple Tissues Regulating Blood Pressure. Hypertension 2024; 81:1500-1510. [PMID: 38747164 PMCID: PMC11168860 DOI: 10.1161/hypertensionaha.123.22538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 04/24/2024] [Indexed: 06/14/2024]
Abstract
BACKGROUND Inter-individual variation in blood pressure (BP) arises in part from sequence variants within enhancers modulating the expression of causal genes. We propose that these genes, active in tissues relevant to BP physiology, can be identified from tissue-level epigenomic data and genotypes of BP-phenotyped individuals. METHODS We used chromatin accessibility data from the heart, adrenal, kidney, and artery to identify cis-regulatory elements (CREs) in these tissues and estimate the impact of common human single-nucleotide variants within these CREs on gene expression, using machine learning methods. To identify causal genes, we performed a gene-wise association test. We conducted analyses in 2 separate large-scale cohorts: 77 822 individuals from the Genetic Epidemiology Research on Adult Health and Aging and 315 270 individuals from the UK Biobank. RESULTS We identified 309, 259, 331, and 367 genes (false discovery rate <0.05) for diastolic BP and 191, 184, 204, and 204 genes for systolic BP in the artery, kidney, heart, and adrenal, respectively, in Genetic Epidemiology Research on Adult Health and Aging; 50% to 70% of these genes were replicated in the UK Biobank, significantly higher than the 12% to 15% expected by chance (P<0.0001). These results enabled tissue expression prediction of these 988 to 2875 putative BP genes in individuals of both cohorts to construct an expression polygenic score. This score explained ≈27% of the reported single-nucleotide variant heritability, substantially higher than expected from prior studies. CONCLUSIONS Our work demonstrates the power of tissue-restricted comprehensive CRE analysis, followed by CRE-based expression prediction, for understanding BP regulation in relevant tissues and provides dual-modality supporting evidence, CRE and expression, for the causality genes.
Collapse
Affiliation(s)
- Or Yaacov
- Center for Human Genetics and Genomics, NYU Grossman School of Medicine, New York, NY, USA
| | - Prabhu Mathiyalagan
- Center for Human Genetics and Genomics, NYU Grossman School of Medicine, New York, NY, USA
- Benthos Prime Central, Houston, TX, USA
| | - Hanna E. Berk-Rauch
- Center for Human Genetics and Genomics, NYU Grossman School of Medicine, New York, NY, USA
| | - Santhi K. Ganesh
- Department of Internal Medicine & Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Luke Zhu
- Center for Human Genetics and Genomics, NYU Grossman School of Medicine, New York, NY, USA
| | - Thomas J. Hoffmann
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Carlos Iribarren
- Kaiser Permanente Northern California Division of Research, Oakland, CA, USA
| | - Neil Risch
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Kaiser Permanente Northern California Division of Research, Oakland, CA, USA
| | - Dongwon Lee
- Department of Pediatrics, Division of Nephrology, Boston Children’s Hospital, Boston & Harvard Medical School, Boston, MA, USA
| | - Aravinda Chakravarti
- Center for Human Genetics and Genomics, NYU Grossman School of Medicine, New York, NY, USA
| |
Collapse
|
2
|
Ghoreishifar M, Chamberlain AJ, Xiang R, Prowse-Wilkins CP, Lopdell TJ, Littlejohn MD, Pryce JE, Goddard ME. Allele-specific binding variants causing ChIP-seq peak height of histone modification are not enriched in expression QTL annotations. Genet Sel Evol 2024; 56:50. [PMID: 38937662 PMCID: PMC11212393 DOI: 10.1186/s12711-024-00916-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 06/04/2024] [Indexed: 06/29/2024] Open
Abstract
BACKGROUND Genome sequence variants affecting complex traits (quantitative trait loci, QTL) are enriched in functional regions of the genome, such as those marked by certain histone modifications. These variants are believed to influence gene expression. However, due to the linkage disequilibrium among nearby variants, pinpointing the precise location of QTL is challenging. We aimed to identify allele-specific binding (ASB) QTL (asbQTL) that cause variation in the level of histone modification, as measured by the height of peaks assayed by ChIP-seq (chromatin immunoprecipitation sequencing). We identified DNA sequences that predict the difference between alleles in ChIP-seq peak height in H3K4me3 and H3K27ac histone modifications in the mammary glands of cows. RESULTS We used a gapped k-mer support vector machine, a novel best linear unbiased prediction model, and a multiple linear regression model that combines the other two approaches to predict variant impacts on peak height. For each method, a subset of 1000 sites with the highest magnitude of predicted ASB was considered as candidate asbQTL. The accuracy of this prediction was measured by the proportion where the predicted direction matched the observed direction. Prediction accuracy ranged between 0.59 and 0.74, suggesting that these 1000 sites are enriched for asbQTL. Using independent data, we investigated functional enrichment in the candidate asbQTL set and three control groups, including non-causal ASB sites, non-ASB variants under a peak, and SNPs (single nucleotide polymorphisms) not under a peak. For H3K4me3, a higher proportion of the candidate asbQTL were confirmed as ASB when compared to the non-causal ASB sites (P < 0.01). However, these candidate asbQTL did not enrich for the other annotations, including expression QTL (eQTL), allele-specific expression QTL (aseQTL) and sites conserved across mammals (P > 0.05). CONCLUSIONS We identified putatively causal sites for asbQTL using the DNA sequence surrounding these sites. Our results suggest that many sites influencing histone modifications may not directly affect gene expression. However, it is important to acknowledge that distinguishing between putative causal ASB sites and other non-causal ASB sites in high linkage disequilibrium with the causal sites regarding their impact on gene expression may be challenging due to limitations in statistical power.
Collapse
Affiliation(s)
- Mohammad Ghoreishifar
- Agriculture Victoria Research, AgriBio Centre for AgriBioscience, Bundoora, VIC, 3083, Australia.
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia.
| | - Amanda J Chamberlain
- Agriculture Victoria Research, AgriBio Centre for AgriBioscience, Bundoora, VIC, 3083, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia
| | - Ruidong Xiang
- Agriculture Victoria Research, AgriBio Centre for AgriBioscience, Bundoora, VIC, 3083, Australia
- Faculty of Veterinary & Agricultural Science, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Claire P Prowse-Wilkins
- Agriculture Victoria Research, AgriBio Centre for AgriBioscience, Bundoora, VIC, 3083, Australia
- Faculty of Veterinary & Agricultural Science, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Thomas J Lopdell
- Research and Development, Livestock Improvement Corporation, Private Bag 3016, Hamilton, 3240, New Zealand
| | - Mathew D Littlejohn
- Research and Development, Livestock Improvement Corporation, Private Bag 3016, Hamilton, 3240, New Zealand
| | - Jennie E Pryce
- Agriculture Victoria Research, AgriBio Centre for AgriBioscience, Bundoora, VIC, 3083, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia
| | - Michael E Goddard
- Agriculture Victoria Research, AgriBio Centre for AgriBioscience, Bundoora, VIC, 3083, Australia
- Faculty of Veterinary & Agricultural Science, University of Melbourne, Parkville, VIC, 3010, Australia
| |
Collapse
|
3
|
Kadagandla S, Kapoor A. Identification of candidate causal cis -regulatory variants underlying electrocardiographic QT interval GWAS loci. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.13.584880. [PMID: 38585875 PMCID: PMC10996567 DOI: 10.1101/2024.03.13.584880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Identifying causal variants among tens or hundreds of associated variants at each locus mapped by genome-wide association studies (GWAS) of complex traits is a challenge. As vast majority of GWAS variants are noncoding, sequence variation at cis -regulatory elements affecting transcriptional expression of specific genes is a widely accepted molecular hypothesis. Following this cis -regulatory hypothesis and combining it with the observation that nucleosome-free open chromatin is a universal hallmark of all types of cis -regulatory elements, we aimed to identify candidate causal regulatory variants underlying electrocardiographic QT interval GWAS loci. At a dozen loci, selected for higher effect sizes and a better understanding of the likely causal gene, we identified and included all common variants in high linkage disequilibrium with the GWAS variants as candidate variants. Using ENCODE DNase-seq and ATAC-seq from multiple human adult cardiac left ventricle tissue samples, we generated genome-wide maps of open chromatin regions marking putative regulatory elements. QT interval associated candidate variants were filtered for overlap with cardiac left ventricle open chromatin regions to identify candidate causal cis -regulatory variants, which were further assessed for colocalizing with a known cardiac GTEx expression quantitative trait locus variant as additional evidence for their causal role. Together, these efforts have generated a comprehensive set of candidate causal variants that are expected to be enriched for cis -regulatory potential and thereby, explaining the observed genetic associations.
Collapse
|
4
|
Mitina A, Khan M, Lesurf R, Yin Y, Engchuan W, Hamdan O, Pellecchia G, Trost B, Backstrom I, Guo K, Pallotto LM, Lam Doong PH, Wang Z, Nalpathamkalam T, Thiruvahindrapuram B, Papaz T, Pearson CE, Ragoussis J, Subbarao P, Azad MB, Turvey SE, Mandhane P, Moraes TJ, Simons E, Scherer SW, Lougheed J, Mondal T, Smythe J, Altamirano-Diaz L, Oechslin E, Mital S, Yuen RKC. Genome-wide enhancer-associated tandem repeats are expanded in cardiomyopathy. EBioMedicine 2024; 101:105027. [PMID: 38418263 PMCID: PMC10944212 DOI: 10.1016/j.ebiom.2024.105027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 02/05/2024] [Accepted: 02/06/2024] [Indexed: 03/01/2024] Open
Abstract
BACKGROUND Cardiomyopathy is a clinically and genetically heterogeneous heart condition that can lead to heart failure and sudden cardiac death in childhood. While it has a strong genetic basis, the genetic aetiology for over 50% of cardiomyopathy cases remains unknown. METHODS In this study, we analyse the characteristics of tandem repeats from genome sequence data of unrelated individuals diagnosed with cardiomyopathy from Canada and the United Kingdom (n = 1216) and compare them to those found in the general population. We perform burden analysis to identify genomic and epigenomic features that are impacted by rare tandem repeat expansions (TREs), and enrichment analysis to identify functional pathways that are involved in the TRE-associated genes in cardiomyopathy. We use Oxford Nanopore targeted long-read sequencing to validate repeat size and methylation status of one of the most recurrent TREs. We also compare the TRE-associated genes to those that are dysregulated in the heart tissues of individuals with cardiomyopathy. FINDINGS We demonstrate that tandem repeats that are rarely expanded in the general population are predominantly expanded in cardiomyopathy. We find that rare TREs are disproportionately present in constrained genes near transcriptional start sites, have high GC content, and frequently overlap active enhancer H3K27ac marks, where expansion-related DNA methylation may reduce gene expression. We demonstrate the gene silencing effect of expanded CGG tandem repeats in DIP2B through promoter hypermethylation. We show that the enhancer-associated loci are found in genes that are highly expressed in human cardiomyocytes and are differentially expressed in the left ventricle of the heart in individuals with cardiomyopathy. INTERPRETATION Our findings highlight the underrecognized contribution of rare tandem repeat expansions to the risk of cardiomyopathy and suggest that rare TREs contribute to ∼4% of cardiomyopathy risk. FUNDING Government of Ontario (RKCY), The Canadian Institutes of Health Research PJT 175329 (RKCY), The Azrieli Foundation (RKCY), SickKids Catalyst Scholar in Genetics (RKCY), The University of Toronto McLaughlin Centre (RKCY, SM), Ted Rogers Centre for Heart Research (SM), Data Sciences Institute at the University of Toronto (SM), The Canadian Institutes of Health Research PJT 175034 (SM), The Canadian Institutes of Health Research ENP 161429 under the frame of ERA PerMed (SM, RL), Heart and Stroke Foundation of Ontario & Robert M Freedom Chair in Cardiovascular Science (SM), Bitove Family Professorship of Adult Congenital Heart Disease (EO), Canada Foundation for Innovation (SWS, JR), Canada Research Chair (PS), Genome Canada (PS, JR), The Canadian Institutes of Health Research (PS).
Collapse
Affiliation(s)
- Aleksandra Mitina
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Mahreen Khan
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada; Department of Molecular Genetics, University of Toronto; Toronto, Ontario, Canada
| | - Robert Lesurf
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Yue Yin
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Worrawat Engchuan
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada; The Centre for Applied Genomics, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Omar Hamdan
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada; The Centre for Applied Genomics, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Giovanna Pellecchia
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada; The Centre for Applied Genomics, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Brett Trost
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada; The Centre for Applied Genomics, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Ian Backstrom
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Keyi Guo
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Linda M Pallotto
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Phoenix Hoi Lam Doong
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Zhuozhi Wang
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada; The Centre for Applied Genomics, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Thomas Nalpathamkalam
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada; The Centre for Applied Genomics, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Bhooma Thiruvahindrapuram
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada; The Centre for Applied Genomics, The Hospital for Sick Children; Toronto, Ontario, Canada
| | - Tanya Papaz
- Ted Rogers Centre for Heart Research; Toronto, Ontario, Canada; Division of Cardiology, Department of Pediatrics, The Hospital for Sick Children, University of Toronto; Toronto, Ontario, Canada
| | - Christopher E Pearson
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada; Department of Molecular Genetics, University of Toronto; Toronto, Ontario, Canada
| | - Jiannis Ragoussis
- McGill Genome Centre, Victor Phillip Dahdaleh Institute of Genomic Medicine, McGill University, Montreal, Quebec, Canada
| | - Padmaja Subbarao
- Department of Paediatrics, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada; Department of Physiology, University of Toronto, Toronto, Ontario, Canada; Program in Translation Medicine & Division of Respiratory Medicine, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Meghan B Azad
- Department of Pediatrics and Child Health, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Stuart E Turvey
- Department of Pediatrics, BC Children's Hospital, University of British Columbia, Vancouver, British Columbia, Canada
| | - Piushkumar Mandhane
- Department of Pediatrics, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Alberta, Canada
| | - Theo J Moraes
- Department of Paediatrics, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada; Program in Translation Medicine & Division of Respiratory Medicine, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Elinor Simons
- Department of Pediatrics and Child Health, Section of Allergy and Clinical Immunology, University of Manitoba, Winnipeg, Manitoba, Canada; Children's Hospital Research Institute of Manitoba, Winnipeg, Manitoba, Canada
| | - Stephen W Scherer
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada; The Centre for Applied Genomics, The Hospital for Sick Children; Toronto, Ontario, Canada; Department of Molecular Genetics and McLaughlin Centre, University of Toronto, Toronto, Ontario, Canada
| | - Jane Lougheed
- Division of Cardiology, Children's Hospital of Eastern Ontario, Ottawa, Ontario, Canada
| | - Tapas Mondal
- Division of Cardiology, Department of Pediatrics, McMaster Children's Hospital, Hamilton, Ontario, Canada
| | - John Smythe
- Division of Cardiology, Department of Pediatrics, Kingston General Hospital, Kingston, Ontario, Canada
| | - Luis Altamirano-Diaz
- Division of Cardiology, Department of Pediatrics, London Health Sciences Centre, London, Ontario, Canada
| | - Erwin Oechslin
- Division of Cardiology, Toronto Adult Congenital Heart Disease Program at Peter Munk Cardiac Centre, Department of Medicine, University Health Network, and University of Toronto, Toronto, Ontario, Canada
| | - Seema Mital
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada; Ted Rogers Centre for Heart Research; Toronto, Ontario, Canada; Division of Cardiology, Department of Pediatrics, The Hospital for Sick Children, University of Toronto; Toronto, Ontario, Canada.
| | - Ryan K C Yuen
- Genetics and Genome Biology, The Hospital for Sick Children; Toronto, Ontario, Canada; Department of Molecular Genetics, University of Toronto; Toronto, Ontario, Canada; The Centre for Applied Genomics, The Hospital for Sick Children; Toronto, Ontario, Canada.
| |
Collapse
|
5
|
Lee D, Han SK, Yaacov O, Berk-Rauch H, Mathiyalagan P, Ganesh SK, Chakravarti A. Tissue-specific and tissue-agnostic effects of genome sequence variation modulating blood pressure. Cell Rep 2023; 42:113351. [PMID: 37910504 PMCID: PMC10726310 DOI: 10.1016/j.celrep.2023.113351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 09/21/2023] [Accepted: 10/11/2023] [Indexed: 11/03/2023] Open
Abstract
Genome-wide association studies (GWASs) have identified numerous variants associated with polygenic traits and diseases. However, with few exceptions, a mechanistic understanding of which variants affect which genes in which tissues to modulate trait variation is lacking. Here, we present genomic analyses to explain trait heritability of blood pressure (BP) through the genetics of transcriptional regulation using GWASs, multiomics data from different tissues, and machine learning approaches. Approximately 500,000 predicted regulatory variants across four tissues explain 33.4% of variant heritability: 2.5%, 5.3%, 7.7%, and 11.8% for kidney-, adrenal-, heart-, and artery-specific variants, respectively. Variation in the enhancers involved shows greater tissue specificity than in the genes they regulate, suggesting that gene regulatory networks perturbed by enhancer variants in a tissue relevant to a phenotype are the major source of interindividual variation in BP. Thus, our study provides an approach to scan human tissue and cell types for their physiological contribution to any trait.
Collapse
Affiliation(s)
- Dongwon Lee
- Department of Pediatrics, Division of Nephrology, Boston Children's Hospital, Boston & Harvard Medical School, Boston, MA, USA.
| | - Seong Kyu Han
- Department of Pediatrics, Division of Nephrology, Boston Children's Hospital, Boston & Harvard Medical School, Boston, MA, USA
| | - Or Yaacov
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
| | - Hanna Berk-Rauch
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
| | - Prabhu Mathiyalagan
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
| | - Santhi K Ganesh
- Department of Internal Medicine & Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Aravinda Chakravarti
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA.
| |
Collapse
|
6
|
Kim EE, Shekhar A, Ramachandran J, Khodadadi-Jamayran A, Liu FY, Zhang J, Fishman GI. The transcription factor EBF1 non-cell-autonomously regulates cardiac growth and differentiation. Development 2023; 150:dev202054. [PMID: 37787076 PMCID: PMC10652039 DOI: 10.1242/dev.202054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 09/18/2023] [Indexed: 10/04/2023]
Abstract
Reciprocal interactions between non-myocytes and cardiomyocytes regulate cardiac growth and differentiation. Here, we report that the transcription factor Ebf1 is highly expressed in non-myocytes and potently regulates heart development. Ebf1-deficient hearts display myocardial hypercellularity and reduced cardiomyocyte size, ventricular conduction system hypoplasia, and conduction system disease. Growth abnormalities in Ebf1 knockout hearts are observed as early as embryonic day 13.5. Transcriptional profiling of Ebf1-deficient embryonic cardiac non-myocytes demonstrates dysregulation of Polycomb repressive complex 2 targets, and ATAC-Seq reveals altered chromatin accessibility near many of these same genes. Gene set enrichment analysis of differentially expressed genes in cardiomyocytes isolated from E13.5 hearts of wild-type and mutant mice reveals significant enrichment of MYC targets and, consistent with this finding, we observe increased abundance of MYC in mutant hearts. EBF1-deficient non-myocytes, but not wild-type non-myocytes, are sufficient to induce excessive accumulation of MYC in co-cultured wild-type cardiomyocytes. Finally, we demonstrate that BMP signaling induces Ebf1 expression in embryonic heart cultures and controls a gene program enriched in EBF1 targets. These data reveal a previously unreported non-cell-autonomous pathway controlling cardiac growth and differentiation.
Collapse
Affiliation(s)
- Eugene E. Kim
- Leon H. Charney Division of Cardiology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Akshay Shekhar
- Leon H. Charney Division of Cardiology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Jayalakshmi Ramachandran
- Leon H. Charney Division of Cardiology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | | | - Fang-Yu Liu
- Leon H. Charney Division of Cardiology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Jie Zhang
- Leon H. Charney Division of Cardiology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Glenn I. Fishman
- Leon H. Charney Division of Cardiology, NYU Grossman School of Medicine, New York, NY 10016, USA
| |
Collapse
|
7
|
Han SK, McNulty MT, Benway CJ, Wen P, Greenberg A, Onuchic-Whitford AC, Jang D, Flannick J, Burtt NP, Wilson PC, Humphreys BD, Wen X, Han Z, Lee D, Sampson MG. Mapping genomic regulation of kidney disease and traits through high-resolution and interpretable eQTLs. Nat Commun 2023; 14:2229. [PMID: 37076491 PMCID: PMC10115815 DOI: 10.1038/s41467-023-37691-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 03/27/2023] [Indexed: 04/21/2023] Open
Abstract
Expression quantitative trait locus (eQTL) studies illuminate genomic variants that regulate specific genes and contribute to fine-mapped loci discovered via genome-wide association studies (GWAS). Efforts to maximize their accuracy are ongoing. Using 240 glomerular (GLOM) and 311 tubulointerstitial (TUBE) micro-dissected samples from human kidney biopsies, we discovered 5371 GLOM and 9787 TUBE genes with at least one variant significantly associated with expression (eGene) by incorporating kidney single-nucleus open chromatin data and transcription start site distance as an "integrative prior" for Bayesian statistical fine-mapping. The use of an integrative prior resulted in higher resolution eQTLs illustrated by (1) smaller numbers of variants in credible sets with greater confidence, (2) increased enrichment of partitioned heritability for GWAS of two kidney traits, (3) an increased number of variants colocalized with the GWAS loci, and (4) enrichment of computationally predicted functional regulatory variants. A subset of variants and genes were validated experimentally in vitro and using a Drosophila nephrocyte model. More broadly, this study demonstrates that tissue-specific eQTL maps informed by single-nucleus open chromatin data have enhanced utility for diverse downstream analyses.
Collapse
Affiliation(s)
- Seong Kyu Han
- Division of Pediatric Nephrology, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Kidney Disease Initiative, Broad Institute, Cambridge, MA, USA
| | - Michelle T McNulty
- Division of Pediatric Nephrology, Boston Children's Hospital, Boston, MA, USA
- Kidney Disease Initiative, Broad Institute, Cambridge, MA, USA
| | - Christopher J Benway
- Division of Pediatric Nephrology, Boston Children's Hospital, Boston, MA, USA
- Kidney Disease Initiative, Broad Institute, Cambridge, MA, USA
| | - Pei Wen
- Center for Precision Disease Modeling, University of Maryland, School of Medicine, Baltimore, MD, USA
| | - Anya Greenberg
- Division of Pediatric Nephrology, Boston Children's Hospital, Boston, MA, USA
- Kidney Disease Initiative, Broad Institute, Cambridge, MA, USA
| | - Ana C Onuchic-Whitford
- Division of Pediatric Nephrology, Boston Children's Hospital, Boston, MA, USA
- Kidney Disease Initiative, Broad Institute, Cambridge, MA, USA
- Division of Renal Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Dongkeun Jang
- Programs in Metabolism and Medical and Population Genetics, Broad Institute, Cambridge, MA, USA
| | - Jason Flannick
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Programs in Metabolism and Medical and Population Genetics, Broad Institute, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Noël P Burtt
- Programs in Metabolism and Medical and Population Genetics, Broad Institute, Cambridge, MA, USA
| | - Parker C Wilson
- Department of Pathology and Immunology, Washington University in St. Louis, St. Louis, MO, USA
| | - Benjamin D Humphreys
- Division of Nephrology, Department of Medicine, Washington University in St. Louis, St. Louis, MO, USA
- Department of Developmental Biology, Washington University in St. Louis, St. Louis, MO, USA
| | - Xiaoquan Wen
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Zhe Han
- Center for Precision Disease Modeling, University of Maryland, School of Medicine, Baltimore, MD, USA.
| | - Dongwon Lee
- Division of Pediatric Nephrology, Boston Children's Hospital, Boston, MA, USA.
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA.
- Kidney Disease Initiative, Broad Institute, Cambridge, MA, USA.
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA.
| | - Matthew G Sampson
- Division of Pediatric Nephrology, Boston Children's Hospital, Boston, MA, USA.
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA.
- Kidney Disease Initiative, Broad Institute, Cambridge, MA, USA.
- Division of Renal Medicine, Brigham and Women's Hospital, Boston, MA, USA.
| |
Collapse
|
8
|
Han SK, Muto Y, Wilson PC, Humphreys BD, Sampson MG, Chakravarti A, Lee D. Quality assessment and refinement of chromatin accessibility data using a sequence-based predictive model. Proc Natl Acad Sci U S A 2022; 119:e2212810119. [PMID: 36508674 PMCID: PMC9907136 DOI: 10.1073/pnas.2212810119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 10/28/2022] [Indexed: 12/15/2022] Open
Abstract
Chromatin accessibility assays are central to the genome-wide identification of gene regulatory elements associated with transcriptional regulation. However, the data have highly variable quality arising from several biological and technical factors. To surmount this problem, we developed a sequence-based machine learning method to evaluate and refine chromatin accessibility data. Our framework, gapped k-mer SVM quality check (gkmQC), provides the quality metrics for a sample based on the prediction accuracy of the trained models. We tested 886 DNase-seq samples from the ENCODE/Roadmap projects to demonstrate that gkmQC can effectively identify "high-quality" (HQ) samples with low conventional quality scores owing to marginal read depths. Peaks identified in HQ samples are more accurately aligned at functional regulatory elements, show greater enrichment of regulatory elements harboring functional variants, and explain greater heritability of phenotypes from their relevant tissues. Moreover, gkmQC can optimize the peak-calling threshold to identify additional peaks, especially for rare cell types in single-cell chromatin accessibility data.
Collapse
Affiliation(s)
- Seong Kyu Han
- Department of Pediatrics, Division of Nephrology, Boston Children’s Hospital, Boston & Harvard Medical School, Boston, MA02115
- Kidney Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA02142
| | - Yoshiharu Muto
- Division of Nephrology, Department of Medicine, Washington University in St. Louis, St. Louis, MO63130
| | - Parker C. Wilson
- Department of Pathology and Immunology, Washington University in St. Louis, St. Louis, MO63130
| | - Benjamin D. Humphreys
- Division of Nephrology, Department of Medicine, Washington University in St. Louis, St. Louis, MO63130
- Department of Developmental Biology, Washington University in St. Louis, St. Louis, MO63130
| | - Matthew G. Sampson
- Department of Pediatrics, Division of Nephrology, Boston Children’s Hospital, Boston & Harvard Medical School, Boston, MA02115
- Kidney Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA02142
| | - Aravinda Chakravarti
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY10016
| | - Dongwon Lee
- Department of Pediatrics, Division of Nephrology, Boston Children’s Hospital, Boston & Harvard Medical School, Boston, MA02115
- Manton Center for Orphan Disease Research, Boston Children’s Hospital, Boston, MA02115
| |
Collapse
|
9
|
Could routine forensic STR genotyping data leak personal phenotypic information? Forensic Sci Int 2022; 335:111311. [PMID: 35468577 DOI: 10.1016/j.forsciint.2022.111311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 03/19/2022] [Accepted: 04/13/2022] [Indexed: 11/22/2022]
Abstract
The application of forensic genetic markers must comply with privacy rights and legal policies on a premise that the markers do not expose phenotypic information. The most widely-used short tandem repeats (STRs) are generally viewed as 'junk' DNA because most STRs are located in non-coding regions and therefore refrain from leaking phenotypic traits. But with a deepening understanding of phenotypes and underlying genetic structure, whether STRs could potentially reflect any phenotypic information may need re-examining. Therefore, we performed the following analyses. First, we analyzed the association between 15 STRs and three facial characteristics (single or double eyelid, with or without epicanthus, unattached or attached earlobe) on 721 unrelated Han Chinese individuals. Then, we collected 27199 individuals' STRs and geographic data from the literature to investigate the association between STRs and bio-geographic information, and predict geographic information by STRs on additional 1993 unrelated individuals. We found that there was scarcely any association between STRs with studied facial characteristics. Although allele19 in D2S1338 and allele 18 in FGA (P = 0.0032, P = 0.0030, respectively after Bonferroni correction) showed statistical significance, the prediction effectiveness was very low. For the STRs and bio-geographic information, the principal component analysis showed the first three components could explain 87.7% of the variance, but the prediction accuracy only reached 25.2%. We demonstrated that the forensic phenotypes are usually complex traits, it is hardly possible to uncover phenotypic information by testing only dozens of STR loci.
Collapse
|
10
|
Abstract
The Human Genome Project marked a major milestone in the scientific community as it unravelled the ~3 billion bases that are central to crucial aspects of human life. Despite this achievement, it only scratched the surface of understanding how each nucleotide matters, both individually and as part of a larger unit. Beyond the coding genome, which comprises only ~2% of the whole genome, scientists have realized that large portions of the genome, not known to code for any protein, were crucial for regulating the coding genes. These large portions of the genome comprise the 'non-coding genome'. The history of gene regulation mediated by proteins that bind to the regulatory non-coding genome dates back many decades to the 1960s. However, the original definition of 'enhancers' was first used in the early 1980s. In this Review, we summarize benchmark studies that have mapped the role of cardiac enhancers in disease and development. We highlight instances in which enhancer-localized genetic variants explain the missing link to cardiac pathogenesis. Finally, we inspire readers to consider the next phase of exploring enhancer-based gene therapy for cardiovascular disease.
Collapse
|
11
|
Jiang X, Li T, Liu S, Fu Q, Li F, Chen S, Sun K, Xu R, Xu Y. Variants in a cis-regulatory element of TBX1 in conotruncal heart defect patients impair GATA6-mediated transactivation. Orphanet J Rare Dis 2021; 16:334. [PMID: 34332615 PMCID: PMC8325851 DOI: 10.1186/s13023-021-01981-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 07/25/2021] [Indexed: 12/14/2022] Open
Abstract
Background TBX1 (T-box transcription factor 1) is a major candidate gene that likely contributes to the etiology of velo-cardio-facial syndrome/DiGeorge syndrome (VCFS/DGS). Although the haploinsufficiency of TBX1 in both mice and humans results in congenital cardiac malformations, little has been elucidated about its upstream regulation. We aimed to explore the transcriptional regulation and dysregulation of TBX1. Methods Different TBX1 promoter reporters were constructed. Luciferase assays and electrophoretic mobility shift assays (EMSAs) were used to identify a cis-regulatory element within the TBX1 promoter region and its trans-acting factor. The expression of proteins was identified by immunohistochemistry and immunofluorescence. Variants in the cis-regulatory element were screened in conotruncal defect (CTD) patients. In vitro functional assays were performed to show the effects of the variants found in CTD patients on the transactivation of TBX1. Results We identified a cis-regulatory element within intron 1 of TBX1 that was found to be responsive to GATA6 (GATA binding protein 6), a transcription factor crucial for cardiogenesis. The expression patterns of GATA6 and TBX1 overlapped in the pharyngeal arches of human embryos. Transfection experiments and EMSA indicated that GATA6 could activate the transcription of TBX1 by directly binding with its GATA cis-regulatory element in vitro. Furthermore, sequencing analyses of 195 sporadic CTD patients without the 22q11.2 deletion or duplication identified 3 variants (NC_000022.11:g.19756832C > G, NC_000022.11:g.19756845C > T, and NC_000022.11:g. 19756902G > T) in the non-coding cis-regulatory element of TBX1. Luciferase assays showed that all 3 variants led to reduced transcription of TBX1 when incubated with GATA6. Conclusions Our findings showed that TBX1 might be a direct transcriptional target of GATA6, and variants in the non-coding cis-regulatory element of TBX1 disrupted GATA6-mediated transactivation. Supplementary Information The online version contains supplementary material available at 10.1186/s13023-021-01981-4.
Collapse
Affiliation(s)
- Xuechao Jiang
- Scientific Research Center, Xinhua Hospital, Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China
| | - Tingting Li
- Department of Pediatric Cardiology, Xinhua Hospital, Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China
| | - Sijie Liu
- Department of Pediatric Cardiology, Xinhua Hospital, Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China
| | - Qihua Fu
- Medical Laboratory, Shanghai Children's Medical Center, Affiliated to Shanghai Jiao Tong University School of Medicine , Shanghai, 200127, China
| | - Fen Li
- Department of Pediatric Cardiology, Shanghai Children's Medical Center, Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200127, China
| | - Sun Chen
- Department of Pediatric Cardiology, Xinhua Hospital, Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China
| | - Kun Sun
- Department of Pediatric Cardiology, Xinhua Hospital, Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China
| | - Rang Xu
- Scientific Research Center, Xinhua Hospital, Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China.
| | - Yuejuan Xu
- Department of Pediatric Cardiology, Xinhua Hospital, Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China.
| |
Collapse
|
12
|
Lee D, Kapoor A, Lee C, Mudgett M, Beer MA, Chakravarti A. Sequence-based correction of barcode bias in massively parallel reporter assays. Genome Res 2021; 31:1638-1645. [PMID: 34285053 PMCID: PMC8415370 DOI: 10.1101/gr.268599.120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Accepted: 07/07/2021] [Indexed: 11/24/2022]
Abstract
Massively parallel reporter assays (MPRAs) are a high-throughput method for evaluating in vitro activities of thousands of candidate cis-regulatory elements (CREs). In these assays, candidate sequences are cloned upstream or downstream from a reporter gene tagged by unique DNA sequences. However, tag sequences may themselves affect reporter gene expression and lead to major potential biases in the measured cis-regulatory activity. Here, we present a sequence-based method for correcting tag-sequence-specific effects and show that our method can significantly reduce this source of variation and improve the identification of functional regulatory variants by MPRAs. We also show that our model captures sequence features associated with post-transcriptional regulation of mRNA. Thus, this new method helps not only to improve detection of regulatory signals in MPRA experiments but also to design better MPRA protocols.
Collapse
Affiliation(s)
| | - Ashish Kapoor
- University of Texas Health Science Center at Houston
| | | | | | | | | |
Collapse
|
13
|
Benway CJ, Liu J, Guo F, Du F, Randell SH, Cho MH, Silverman EK, Zhou X. Chromatin Landscapes of Human Lung Cells Predict Potentially Functional Chronic Obstructive Pulmonary Disease Genome-Wide Association Study Variants. Am J Respir Cell Mol Biol 2021; 65:92-102. [PMID: 33788674 DOI: 10.1165/rcmb.2020-0475oc] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Genome-wide association studies (GWASs) have identified dozens of loci associated with risk of chronic obstructive pulmonary disease (COPD). However, identifying the causal variants and their functional role in the appropriate cell type remains a major challenge. We aimed to identify putative causal variants in 82 GWAS loci associated with COPD susceptibility and predict the regulatory impact of these variants in lung-cell types. We used an integrated approach featuring statistical fine mapping, open chromatin profiling, and machine learning to identify functional variants. We generated chromatin accessibility data using the Assay for Transposase-Accessible Chromatin with High-Throughput Sequencing (ATAC-seq) for human primary lung-cell types implicated in COPD pathobiology. We then evaluated the enrichment of COPD risk variants in lung-specific open chromatin regions and generated cell type-specific regulatory predictions for >6,500 variants corresponding to 82 COPD GWAS loci. Integration of the fine-mapped variants with lung open chromatin regions helped prioritize 22 variants in putative regulatory elements with potential functional effects. Comparison with functional predictions from 222 Encyclopedia of DNA Elements (ENCODE) cell samples revealed cell type-specific regulatory effects of COPD variants in the lung epithelium, endothelium, and immune cells. We identified potential causal variants for COPD risk by integrating fine mapping in GWAS loci with cell-specific regulatory profiling, highlighting the importance of leveraging the chromatin status in relevant cell types to predict the molecular effects of risk variants in lung disease.
Collapse
Affiliation(s)
| | | | - Feng Guo
- Channing Division of Network Medicine and
| | - Fei Du
- Channing Division of Network Medicine and
| | - Scott H Randell
- Department of Cell Biology and Physiology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Michael H Cho
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts; and
| | - Edwin K Silverman
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts; and
| | - Xiaobo Zhou
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts; and
| | | |
Collapse
|
14
|
Sharma K, Mishra A, Singh HN, Prashar D, Alam P, Thinlas T, Mohammad G, Kukreti R, Syed MA, Pasha MAQ. High-altitude pulmonary edema is aggravated by risk-loci and associated transcription factors in HIF-prolyl hydroxylases. Hum Mol Genet 2021; 30:1734-1749. [PMID: 34007987 DOI: 10.1093/hmg/ddab139] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 05/10/2021] [Accepted: 05/11/2021] [Indexed: 11/15/2022] Open
Abstract
High-altitude (HA, > 2500 meters) hypoxic exposure evokes several physiological processes that may be abetted by differential genetic distribution in sojourners, who are susceptible to various HA disorders, such as high-altitude pulmonary edema (HAPE). The genetic variants in hypoxia-sensing genes influence the transcriptional output, however the functional role has not been investigated in HAPE. This study explored the two hypoxia-sensing genes, prolyl hydroxylase domain protein 2 (EGLN1) and factor inhibiting HIF-1α (HIF1AN) in HA adaptation and maladaptation in three well-characterized groups: highland natives, HAPE-free controls and HAPE-patients. The two genes were sequenced and subsequently validated through genotyping of significant SNPs, haplotyping and MDR. Three EGLN1 SNPs rs1538664, rs479200 and rs480902 and their haplotypes emerged significant in HAPE. Blood gene expression and protein levels also differed significantly (P < 0.05) and correlated with clinical parameters and respective alleles. The RegulomeDB annotation exercises of the loci corroborated regulatory role. Allele-specific differential expression was evidenced by luciferase assay followed by electrophoretic mobility shift assay, LC-MS/MS and supershift assays, which confirmed allele-specific transcription factor (TF) binding of FUS RNA binding protein (FUS) with rs1538664A, Rho GDP dissociation inhibitor 1 (RhoGDH1) with rs479200T and Hypoxia up-regulated protein 1 (HYOU1) with rs480902C. Docking simulation studies were in sync for the DNA-TF structural variations. There was strong networking among the TFs that revealed physiological consequences through relevant pathways. The two hydroxylases appear crucial in the regulation of hypoxia-inducible responses.
Collapse
Affiliation(s)
- Kavita Sharma
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology, Delhi, 110007, India.,Department of Biotechnology, Jamia Millia Islamia, New Delhi, 110025, India
| | - Aastha Mishra
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology, Delhi, 110007, India
| | - Himanshu N Singh
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology, Delhi, 110007, India
| | - Deepak Prashar
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology, Delhi, 110007, India
| | - Perwez Alam
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology, Delhi, 110007, India.,Department of Pathology and Laboratory Medicine, College of Medicine, University of Cincinnati, OH, USA
| | | | | | - Ritushree Kukreti
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology, Delhi, 110007, India
| | - Mansoor Ali Syed
- Department of Biotechnology, Jamia Millia Islamia, New Delhi, 110025, India
| | - M A Qadar Pasha
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology, Delhi, 110007, India.,Indian Council of Medical Research, New Delhi, 110029, India
| |
Collapse
|
15
|
Yuan X, Scott IC, Wilson MD. Heart Enhancers: Development and Disease Control at a Distance. Front Genet 2021; 12:642975. [PMID: 33777110 PMCID: PMC7987942 DOI: 10.3389/fgene.2021.642975] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 01/29/2021] [Indexed: 12/14/2022] Open
Abstract
Bound by lineage-determining transcription factors and signaling effectors, enhancers play essential roles in controlling spatiotemporal gene expression profiles during development, homeostasis and disease. Recent synergistic advances in functional genomic technologies, combined with the developmental biology toolbox, have resulted in unprecedented genome-wide annotation of heart enhancers and their target genes. Starting with early studies of vertebrate heart enhancers and ending with state-of-the-art genome-wide enhancer discovery and testing, we will review how studying heart enhancers in metazoan species has helped inform our understanding of cardiac development and disease.
Collapse
Affiliation(s)
- Xuefei Yuan
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
- Program in Developmental and Stem Cell Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Ian C. Scott
- Program in Developmental and Stem Cell Biology, The Hospital for Sick Children, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Michael D. Wilson
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
16
|
Abstract
Spatiotemporal control of gene expression during development requires orchestrated activities of numerous enhancers, which are cis-regulatory DNA sequences that, when bound by transcription factors, support selective activation or repression of associated genes. Proper activation of enhancers is critical during embryonic development, adult tissue homeostasis, and regeneration, and inappropriate enhancer activity is often associated with pathological conditions such as cancer. Multiple consortia [e.g., the Encyclopedia of DNA Elements (ENCODE) Consortium and National Institutes of Health Roadmap Epigenomics Mapping Consortium] and independent investigators have mapped putative regulatory regions in a large number of cell types and tissues, but the sequence determinants of cell-specific enhancers are not yet fully understood. Machine learning approaches trained on large sets of these regulatory regions can identify core transcription factor binding sites and generate quantitative predictions of enhancer activity and the impact of sequence variants on activity. Here, we review these computational methods in the context of enhancer prediction and gene regulatory network models specifying cell fate.
Collapse
Affiliation(s)
- Michael A Beer
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205, USA;
| | - Dustin Shigaki
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205, USA;
| | | |
Collapse
|
17
|
Nandakumar P, Lee D, Hoffmann TJ, Ehret GB, Arking D, Ranatunga D, Li M, Grove ML, Boerwinkle E, Schaefer C, Kwok PY, Iribarren C, Risch N, Chakravarti A. Analysis of putative cis-regulatory elements regulating blood pressure variation. Hum Mol Genet 2020; 29:1922-1932. [PMID: 32436959 PMCID: PMC7372556 DOI: 10.1093/hmg/ddaa098] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 03/29/2020] [Accepted: 05/06/2020] [Indexed: 12/21/2022] Open
Abstract
Hundreds of loci have been associated with blood pressure (BP) traits from many genome-wide association studies. We identified an enrichment of these loci in aorta and tibial artery expression quantitative trait loci in our previous work in ~100 000 Genetic Epidemiology Research on Aging study participants. In the present study, we sought to fine-map known loci and identify novel genes by determining putative regulatory regions for these and other tissues relevant to BP. We constructed maps of putative cis-regulatory elements (CREs) using publicly available open chromatin data for the heart, aorta and tibial arteries, and multiple kidney cell types. Variants within these regions may be evaluated quantitatively for their tissue- or cell-type-specific regulatory impact using deltaSVM functional scores, as described in our previous work. We aggregate variants within these putative CREs within 50 Kb of the start or end of 'expressed' genes in these tissues or cell types using public expression data and use deltaSVM scores as weights in the group-wise sequence kernel association test to identify candidates. We test for association with both BP traits and expression within these tissues or cell types of interest and identify the candidates MTHFR, C10orf32, CSK, NOV, ULK4, SDCCAG8, SCAMP5, RPP25, HDGFRP3, VPS37B and PPCDC. Additionally, we examined two known QT interval genes, SCN5A and NOS1AP, in the Atherosclerosis Risk in Communities Study, as a positive control, and observed the expected heart-specific effect. Thus, our method identifies variants and genes for further functional testing using tissue- or cell-type-specific putative regulatory information.
Collapse
Affiliation(s)
- Priyanka Nandakumar
- Department of Genetic Medicine, McKusick-Nathans Institute, Baltimore, MD 21205, USA
| | - Dongwon Lee
- Department of Genetic Medicine, McKusick-Nathans Institute, Baltimore, MD 21205, USA
- Center for Human Genetics and Genomics, NYU School of Medicine, New York, NY 10016, USA
- Division of Nephrology, Boston Children’s Hospital, Boston, MA 02115, USA
| | - Thomas J Hoffmann
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94143, USA
| | - Georg B Ehret
- Department of Genetic Medicine, McKusick-Nathans Institute, Baltimore, MD 21205, USA
- Center for Human Genetics and Genomics, NYU School of Medicine, New York, NY 10016, USA
- Cardiology, Department of Specialties of Internal Medicine, University of Geneva, Geneva 1211, Switzerland
| | - Dan Arking
- Department of Genetic Medicine, McKusick-Nathans Institute, Baltimore, MD 21205, USA
| | - Dilrini Ranatunga
- Kaiser Permanente Northern California Division of Research, Oakland, California 94612 USA
| | - Man Li
- Division of Nephrology, Department of Human Genetics, University of Utah, Salt Lake City, Utah 84132, USA
| | - Megan L Grove
- Human Genetics Center, University of Texas Health Science Center, Houston, Texas 77030, USA
| | - Eric Boerwinkle
- Human Genetics Center, University of Texas Health Science Center, Houston, Texas 77030, USA
| | - Catherine Schaefer
- Kaiser Permanente Northern California Division of Research, Oakland, California 94612 USA
| | - Pui-Yan Kwok
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94143, USA
| | - Carlos Iribarren
- Kaiser Permanente Northern California Division of Research, Oakland, California 94612 USA
| | - Neil Risch
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94143, USA
- Kaiser Permanente Northern California Division of Research, Oakland, California 94612 USA
| | - Aravinda Chakravarti
- Department of Genetic Medicine, McKusick-Nathans Institute, Baltimore, MD 21205, USA
- Center for Human Genetics and Genomics, NYU School of Medicine, New York, NY 10016, USA
| |
Collapse
|
18
|
de Marvao A, Dawes TJW, O'Regan DP. Artificial Intelligence for Cardiac Imaging-Genetics Research. Front Cardiovasc Med 2020; 6:195. [PMID: 32039240 PMCID: PMC6985036 DOI: 10.3389/fcvm.2019.00195] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Accepted: 12/27/2019] [Indexed: 12/18/2022] Open
Abstract
Cardiovascular conditions remain the leading cause of mortality and morbidity worldwide, with genotype being a significant influence on disease risk. Cardiac imaging-genetics aims to identify and characterize the genetic variants that influence functional, physiological, and anatomical phenotypes derived from cardiovascular imaging. High-throughput DNA sequencing and genotyping have greatly accelerated genetic discovery, making variant interpretation one of the key challenges in contemporary clinical genetics. Heterogeneous, low-fidelity phenotyping and difficulties integrating and then analyzing large-scale genetic, imaging and clinical datasets using traditional statistical approaches have impeded process. Artificial intelligence (AI) methods, such as deep learning, are particularly suited to tackle the challenges of scalability and high dimensionality of data and show promise in the field of cardiac imaging-genetics. Here we review the current state of AI as applied to imaging-genetics research and discuss outstanding methodological challenges, as the field moves from pilot studies to mainstream applications, from one dimensional global descriptors to high-resolution models of whole-organ shape and function, from univariate to multivariate analysis and from candidate gene to genome-wide approaches. Finally, we consider the future directions and prospects of AI imaging-genetics for ultimately helping understand the genetic and environmental underpinnings of cardiovascular health and disease.
Collapse
Affiliation(s)
| | | | - Declan P. O'Regan
- MRC London Institute of Medical Sciences, Imperial College London, London, United Kingdom
| |
Collapse
|
19
|
Leon-Mimila P, Wang J, Huertas-Vazquez A. Relevance of Multi-Omics Studies in Cardiovascular Diseases. Front Cardiovasc Med 2019; 6:91. [PMID: 31380393 PMCID: PMC6656333 DOI: 10.3389/fcvm.2019.00091] [Citation(s) in RCA: 87] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2019] [Accepted: 06/19/2019] [Indexed: 12/21/2022] Open
Abstract
Cardiovascular diseases are the leading cause of death around the world. Despite the larger number of genes and loci identified, the precise mechanisms by which these genes influence risk of cardiovascular disease is not well understood. Recent advances in the development and optimization of high-throughput technologies for the generation of “omics data” have provided a deeper understanding of the processes and dynamic interactions involved in human diseases. However, the integrative analysis of “omics” data is not straightforward and represents several logistic and computational challenges. In spite of these difficulties, several studies have successfully applied integrative genomics approaches for the investigation of novel mechanisms and plasma biomarkers involved in cardiovascular diseases. In this review, we summarized recent studies aimed to understand the molecular framework of these diseases using multi-omics data from mice and humans. We discuss examples of omics studies for cardiovascular diseases focused on the integration of genomics, epigenomics, transcriptomics, and proteomics. This review also describes current gaps in the study of complex diseases using systems genetics approaches as well as potential limitations and future directions of this emerging field.
Collapse
Affiliation(s)
- Paola Leon-Mimila
- Division of Cardiology, David Geffen School of Medicine, Department of Medicine, University of California, Los Angeles, Los Angeles, CA, United States
| | - Jessica Wang
- Division of Cardiology, David Geffen School of Medicine, Department of Medicine, University of California, Los Angeles, Los Angeles, CA, United States
| | - Adriana Huertas-Vazquez
- Division of Cardiology, David Geffen School of Medicine, Department of Medicine, University of California, Los Angeles, Los Angeles, CA, United States
| |
Collapse
|
20
|
Kapoor A, Lee D, Zhu L, Soliman EZ, Grove ML, Boerwinkle E, Arking DE, Chakravarti A. Multiple SCN5A variant enhancers modulate its cardiac gene expression and the QT interval. Proc Natl Acad Sci U S A 2019; 116:10636-10645. [PMID: 31068470 PMCID: PMC6561183 DOI: 10.1073/pnas.1808734116] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The rationale for genome-wide association study (GWAS) results is sequence variation in cis-regulatory elements (CREs) modulating a target gene's expression as the major cause of trait variation. To understand the complete molecular landscape of one of these GWAS loci, we performed in vitro reporter screens in cardiomyocyte cell lines for CREs overlapping nearly all common variants associated with any of five independent QT interval (QTi)-associated GWAS hits at the SCN5A-SCN10A locus. We identified 13 causal CRE variants using allelic reporter activity, cardiomyocyte nuclear extract-based binding assays, overlap with human cardiac tissue DNaseI hypersensitive regions, and predicted impact of sequence variants on DNaseI sensitivity. Our analyses identified at least one high-confidence causal CRE variant for each of the five sentinel hits that could collectively predict SCN5A cardiac gene expression and QTi association. Although all 13 variants could explain SCN5A gene expression, the highest statistical significance was obtained with seven variants (inclusive of the five above). Thus, multiple, causal, mutually associated CRE variants can underlie GWAS signals.
Collapse
Affiliation(s)
- Ashish Kapoor
- Institute of Molecular Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX 77030;
| | - Dongwon Lee
- Center for Human Genetics and Genomics, New York University School of Medicine, New York, NY 10016
| | - Luke Zhu
- Center for Human Genetics and Genomics, New York University School of Medicine, New York, NY 10016
| | - Elsayed Z Soliman
- Epidemiological Cardiology Research Center, Department of Epidemiology and Prevention, Division of Public Health Sciences, Wake Forest School of Medicine, Winston-Salem, NC 27101
| | - Megan L Grove
- Division of Epidemiology, Human Genetics and Environmental Sciences, University of Texas Health Science Center at Houston, Houston, TX 77030
| | - Eric Boerwinkle
- Division of Epidemiology, Human Genetics and Environmental Sciences, University of Texas Health Science Center at Houston, Houston, TX 77030
| | - Dan E Arking
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205
| | - Aravinda Chakravarti
- Center for Human Genetics and Genomics, New York University School of Medicine, New York, NY 10016;
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205
| |
Collapse
|