1
|
Aracena KA, Lin YL, Luo K, Pacis A, Gona S, Mu Z, Yotova V, Sindeaux R, Pramatarova A, Simon MM, Chen X, Groza C, Lougheed D, Gregoire R, Brownlee D, Boye C, Pique-Regi R, Li Y, He X, Bujold D, Pastinen T, Bourque G, Barreiro LB. Epigenetic variation impacts individual differences in the transcriptional response to influenza infection. Nat Genet 2024; 56:408-419. [PMID: 38424460 DOI: 10.1038/s41588-024-01668-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 01/16/2024] [Indexed: 03/02/2024]
Abstract
Humans display remarkable interindividual variation in their immune response to identical challenges. Yet, our understanding of the genetic and epigenetic factors contributing to such variation remains limited. Here we performed in-depth genetic, epigenetic and transcriptional profiling on primary macrophages derived from individuals of European and African ancestry before and after infection with influenza A virus. We show that baseline epigenetic profiles are strongly predictive of the transcriptional response to influenza A virus across individuals. Quantitative trait locus (QTL) mapping revealed highly coordinated genetic effects on gene regulation, with many cis-acting genetic variants impacting concomitantly gene expression and multiple epigenetic marks. These data reveal that ancestry-associated differences in the epigenetic landscape can be genetically controlled, even more than gene expression. Lastly, among QTL variants that colocalized with immune-disease loci, only 7% were gene expression QTL, while the remaining genetic variants impact epigenetic marks, stressing the importance of considering molecular phenotypes beyond gene expression in disease-focused studies.
Collapse
Affiliation(s)
| | - Yen-Lung Lin
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Kaixuan Luo
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Alain Pacis
- Canadian Centre for Computational Genomics, McGill University, Montreal, Quebec, Canada
| | - Saideep Gona
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Zepeng Mu
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Vania Yotova
- Department of Genetics, CHU Sainte-Justine Research Center, Montreal, Quebec, Canada
| | - Renata Sindeaux
- Department of Genetics, CHU Sainte-Justine Research Center, Montreal, Quebec, Canada
| | | | | | - Xun Chen
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Cristian Groza
- Quantitative Life Sciences, McGill University, Montreal, Quebec, Canada
| | - David Lougheed
- Canadian Centre for Computational Genomics, McGill University, Montreal, Quebec, Canada
- Department of Human Genetics, McGill University, Montreal, Quebec, Canada
| | - Romain Gregoire
- Canadian Centre for Computational Genomics, McGill University, Montreal, Quebec, Canada
| | - David Brownlee
- Canadian Centre for Computational Genomics, McGill University, Montreal, Quebec, Canada
| | - Carly Boye
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA
| | - Roger Pique-Regi
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI, USA
| | - Yang Li
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Xin He
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - David Bujold
- Canadian Centre for Computational Genomics, McGill University, Montreal, Quebec, Canada
- McGill Genome Centre, Montreal, Quebec, Canada
| | - Tomi Pastinen
- Department of Human Genetics, McGill University, Montreal, Quebec, Canada
- Genomic Medicine Center, Children's Mercy, Kansas City, MO, USA
| | - Guillaume Bourque
- Canadian Centre for Computational Genomics, McGill University, Montreal, Quebec, Canada.
- McGill Genome Centre, Montreal, Quebec, Canada.
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan.
- Department of Human Genetics, McGill University, Montreal, Quebec, Canada.
| | - Luis B Barreiro
- Department of Human Genetics, University of Chicago, Chicago, IL, USA.
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA.
- Committee on Immunology, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
2
|
Pettie KP, Mumbach M, Lea AJ, Ayroles J, Chang HY, Kasowski M, Fraser HB. Chromatin activity identifies differential gene regulation across human ancestries. Genome Biol 2024; 25:21. [PMID: 38225662 PMCID: PMC10789071 DOI: 10.1186/s13059-024-03165-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 01/04/2024] [Indexed: 01/17/2024] Open
Abstract
BACKGROUND Current evidence suggests that cis-regulatory elements controlling gene expression may be the predominant target of natural selection in humans and other species. Detecting selection acting on these elements is critical to understanding evolution but remains challenging because we do not know which mutations will affect gene regulation. RESULTS To address this, we devise an approach to search for lineage-specific selection on three critical steps in transcriptional regulation: chromatin activity, transcription factor binding, and chromosomal looping. Applying this approach to lymphoblastoid cells from 831 individuals of either European or African descent, we find strong signals of differential chromatin activity linked to gene expression differences between ancestries in numerous contexts, but no evidence of functional differences in chromosomal looping. Moreover, we show that enhancers rather than promoters display the strongest signs of selection associated with sites of differential transcription factor binding. CONCLUSIONS Overall, our study indicates that some cis-regulatory adaptation may be more easily detected at the level of chromatin than DNA sequence. This work provides a vast resource of genomic interaction data from diverse human populations and establishes a novel selection test that will benefit future study of regulatory evolution in humans and other species.
Collapse
Affiliation(s)
- Kade P Pettie
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Maxwell Mumbach
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Amanda J Lea
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
| | - Julien Ayroles
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA
| | - Howard Y Chang
- Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Maya Kasowski
- Sean N. Parker Center for Allergy and Asthma Research, Stanford University School of Medicine, Stanford, CA, USA
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Hunter B Fraser
- Department of Biology, Stanford University, Stanford, CA, USA.
| |
Collapse
|
3
|
Jeong R, Bulyk ML. Blood cell traits' GWAS loci colocalization with variation in PU.1 genomic occupancy prioritizes causal noncoding regulatory variants. CELL GENOMICS 2023; 3:100327. [PMID: 37492098 PMCID: PMC10363807 DOI: 10.1016/j.xgen.2023.100327] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 02/10/2023] [Accepted: 04/25/2023] [Indexed: 07/27/2023]
Abstract
Genome-wide association studies (GWASs) have uncovered numerous trait-associated loci across the human genome, most of which are located in noncoding regions, making interpretation difficult. Moreover, causal variants are hard to statistically fine-map at many loci because of widespread linkage disequilibrium. To address this challenge, we present a strategy utilizing transcription factor (TF) binding quantitative trait loci (bQTLs) for colocalization analysis to identify trait associations likely mediated by TF occupancy variation and to pinpoint likely causal variants using motif scores. We applied this approach to PU.1 bQTLs in lymphoblastoid cell lines and blood cell trait GWAS data. Colocalization analysis revealed 69 blood cell trait GWAS loci putatively driven by PU.1 occupancy variation. We nominate PU.1 motif-altering variants as the likely shared causal variants at 51 loci. Such integration of TF bQTL data with other GWAS data may reveal transcriptional regulatory mechanisms and causal noncoding variants underlying additional complex traits.
Collapse
Affiliation(s)
- Raehoon Jeong
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
- Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, MA 02138, USA
| | - Martha L. Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
- Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, MA 02138, USA
- Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
4
|
Yang MG, Ling E, Cowley CJ, Greenberg ME, Vierbuchen T. Characterization of sequence determinants of enhancer function using natural genetic variation. eLife 2022; 11:76500. [PMID: 36043696 PMCID: PMC9662815 DOI: 10.7554/elife.76500] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Accepted: 08/30/2022] [Indexed: 02/04/2023] Open
Abstract
Sequence variation in enhancers that control cell-type-specific gene transcription contributes significantly to phenotypic variation within human populations. However, it remains difficult to predict precisely the effect of any given sequence variant on enhancer function due to the complexity of DNA sequence motifs that determine transcription factor (TF) binding to enhancers in their native genomic context. Using F1-hybrid cells derived from crosses between distantly related inbred strains of mice, we identified thousands of enhancers with allele-specific TF binding and/or activity. We find that genetic variants located within the central region of enhancers are most likely to alter TF binding and enhancer activity. We observe that the AP-1 family of TFs (Fos/Jun) are frequently required for binding of TEAD TFs and for enhancer function. However, many sequence variants outside of core motifs for AP-1 and TEAD also impact enhancer function, including sequences flanking core TF motifs and AP-1 half sites. Taken together, these data represent one of the most comprehensive assessments of allele-specific TF binding and enhancer function to date and reveal how sequence changes at enhancers alter their function across evolutionary timescales.
Collapse
Affiliation(s)
- Marty G Yang
- Department of Neurobiology, Harvard Medical School, Boston, United States.,Program in Neuroscience, Harvard Medical School, Boston, United States
| | - Emi Ling
- Department of Neurobiology, Harvard Medical School, Boston, United States
| | | | | | - Thomas Vierbuchen
- Developmental Biology Program, Sloan Kettering Institute for Cancer Research, New York, United States.,Center for Stem Cell Biology, Sloan Kettering Institute for Cancer Research, New York, United States
| |
Collapse
|
5
|
Cechova M, Miga KH. Satellite DNAs and human sex chromosome variation. Semin Cell Dev Biol 2022; 128:15-25. [PMID: 35644878 PMCID: PMC9233459 DOI: 10.1016/j.semcdb.2022.04.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 04/26/2022] [Accepted: 04/27/2022] [Indexed: 11/17/2022]
Abstract
Satellite DNAs are present on every chromosome in the cell and are typically enriched in repetitive, heterochromatic parts of the human genome. Sex chromosomes represent a unique genomic and epigenetic context. In this review, we first report what is known about satellite DNA biology on human X and Y chromosomes, including repeat content and organization, as well as satellite variation in typical euploid individuals. Then, we review sex chromosome aneuploidies that are among the most common types of aneuploidies in the general population, and are better tolerated than autosomal aneuploidies. This is demonstrated also by the fact that aging is associated with the loss of the X, and especially the Y chromosome. In addition, supernumerary sex chromosomes enable us to study general processes in a cell, such as analyzing heterochromatin dosage (i.e. additional Barr bodies and long heterochromatin arrays on Yq) and their downstream consequences. Finally, genomic and epigenetic organization and regulation of satellite DNA could influence chromosome stability and lead to aneuploidy. In this review, we argue that the complete annotation of satellite DNA on sex chromosomes in human, and especially in centromeric regions, will aid in explaining the prevalence and the consequences of sex chromosome aneuploidies.
Collapse
Affiliation(s)
- Monika Cechova
- Faculty of Informatics, Masaryk University, Czech Republic
| | - Karen H Miga
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA; UC Santa Cruz Genomics Institute, University of California Santa Cruz, CA 95064, USA
| |
Collapse
|
6
|
Flynn E, Lappalainen T. Functional Characterization of Genetic Variant Effects on Expression. Annu Rev Biomed Data Sci 2022; 5:119-139. [PMID: 35483347 DOI: 10.1146/annurev-biodatasci-122120-010010] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Thousands of common genetic variants in the human population have been associated with disease risk and phenotypic variation by genome-wide association studies (GWAS). However, the majority of GWAS variants fall into noncoding regions of the genome, complicating our understanding of their regulatory functions, and few molecular mechanisms of GWAS variant effects have been clearly elucidated. Here, we set out to review genetic variant effects, focusing on expression quantitative trait loci (eQTLs), including their utility in interpreting GWAS variant mechanisms. We discuss the interrelated challenges and opportunities for eQTL analysis, covering determining causal variants, elucidating molecular mechanisms of action, and understanding context variability. Addressing these questions can enable better functional characterization of disease-associated loci and provide insights into fundamental biological questions of the noncoding genetic regulatory code and its control of gene expression. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Elise Flynn
- New York Genome Center, New York, NY, USA; , .,Department of Systems Biology, Columbia University, New York, NY, USA
| | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA; , .,Department of Systems Biology, Columbia University, New York, NY, USA.,Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|
7
|
Llimos G, Gardeux V, Koch U, Kribelbauer JF, Hafner A, Alpern D, Pezoldt J, Litovchenko M, Russeil J, Dainese R, Moia R, Mahmoud AM, Rossi D, Gaidano G, Plass C, Lutsik P, Gerhauser C, Waszak SM, Boettiger A, Radtke F, Deplancke B. A leukemia-protective germline variant mediates chromatin module formation via transcription factor nucleation. Nat Commun 2022; 13:2042. [PMID: 35440565 PMCID: PMC9018852 DOI: 10.1038/s41467-022-29625-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 03/24/2022] [Indexed: 12/13/2022] Open
Abstract
Non-coding variants coordinate transcription factor (TF) binding and chromatin mark enrichment changes over regions spanning >100 kb. These molecularly coordinated regions are named "variable chromatin modules" (VCMs), providing a conceptual framework of how regulatory variation might shape complex traits. To better understand the molecular mechanisms underlying VCM formation, here, we mechanistically dissect a VCM-modulating noncoding variant that is associated with reduced chronic lymphocytic leukemia (CLL) predisposition and disease progression. This common, germline variant constitutes a 5-bp indel that controls the activity of an AXIN2 gene-linked VCM by creating a MEF2 binding site, which, upon binding, activates a super-enhancer-like regulatory element. This triggers a large change in TF binding activity and chromatin state at an enhancer cluster spanning >150 kb, coinciding with subtle, long-range chromatin compaction and robust AXIN2 up-regulation. Our results support a model in which the indel acts as an AXIN2 VCM-activating TF nucleation event, which modulates CLL pathology.
Collapse
Affiliation(s)
- Gerard Llimos
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Vincent Gardeux
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Ute Koch
- Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Judith F Kribelbauer
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Antonina Hafner
- Department of Developmental Biology, Stanford University, Stanford, CA, USA
| | - Daniel Alpern
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Joern Pezoldt
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Maria Litovchenko
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Cancer Research UK Lung Cancer Centre of Excellence, University College London (UCL) Cancer Institute, Cancer Genome Evolution Research Group, London, UK
| | - Julie Russeil
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Riccardo Dainese
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Riccardo Moia
- Division of Hematology, Department of Translational Medicine, University of Eastern Piedmont, Novara, Italy
| | - Abdurraouf Mokhtar Mahmoud
- Division of Hematology, Department of Translational Medicine, University of Eastern Piedmont, Novara, Italy
| | - Davide Rossi
- Oncology Institute of Southern Switzerland, Università della Svizzera italiana, Bellinzona, Switzerland
- Institute of Oncology Research, Università della Svizzera italiana, Bellinzona, Switzerland
| | - Gianluca Gaidano
- Division of Hematology, Department of Translational Medicine, University of Eastern Piedmont, Novara, Italy
| | - Christoph Plass
- Division of Epigenomics and Cancer Risk Factors, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Pavlo Lutsik
- Division of Epigenomics and Cancer Risk Factors, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Clarissa Gerhauser
- Division of Epigenomics and Cancer Risk Factors, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Sebastian M Waszak
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo and Oslo University Hospital, Oslo, Norway
- Department of Pediatric Research, Division of Paediatric and Adolescent Medicine, Rikshospitalet, Oslo University Hospital, Oslo, Norway
| | - Alistair Boettiger
- Department of Developmental Biology, Stanford University, Stanford, CA, USA
| | - Freddy Radtke
- Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Bart Deplancke
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
8
|
Novikova G, Kapoor M, Tcw J, Abud EM, Efthymiou AG, Chen SX, Cheng H, Fullard JF, Bendl J, Liu Y, Roussos P, Björkegren JL, Liu Y, Poon WW, Hao K, Marcora E, Goate AM. Integration of Alzheimer's disease genetics and myeloid genomics identifies disease risk regulatory elements and genes. Nat Commun 2021; 12:1610. [PMID: 33712570 PMCID: PMC7955030 DOI: 10.1038/s41467-021-21823-y] [Citation(s) in RCA: 107] [Impact Index Per Article: 35.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 02/10/2021] [Indexed: 02/07/2023] Open
Abstract
Genome-wide association studies (GWAS) have identified more than 40 loci associated with Alzheimer's disease (AD), but the causal variants, regulatory elements, genes and pathways remain largely unknown, impeding a mechanistic understanding of AD pathogenesis. Previously, we showed that AD risk alleles are enriched in myeloid-specific epigenomic annotations. Here, we show that they are specifically enriched in active enhancers of monocytes, macrophages and microglia. We integrated AD GWAS with myeloid epigenomic and transcriptomic datasets using analytical approaches to link myeloid enhancer activity to target gene expression regulation and AD risk modification. We identify AD risk enhancers and nominate candidate causal genes among their likely targets (including AP4E1, AP4M1, APBB3, BIN1, MS4A4A, MS4A6A, PILRA, RABEP1, SPI1, TP53INP1, and ZYX) in twenty loci. Fine-mapping of these enhancers nominates candidate functional variants that likely modify AD risk by regulating gene expression in myeloid cells. In the MS4A locus we identified a single candidate functional variant and validated it in human induced pluripotent stem cell (hiPSC)-derived microglia and brain. Taken together, this study integrates AD GWAS with multiple myeloid genomic datasets to investigate the mechanisms of AD risk alleles and nominates candidate functional variants, regulatory elements and genes that likely modulate disease susceptibility.
Collapse
Affiliation(s)
- Gloriia Novikova
- Ronald M. Loeb Center for Alzheimer's Disease, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Manav Kapoor
- Ronald M. Loeb Center for Alzheimer's Disease, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Julia Tcw
- Ronald M. Loeb Center for Alzheimer's Disease, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Edsel M Abud
- Department of Neurobiology & Behavior, University of California Irvine, Irvine, CA, USA
- Sue and Bill Gross Stem Cell Research Center, University of California Irvine, Irvine, CA, USA
| | - Anastasia G Efthymiou
- Ronald M. Loeb Center for Alzheimer's Disease, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Steven X Chen
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Haoxiang Cheng
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - John F Fullard
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jaroslav Bendl
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Yiyuan Liu
- Ronald M. Loeb Center for Alzheimer's Disease, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Panos Roussos
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Johan Lm Björkegren
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Integrated Cardio Metabolic Centre, Department of Medicine, Karolinska Institutet, Karolinska Universitetssjukhuset, Huddinge, Sweden
| | - Yunlong Liu
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Wayne W Poon
- Institute for Memory Impairments and Neurological Disorders, University of California Irvine, Irvine, CA, USA
| | - Ke Hao
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Edoardo Marcora
- Ronald M. Loeb Center for Alzheimer's Disease, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Alison M Goate
- Ronald M. Loeb Center for Alzheimer's Disease, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
9
|
Mitchelmore J, Grinberg NF, Wallace C, Spivakov M. Functional effects of variation in transcription factor binding highlight long-range gene regulation by epromoters. Nucleic Acids Res 2020; 48:2866-2879. [PMID: 32112106 PMCID: PMC7102942 DOI: 10.1093/nar/gkaa123] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2019] [Revised: 02/14/2020] [Accepted: 02/17/2020] [Indexed: 02/06/2023] Open
Abstract
Identifying DNA cis-regulatory modules (CRMs) that control the expression of specific genes is crucial for deciphering the logic of transcriptional control. Natural genetic variation can point to the possible gene regulatory function of specific sequences through their allelic associations with gene expression. However, comprehensive identification of causal regulatory sequences in brute-force association testing without incorporating prior knowledge is challenging due to limited statistical power and effects of linkage disequilibrium. Sequence variants affecting transcription factor (TF) binding at CRMs have a strong potential to influence gene regulatory function, which provides a motivation for prioritizing such variants in association testing. Here, we generate an atlas of CRMs showing predicted allelic variation in TF binding affinity in human lymphoblastoid cell lines and test their association with the expression of their putative target genes inferred from Promoter Capture Hi-C and immediate linear proximity. We reveal >1300 CRM TF-binding variants associated with target gene expression, the majority of them undetected with standard association testing. A large proportion of CRMs showing associations with the expression of genes they contact in 3D localize to the promoter regions of other genes, supporting the notion of 'epromoters': dual-action CRMs with promoter and distal enhancer activity.
Collapse
Affiliation(s)
- Joanna Mitchelmore
- Nuclear Dynamics Programme, Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK
| | - Nastasiya F Grinberg
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0AW, UK
| | - Chris Wallace
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0AW, UK
- MRC Biostatistics Unit, University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0SR, UK
| | - Mikhail Spivakov
- Nuclear Dynamics Programme, Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK
- MRC London Institute of Medical Sciences, Du Cane Road, London W12 0NN, UK
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College, Du Cane Road, London W12 0NN, UK
| |
Collapse
|
10
|
Bansal P, Kondaveeti Y, Pinter SF. Forged by DXZ4, FIRRE, and ICCE: How Tandem Repeats Shape the Active and Inactive X Chromosome. Front Cell Dev Biol 2020; 7:328. [PMID: 32076600 PMCID: PMC6985041 DOI: 10.3389/fcell.2019.00328] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Accepted: 11/26/2019] [Indexed: 12/11/2022] Open
Abstract
Recent efforts in mapping spatial genome organization have revealed three evocative and conserved structural features of the inactive X in female mammals. First, the chromosomal conformation of the inactive X reveals a loss of topologically associated domains (TADs) present on the active X. Second, the macrosatellite DXZ4 emerges as a singular boundary that suppresses physical interactions between two large TAD-depleted "megadomains." Third, DXZ4 reaches across several megabases to form "superloops" with two other X-linked tandem repeats, FIRRE and ICCE, which also loop to each other. Although all three structural features are conserved across rodents and primates, deletion of mouse and human orthologs of DXZ4 and FIRRE from the inactive X have revealed limited impact on X chromosome inactivation (XCI) and escape in vitro. In contrast, loss of Xist or SMCHD1 have been shown to impair TAD erasure and gene silencing on the inactive X. In this perspective, we summarize these results in the context of new research describing disruption of X-linked tandem repeats in vivo, and discuss their possible molecular roles through the lens of evolutionary conservation and clinical genetics. As a null hypothesis, we consider whether the conservation of some structural features on the inactive X may reflect selection for X-linked tandem repeats on account of necessary cis- and trans-regulatory roles they may play on the active X, rather than the inactive X. Additional hypotheses invoking a role for X-linked tandem repeats on X reactivation, for example in the germline or totipotency, remain to be assessed in multiple developmental models spanning mammalian evolution.
Collapse
Affiliation(s)
- Prakhar Bansal
- Department of Genetics and Genome Sciences, School of Medicine, UCONN Health, University of Connecticut, Farmington, CT, United States
- Institute for Systems Genomics, University of Connecticut, Farmington, CT, United States
| | - Yuvabharath Kondaveeti
- Department of Genetics and Genome Sciences, School of Medicine, UCONN Health, University of Connecticut, Farmington, CT, United States
- Institute for Systems Genomics, University of Connecticut, Farmington, CT, United States
| | - Stefan F. Pinter
- Department of Genetics and Genome Sciences, School of Medicine, UCONN Health, University of Connecticut, Farmington, CT, United States
- Institute for Systems Genomics, University of Connecticut, Farmington, CT, United States
| |
Collapse
|
11
|
Gorkin DU, Qiu Y, Hu M, Fletez-Brant K, Liu T, Schmitt AD, Noor A, Chiou J, Gaulton KJ, Sebat J, Li Y, Hansen KD, Ren B. Common DNA sequence variation influences 3-dimensional conformation of the human genome. Genome Biol 2019; 20:255. [PMID: 31779666 PMCID: PMC6883528 DOI: 10.1186/s13059-019-1855-4] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 10/10/2019] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND The 3-dimensional (3D) conformation of chromatin inside the nucleus is integral to a variety of nuclear processes including transcriptional regulation, DNA replication, and DNA damage repair. Aberrations in 3D chromatin conformation have been implicated in developmental abnormalities and cancer. Despite the importance of 3D chromatin conformation to cellular function and human health, little is known about how 3D chromatin conformation varies in the human population, or whether DNA sequence variation between individuals influences 3D chromatin conformation. RESULTS To address these questions, we perform Hi-C on lymphoblastoid cell lines from 20 individuals. We identify thousands of regions across the genome where 3D chromatin conformation varies between individuals and find that this variation is often accompanied by variation in gene expression, histone modifications, and transcription factor binding. Moreover, we find that DNA sequence variation influences several features of 3D chromatin conformation including loop strength, contact insulation, contact directionality, and density of local cis contacts. We map hundreds of quantitative trait loci associated with 3D chromatin features and find evidence that some of these same variants are associated at modest levels with other molecular phenotypes as well as complex disease risk. CONCLUSION Our results demonstrate that common DNA sequence variants can influence 3D chromatin conformation, pointing to a more pervasive role for 3D chromatin conformation in human phenotypic variation than previously recognized.
Collapse
Affiliation(s)
- David U Gorkin
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, USA
| | - Yunjiang Qiu
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Ming Hu
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH, USA.
| | - Kipper Fletez-Brant
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Tristin Liu
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
| | - Anthony D Schmitt
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
- Current address: Arima Genomics, San Diego, CA, 92121, USA
| | - Amina Noor
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, USA
| | - Joshua Chiou
- Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
- Biomedical Sciences Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Kyle J Gaulton
- Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
| | - Jonathan Sebat
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Psychiatry, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Yun Li
- Department of Genetics, Department of Biostatistics, and Department of Computer Science, University of North Carolina, Chapel Hill, Chapel Hill, NC, USA
| | - Kasper D Hansen
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Bing Ren
- Ludwig Institute for Cancer Research, La Jolla, CA, USA.
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, USA.
- Institute of Genomic Medicine and Moores Cancer Center, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
12
|
Posynick BJ, Brown CJ. Escape From X-Chromosome Inactivation: An Evolutionary Perspective. Front Cell Dev Biol 2019; 7:241. [PMID: 31696116 PMCID: PMC6817483 DOI: 10.3389/fcell.2019.00241] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 10/02/2019] [Indexed: 12/14/2022] Open
Abstract
Sex chromosomes originate as a pair of homologus autosomes that then follow a general pattern of divergence. This is evident in mammalian sex chromosomes, which have undergone stepwise recombination suppression events that left footprints of evolutionary strata on the X chromosome. The loss of genes on the Y chromosome led to Ohno’s hypothesis of dosage equivalence between XY males and XX females, which is achieved through X-chromosome inactivation (XCI). This process transcriptionally silences all but one X chromosome in each female cell, although 15–30% of human X-linked genes still escape inactivation. There are multiple evolutionary pathways that may lead to a gene escaping XCI, including remaining Y chromosome homology, or female advantage to escape. The conservation of some escape genes across multiple species and the ability of the mouse inactive X to recapitulate human escape status both suggest that escape from XCI is controlled by conserved processes. Evolutionary pressures to minimize dosage imbalances have led to the accumulation of genetic elements that favor either silencing or escape; lack of dosage sensitivity might also allow for the escape of flanking genes near another escapee, if a boundary element is not present between them. Delineation of the elements involved in escape is progressing, but mechanistic understanding of how they interact to allow escape from XCI is still lacking. Although increasingly well-studied in humans and mice, non-trivial challenges to studying escape have impeded progress in other species. Mouse models that can dissect the role of the sex chromosomes distinct from sex of the organism reveal an important contribution for escape genes to multiple diseases. In humans, with their elevated number of escape genes, the phenotypic consequences of sex chromosome aneuplodies and sexual dimorphism in disease both highlight the importance of escape genes.
Collapse
Affiliation(s)
- Bronwyn J Posynick
- Department of Medical Genetics, Molecular Epigenetics Group, Life Sciences Institute, The University of British Columbia, Vancouver, BC, Canada
| | - Carolyn J Brown
- Department of Medical Genetics, Molecular Epigenetics Group, Life Sciences Institute, The University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
13
|
Balaton BP, Dixon-McDougall T, Peeters SB, Brown CJ. The eXceptional nature of the X chromosome. Hum Mol Genet 2019; 27:R242-R249. [PMID: 29701779 DOI: 10.1093/hmg/ddy148] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Accepted: 04/20/2018] [Indexed: 12/16/2022] Open
Abstract
The X chromosome is unique in the genome. In this review we discuss recent advances in our understanding of the genetics and epigenetics of the X chromosome. The X chromosome shares limited conservation with its ancestral homologue the Y chromosome and the resulting difference in X-chromosome dosage between males and females is largely compensated for by X-chromosome inactivation. The process of inactivation is initiated by the long non-coding RNA X-inactive specific transcript (XIST) and achieved through interaction with multiple synergistic silencing pathways. Identification of Xist-interacting proteins has given insight into these processes yet the cascade of events from initiation to maintenance have still to be resolved. In particular, the initiation of inactivation in humans has been challenging to study as: it occurs very early in development; most human embryonic stem cell lines already have an inactive X; and the process seems to differ from mouse. Another difference between human and mouse X inactivation is the larger number of human genes that escape silencing. In humans over 20% of X-linked genes continue to be expressed from the otherwise inactive X chromosome. We are only beginning to understand how such escape occurs but there is growing recognition that escapees contribute to sexually dimorphic traits. The unique biology and epigenetics of the X chromosome have often led to its exclusion from disease studies, yet the X constitutes 5% of the genome and is an important contributor to disease, often in a sex-specific manner.
Collapse
Affiliation(s)
- Bradley P Balaton
- Molecular Epigenetics Group, Department of Medical Genetics, Life Sciences Institute, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - Thomas Dixon-McDougall
- Molecular Epigenetics Group, Department of Medical Genetics, Life Sciences Institute, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - Samantha B Peeters
- Molecular Epigenetics Group, Department of Medical Genetics, Life Sciences Institute, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - Carolyn J Brown
- Molecular Epigenetics Group, Department of Medical Genetics, Life Sciences Institute, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| |
Collapse
|
14
|
Tehranchi A, Hie B, Dacre M, Kaplow I, Pettie K, Combs P, Fraser HB. Fine-mapping cis-regulatory variants in diverse human populations. eLife 2019; 8:39595. [PMID: 30650056 PMCID: PMC6335058 DOI: 10.7554/elife.39595] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2018] [Accepted: 12/30/2018] [Indexed: 12/19/2022] Open
Abstract
Genome-wide association studies (GWAS) are a powerful approach for connecting genotype to phenotype. Most GWAS hits are located in cis-regulatory regions, but the underlying causal variants and their molecular mechanisms remain unknown. To better understand human cis-regulatory variation, we mapped quantitative trait loci for chromatin accessibility (caQTLs)—a key step in cis-regulation—in 1000 individuals from 10 diverse populations. Most caQTLs were shared across populations, allowing us to leverage the genetic diversity to fine-map candidate causal regulatory variants, several thousand of which have been previously implicated in GWAS. In addition, many caQTLs that affect the expression of distal genes also alter the landscape of long-range chromosomal interactions, suggesting a mechanism for long-range expression QTLs. In sum, our results show that molecular QTL mapping integrated across diverse populations provides a high-resolution view of how worldwide human genetic variation affects chromatin accessibility, gene expression, and phenotype. Editorial note: This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that minor issues remain unresolved (see decision letter).
Collapse
Affiliation(s)
- Ashley Tehranchi
- Department of Biology, Stanford University, Stanford, United States
| | - Brian Hie
- Department of Computer Science, Stanford University, Stanford, United States
| | - Michael Dacre
- Department of Biology, Stanford University, Stanford, United States
| | - Irene Kaplow
- Department of Computer Science, Stanford University, Stanford, United States
| | - Kade Pettie
- Department of Biology, Stanford University, Stanford, United States
| | - Peter Combs
- Department of Biology, Stanford University, Stanford, United States
| | - Hunter B Fraser
- Department of Biology, Stanford University, Stanford, United States
| |
Collapse
|
15
|
High-resolution genetic mapping of putative causal interactions between regions of open chromatin. Nat Genet 2018; 51:128-137. [PMID: 30478436 PMCID: PMC6330062 DOI: 10.1038/s41588-018-0278-6] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Accepted: 10/15/2018] [Indexed: 01/04/2023]
Abstract
Physical interaction of regulatory elements in three-dimensional space poses a challenge for studies of disease because non-coding risk variants may be great distances from the genes they regulate. Experimental methods to capture these interactions, such as chromosome conformation capture, usually cannot assign causal direction of effect between regulatory elements, an important component of fine-mapping studies. We developed a Bayesian hierarchical approach that uses two-stage least squares and applied it to an ATAC-seq (assay for transposase-accessible chromatin using sequencing) data set from 100 individuals, to identify over 15,000 high-confidence causal interactions. Most (60%) interactions occurred over <20 kb, where chromosome conformation capture-based methods perform poorly. For a fraction of loci, we identified a single variant that alters accessibility across multiple regions, and experimentally validated the BLK locus, which is associated with multiple autoimmune diseases, using CRISPR genome editing. Our study highlights how association genetics of chromatin state is a powerful approach for identifying interactions between regulatory elements.
Collapse
|
16
|
Guo Y, Perez AA, Hazelett DJ, Coetzee GA, Rhie SK, Farnham PJ. CRISPR-mediated deletion of prostate cancer risk-associated CTCF loop anchors identifies repressive chromatin loops. Genome Biol 2018; 19:160. [PMID: 30296942 PMCID: PMC6176514 DOI: 10.1186/s13059-018-1531-0] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Accepted: 09/09/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Recent genome-wide association studies (GWAS) have identified more than 100 loci associated with increased risk of prostate cancer, most of which are in non-coding regions of the genome. Understanding the function of these non-coding risk loci is critical to elucidate the genetic susceptibility to prostate cancer. RESULTS We generate genome-wide regulatory element maps and performed genome-wide chromosome confirmation capture assays (in situ Hi-C) in normal and tumorigenic prostate cells. Using this information, we annotate the regulatory potential of 2,181 fine-mapped prostate cancer risk-associated SNPs and predict a set of target genes that are regulated by prostate cancer risk-related H3K27Ac-mediated loops. We next identify prostate cancer risk-associated CTCF sites involved in long-range chromatin loops. We use CRISPR-mediated deletion to remove prostate cancer risk-associated CTCF anchor regions and the CTCF anchor regions looped to the prostate cancer risk-associated CTCF sites, and we observe up to 100-fold increases in expression of genes within the loops when the prostate cancer risk-associated CTCF anchor regions are deleted. CONCLUSIONS We identify GWAS risk loci involved in long-range loops that function to repress gene expression within chromatin loops. Our studies provide new insights into the genetic susceptibility to prostate cancer.
Collapse
Affiliation(s)
- Yu Guo
- Department of Biochemistry and Molecular Medicine and the Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, 1450 Biggy Street, NRT 6503, Los Angeles, CA 90089-9601 USA
| | - Andrew A. Perez
- Department of Biochemistry and Molecular Medicine and the Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, 1450 Biggy Street, NRT 6503, Los Angeles, CA 90089-9601 USA
| | - Dennis J. Hazelett
- Department of Biomedical Sciences and the Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048 USA
| | | | - Suhn Kyong Rhie
- Department of Biochemistry and Molecular Medicine and the Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, 1450 Biggy Street, NRT 6503, Los Angeles, CA 90089-9601 USA
| | - Peggy J. Farnham
- Department of Biochemistry and Molecular Medicine and the Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, 1450 Biggy Street, NRT 6503, Los Angeles, CA 90089-9601 USA
- Department of Biochemistry and Molecular Medicine and the Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, 1450 Biggy Street, NRT G511B, Los Angeles, CA 90089-9601 USA
| |
Collapse
|
17
|
Suzuki M, Liao W, Wos F, Johnston AD, DeGrazia J, Ishii J, Bloom T, Zody MC, Germer S, Greally JM. Whole-genome bisulfite sequencing with improved accuracy and cost. Genome Res 2018; 28:1364-1371. [PMID: 30093547 PMCID: PMC6120621 DOI: 10.1101/gr.232587.117] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2017] [Accepted: 07/14/2018] [Indexed: 12/17/2022]
Abstract
DNA methylation patterns in the genome both reflect and help to mediate transcriptional regulatory processes. The digital nature of DNA methylation, present or absent on each allele, makes this assay capable of quantifying events in subpopulations of cells, whereas genome-wide chromatin studies lack the same quantitative capacity. Testing DNA methylation throughout the genome is possible using whole-genome bisulfite sequencing (WGBS), but the high costs associated with the assay have made it impractical for studies involving more than limited numbers of samples. We have optimized a new transposase-based library preparation assay for the Illumina HiSeq X platform suitable for limited amounts of DNA and providing a major cost reduction for WGBS. By incorporating methylated cytosines during fragment end repair, we reveal an end-repair artifact affecting 1%-2% of reads that we can remove analytically. We show that the use of a high (G + C) content spike-in performs better than PhiX in terms of bisulfite sequencing quality. As expected, the loci with transposase-accessible chromatin are DNA hypomethylated and enriched in flanking regions by post-translational modifications of histones usually associated with positive effects on gene expression. Using these transposase-accessible loci to represent the cis-regulatory loci in the genome, we compared the representation of these loci between WGBS and other genome-wide DNA methylation assays, showing WGBS to outperform substantially all of the alternatives. We conclude that it is now technologically and financially feasible to perform WGBS in larger numbers of samples with greater accuracy than previously possible.
Collapse
Affiliation(s)
- Masako Suzuki
- Center for Epigenomics and Department of Genetics, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| | - Will Liao
- New York Genome Center, New York, New York 10013, USA
| | - Frank Wos
- New York Genome Center, New York, New York 10013, USA
| | - Andrew D Johnston
- Center for Epigenomics and Department of Genetics, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| | | | | | - Toby Bloom
- New York Genome Center, New York, New York 10013, USA
| | | | - Soren Germer
- New York Genome Center, New York, New York 10013, USA
| | - John M Greally
- Center for Epigenomics and Department of Genetics, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| |
Collapse
|
18
|
Gallagher MD, Chen-Plotkin AS. The Post-GWAS Era: From Association to Function. Am J Hum Genet 2018; 102:717-730. [PMID: 29727686 DOI: 10.1016/j.ajhg.2018.04.002] [Citation(s) in RCA: 495] [Impact Index Per Article: 82.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Accepted: 04/04/2018] [Indexed: 12/13/2022] Open
Abstract
During the past 12 years, genome-wide association studies (GWASs) have uncovered thousands of genetic variants that influence risk for complex human traits and diseases. Yet functional studies aimed at delineating the causal genetic variants and biological mechanisms underlying the observed statistical associations with disease risk have lagged. In this review, we highlight key advances in the field of functional genomics that may facilitate the derivation of biological meaning post-GWAS. We highlight the evidence suggesting that causal variants underlying disease risk often function through regulatory effects on the expression of target genes and that these expression effects might be modest and cell-type specific. We moreover discuss specific studies as proof-of-principle examples for current statistical, bioinformatic, and empirical bench-based approaches to downstream elucidation of GWAS-identified disease risk loci.
Collapse
|
19
|
Abstract
Transcription is regulated by transcription factor (TF) binding at promoters and distal regulatory elements and histone modifications that control the accessibility of these elements. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) has become the standard assay for identifying genome-wide protein-DNA interactions in vitro and in vivo. As large-scale ChIP-seq data sets have been collected for different TFs and histone modifications, their potential to predict gene expression can be used to test hypotheses about the mechanisms of gene regulation. In addition, complementary functional genomics assays provide a global view of chromatin accessibility and long-range cis-regulatory interactions that are being combined with TF binding and histone remodeling to study the regulation of gene expression. Thus, ChIP-seq analysis is now widely integrated with other functional genomics assays to better understand gene regulatory mechanisms. In this review, we discuss advances and challenges in integrating ChIP-seq data to identify context-specific chromatin states associated with gene activity. We describe the overall computational design of integrating ChIP-seq data with other functional genomics assays. We also discuss the challenges of extending these methods to low-input ChIP-seq assays and related single-cell assays.
Collapse
Affiliation(s)
| | - Ali Mortazavi
- Corresponding author: Ali Mortazavi, Department of Developmental and Cell Biology, 2300 Biological Sciences 3, University of California, Irvine, CA 92697, USA. Tel: (949)824-6762; E-mail:
| |
Collapse
|
20
|
Bell CG, Gao F, Yuan W, Roos L, Acton RJ, Xia Y, Bell J, Ward K, Mangino M, Hysi PG, Wang J, Spector TD. Obligatory and facilitative allelic variation in the DNA methylome within common disease-associated loci. Nat Commun 2018; 9:8. [PMID: 29295990 PMCID: PMC5750212 DOI: 10.1038/s41467-017-01586-1] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Accepted: 09/29/2017] [Indexed: 12/16/2022] Open
Abstract
Integrating epigenetic data with genome-wide association study (GWAS) results can reveal disease mechanisms. The genome sequence itself also shapes the epigenome, with CpG density and transcription factor binding sites (TFBSs) strongly encoding the DNA methylome. Therefore, genetic polymorphism impacts on the observed epigenome. Furthermore, large genetic variants alter epigenetic signal dosage. Here, we identify DNA methylation variability between GWAS-SNP risk and non-risk haplotypes. In three subsets comprising 3128 MeDIP-seq peripheral-blood DNA methylomes, we find 7173 consistent and functionally enriched Differentially Methylated Regions. 36.8% can be attributed to common non-SNP genetic variants. CpG-SNPs, as well as facilitative TFBS-motifs, are also enriched. Highlighting their functional potential, CpG-SNPs strongly associate with allele-specific DNase-I hypersensitivity sites. Our results demonstrate strong DNA methylation allelic differences driven by obligatory or facilitative genetic effects, with potential direct or regional disease-related repercussions. These allelic variations require disentangling from pure tissue-specific modifications, may influence array studies, and imply underestimated population variability in current reference epigenomes. Genomic polymorphisms affect the epigenome, which in turn influences how epigenome- and genome-wide analysis are interpreted. Here, the authors characterise allelic differences in DNA methylation driven by obligatory or facilitative genetic effects, which may affect disease-related loci.
Collapse
Affiliation(s)
- Christopher G Bell
- Department of Twin Research & Genetic Epidemiology, King's College London, London, SE1 7EH, UK. .,MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, SO16 6YD, UK. .,Epigenomic Medicine, Biological Sciences, Faculty of Environmental and Natural Sciences, University of Southampton, Southampton, SO17 1BJ, UK. .,Human Development and Health Academic Unit, Institute of Developmental Sciences, University of Southampton, Southampton, SO16 6YD, UK.
| | - Fei Gao
- BGI-Shenzhen, Shenzhen, 518083, China
| | - Wei Yuan
- Department of Twin Research & Genetic Epidemiology, King's College London, London, SE1 7EH, UK.,Institute of Cancer Research, Sutton, SM2 5NG, UK
| | - Leonie Roos
- Department of Twin Research & Genetic Epidemiology, King's College London, London, SE1 7EH, UK.,MRC London Institute of Medical Sciences, Imperial College London, Du Cane Road, London, W12 0NN, UK
| | - Richard J Acton
- MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, SO16 6YD, UK.,Epigenomic Medicine, Biological Sciences, Faculty of Environmental and Natural Sciences, University of Southampton, Southampton, SO17 1BJ, UK.,Human Development and Health Academic Unit, Institute of Developmental Sciences, University of Southampton, Southampton, SO16 6YD, UK
| | | | - Jordana Bell
- Department of Twin Research & Genetic Epidemiology, King's College London, London, SE1 7EH, UK
| | - Kirsten Ward
- Department of Twin Research & Genetic Epidemiology, King's College London, London, SE1 7EH, UK
| | - Massimo Mangino
- Department of Twin Research & Genetic Epidemiology, King's College London, London, SE1 7EH, UK
| | - Pirro G Hysi
- Department of Twin Research & Genetic Epidemiology, King's College London, London, SE1 7EH, UK
| | - Jun Wang
- MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, SO16 6YD, UK
| | - Timothy D Spector
- Department of Twin Research & Genetic Epidemiology, King's College London, London, SE1 7EH, UK
| |
Collapse
|
21
|
Carrel L, Brown CJ. When the Lyon(ized chromosome) roars: ongoing expression from an inactive X chromosome. Philos Trans R Soc Lond B Biol Sci 2017; 372:20160355. [PMID: 28947654 PMCID: PMC5627157 DOI: 10.1098/rstb.2016.0355] [Citation(s) in RCA: 59] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/24/2017] [Indexed: 12/21/2022] Open
Abstract
A tribute to Mary Lyon was held in October 2016. Many remarked about Lyon's foresight regarding many intricacies of the X-chromosome inactivation process. One such example is that a year after her original 1961 hypothesis she proposed that genes with Y homologues should escape from X inactivation to achieve dosage compensation between males and females. Fifty-five years later we have learned many details about these escapees that we attempt to summarize in this review, with a particular focus on recent findings. We now know that escapees are not rare, particularly on the human X, and that most lack functionally equivalent Y homologues, leading to their increasingly recognized role in sexually dimorphic traits. Newer sequencing technologies have expanded profiling of primary tissues that will better enable connections to sex-biased disorders as well as provide additional insights into the X-inactivation process. Chromosome organization, nuclear location and chromatin environments distinguish escapees from other X-inactivated genes. Nevertheless, several big questions remain, including what dictates their distinct epigenetic environment, the underlying basis of species differences in escapee regulation, how different classes of escapees are distinguished, and the roles that local sequences and chromosome ultrastructure play in escapee regulation.This article is part of the themed issue 'X-chromosome inactivation: a tribute to Mary Lyon'.
Collapse
Affiliation(s)
- Laura Carrel
- Department of Biochemistry and Molecular Biology, Penn State College of Medicine, 500 University Drive, Mail code H171, Hershey, PA 17033, USA
| | - Carolyn J Brown
- Department of Medical Genetics, Molecular Epigenetics Group, Life Sciences Institute, University of British Columbia, 2350 Health Sciences Mall, Vancouver, Canada BC V6T 1Z3
| |
Collapse
|
22
|
Wong ES, Schmitt BM, Kazachenka A, Thybert D, Redmond A, Connor F, Rayner TF, Feig C, Ferguson-Smith AC, Marioni JC, Odom DT, Flicek P. Interplay of cis and trans mechanisms driving transcription factor binding and gene expression evolution. Nat Commun 2017; 8:1092. [PMID: 29061983 PMCID: PMC5653656 DOI: 10.1038/s41467-017-01037-x] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 08/09/2017] [Indexed: 12/23/2022] Open
Abstract
Noncoding regulatory variants play a central role in the genetics of human diseases and in evolution. Here we measure allele-specific transcription factor binding occupancy of three liver-specific transcription factors between crosses of two inbred mouse strains to elucidate the regulatory mechanisms underlying transcription factor binding variations in mammals. Our results highlight the pre-eminence of cis-acting variants on transcription factor occupancy divergence. Transcription factor binding differences linked to cis-acting variants generally exhibit additive inheritance, while those linked to trans-acting variants are most often dominantly inherited. Cis-acting variants lead to local coordination of transcription factor occupancies that decay with distance; distal coordination is also observed and may be modulated by long-range chromatin contacts. Our results reveal the regulatory mechanisms that interplay to drive transcription factor occupancy, chromatin state, and gene expression in complex mammalian cell states.
Collapse
Affiliation(s)
- Emily S Wong
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Bianca M Schmitt
- University of Cambridge, Cancer Research UK-Cambridge Institute, Li Ka Shing Centre, Cambridge, CB2 0RE, UK
| | | | - David Thybert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Aisling Redmond
- University of Cambridge, Cancer Research UK-Cambridge Institute, Li Ka Shing Centre, Cambridge, CB2 0RE, UK
| | - Frances Connor
- University of Cambridge, Cancer Research UK-Cambridge Institute, Li Ka Shing Centre, Cambridge, CB2 0RE, UK
| | - Tim F Rayner
- University of Cambridge, Cancer Research UK-Cambridge Institute, Li Ka Shing Centre, Cambridge, CB2 0RE, UK
| | - Christine Feig
- University of Cambridge, Cancer Research UK-Cambridge Institute, Li Ka Shing Centre, Cambridge, CB2 0RE, UK
| | | | - John C Marioni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- University of Cambridge, Cancer Research UK-Cambridge Institute, Li Ka Shing Centre, Cambridge, CB2 0RE, UK
| | - Duncan T Odom
- University of Cambridge, Cancer Research UK-Cambridge Institute, Li Ka Shing Centre, Cambridge, CB2 0RE, UK.
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
| |
Collapse
|
23
|
Abstract
Extensive 3D folding is required to package a genome into the tiny nuclear space, and this packaging must be compatible with proper gene expression. Thus, in the well-hierarchized nucleus, chromosomes occupy discrete territories and adopt specific 3D organizational structures that facilitate interactions between regulatory elements for gene expression. The mammalian X chromosome exemplifies this structure-function relationship. Recent studies have shown that, upon X-chromosome inactivation, active and inactive X chromosomes localize to different subnuclear positions and adopt distinct chromosomal architectures that reflect their activity states. Here, we review the roles of long non-coding RNAs, chromosomal organizational structures and the subnuclear localization of chromosomes as they relate to X-linked gene expression.
Collapse
|
24
|
Ruiz-Velasco M, Zaugg JB. Structure meets function: How chromatin organisation conveys functionality. ACTA ACUST UNITED AC 2017. [DOI: 10.1016/j.coisb.2017.01.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
25
|
Deplancke B, Alpern D, Gardeux V. The Genetics of Transcription Factor DNA Binding Variation. Cell 2016; 166:538-554. [PMID: 27471964 DOI: 10.1016/j.cell.2016.07.012] [Citation(s) in RCA: 244] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Indexed: 12/23/2022]
Abstract
Most complex trait-associated variants are located in non-coding regulatory regions of the genome, where they have been shown to disrupt transcription factor (TF)-DNA binding motifs. Variable TF-DNA interactions are therefore increasingly considered as key drivers of phenotypic variation. However, recent genome-wide studies revealed that the majority of variable TF-DNA binding events are not driven by sequence alterations in the motif of the studied TF. This observation implies that the molecular mechanisms underlying TF-DNA binding variation and, by extrapolation, inter-individual phenotypic variation are more complex than originally anticipated. Here, we summarize the findings that led to this important paradigm shift and review proposed mechanisms for local, proximal, or distal genetic variation-driven variable TF-DNA binding. In addition, we discuss the biomedical implications of these findings for our ability to dissect the molecular role(s) of non-coding genetic variants in complex traits, including disease susceptibility.
Collapse
Affiliation(s)
- Bart Deplancke
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.
| | - Daniel Alpern
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Vincent Gardeux
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
26
|
Chen CY, Shi W, Balaton BP, Matthews AM, Li Y, Arenillas DJ, Mathelier A, Itoh M, Kawaji H, Lassmann T, Hayashizaki Y, Carninci P, Forrest ARR, Brown CJ, Wasserman WW. YY1 binding association with sex-biased transcription revealed through X-linked transcript levels and allelic binding analyses. Sci Rep 2016; 6:37324. [PMID: 27857184 PMCID: PMC5114649 DOI: 10.1038/srep37324] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Accepted: 10/24/2016] [Indexed: 12/27/2022] Open
Abstract
Sex differences in susceptibility and progression have been reported in numerous diseases. Female cells have two copies of the X chromosome with X-chromosome inactivation imparting mono-allelic gene silencing for dosage compensation. However, a subset of genes, named escapees, escape silencing and are transcribed bi-allelically resulting in sexual dimorphism. Here we conducted in silico analyses of the sexes using human datasets to gain perspectives into such regulation. We identified transcription start sites of escapees (escTSSs) based on higher transcription levels in female cells using FANTOM5 CAGE data. Significant over-representations of YY1 transcription factor binding motif and ChIP-seq peaks around escTSSs highlighted its positive association with escapees. Furthermore, YY1 occupancy is significantly biased towards the inactive X (Xi) at long non-coding RNA loci that are frequent contacts of Xi-specific superloops. Our study suggests a role for YY1 in transcriptional activity on Xi in general through sequence-specific binding, and its involvement at superloop anchors.
Collapse
Affiliation(s)
- Chih-Yu Chen
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, British Columbia, Canada.,Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Wenqiang Shi
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, British Columbia, Canada.,Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Bradley P Balaton
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Allison M Matthews
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, British Columbia, Canada
| | - Yifeng Li
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, British Columbia, Canada
| | - David J Arenillas
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, British Columbia, Canada
| | - Anthony Mathelier
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, British Columbia, Canada
| | - Masayoshi Itoh
- RIKEN Omics Science Center, Yokohama, Japan.,RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Japan.,RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Saitama, Japan
| | - Hideya Kawaji
- RIKEN Omics Science Center, Yokohama, Japan.,RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Japan.,RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Saitama, Japan
| | - Timo Lassmann
- RIKEN Omics Science Center, Yokohama, Japan.,RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Japan
| | - Yoshihide Hayashizaki
- RIKEN Omics Science Center, Yokohama, Japan.,RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Saitama, Japan
| | - Piero Carninci
- RIKEN Omics Science Center, Yokohama, Japan.,RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Japan
| | - Alistair R R Forrest
- RIKEN Omics Science Center, Yokohama, Japan.,RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Japan.,Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, the University of Western Australia, Nedlands, Western Australia, Australia
| | - Carolyn J Brown
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Wyeth W Wasserman
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, British Columbia, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
27
|
Wen X. Molecular QTL discovery incorporating genomic annotations using Bayesian false discovery rate control. Ann Appl Stat 2016. [DOI: 10.1214/16-aoas952] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
28
|
Shi W, Fornes O, Mathelier A, Wasserman WW. Evaluating the impact of single nucleotide variants on transcription factor binding. Nucleic Acids Res 2016; 44:10106-10116. [PMID: 27492288 PMCID: PMC5137422 DOI: 10.1093/nar/gkw691] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Revised: 07/25/2016] [Accepted: 07/26/2016] [Indexed: 12/21/2022] Open
Abstract
Diseases and phenotypes caused by disrupted transcription factor (TF) binding are being identified, but progress is hampered by our limited capacity to predict such functional alterations. Improving predictions may be dependent on expanding the set of bona fide TF binding alterations. Allele-specific binding (ASB) events, where TFs preferentially bind to one of the two alleles at heterozygous sites, reveal the impact of sequence variations in altered TF binding. Here, we present the largest ASB compilation to our knowledge, 10 765 ASB events retrieved from 45 ENCODE ChIP-Seq data sets. Our analysis showed that ASB events were frequently associated with motif alterations of the ChIP'ed TF and potential partner TFs, allelic difference of DNase I hypersensitivity and allelic difference of histone modifications. For TF dimers bound symmetrically to DNA, ASB data revealed that central positions of the TF binding motifs were disproportionately important for binding. Lastly, the impact of variation on TF binding was predicted by a classification model incorporating all the investigated features of ASB events. Classification models using only DNase I hypersensitivity and sequence data exhibited predictive accuracy approaching the models with substantially more features. Taken together, the combination of ASB data and the classification model represents an important step toward elucidating regulatory variants across the human genome.
Collapse
Affiliation(s)
- Wenqiang Shi
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, Child & Family Research Institute, University of British Columbia, 950 28th Ave W, Vancouver, BC V5Z 4H4, Canada.,Bioinformatics Graduate Program, University of British Columbia, 2329 W Mall, Vancouver, BC V6T 1Z4, Canada
| | - Oriol Fornes
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, Child & Family Research Institute, University of British Columbia, 950 28th Ave W, Vancouver, BC V5Z 4H4, Canada
| | - Anthony Mathelier
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, Child & Family Research Institute, University of British Columbia, 950 28th Ave W, Vancouver, BC V5Z 4H4, Canada.,Centre for Molecular Medicine Norway (NCMM), Nordic EMBL partnership, University of Oslo and Oslo University Hospital, Norway
| | - Wyeth W Wasserman
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, Child & Family Research Institute, University of British Columbia, 950 28th Ave W, Vancouver, BC V5Z 4H4, Canada
| |
Collapse
|
29
|
Abstract
Epigenome-wide association studies represent one means of applying genome-wide assays to identify molecular events that could be associated with human phenotypes. The epigenome is especially intriguing as a target for study, as epigenetic regulatory processes are, by definition, heritable from parent to daughter cells and are found to have transcriptional regulatory properties. As such, the epigenome is an attractive candidate for mediating long-term responses to cellular stimuli, such as environmental effects modifying disease risk. Such epigenomic studies represent a broader category of disease -omics, which suffer from multiple problems in design and execution that severely limit their interpretability. Here we define many of the problems with current epigenomic studies and propose solutions that can be applied to allow this and other disease -omics studies to achieve their potential for generating valuable insights.
Collapse
Affiliation(s)
- Ewan Birney
- European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - George Davey Smith
- University of Bristol, School of Social and Community Medicine, Oakfield House, Oakfield Grove, United Kingdom
| | - John M. Greally
- Department of Genetics, Albert Einstein College of Medicine, Bronx, New York, United States of America
- * E-mail:
| |
Collapse
|
30
|
Li YI, van de Geijn B, Raj A, Knowles DA, Petti AA, Golan D, Gilad Y, Pritchard JK. RNA splicing is a primary link between genetic variation and disease. Science 2016; 352:600-4. [PMID: 27126046 PMCID: PMC5182069 DOI: 10.1126/science.aad9417] [Citation(s) in RCA: 411] [Impact Index Per Article: 51.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Accepted: 03/25/2016] [Indexed: 12/14/2022]
Abstract
Noncoding variants play a central role in the genetics of complex traits, but we still lack a full understanding of the molecular pathways through which they act. We quantified the contribution of cis-acting genetic effects at all major stages of gene regulation from chromatin to proteins, in Yoruba lymphoblastoid cell lines (LCLs). About ~65% of expression quantitative trait loci (eQTLs) have primary effects on chromatin, whereas the remaining eQTLs are enriched in transcribed regions. Using a novel method, we also detected 2893 splicing QTLs, most of which have little or no effect on gene-level expression. These splicing QTLs are major contributors to complex traits, roughly on a par with variants that affect gene expression levels. Our study provides a comprehensive view of the mechanisms linking genetic variation to variation in human gene regulation.
Collapse
Affiliation(s)
- Yang I Li
- Department of Genetics, Stanford University, Stanford, CA, USA
| | | | - Anil Raj
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - David A Knowles
- Department of Computer Science, Stanford University, Stanford, CA, USA. Department of Radiology, Stanford University, Stanford, CA, USA
| | - Allegra A Petti
- Genome Institute, Washington University in St. Louis, St. Louis, MO, USA
| | - David Golan
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Yoav Gilad
- Department of Human Genetics, University of Chicago, Chicago, IL, USA.
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA, USA. Department of Biology, Stanford University, Stanford, CA, USA. Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA.
| |
Collapse
|
31
|
Tehranchi AK, Myrthil M, Martin T, Hie BL, Golan D, Fraser HB. Pooled ChIP-Seq Links Variation in Transcription Factor Binding to Complex Disease Risk. Cell 2016; 165:730-41. [PMID: 27087447 DOI: 10.1016/j.cell.2016.03.041] [Citation(s) in RCA: 83] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Revised: 12/30/2015] [Accepted: 03/23/2016] [Indexed: 01/08/2023]
Abstract
Cis-regulatory elements such as transcription factor (TF) binding sites can be identified genome-wide, but it remains far more challenging to pinpoint genetic variants affecting TF binding. Here, we introduce a pooling-based approach to mapping quantitative trait loci (QTLs) for molecular-level traits. Applying this to five TFs and a histone modification, we mapped thousands of cis-acting QTLs, with over 25-fold lower cost compared to standard QTL mapping. We found that single genetic variants frequently affect binding of multiple TFs, and CTCF can recruit all five TFs to its binding sites. These QTLs often affect local chromatin and transcription but can also influence long-range chromosomal contacts, demonstrating a role for natural genetic variation in chromosomal architecture. Thousands of these QTLs have been implicated in genome-wide association studies, providing candidate molecular mechanisms for many disease risk loci and suggesting that TF binding variation may underlie a large fraction of human phenotypic variation.
Collapse
Affiliation(s)
| | - Marsha Myrthil
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Trevor Martin
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Brian L Hie
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - David Golan
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Statistics, Stanford University, Stanford, CA 94305, USA
| | - Hunter B Fraser
- Department of Biology, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
32
|
Moyerbrailean GA, Kalita CA, Harvey CT, Wen X, Luca F, Pique-Regi R. Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? PLoS Genet 2016; 12:e1005875. [PMID: 26901046 PMCID: PMC4764260 DOI: 10.1371/journal.pgen.1005875] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2015] [Accepted: 01/26/2016] [Indexed: 01/08/2023] Open
Abstract
Large experimental efforts are characterizing the regulatory genome, yet we are still missing a systematic definition of functional and silent genetic variants in non-coding regions. Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict the impact of a genetic variant on TF binding across 153 tissues and 1,372 TF motifs. Each annotation we derived is specific for a cell-type condition or assay and is locally motif-driven. We found 5.8 million genetic variants in footprints, 66% of which are predicted by our model to affect TF binding. Comprehensive examination using allele-specific hypersensitivity (ASH) reveals that only the latter group consistently shows evidence for ASH (3,217 SNPs at 20% FDR), suggesting that most (97%) genetic variants in footprinted regulatory regions are indeed silent. Combining this information with GWAS data reveals that our annotation helps in computationally fine-mapping 86 SNPs in GWAS hit regions with at least a 2-fold increase in the posterior odds of picking the causal SNP. The rich meta information provided by the tissue-specificity and the identity of the putative TF binding site being affected also helps in identifying the underlying mechanism supporting the association. As an example, the enrichment for LDL level-associated SNPs is 9.1-fold higher among SNPs predicted to affect HNF4 binding sites than in a background model already including tissue-specific annotation.
Collapse
Affiliation(s)
- Gregory A. Moyerbrailean
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, United States of America
| | - Cynthia A. Kalita
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, United States of America
| | - Chris T. Harvey
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, United States of America
| | - Xiaoquan Wen
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Francesca Luca
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, United States of America
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, Michigan, United States of America
| | - Roger Pique-Regi
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, United States of America
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, Michigan, United States of America
| |
Collapse
|
33
|
Tak YG, Farnham PJ. Making sense of GWAS: using epigenomics and genome engineering to understand the functional relevance of SNPs in non-coding regions of the human genome. Epigenetics Chromatin 2015; 8:57. [PMID: 26719772 PMCID: PMC4696349 DOI: 10.1186/s13072-015-0050-4] [Citation(s) in RCA: 206] [Impact Index Per Article: 22.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 12/09/2015] [Indexed: 12/13/2022] Open
Abstract
Considerable progress towards an understanding of complex diseases has been made in recent years due to the development of high-throughput genotyping technologies. Using microarrays that contain millions of single-nucleotide polymorphisms (SNPs), Genome Wide Association Studies (GWASs) have identified SNPs that are associated with many complex diseases or traits. For example, as of February 2015, 2111 association studies have identified 15,396 SNPs for various diseases and traits, with the number of identified SNP-disease/trait associations increasing rapidly in recent years. However, it has been difficult for researchers to understand disease risk from GWAS results. This is because most GWAS-identified SNPs are located in non-coding regions of the genome. It is important to consider that the GWAS-identified SNPs serve only as representatives for all SNPs in the same haplotype block, and it is equally likely that other SNPs in high linkage disequilibrium (LD) with the array-identified SNPs are causal for the disease. Because it was hoped that disease-associated coding variants would be identified if the true casual SNPs were known, investigators have expanded their analyses using LD calculation and fine-mapping. However, such analyses also identified risk-associated SNPs located in non-coding regions. Thus, the GWAS field has been left with the conundrum as to how a single-nucleotide change in a non-coding region could confer increased risk for a specific disease. One possible answer to this puzzle is that the variant SNPs cause changes in gene expression levels rather than causing changes in protein function. This review provides a description of (1) advances in genomic and epigenomic approaches that incorporate functional annotation of regulatory elements to prioritize the disease risk-associated SNPs that are located in non-coding regions of the genome for follow-up studies, (2) various computational tools that aid in identifying gene expression changes caused by the non-coding disease-associated SNPs, and (3) experimental approaches to identify target genes of, and study the biological phenotypes conferred by, non-coding disease-associated SNPs.
Collapse
Affiliation(s)
- Yu Gyoung Tak
- Department of Biochemistry and Molecular Biology, Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA 90089 USA
| | - Peggy J Farnham
- Department of Biochemistry and Molecular Biology, Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA 90089 USA
| |
Collapse
|
34
|
Kumasaka N, Knights AJ, Gaffney DJ. Fine-mapping cellular QTLs with RASQUAL and ATAC-seq. Nat Genet 2015; 48:206-13. [PMID: 26656845 PMCID: PMC5098600 DOI: 10.1038/ng.3467] [Citation(s) in RCA: 149] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2015] [Accepted: 11/13/2015] [Indexed: 12/13/2022]
Abstract
When cellular traits are measured using high-throughput DNA sequencing, quantitative trait loci (QTLs) manifest as fragment count differences between individuals and allelic differences within individuals. We present RASQUAL (Robust Allele-Specific Quantitation and Quality Control), a new statistical approach for association mapping that models genetic effects and accounts for biases in sequencing data using a single, probabilistic framework. RASQUAL substantially improves fine-mapping accuracy and sensitivity relative to existing methods in RNA-seq, DNase-seq and ChIP-seq data. We illustrate how RASQUAL can be used to maximize association detection by generating the first map of chromatin accessibility QTLs (caQTLs) in a European population using ATAC-seq. Despite a modest sample size, we identified 2,707 independent caQTLs (at a false discovery rate of 10%) and demonstrated how RASQUAL and ATAC-seq can provide powerful information for fine-mapping gene-regulatory variants and for linking distal regulatory elements with gene promoters. Our results highlight how combining between-individual and allele-specific genetic signals improves the functional interpretation of noncoding variation.
Collapse
Affiliation(s)
| | - Andrew J Knights
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Daniel J Gaffney
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| |
Collapse
|
35
|
Maurano MT, Wang H, John S, Shafer A, Canfield T, Lee K, Stamatoyannopoulos JA. Role of DNA Methylation in Modulating Transcription Factor Occupancy. Cell Rep 2015; 12:1184-95. [PMID: 26257180 DOI: 10.1016/j.celrep.2015.07.024] [Citation(s) in RCA: 195] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2014] [Revised: 06/14/2015] [Accepted: 07/10/2015] [Indexed: 02/07/2023] Open
Abstract
Although DNA methylation is commonly invoked as a mechanism for transcriptional repression, the extent to which it actively silences transcription factor (TF) occupancy sites in vivo is unknown. To study the role of DNA methylation in the active modulation of TF binding, we quantified the effect of DNA methylation depletion on the genomic occupancy patterns of CTCF, an abundant TF with known methylation sensitivity that is capable of autonomous binding to its target sites in chromatin. Here, we show that the vast majority (>98.5%) of the tens of thousands of unoccupied, methylated CTCF recognition sequences remain unbound upon abrogation of DNA methylation. The small fraction of sites that show methylation-dependent binding in vivo are in turn characterized by highly variable CTCF occupancy across cell types. Our results suggest that DNA methylation is not a primary groundskeeper of genomic TF landscapes, but rather a specialized mechanism for stabilizing intrinsically labile sites.
Collapse
Affiliation(s)
- Matthew T Maurano
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
| | - Hao Wang
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Sam John
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Anthony Shafer
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Theresa Canfield
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Kristen Lee
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - John A Stamatoyannopoulos
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Division of Oncology, Department of Medicine, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
36
|
Kaplow IM, MacIsaac JL, Mah SM, McEwen LM, Kobor MS, Fraser HB. A pooling-based approach to mapping genetic variants associated with DNA methylation. Genome Res 2015; 25:907-17. [PMID: 25910490 PMCID: PMC4448686 DOI: 10.1101/gr.183749.114] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2014] [Accepted: 04/17/2015] [Indexed: 12/23/2022]
Abstract
DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a truly genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. We found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data.
Collapse
Affiliation(s)
- Irene M Kaplow
- Department of Computer Science, Stanford University, Stanford, California 94305, USA; Department of Biology, Stanford University, Stanford, California 94305, USA
| | - Julia L MacIsaac
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, British Columbia V5Z 4H4, Canada
| | - Sarah M Mah
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, British Columbia V5Z 4H4, Canada
| | - Lisa M McEwen
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, British Columbia V5Z 4H4, Canada; Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia V5Z 4H4, Canada
| | - Michael S Kobor
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, British Columbia V5Z 4H4, Canada; Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia V5Z 4H4, Canada
| | - Hunter B Fraser
- Department of Biology, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
37
|
Berletch JB, Ma W, Yang F, Shendure J, Noble WS, Disteche CM, Deng X. Escape from X inactivation varies in mouse tissues. PLoS Genet 2015; 11:e1005079. [PMID: 25785854 PMCID: PMC4364777 DOI: 10.1371/journal.pgen.1005079] [Citation(s) in RCA: 185] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2014] [Accepted: 02/17/2015] [Indexed: 12/22/2022] Open
Abstract
X chromosome inactivation (XCI) silences most genes on one X chromosome in female mammals, but some genes escape XCI. To identify escape genes in vivo and to explore molecular mechanisms that regulate this process we analyzed the allele-specific expression and chromatin structure of X-linked genes in mouse tissues and cells with skewed XCI and distinguishable alleles based on single nucleotide polymorphisms. Using a binomial model to assess allelic expression, we demonstrate a continuum between complete silencing and expression from the inactive X (Xi). The validity of the RNA-seq approach was verified using RT-PCR with species-specific primers or Sanger sequencing. Both common escape genes and genes with significant differences in XCI status between tissues were identified. Such genes may be candidates for tissue-specific sex differences. Overall, few genes (3-7%) escape XCI in any of the mouse tissues examined, suggesting stringent silencing and escape controls. In contrast, an in vitro system represented by the embryonic-kidney-derived Patski cell line showed a higher density of escape genes (21%), representing both kidney-specific escape genes and cell-line specific escape genes. Allele-specific RNA polymerase II occupancy and DNase I hypersensitivity at the promoter of genes on the Xi correlated well with levels of escape, consistent with an open chromatin structure at escape genes. Allele-specific CTCF binding on the Xi clustered at escape genes and was denser in brain compared to the Patski cell line, possibly contributing to a more compartmentalized structure of the Xi and fewer escape genes in brain compared to the cell line where larger domains of escape were observed.
Collapse
Affiliation(s)
- Joel B. Berletch
- Department of Pathology, University of Washington, Seattle, Washington, United States of America
| | - Wenxiu Ma
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Fan Yang
- Department of Pathology, University of Washington, Seattle, Washington, United States of America
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - William S. Noble
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Christine M. Disteche
- Department of Pathology, University of Washington, Seattle, Washington, United States of America
- Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Xinxian Deng
- Department of Pathology, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
38
|
Albert FW, Kruglyak L. The role of regulatory variation in complex traits and disease. Nat Rev Genet 2015; 16:197-212. [DOI: 10.1038/nrg3891] [Citation(s) in RCA: 684] [Impact Index Per Article: 76.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|