1
|
Cai YM, Lu ZQ, Li B, Huang JY, Zhang M, Chen C, Fan LY, Ma QY, He CY, Chen SN, Jiang Y, Li YM, Ning CB, Zhang FW, Wang WZ, Liu YZ, Zhang H, Jin M, Wang XY, Han JX, Xiong Z, Cai M, Huang CQ, Yang XJ, Zhu X, Zhu Y, Miao XP, Zhang SK, Wei YC, Tian JB. Genome-wide enhancer RNA profiling adds molecular links between genetic variation and human cancers. Mil Med Res 2024; 11:36. [PMID: 38863031 PMCID: PMC11165858 DOI: 10.1186/s40779-024-00539-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 05/17/2024] [Indexed: 06/13/2024] Open
Abstract
BACKGROUND Dysregulation of enhancer transcription occurs in multiple cancers. Enhancer RNAs (eRNAs) are transcribed products from enhancers that play critical roles in transcriptional control. Characterizing the genetic basis of eRNA expression may elucidate the molecular mechanisms underlying cancers. METHODS Initially, a comprehensive analysis of eRNA quantitative trait loci (eRNAQTLs) was performed in The Cancer Genome Atlas (TCGA), and functional features were characterized using multi-omics data. To establish the first eRNAQTL profiles for colorectal cancer (CRC) in China, epigenomic data were used to define active enhancers, which were subsequently integrated with transcription and genotyping data from 154 paired CRC samples. Finally, large-scale case-control studies (34,585 cases and 69,544 controls) were conducted along with multipronged experiments to investigate the potential mechanisms by which candidate eRNAQTLs affect CRC risk. RESULTS A total of 300,112 eRNAQTLs were identified across 30 different cancer types, which exert their influence on eRNA transcription by modulating chromatin status, binding affinity to transcription factors and RNA-binding proteins. These eRNAQTLs were found to be significantly enriched in cancer risk loci, explaining a substantial proportion of cancer heritability. Additionally, tumor-specific eRNAQTLs exhibited high responsiveness to the development of cancer. Moreover, the target genes of these eRNAs were associated with dysregulated signaling pathways and immune cell infiltration in cancer, highlighting their potential as therapeutic targets. Furthermore, multiple ethnic population studies have confirmed that an eRNAQTL rs3094296-T variant decreases the risk of CRC in populations from China (OR = 0.91, 95%CI 0.88-0.95, P = 2.92 × 10-7) and Europe (OR = 0.92, 95%CI 0.88-0.95, P = 4.61 × 10-6). Mechanistically, rs3094296 had an allele-specific effect on the transcription of the eRNA ENSR00000155786, which functioned as a transcriptional activator promoting the expression of its target gene SENP7. These two genes synergistically suppressed tumor cell proliferation. Our curated list of variants, genes, and drugs has been made available in CancereRNAQTL ( http://canernaqtl.whu.edu.cn/#/ ) to serve as an informative resource for advancing this field. CONCLUSION Our findings underscore the significance of eRNAQTLs in transcriptional regulation and disease heritability, pinpointing the potential of eRNA-based therapeutic strategies in cancers.
Collapse
Affiliation(s)
- Yi-Min Cai
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
- Department of Cancer Epidemiology, Henan Engineering Research Center of Cancer Prevention and Control, Henan International Joint Laboratory of Cancer Prevention, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, 450008, China
| | - Ze-Qun Lu
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
- Department of Cancer Epidemiology, Henan Engineering Research Center of Cancer Prevention and Control, Henan International Joint Laboratory of Cancer Prevention, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, 450008, China
| | - Bin Li
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
- Department of Cancer Epidemiology, Henan Engineering Research Center of Cancer Prevention and Control, Henan International Joint Laboratory of Cancer Prevention, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, 450008, China
| | - Jin-Yu Huang
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Ming Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Can Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Lin-Yun Fan
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Qian-Ying Ma
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Chun-Yi He
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Shuo-Ni Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Yuan Jiang
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Yan-Min Li
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Cai-Bo Ning
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Fu-Wei Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Wen-Zhuo Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Yi-Zhuo Liu
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Heng Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Meng Jin
- Department of Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Xiao-Yang Wang
- Department of Cancer Epidemiology, Henan Engineering Research Center of Cancer Prevention and Control, Henan International Joint Laboratory of Cancer Prevention, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, 450008, China
| | - Jin-Xin Han
- Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Zhen Xiong
- Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Ming Cai
- Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Chao-Qun Huang
- Department of Gastrointestinal Surgery, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
| | - Xiao-Jun Yang
- Department of Gastrointestinal Surgery, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
| | - Xu Zhu
- Department of Gastrointestinal Surgery, Renmin Hospital of Wuhan University, Wuhan, 430060, China
| | - Ying Zhu
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Xiao-Ping Miao
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China.
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China.
- Department of Epidemiology and Biostatistics, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
- Jiangsu Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, 211166, China.
| | - Shao-Kai Zhang
- Department of Cancer Epidemiology, Henan Engineering Research Center of Cancer Prevention and Control, Henan International Joint Laboratory of Cancer Prevention, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, 450008, China.
| | - Yong-Chang Wei
- Department of Gastrointestinal Oncology, Hubei Cancer Clinical Study Center, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China.
| | - Jian-Bo Tian
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China.
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China.
- Department of Cancer Epidemiology, Henan Engineering Research Center of Cancer Prevention and Control, Henan International Joint Laboratory of Cancer Prevention, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, 450008, China.
| |
Collapse
|
2
|
DeGroat W, Inoue F, Ashuach T, Yosef N, Ahituv N, Kreimer A. Comprehensive network modeling approaches unravel dynamic enhancer-promoter interactions across neural differentiation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.22.595375. [PMID: 38826254 PMCID: PMC11142193 DOI: 10.1101/2024.05.22.595375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Background Increasing evidence suggests that a substantial proportion of disease-associated mutations occur in enhancers, regions of non-coding DNA essential to gene regulation. Understanding the structures and mechanisms of regulatory programs this variation affects can shed light on the apparatuses of human diseases. Results We collected epigenetic and gene expression datasets from seven early time points during neural differentiation. Focusing on this model system, we constructed networks of enhancer-promoter interactions, each at an individual stage of neural induction. These networks served as the base for a rich series of analyses, through which we demonstrated their temporal dynamics and enrichment for various disease-associated variants. We applied the Girvan-Newman clustering algorithm to these networks to reveal biologically relevant substructures of regulation. Additionally, we demonstrated methods to validate predicted enhancer-promoter interactions using transcription factor overexpression and massively parallel reporter assays. Conclusions Our findings suggest a generalizable framework for exploring gene regulatory programs and their dynamics across developmental processes. This includes a comprehensive approach to studying the effects of disease-associated variation on transcriptional networks. The techniques applied to our networks have been published alongside our findings as a computational tool, E-P-INAnalyzer. Our procedure can be utilized across different cellular contexts and disorders.
Collapse
Affiliation(s)
- William DeGroat
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, 679 Hoes Lane West, Piscataway, NJ 08854, UAS
| | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Tal Ashuach
- Department of Electrical Engineering and Computer Sciences and Center for Computational Biology, University of California, Berkeley, 387 Soda Hall, Berkeley, CA 94720, USA
| | - Nir Yosef
- Department of Systems Immunology, Weizmann Institute of Science, 234 Herzl Street, Rehovot 7610001, Israel
- Chan-Zuckerberg Biohub, 499 Illinois St, San Francisco, CA 94158, USA
- Department of Systems Immunology, Ragon Institute of MGH, MIT, and Harvard Institute of Science, 400 Technology Square, Cambridge, MA 02139, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, 513 Parnassus Ave, CA 94143, USA
- Institute for Human Genetics, University of California, San Francisco, 513 Parnassus Ave, CA 94143, USA
| | - Anat Kreimer
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, 679 Hoes Lane West, Piscataway, NJ 08854, UAS
- Department of Biochemistry and Molecular Biology, Rutgers, The State University of New Jersey, 604 Allison Road, Piscataway, NJ 08854, USA
| |
Collapse
|
3
|
Siraj L, Castro RI, Dewey H, Kales S, Nguyen TTL, Kanai M, Berenzy D, Mouri K, Wang QS, McCaw ZR, Gosai SJ, Aguet F, Cui R, Vockley CM, Lareau CA, Okada Y, Gusev A, Jones TR, Lander ES, Sabeti PC, Finucane HK, Reilly SK, Ulirsch JC, Tewhey R. Functional dissection of complex and molecular trait variants at single nucleotide resolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.05.592437. [PMID: 38766054 PMCID: PMC11100724 DOI: 10.1101/2024.05.05.592437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Identifying the causal variants and mechanisms that drive complex traits and diseases remains a core problem in human genetics. The majority of these variants have individually weak effects and lie in non-coding gene-regulatory elements where we lack a complete understanding of how single nucleotide alterations modulate transcriptional processes to affect human phenotypes. To address this, we measured the activity of 221,412 trait-associated variants that had been statistically fine-mapped using a Massively Parallel Reporter Assay (MPRA) in 5 diverse cell-types. We show that MPRA is able to discriminate between likely causal variants and controls, identifying 12,025 regulatory variants with high precision. Although the effects of these variants largely agree with orthogonal measures of function, only 69% can plausibly be explained by the disruption of a known transcription factor (TF) binding motif. We dissect the mechanisms of 136 variants using saturation mutagenesis and assign impacted TFs for 91% of variants without a clear canonical mechanism. Finally, we provide evidence that epistasis is prevalent for variants in close proximity and identify multiple functional variants on the same haplotype at a small, but important, subset of trait-associated loci. Overall, our study provides a systematic functional characterization of likely causal common variants underlying complex and molecular human traits, enabling new insights into the regulatory grammar underlying disease risk.
Collapse
Affiliation(s)
- Layla Siraj
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Program in Biophysics, Harvard Graduate School of Arts and Sciences, Boston, MA, USA
- Harvard-Massachusetts Institute of Technology MD/PhD Program, Harvard Medical School, Boston, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | | | | | | | | | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, MA, USA
| | | | | | - Qingbo S. Wang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan
| | | | - Sager J. Gosai
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - François Aguet
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Ran Cui
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Caleb A. Lareau
- Program in Computational and Systems Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan
- Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| | - Thouis R. Jones
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eric S. Lander
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Biology, MIT, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Pardis C. Sabeti
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Hilary K. Finucane
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
| | - Steven K. Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Wu Tsai Institute, Yale University, New Haven, CT, USA
| | - Jacob C. Ulirsch
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Ryan Tewhey
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
- Graduate School of Biomedical Sciences, Tufts University School of Medicine, Boston, MA, USA
| |
Collapse
|
4
|
Lu Z, Wang X, Carr M, Kim A, Gazal S, Mohammadi P, Wu L, Gusev A, Pirruccello J, Kachuri L, Mancuso N. Improved multi-ancestry fine-mapping identifies cis-regulatory variants underlying molecular traits and disease risk. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.15.24305836. [PMID: 38699369 PMCID: PMC11065034 DOI: 10.1101/2024.04.15.24305836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Multi-ancestry statistical fine-mapping of cis-molecular quantitative trait loci (cis-molQTL) aims to improve the precision of distinguishing causal cis-molQTLs from tagging variants. However, existing approaches fail to reflect shared genetic architectures. To solve this limitation, we present the Sum of Shared Single Effects (SuShiE) model, which leverages LD heterogeneity to improve fine-mapping precision, infer cross-ancestry effect size correlations, and estimate ancestry-specific expression prediction weights. We apply SuShiE to mRNA expression measured in PBMCs (n=956) and LCLs (n=814) together with plasma protein levels (n=854) from individuals of diverse ancestries in the TOPMed MESA and GENOA studies. We find SuShiE fine-maps cis-molQTLs for 16% more genes compared with baselines while prioritizing fewer variants with greater functional enrichment. SuShiE infers highly consistent cis-molQTL architectures across ancestries on average; however, we also find evidence of heterogeneity at genes with predicted loss-of-function intolerance, suggesting that environmental interactions may partially explain differences in cis-molQTL effect sizes across ancestries. Lastly, we leverage estimated cis-molQTL effect-sizes to perform individual-level TWAS and PWAS on six white blood cell-related traits in AOU Biobank individuals (n=86k), and identify 44 more genes compared with baselines, further highlighting its benefits in identifying genes relevant for complex disease risk. Overall, SuShiE provides new insights into the cis-genetic architecture of molecular traits.
Collapse
Affiliation(s)
- Zeyun Lu
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Xinran Wang
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Matthew Carr
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Artem Kim
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Steven Gazal
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA
| | - Pejman Mohammadi
- Center for Immunity and Immunotherapies, Seattle Children’s Research Institute, Seattle, WA, USA
- Department of Pediatrics, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Lang Wu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaiʻi Cancer Center, University of Hawaiʻi at Mānoa, Honolulu, HI, USA
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| | - James Pirruccello
- Division of Cardiology, University of California San Francisco, San Francisco, CA, USA
| | - Linda Kachuri
- Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Nicholas Mancuso
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA
| |
Collapse
|
5
|
Zeng T, Spence JP, Mostafavi H, Pritchard JK. Bayesian estimation of gene constraint from an evolutionary model with gene features. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.05.19.541520. [PMID: 37292653 PMCID: PMC10245655 DOI: 10.1101/2023.05.19.541520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Measures of selective constraint on genes have been used for many applications including clinical interpretation of rare coding variants, disease gene discovery, and studies of genome evolution. However, widely-used metrics are severely underpowered at detecting constraint for the shortest ∼25% of genes, potentially causing important pathogenic mutations to be overlooked. We developed a framework combining a population genetics model with machine learning on gene features to enable accurate inference of an interpretable constraint metric, shet. Our estimates outperform existing metrics for prioritizing genes important for cell essentiality, human disease, and other phenotypes, especially for short genes. Our new estimates of selective constraint should have wide utility for characterizing genes relevant to human disease. Finally, our inference framework, GeneBayes, provides a flexible platform that can improve estimation of many gene-level properties, such as rare variant burden or gene expression differences.
Collapse
Affiliation(s)
- Tony Zeng
- Department of Genetics, Stanford University, Stanford CA
| | | | | | - Jonathan K. Pritchard
- Department of Genetics, Stanford University, Stanford CA
- Department of Biology, Stanford University, Stanford CA
| |
Collapse
|
6
|
Sakaue S, Weinand K, Isaac S, Dey KK, Jagadeesh K, Kanai M, Watts GFM, Zhu Z, Brenner MB, McDavid A, Donlin LT, Wei K, Price AL, Raychaudhuri S. Tissue-specific enhancer-gene maps from multimodal single-cell data identify causal disease alleles. Nat Genet 2024; 56:615-626. [PMID: 38594305 DOI: 10.1038/s41588-024-01682-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 02/07/2024] [Indexed: 04/11/2024]
Abstract
Translating genome-wide association study (GWAS) loci into causal variants and genes requires accurate cell-type-specific enhancer-gene maps from disease-relevant tissues. Building enhancer-gene maps is essential but challenging with current experimental methods in primary human tissues. Here we developed a nonparametric statistical method, SCENT (single-cell enhancer target gene mapping), that models association between enhancer chromatin accessibility and gene expression in single-cell or nucleus multimodal RNA sequencing and ATAC sequencing data. We applied SCENT to 9 multimodal datasets including >120,000 single cells or nuclei and created 23 cell-type-specific enhancer-gene maps. These maps were highly enriched for causal variants in expression quantitative loci and GWAS for 1,143 diseases and traits. We identified likely causal genes for both common and rare diseases and linked somatic mutation hotspots to target genes. We demonstrate that application of SCENT to multimodal data from disease-relevant human tissue enables the scalable construction of accurate cell-type-specific enhancer-gene maps, essential for defining noncoding variant function.
Collapse
Affiliation(s)
- Saori Sakaue
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kathryn Weinand
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Shakson Isaac
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Kushal K Dey
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Karthik Jagadeesh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Masahiro Kanai
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, MA, USA
| | - Gerald F M Watts
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Zhu Zhu
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael B Brenner
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Andrew McDavid
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY, USA
| | - Laura T Donlin
- Hospital for Special Surgery, New York, NY, USA
- Weill Cornell Medicine, New York, NY, USA
| | - Kevin Wei
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Alkes L Price
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
7
|
Houzelstein D, Eozenou C, Lagos CF, Elzaiat M, Bignon-Topalovic J, Gonzalez I, Laville V, Schlick L, Wankanit S, Madon P, Kirtane J, Athalye A, Buonocore F, Bigou S, Conway GS, Bohl D, Achermann JC, Bashamboo A, McElreavey K. A conserved NR5A1-responsive enhancer regulates SRY in testis-determination. Nat Commun 2024; 15:2796. [PMID: 38555298 PMCID: PMC10981742 DOI: 10.1038/s41467-024-47162-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Accepted: 03/21/2024] [Indexed: 04/02/2024] Open
Abstract
The Y-linked SRY gene initiates mammalian testis-determination. However, how the expression of SRY is regulated remains elusive. Here, we demonstrate that a conserved steroidogenic factor-1 (SF-1)/NR5A1 binding enhancer is required for appropriate SRY expression to initiate testis-determination in humans. Comparative sequence analysis of SRY 5' regions in mammals identified an evolutionary conserved SF-1/NR5A1-binding motif within a 250 bp region of open chromatin located 5 kilobases upstream of the SRY transcription start site. Genomic analysis of 46,XY individuals with disrupted testis-determination, including a large multigenerational family, identified unique single-base substitutions of highly conserved residues within the SF-1/NR5A1-binding element. In silico modelling and in vitro assays demonstrate the enhancer properties of the NR5A1 motif. Deletion of this hemizygous element by genome-editing, in a novel in vitro cellular model recapitulating human Sertoli cell formation, resulted in a significant reduction in expression of SRY. Therefore, human NR5A1 acts as a regulatory switch between testis and ovary development by upregulating SRY expression, a role that may predate the eutherian radiation. We show that disruption of an enhancer can phenocopy variants in the coding regions of SRY that cause human testis dysgenesis. Since disease causing variants in enhancers are currently rare, the regulation of gene expression in testis-determination offers a paradigm to define enhancer activity in a key developmental process.
Collapse
Affiliation(s)
- Denis Houzelstein
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France.
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France.
| | - Caroline Eozenou
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
- Institut Cochin, Université Paris Cité, INSERM, CNRS, Paris, France
| | - Carlos F Lagos
- Chemical Biology & Drug Discovery Lab, Escuela de Química y Farmacia, Facultad de Medicina y Ciencia, Universidad San Sebastián, Campus Los Leones, Lota 2465 Providencia, 7510157, Santiago, Chile
- Centro Ciencia & Vida, Fundación Ciencia & Vida, Av. del Valle Norte 725, Huechuraba, 8580702, Santiago, Chile
| | - Maëva Elzaiat
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
| | - Joelle Bignon-Topalovic
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
| | - Inma Gonzalez
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
- Institut Pasteur, Université Paris Cité, Epigenomics, Proliferation, and the Identity of Cells Unit, F-75015, Paris, France
| | - Vincent Laville
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
- Institut Pasteur, Université Paris Cité, Stem Cells and Development Unit, F-75015, Paris, France
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, F-75015, Paris, France
| | - Laurène Schlick
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
| | - Somboon Wankanit
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
- Department of Pediatrics, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
| | - Prochi Madon
- Department of Assisted Reproduction and Genetics, Jaslok Hospital and Research Centre, Mumbai, India
| | - Jyotsna Kirtane
- Department of Pediatric Surgery, Jaslok Hospital and Research Centre, Mumbai, India
| | - Arundhati Athalye
- Department of Assisted Reproduction and Genetics, Jaslok Hospital and Research Centre, Mumbai, India
| | - Federica Buonocore
- Genetics and Genomic Medicine Research & Teaching Department, UCL GOS Institute of Child Health, University College London, London, United Kingdom
| | - Stéphanie Bigou
- ICV-iPS core facility, Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, Inserm, CNRS, APHP, Hôpital de la Pitié Salpêtrière, Paris, France
| | - Gerard S Conway
- Institute for Women's Health, University College London, London, United Kingdom
| | - Delphine Bohl
- ICV-iPS core facility, Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, Inserm, CNRS, APHP, Hôpital de la Pitié Salpêtrière, Paris, France
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, Inserm, CNRS, APHP, Hôpital de la Pitié Salpêtrière, Paris, France
| | - John C Achermann
- Genetics and Genomic Medicine Research & Teaching Department, UCL GOS Institute of Child Health, University College London, London, United Kingdom
| | - Anu Bashamboo
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
| | - Ken McElreavey
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France.
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France.
| |
Collapse
|
8
|
Lin W, Wall JD, Li G, Newman D, Yang Y, Abney M, VandeBerg JL, Olivier M, Gilad Y, Cox LA. Genetic regulatory effects in response to a high-cholesterol, high-fat diet in baboons. CELL GENOMICS 2024; 4:100509. [PMID: 38430910 PMCID: PMC10943580 DOI: 10.1016/j.xgen.2024.100509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 11/20/2023] [Accepted: 02/05/2024] [Indexed: 03/05/2024]
Abstract
Steady-state expression quantitative trait loci (eQTLs) explain only a fraction of disease-associated loci identified through genome-wide association studies (GWASs), while eQTLs involved in gene-by-environment (GxE) interactions have rarely been characterized in humans due to experimental challenges. Using a baboon model, we found hundreds of eQTLs that emerge in adipose, liver, and muscle after prolonged exposure to high dietary fat and cholesterol. Diet-responsive eQTLs exhibit genomic localization and genic features that are distinct from steady-state eQTLs. Furthermore, the human orthologs associated with diet-responsive eQTLs are enriched for GWAS genes associated with human metabolic traits, suggesting that context-responsive eQTLs with more complex regulatory effects are likely to explain GWAS hits that do not seem to overlap with standard eQTLs. Our results highlight the complexity of genetic regulatory effects and the potential of eQTLs with disease-relevant GxE interactions in enhancing the understanding of GWAS signals for human complex disease using non-human primate models.
Collapse
Affiliation(s)
- Wenhe Lin
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA.
| | - Jeffrey D Wall
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Ge Li
- Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA
| | - Deborah Newman
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX 78229, USA
| | - Yunqi Yang
- Committee on Genetics, Genomics and System Biology, The University of Chicago, Chicago, IL 60637, USA
| | - Mark Abney
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA
| | - John L VandeBerg
- Department of Human Genetics, South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, TX 78520, USA
| | - Michael Olivier
- Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA
| | - Yoav Gilad
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA; Department of Medicine, Section of Genetic Medicine, The University of Chicago, Chicago, IL 60637, USA.
| | - Laura A Cox
- Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA; Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX 78229, USA.
| |
Collapse
|
9
|
Lappalainen T, Li YI, Ramachandran S, Gusev A. Genetic and molecular architecture of complex traits. Cell 2024; 187:1059-1075. [PMID: 38428388 DOI: 10.1016/j.cell.2024.01.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/20/2023] [Accepted: 01/16/2024] [Indexed: 03/03/2024]
Abstract
Human genetics has emerged as one of the most dynamic areas of biology, with a broadening societal impact. In this review, we discuss recent achievements, ongoing efforts, and future challenges in the field. Advances in technology, statistical methods, and the growing scale of research efforts have all provided many insights into the processes that have given rise to the current patterns of genetic variation. Vast maps of genetic associations with human traits and diseases have allowed characterization of their genetic architecture. Finally, studies of molecular and cellular effects of genetic variants have provided insights into biological processes underlying disease. Many outstanding questions remain, but the field is well poised for groundbreaking discoveries as it increases the use of genetic data to understand both the history of our species and its applications to improve human health.
Collapse
Affiliation(s)
- Tuuli Lappalainen
- New York Genome Center, New York, NY, USA; Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - Yang I Li
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA; Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Sohini Ramachandran
- Ecology, Evolution and Organismal Biology, Center for Computational Molecular Biology, and the Data Science Institute, Brown University, Providence, RI 029129, USA
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| |
Collapse
|
10
|
Lin W, Wall JD, Li G, Newman D, Yang Y, Abney M, VandeBerg JL, Olivier M, Gilad Y, Cox LA. Genetic regulatory effects in response to a high cholesterol, high fat diet in baboons. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.01.551489. [PMID: 37577666 PMCID: PMC10418186 DOI: 10.1101/2023.08.01.551489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Steady-state expression quantitative trait loci (eQTLs) explain only a fraction of disease-associated loci identified through genome-wide association studies (GWAS), while eQTLs involved in gene-by-environment (GxE) interactions have rarely been characterized in humans due to experimental challenges. Using a baboon model, we found hundreds of eQTLs that emerge in adipose, liver, and muscle after prolonged exposure to high dietary fat and cholesterol. Diet-responsive eQTLs exhibit genomic localization and genic features that are distinct from steady-state eQTLs. Furthermore, the human orthologs associated with diet-responsive eQTLs are enriched for GWAS genes associated with human metabolic traits, suggesting that context-responsive eQTLs with more complex regulatory effects are likely to explain GWAS hits that do not seem to overlap with standard eQTLs. Our results highlight the complexity of genetic regulatory effects and the potential of eQTLs with disease-relevant GxE interactions in enhancing the understanding of GWAS signals for human complex disease using nonhuman primate models.
Collapse
Affiliation(s)
- Wenhe Lin
- Department of Human Genetics, The University of Chicago, Chicago, USA
| | - Jeffrey D. Wall
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Present address: Galatea Bio, Hialeah, FL, USA
| | - Ge Li
- Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Deborah Newman
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, USA
| | - Yunqi Yang
- Committee on Genetics, Genomics and System Biology, The University of Chicago, Chicago, USA
| | - Mark Abney
- Department of Human Genetics, The University of Chicago, Chicago, USA
| | - John L. VandeBerg
- Department of Human Genetics, South Texas Diabetes and Obesity Institute, University of Texas Rio Grand Valley, Brownsville, TX, USA
| | - Michael Olivier
- Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Yoav Gilad
- Department of Human Genetics, The University of Chicago, Chicago, USA
- Department of Medicine, Section of Genetic Medicine, The University of Chicago, Chicago, IL, USA
- Lead contact
| | - Laura A. Cox
- Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, USA
| |
Collapse
|
11
|
Gschwind AR, Mualim KS, Karbalayghareh A, Sheth MU, Dey KK, Jagoda E, Nurtdinov RN, Xi W, Tan AS, Jones H, Ma XR, Yao D, Nasser J, Avsec Ž, James BT, Shamim MS, Durand NC, Rao SSP, Mahajan R, Doughty BR, Andreeva K, Ulirsch JC, Fan K, Perez EM, Nguyen TC, Kelley DR, Finucane HK, Moore JE, Weng Z, Kellis M, Bassik MC, Price AL, Beer MA, Guigó R, Stamatoyannopoulos JA, Lieberman Aiden E, Greenleaf WJ, Leslie CS, Steinmetz LM, Kundaje A, Engreitz JM. An encyclopedia of enhancer-gene regulatory interactions in the human genome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.09.563812. [PMID: 38014075 PMCID: PMC10680627 DOI: 10.1101/2023.11.09.563812] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.
Collapse
Affiliation(s)
- Andreas R. Gschwind
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - Kristy S. Mualim
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Plant Biology, Carnegie Institute of Science, Stanford, CA, USA
| | - Alireza Karbalayghareh
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Maya U. Sheth
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Kushal K. Dey
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Evelyn Jagoda
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ramil N. Nurtdinov
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Wang Xi
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Anthony S. Tan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - Hank Jones
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - X. Rosa Ma
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - David Yao
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Joseph Nasser
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Present Address: Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | | | - Benjamin T. James
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Muhammad S. Shamim
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Bioengineering, Rice University, Houston, TX, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, Texas, USA
| | - Neva C. Durand
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Suhas S. P. Rao
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Department of Medicine, University of California San Francisco, San Francisco, CA, USA
- Department of Structural Biology, Stanford University, Stanford, CA, USA
| | - Ragini Mahajan
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Biosciences, Rice University, Houston, TX, USA
| | - Benjamin R. Doughty
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Kalina Andreeva
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Jacob C. Ulirsch
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Present Address: Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
- Present Address: Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, USA
| | | | - Tri C. Nguyen
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | | | - Hilary K. Finucane
- Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jill E. Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Manolis Kellis
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Michael C. Bassik
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Alkes L. Price
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Michael A. Beer
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - John A. Stamatoyannopoulos
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
- Clinical Research Division, Fred Hutch Cancer Center, Seattle, WA, USA
| | - Erez Lieberman Aiden
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | - William J. Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Applied Physics, Stanford University, Stanford, CA, USA
| | | | - Lars M. Steinmetz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Genome Technology Center, Palo Alto, CA, USA
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
| | - Anshul Kundaje
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jesse M. Engreitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA
| |
Collapse
|
12
|
You J, Liu Z, Qi Z, Ma Y, Sun M, Su L, Niu H, Peng Y, Luo X, Zhu M, Huang Y, Chang X, Hu X, Zhang Y, Pi R, Liu Y, Meng Q, Li J, Zhang Q, Zhu L, Lin Z, Min L, Yuan D, Grover CE, Fang DD, Lindsey K, Wendel JF, Tu L, Zhang X, Wang M. Regulatory controls of duplicated gene expression during fiber development in allotetraploid cotton. Nat Genet 2023; 55:1987-1997. [PMID: 37845354 PMCID: PMC10632151 DOI: 10.1038/s41588-023-01530-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 09/14/2023] [Indexed: 10/18/2023]
Abstract
Polyploidy complicates transcriptional regulation and increases phenotypic diversity in organisms. The dynamics of genetic regulation of gene expression between coresident subgenomes in polyploids remains to be understood. Here we document the genetic regulation of fiber development in allotetraploid cotton Gossypium hirsutum by sequencing 376 genomes and 2,215 time-series transcriptomes. We characterize 1,258 genes comprising 36 genetic modules that control staged fiber development and uncover genetic components governing their partitioned expression relative to subgenomic duplicated genes (homoeologs). Only about 30% of fiber quality-related homoeologs show phenotypically favorable allele aggregation in cultivars, highlighting the potential for subgenome additivity in fiber improvement. We envision a genome-enabled breeding strategy, with particular attention to 48 favorable alleles related to fiber phenotypes that have been subjected to purifying selection during domestication. Our work delineates the dynamics of gene regulation during fiber development and highlights the potential of subgenomic coordination underpinning phenotypes in polyploid plants.
Collapse
Affiliation(s)
- Jiaqi You
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Zhenping Liu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Zhengyang Qi
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Yizan Ma
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Mengling Sun
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Ling Su
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Hao Niu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Yabing Peng
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Xuanxuan Luo
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Mengmeng Zhu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Yuefan Huang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Xing Chang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Xiubao Hu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Yuqi Zhang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Ruizhen Pi
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Yuqi Liu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Qingying Meng
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Jianying Li
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Qinghua Zhang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Longfu Zhu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Zhongxu Lin
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Ling Min
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Daojun Yuan
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Corrinne E Grover
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, USA
| | - David D Fang
- Cotton Fiber Bioscience Research Unit, USDA-ARS, Southern Regional Research Center, New Orleans, LA, USA
| | - Keith Lindsey
- Department of Biosciences, Durham University, Durham, UK
| | - Jonathan F Wendel
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, USA
| | - Lili Tu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China.
| | - Xianlong Zhang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China.
| | - Maojun Wang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China.
| |
Collapse
|
13
|
Mostafavi H, Spence JP, Naqvi S, Pritchard JK. Systematic differences in discovery of genetic effects on gene expression and complex traits. Nat Genet 2023; 55:1866-1875. [PMID: 37857933 DOI: 10.1038/s41588-023-01529-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Accepted: 09/14/2023] [Indexed: 10/21/2023]
Abstract
Most signals in genome-wide association studies (GWAS) of complex traits implicate noncoding genetic variants with putative gene regulatory effects. However, currently identified regulatory variants, notably expression quantitative trait loci (eQTLs), explain only a small fraction of GWAS signals. Here, we show that GWAS and cis-eQTL hits are systematically different: eQTLs cluster strongly near transcription start sites, whereas GWAS hits do not. Genes near GWAS hits are enriched in key functional annotations, are under strong selective constraint and have complex regulatory landscapes across different tissue/cell types, whereas genes near eQTLs are depleted of most functional annotations, show relaxed constraint, and have simpler regulatory landscapes. We describe a model to understand these observations, including how natural selection on complex traits hinders discovery of functionally relevant eQTLs. Our results imply that GWAS and eQTL studies are systematically biased toward different types of variant, and support the use of complementary functional approaches alongside the next generation of eQTL studies.
Collapse
Affiliation(s)
| | | | - Sahin Naqvi
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA, USA
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Biology, Stanford University, Stanford, CA, USA.
| |
Collapse
|
14
|
Brown BC, Morris JA, Lappalainen T, Knowles DA. Large-scale causal discovery using interventional data sheds light on the regulatory network architecture of blood traits. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.13.562293. [PMID: 37905013 PMCID: PMC10614812 DOI: 10.1101/2023.10.13.562293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Inference of directed biological networks is an important but notoriously challenging problem. We introduce inverse sparse regression (inspre), an approach to learning causal networks that leverages large-scale intervention-response data. Applied to 788 genes from the genome-wide perturb-seq dataset, inspre helps elucidate the network architecture of blood traits.
Collapse
Affiliation(s)
- Brielin C. Brown
- New York Genome Center, New York, NY, USA
- Data Science Institute, Columbia University, New York, NY, USA
| | | | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
- Department of Systems Biology, Columbia University, New York, NY
| | - David A. Knowles
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY
- Department of Computer Science, Columbia University, New York, NY
| |
Collapse
|
15
|
McAfee JC, Lee S, Lee J, Bell JL, Krupa O, Davis J, Insigne K, Bond ML, Zhao N, Boyle AP, Phanstiel DH, Love MI, Stein JL, Ruzicka WB, Davila-Velderrain J, Kosuri S, Won H. Systematic investigation of allelic regulatory activity of schizophrenia-associated common variants. CELL GENOMICS 2023; 3:100404. [PMID: 37868037 PMCID: PMC10589626 DOI: 10.1016/j.xgen.2023.100404] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 02/23/2023] [Accepted: 08/21/2023] [Indexed: 10/24/2023]
Abstract
Genome-wide association studies (GWASs) have successfully identified 145 genomic regions that contribute to schizophrenia risk, but linkage disequilibrium makes it challenging to discern causal variants. We performed a massively parallel reporter assay (MPRA) on 5,173 fine-mapped schizophrenia GWAS variants in primary human neural progenitors and identified 439 variants with allelic regulatory effects (MPRA-positive variants). Transcription factor binding had modest predictive power, while fine-map posterior probability, enhancer overlap, and evolutionary conservation failed to predict MPRA-positive variants. Furthermore, 64% of MPRA-positive variants did not exhibit expressive quantitative trait loci signature, suggesting that MPRA could identify yet unexplored variants with regulatory potentials. To predict the combinatorial effect of MPRA-positive variants on gene regulation, we propose an accessibility-by-contact model that combines MPRA-measured allelic activity with neuronal chromatin architecture.
Collapse
Affiliation(s)
- Jessica C. McAfee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Sool Lee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jiseok Lee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jessica L. Bell
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Oleh Krupa
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jessica Davis
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Kimberly Insigne
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Marielle L. Bond
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Nanxiang Zhao
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Alan P. Boyle
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Douglas H. Phanstiel
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Michael I. Love
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jason L. Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - W. Brad Ruzicka
- Laboratory for Epigenomics in Human Psychopathology, McLean Hospital, Belmont, MA 02141, USA
- Harvard Medical School, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | - Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Hyejung Won
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
16
|
Liu Z, Huang YF. Deep multiple-instance learning accurately predicts gene haploinsufficiency and deletion pathogenicity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.29.555384. [PMID: 37693607 PMCID: PMC10491176 DOI: 10.1101/2023.08.29.555384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Copy number losses (deletions) are a major contributor to the etiology of severe genetic disorders. Although haploinsufficient genes play a critical role in deletion pathogenicity, current methods for deletion pathogenicity prediction fail to integrate multiple lines of evidence for haploinsufficiency at the gene level, limiting their power to pinpoint deleterious deletions associated with genetic disorders. Here we introduce DosaCNV, a deep multiple-instance learning framework that, for the first time, models deletion pathogenicity jointly with gene haploinsufficiency. By integrating over 30 gene-level features potentially predictive of haploinsufficiency, DosaCNV shows unmatched performance in prioritizing pathogenic deletions associated with a broad spectrum of genetic disorders. Furthermore, DosaCNV outperforms existing methods in predicting gene haploinsufficiency even though it is not trained on known haploinsufficient genes. Finally, DosaCNV leverages a state-of-the-art technique to quantify the contributions of individual gene-level features to haploinsufficiency, allowing for human-understandable explanations of model predictions. Altogether, DosaCNV is a powerful computational tool for both fundamental and translational research.
Collapse
Affiliation(s)
- Zhihan Liu
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
- Molecular, Cellular, and Integrative Biosciences Program, Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Yi-Fei Huang
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
17
|
Ziyani C, Delaneau O, Ribeiro DM. Multimodal single cell analysis infers widespread enhancer co-activity in a lymphoblastoid cell line. Commun Biol 2023; 6:563. [PMID: 37237005 DOI: 10.1038/s42003-023-04954-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 05/18/2023] [Indexed: 05/28/2023] Open
Abstract
Non-coding regulatory elements such as enhancers are key in controlling the cell-type specificity and spatio-temporal expression of genes. To drive stable and precise gene transcription robust to genetic variation and environmental stress, genes are often targeted by multiple enhancers with redundant action. However, it is unknown whether enhancers targeting the same gene display simultaneous activity or whether some enhancer combinations are more often co-active than others. Here, we take advantage of recent developments in single cell technology that permit assessing chromatin status (scATAC-seq) and gene expression (scRNA-seq) in the same single cells to correlate gene expression to the activity of multiple enhancers. Measuring activity patterns across 24,844 human lymphoblastoid single cells, we find that the majority of enhancers associated with the same gene display significant correlation in their chromatin profiles. For 6944 expressed genes associated with enhancers, we predict 89,885 significant enhancer-enhancer associations between nearby enhancers. We find that associated enhancers share similar transcription factor binding profiles and that gene essentiality is linked with higher enhancer co-activity. We provide a set of predicted enhancer-enhancer associations based on correlation derived from a single cell line, which can be further investigated for functional relevance.
Collapse
Affiliation(s)
- Chaymae Ziyani
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Olivier Delaneau
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Diogo M Ribeiro
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| |
Collapse
|
18
|
Romero IG. Seeing humans through an evolutionary lens. Science 2023; 380:360-361. [PMID: 37104588 DOI: 10.1126/science.adh0745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
A collection of mammalian genomes provides insights into human biology and evolution.
Collapse
Affiliation(s)
- Irene Gallego Romero
- Melbourne Integrative Genomics, University of Melbourne, Parkville, VIC, Australia
- School of BioSciences, University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
19
|
Chan WF, Coughlan HD, Ruhle M, Iannarella N, Alvarado C, Groom JR, Keenan CR, Kueh AJ, Wheatley AK, Smyth GK, Allan RS, Johanson TM. Survey of activation-induced genome architecture reveals a novel enhancer of Myc. Immunol Cell Biol 2023; 101:345-357. [PMID: 36710659 PMCID: PMC10952581 DOI: 10.1111/imcb.12626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 01/25/2023] [Accepted: 01/27/2023] [Indexed: 01/31/2023]
Abstract
The transcription factor Myc is critically important in driving cell proliferation, a function that is frequently dysregulated in cancer. To avoid this dysregulation Myc is tightly controlled by numerous layers of regulation. One such layer is the use of distal regulatory enhancers to drive Myc expression. Here, using chromosome conformation capture to examine B cells of the immune system in the first hours after their activation, we reveal a previously unidentified enhancer of Myc. The interactivity of this enhancer coincides with a dramatic, but discrete, spike in Myc expression 3 h post-activation. However, genetic deletion of this region, has little impact on Myc expression, Myc protein level or in vitro and in vivo cell proliferation. Examination of the enhancer deleted regulatory landscape suggests that enhancer redundancy likely sustains Myc expression. This work highlights not only the importance of temporally examining enhancers, but also the complexity and dynamics of the regulation of critical genes such as Myc.
Collapse
Affiliation(s)
- Wing Fuk Chan
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Hannah D Coughlan
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Michelle Ruhle
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Nadia Iannarella
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Carolina Alvarado
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Joanna R Groom
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Christine R Keenan
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Andrew J Kueh
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Adam K Wheatley
- Department of Microbiology and ImmunologyUniversity of Melbourne at the Peter Doherty Institute for Infection and ImmunityMelbourneVICAustralia
| | - Gordon K Smyth
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- School of Mathematics and StatisticsThe University of MelbourneParkvilleVICAustralia
| | - Rhys S Allan
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Timothy M Johanson
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| |
Collapse
|
20
|
de Klein N, Tsai EA, Vochteloo M, Baird D, Huang Y, Chen CY, van Dam S, Oelen R, Deelen P, Bakker OB, El Garwany O, Ouyang Z, Marshall EE, Zavodszky MI, van Rheenen W, Bakker MK, Veldink J, Gaunt TR, Runz H, Franke L, Westra HJ. Brain expression quantitative trait locus and network analyses reveal downstream effects and putative drivers for brain-related diseases. Nat Genet 2023; 55:377-388. [PMID: 36823318 PMCID: PMC10011140 DOI: 10.1038/s41588-023-01300-6] [Citation(s) in RCA: 44] [Impact Index Per Article: 44.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 01/17/2023] [Indexed: 02/25/2023]
Abstract
Identification of therapeutic targets from genome-wide association studies (GWAS) requires insights into downstream functional consequences. We harmonized 8,613 RNA-sequencing samples from 14 brain datasets to create the MetaBrain resource and performed cis- and trans-expression quantitative trait locus (eQTL) meta-analyses in multiple brain region- and ancestry-specific datasets (n ≤ 2,759). Many of the 16,169 cortex cis-eQTLs were tissue-dependent when compared with blood cis-eQTLs. We inferred brain cell types for 3,549 cis-eQTLs by interaction analysis. We prioritized 186 cis-eQTLs for 31 brain-related traits using Mendelian randomization and co-localization including 40 cis-eQTLs with an inferred cell type, such as a neuron-specific cis-eQTL (CYP24A1) for multiple sclerosis. We further describe 737 trans-eQTLs for 526 unique variants and 108 unique genes. We used brain-specific gene-co-regulation networks to link GWAS loci and prioritize additional genes for five central nervous system diseases. This study represents a valuable resource for post-GWAS research on central nervous system diseases.
Collapse
Affiliation(s)
- Niek de Klein
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Wellcome Sanger Institute, Hinxton, UK
| | - Ellen A Tsai
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Martijn Vochteloo
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Institute for Life Science and Technology, Hanze University of Applied Sciences, Groningen, The Netherlands
- Oncode Institute, Groningen, The Netherlands
| | - Denis Baird
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Yunfeng Huang
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Chia-Yen Chen
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Sipko van Dam
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Ancora Health, Groningen, The Netherlands
| | - Roy Oelen
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Oncode Institute, Groningen, The Netherlands
| | - Patrick Deelen
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Oncode Institute, Groningen, The Netherlands
| | - Olivier B Bakker
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Wellcome Sanger Institute, Hinxton, UK
| | - Omar El Garwany
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Wellcome Sanger Institute, Hinxton, UK
| | | | - Eric E Marshall
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Maria I Zavodszky
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Wouter van Rheenen
- Department of Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Mark K Bakker
- Department of Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Jan Veldink
- Department of Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Tom R Gaunt
- MRC Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, UK
| | - Heiko Runz
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA.
| | - Lude Franke
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.
- Oncode Institute, Groningen, The Netherlands.
| | - Harm-Jan Westra
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.
- Oncode Institute, Groningen, The Netherlands.
| |
Collapse
|
21
|
Morova T, Ding Y, Huang CCF, Sar F, Schwarz T, Giambartolomei C, Baca S, Grishin D, Hach F, Gusev A, Freedman M, Pasaniuc B, Lack N. Optimized high-throughput screening of non-coding variants identified from genome-wide association studies. Nucleic Acids Res 2022; 51:e18. [PMID: 36546757 PMCID: PMC9943666 DOI: 10.1093/nar/gkac1198] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 11/19/2022] [Accepted: 12/06/2022] [Indexed: 12/24/2022] Open
Abstract
The vast majority of disease-associated single nucleotide polymorphisms (SNP) identified from genome-wide association studies (GWAS) are localized in non-coding regions. A significant fraction of these variants impact transcription factors binding to enhancer elements and alter gene expression. To functionally interrogate the activity of such variants we developed snpSTARRseq, a high-throughput experimental method that can interrogate the functional impact of hundreds to thousands of non-coding variants on enhancer activity. snpSTARRseq dramatically improves signal-to-noise by utilizing a novel sequencing and bioinformatic approach that increases both insert size and the number of variants tested per loci. Using this strategy, we interrogated known prostate cancer (PCa) risk-associated loci and demonstrated that 35% of them harbor SNPs that significantly altered enhancer activity. Combining these results with chromosomal looping data we could identify interacting genes and provide a mechanism of action for 20 PCa GWAS risk regions. When benchmarked to orthogonal methods, snpSTARRseq showed a strong correlation with in vivo experimental allelic-imbalance studies whereas there was no correlation with predictive in silico approaches. Overall, snpSTARRseq provides an integrated experimental and computational framework to functionally test non-coding genetic variants.
Collapse
Affiliation(s)
- Tunc Morova
- Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada
| | - Yi Ding
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | | | - Funda Sar
- Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada
| | - Tommer Schwarz
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Claudia Giambartolomei
- Central RNA Lab, Istituto Italiano di Tecnologia, Genova 16163, Italy,Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Sylvan C Baca
- Department of Medical Oncology, The Center for Functional Cancer Epigenetics, Dana Farber Cancer Institute, Boston, MA 02215, USA
| | - Dennis Grishin
- Department of Medical Oncology, The Center for Functional Cancer Epigenetics, Dana Farber Cancer Institute, Boston, MA 02215, USA
| | - Faraz Hach
- Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada,Department of Urologic Science, University of British Columbia, Vancouver, BC V5Z 1M9, Canada
| | - Alexander Gusev
- Department of Medical Oncology, The Center for Functional Cancer Epigenetics, Dana Farber Cancer Institute, Boston, MA 02215, USA,Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Matthew L Freedman
- Department of Medical Oncology, The Center for Functional Cancer Epigenetics, Dana Farber Cancer Institute, Boston, MA 02215, USA,The Center for Cancer Genome Discovery, Dana Farber Cancer Institute, Boston, MA 02215, USA
| | - Bogdan Pasaniuc
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA,Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA,Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA,Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Nathan A Lack
- To whom correspondence should be addressed. Tel: +1 604 875 4411;
| |
Collapse
|
22
|
Leyhr J, Waldmann L, Filipek-Górniok B, Zhang H, Allalou A, Haitina T. A novel cis-regulatory element drives early expression of Nkx3.2 in the gnathostome primary jaw joint. eLife 2022; 11:75749. [PMCID: PMC9665848 DOI: 10.7554/elife.75749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 09/30/2022] [Indexed: 11/16/2022] Open
Abstract
The acquisition of movable jaws was a major event during vertebrate evolution. The role of NK3 homeobox 2 (Nkx3.2) transcription factor in patterning the primary jaw joint of gnathostomes (jawed vertebrates) is well known, however knowledge about its regulatory mechanism is lacking. In this study, we report a proximal enhancer element of Nkx3.2 that is deeply conserved in most gnathostomes but undetectable in the jawless hagfish and lamprey. This enhancer is active in the developing jaw joint region of the zebrafish Danio rerio, and was thus designated as jaw joint regulatory sequence 1 (JRS1). We further show that JRS1 enhancer sequences from a range of gnathostome species, including a chondrichthyan and mammals, have the same activity in the jaw joint as the native zebrafish enhancer, indicating a high degree of functional conservation despite the divergence of cartilaginous and bony fish lineages or the transition of the primary jaw joint into the middle ear of mammals. Finally, we show that deletion of JRS1 from the zebrafish genome using CRISPR/Cas9 results in a significant reduction of early gene expression of nkx3.2 and leads to a transient jaw joint deformation and partial fusion. Emergence of this Nkx3.2 enhancer in early gnathostomes may have contributed to the origin and shaping of the articulating surfaces of vertebrate jaws.
Collapse
Affiliation(s)
- Jake Leyhr
- Subdepartment of Evolution and Development, Department of Organismal Biology, Uppsala University
| | - Laura Waldmann
- Subdepartment of Evolution and Development, Department of Organismal Biology, Uppsala University
| | - Beata Filipek-Górniok
- Science for Life Laboratory Genome Engineering Zebrafish Facility, Department of Organismal Biology, Uppsala University
| | - Hanqing Zhang
- Division of Visual Information and Interaction, Department of Information Technology, Uppsala University
- Science for Life Laboratory BioImage Informatics Facility
| | - Amin Allalou
- Division of Visual Information and Interaction, Department of Information Technology, Uppsala University
- Science for Life Laboratory BioImage Informatics Facility
| | - Tatjana Haitina
- Subdepartment of Evolution and Development, Department of Organismal Biology, Uppsala University
| |
Collapse
|
23
|
Dong P, Hoffman GE, Apontes P, Bendl J, Rahman S, Fernando MB, Zeng B, Vicari JM, Zhang W, Girdhar K, Townsley KG, Misir R, Brennand KJ, Haroutunian V, Voloudakis G, Fullard JF, Roussos P. Population-level variation in enhancer expression identifies disease mechanisms in the human brain. Nat Genet 2022; 54:1493-1503. [PMID: 36163279 PMCID: PMC9547946 DOI: 10.1038/s41588-022-01170-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 07/25/2022] [Indexed: 02/06/2023]
Abstract
Identification of risk variants for neuropsychiatric diseases within enhancers underscores the importance of understanding population-level variation in enhancer function in the human brain. Besides regulating tissue-specific and cell-type-specific transcription of target genes, enhancers themselves can be transcribed. By jointly analyzing large-scale cell-type-specific transcriptome and regulome data, we cataloged 30,795 neuronal and 23,265 non-neuronal candidate transcribed enhancers. Examination of the transcriptome in 1,382 brain samples identified robust expression of transcribed enhancers. We explored gene-enhancer coordination and found that enhancer-linked genes are strongly implicated in neuropsychiatric disease. We identified expression quantitative trait loci (eQTLs) for both genes and enhancers and found that enhancer eQTLs mediate a substantial fraction of neuropsychiatric trait heritability. Inclusion of enhancer eQTLs in transcriptome-wide association studies enhanced functional interpretation of disease loci. Overall, our study characterizes the gene-enhancer regulome and genetic mechanisms in the human cortex in both healthy and diseased states.
Collapse
Affiliation(s)
- Pengfei Dong
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Gabriel E Hoffman
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Pasha Apontes
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mental Illness Research Education and Clinical Center (MIRECC), James J. Peters VA Medical Center, New York, NY, USA
| | - Jaroslav Bendl
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Samir Rahman
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael B Fernando
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Biao Zeng
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - James M Vicari
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Wen Zhang
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kiran Girdhar
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kayla G Townsley
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ruth Misir
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kristen J Brennand
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Yale University, New Haven, CT, USA
| | - Vahram Haroutunian
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mental Illness Research Education and Clinical Center (MIRECC), James J. Peters VA Medical Center, New York, NY, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Georgios Voloudakis
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - John F Fullard
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Panos Roussos
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Mental Illness Research Education and Clinical Center (MIRECC), James J. Peters VA Medical Center, New York, NY, USA.
- Center for Dementia Research, Nathan Kline Institute for Psychiatric Research, Orangeburg, NY, USA.
| |
Collapse
|
24
|
Sharma SP, Peterson T. Complex chromosomal rearrangements induced by transposons in maize. Genetics 2022; 223:6702042. [PMID: 36111993 PMCID: PMC9910405 DOI: 10.1093/genetics/iyac124] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
Eukaryotic genomes are large and complex, and gene expression can be affected by multiple regulatory elements and their positions within the dynamic chromatin architecture. Transposable elements are known to play important roles in genome evolution, yet questions remain as to how transposable elements alter genome structure and affect gene expression. Previous studies have shown that genome rearrangements can be induced by Reversed Ends Transposition involving termini of Activator and related transposable elements in maize and other plants. Here, we show that complex alleles can be formed by the rapid and progressive accumulation of Activator-induced duplications and rearrangements. The p1 gene enhancer in maize can induce ectopic expression of the nearby p2 gene in pericarp tissue when placed near it via different structural rearrangements. By screening for p2 expression, we identified and studied 5 cases in which multiple sequential transposition events occurred and increased the p1 enhancer copy number. We see active p2 expression due to multiple copies of the p1 enhancer present near p2 in all 5 cases. The p1 enhancer effects are confirmed by the observation that loss of p2 expression is correlated with transposition-induced excision of the p1 enhancers. We also performed a targeted Chromosome Conformation Capture experiment to test the physical interaction between the p1 enhancer and p2 promoter region. Together, our results show that transposon-induced rearrangements can accumulate rapidly and progressively increase genetic variation important for genomic evolution.
Collapse
Affiliation(s)
- Sharu Paul Sharma
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50011, USA
| | - Thomas Peterson
- Corresponding author: Department of Genetics, Development and Cell Biology, Iowa State University, 2258 Molecular Biology, Iowa State University, Ames, IA 50011, USA.
| |
Collapse
|
25
|
Baca SC, Singler C, Zacharia S, Seo JH, Morova T, Hach F, Ding Y, Schwarz T, Huang CCF, Anderson J, Fay AP, Kalita C, Groha S, Pomerantz MM, Wang V, Linder S, Sweeney CJ, Zwart W, Lack NA, Pasaniuc B, Takeda DY, Gusev A, Freedman ML. Genetic determinants of chromatin reveal prostate cancer risk mediated by context-dependent gene regulation. Nat Genet 2022; 54:1364-1375. [PMID: 36071171 PMCID: PMC9784646 DOI: 10.1038/s41588-022-01168-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 07/19/2022] [Indexed: 12/25/2022]
Abstract
Many genetic variants affect disease risk by altering context-dependent gene regulation. Such variants are difficult to study mechanistically using current methods that link genetic variation to steady-state gene expression levels, such as expression quantitative trait loci (eQTLs). To address this challenge, we developed the cistrome-wide association study (CWAS), a framework for identifying genotypic and allele-specific effects on chromatin that are also associated with disease. In prostate cancer, CWAS identified regulatory elements and androgen receptor-binding sites that explained the association at 52 of 98 known prostate cancer risk loci and discovered 17 additional risk loci. CWAS implicated key developmental transcription factors in prostate cancer risk that are overlooked by eQTL-based approaches due to context-dependent gene regulation. We experimentally validated associations and demonstrated the extensibility of CWAS to additional epigenomic datasets and phenotypes, including response to prostate cancer treatment. CWAS is a powerful and biologically interpretable paradigm for studying variants that influence traits by affecting transcriptional regulation.
Collapse
Affiliation(s)
- Sylvan C. Baca
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA
| | - Cassandra Singler
- Laboratory of Genitourinary Cancer Pathogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
| | - Soumya Zacharia
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ji-Heui Seo
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Tunc Morova
- Vancouver Prostate Centre University of British Columbia, Vancouver, BC, Canada
| | - Faraz Hach
- Vancouver Prostate Centre University of British Columbia, Vancouver, BC, Canada
| | - Yi Ding
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA
| | - Tommer Schwarz
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA
| | | | - Jacob Anderson
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - André P. Fay
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Cynthia Kalita
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Division of Genetics, Brigham & Women’s Hospital, Boston, MA, USA
| | - Stefan Groha
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA
| | - Mark M. Pomerantz
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Victoria Wang
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA,Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - Simon Linder
- Division of Oncogenomics, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands,Laboratory of Chemical Biology and Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | | | - Wilbert Zwart
- Division of Oncogenomics, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands,Laboratory of Chemical Biology and Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Nathan A. Lack
- Vancouver Prostate Centre University of British Columbia, Vancouver, BC, Canada,School of Medicine, Koç University, Istanbul, Turkey
| | - Bogdan Pasaniuc
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA,Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA USA,Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA,Department of Pathology and Laboratory Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - David Y. Takeda
- Laboratory of Genitourinary Cancer Pathogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
| | - Alexander Gusev
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA,Division of Genetics, Brigham & Women’s Hospital, Boston, MA, USA,These authors jointly supervised this work. Correspondence should be directed to M.L.F or A.G. ()
| | - Matthew L. Freedman
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA,These authors jointly supervised this work. Correspondence should be directed to M.L.F or A.G. ()
| |
Collapse
|
26
|
Lin X, Liu Y, Liu S, Zhu X, Wu L, Zhu Y, Zhao D, Xu X, Chemparathy A, Wang H, Cao Y, Nakamura M, Noordermeer JN, La Russa M, Wong WH, Zhao K, Qi LS. Nested epistasis enhancer networks for robust genome regulation. Science 2022; 377:1077-1085. [PMID: 35951677 DOI: 10.1126/science.abk3512] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Mammalian genomes possess multiple enhancers spanning an ultralong distance (>megabases) to modulate important genes, yet it is unclear how these enhancers coordinate to achieve this task. Here, we combine multiplexed CRISPRi screening with machine learning to define quantitative enhancer-enhancer interactions. We find that the ultralong distance enhancer network possesses a nested multi-layer architecture that confers functional robustness of gene expression. Experimental characterization reveals that enhancer epistasis is maintained by three-dimensional chromosomal interactions and BRD4 condensation. Machine learning prediction of synergistic enhancers provides an effective strategy to identify non-coding variant pairs associated with pathogenic genes in diseases beyond Genome-Wide Association Studies (GWAS) analysis. Our work unveils nested epistasis enhancer networks, which can better explain enhancer functions within cells and in diseases.
Collapse
Affiliation(s)
- Xueqiu Lin
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Yanxia Liu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Shuai Liu
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute NIH, Bethesda, MD 20892, USA
| | - Xiang Zhu
- Department of Statistics, Stanford University, Stanford, CA 94305, USA.,Department of Statistics, The Pennsylvania State University, University Park, PA 16802, USA.,Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA
| | - Lingling Wu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Yanyu Zhu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Dehua Zhao
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Xiaoshu Xu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | - Haifeng Wang
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Yaqiang Cao
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute NIH, Bethesda, MD 20892, USA
| | - Muneaki Nakamura
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | - Marie La Russa
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Wing Hung Wong
- Department of Statistics, Stanford University, Stanford, CA 94305, USA.,Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Keji Zhao
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute NIH, Bethesda, MD 20892, USA
| | - Lei S Qi
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA.,ChEM-H, Stanford University, Stanford, CA 94305, USA.,Chan Zuckerberg BioHub, San Francisco, CA 94158, USA
| |
Collapse
|
27
|
Song W, Yuan K, Liu Z, Cai W, Chen J, Yu S, Zhao M, Lin GN. Locus-level antagonistic selection shaped the polygenic architecture of human complex diseases. Hum Genet 2022; 141:1935-1947. [PMID: 35943608 DOI: 10.1007/s00439-022-02471-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 07/11/2022] [Indexed: 12/01/2022]
Abstract
BACKGROUND We aimed to evaluate the potential role of antagonistic selection in polygenic diseases: if one variant increases the risk of one disease and decreases the risk of another disease, the signals of genetic risk elimination by natural selection will be distorted, which leads to a higher frequency of risk alleles. METHODS We applied local genetic correlations and transcriptome-wide association studies to identify genomic loci and genes adversely associated with at least two diseases. Then, we used different population genetic metrics to measure the signals of natural selection for these loci and genes. RESULTS First, we identified 2120 cases of antagonistic pleiotropy (negative local genetic correlation) among 87 diseases in 716 genomic loci (antagonistic loci). Next, by comparing with non-antagonistic loci, we observed that antagonistic loci explained an excess proportion of disease heritability (median 6%), showed enhanced signals of balancing selection, and reduced signals of directional polygenic adaptation. Then, at the gene expression level, we identified 31,991 cases of antagonistic pleiotropy among 98 diseases at 4368 genes. However, evidence of altered signals of selection pressure and heritability distribution at the gene expression level is limited. CONCLUSION We conclude that antagonistic pleiotropy is widespread among human polygenic diseases, and it has distorted the evolutionary signal and genetic architecture of diseases at the locus level.
Collapse
Affiliation(s)
- Weichen Song
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China.,Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China
| | - Kai Yuan
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Zhe Liu
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Wenxiang Cai
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Jue Chen
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Shunying Yu
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China.,Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China
| | - Min Zhao
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China. .,Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China.
| | - Guan Ning Lin
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China. .,Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China.
| |
Collapse
|
28
|
Collins RL, Glessner JT, Porcu E, Lepamets M, Brandon R, Lauricella C, Han L, Morley T, Niestroj LM, Ulirsch J, Everett S, Howrigan DP, Boone PM, Fu J, Karczewski KJ, Kellaris G, Lowther C, Lucente D, Mohajeri K, Nõukas M, Nuttle X, Samocha KE, Trinh M, Ullah F, Võsa U, Hurles ME, Aradhya S, Davis EE, Finucane H, Gusella JF, Janze A, Katsanis N, Matyakhina L, Neale BM, Sanders D, Warren S, Hodge JC, Lal D, Ruderfer DM, Meck J, Mägi R, Esko T, Reymond A, Kutalik Z, Hakonarson H, Sunyaev S, Brand H, Talkowski ME. A cross-disorder dosage sensitivity map of the human genome. Cell 2022; 185:3041-3055.e25. [PMID: 35917817 PMCID: PMC9742861 DOI: 10.1016/j.cell.2022.06.036] [Citation(s) in RCA: 110] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 03/17/2022] [Accepted: 06/20/2022] [Indexed: 02/06/2023]
Abstract
Rare copy-number variants (rCNVs) include deletions and duplications that occur infrequently in the global human population and can confer substantial risk for disease. In this study, we aimed to quantify the properties of haploinsufficiency (i.e., deletion intolerance) and triplosensitivity (i.e., duplication intolerance) throughout the human genome. We harmonized and meta-analyzed rCNVs from nearly one million individuals to construct a genome-wide catalog of dosage sensitivity across 54 disorders, which defined 163 dosage sensitive segments associated with at least one disorder. These segments were typically gene dense and often harbored dominant dosage sensitive driver genes, which we were able to prioritize using statistical fine-mapping. Finally, we designed an ensemble machine-learning model to predict probabilities of dosage sensitivity (pHaplo & pTriplo) for all autosomal genes, which identified 2,987 haploinsufficient and 1,559 triplosensitive genes, including 648 that were uniquely triplosensitive. This dosage sensitivity resource will provide broad utility for human disease research and clinical genetics.
Collapse
Affiliation(s)
- Ryan L Collins
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Division of Medical Sciences and Department of Medicine, Harvard Medical School, Boston, MA 02115, USA.
| | - Joseph T Glessner
- Department of Pediatrics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pediatrics, Division of Human Genetics, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Eleonora Porcu
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Maarja Lepamets
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 51010 Tartu, Estonia; Institute of Molecular and Cell Biology, University of Tartu, 51010 Tartu, Estonia
| | | | | | - Lide Han
- Division of Genetic Medicine, Department of Medicine, and Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Theodore Morley
- Division of Genetic Medicine, Department of Medicine, and Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | | | - Jacob Ulirsch
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Division of Medical Sciences and Department of Medicine, Harvard Medical School, Boston, MA 02115, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Selin Everett
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Daniel P Howrigan
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Philip M Boone
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA 02115, USA
| | - Jack Fu
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Konrad J Karczewski
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Georgios Kellaris
- Advanced Center for Translational and Genetic Medicine, Stanley Manne Children's Research Institute, Lurie Children's Hospital, Chicago, IL 60611, USA; Departments of Pediatrics and Cell and Developmental Biology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Chelsea Lowther
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Diane Lucente
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Kiana Mohajeri
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Division of Medical Sciences and Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
| | - Margit Nõukas
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 51010 Tartu, Estonia; Institute of Molecular and Cell Biology, University of Tartu, 51010 Tartu, Estonia
| | - Xander Nuttle
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Kaitlin E Samocha
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Division of Medical Sciences and Department of Medicine, Harvard Medical School, Boston, MA 02115, USA; Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10, UK
| | - Mi Trinh
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10, UK
| | - Farid Ullah
- Advanced Center for Translational and Genetic Medicine, Stanley Manne Children's Research Institute, Lurie Children's Hospital, Chicago, IL 60611, USA; Departments of Pediatrics and Cell and Developmental Biology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Urmo Võsa
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 51010 Tartu, Estonia
| | | | | | - Matthew E Hurles
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10, UK
| | | | - Erica E Davis
- Advanced Center for Translational and Genetic Medicine, Stanley Manne Children's Research Institute, Lurie Children's Hospital, Chicago, IL 60611, USA; Departments of Pediatrics and Cell and Developmental Biology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Hilary Finucane
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - James F Gusella
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | | | - Nicholas Katsanis
- Advanced Center for Translational and Genetic Medicine, Stanley Manne Children's Research Institute, Lurie Children's Hospital, Chicago, IL 60611, USA; Departments of Pediatrics and Cell and Developmental Biology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | | | - Benjamin M Neale
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | | | | | - Jennelle C Hodge
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Dennis Lal
- Cologne Center for Genomics, University of Cologne, 51149 Cologne, Germany; Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA; Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Douglas M Ruderfer
- Division of Genetic Medicine, Department of Medicine, and Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Center for Precision Medicine, Department of Biomedical Informatics, and Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | | | - Reedik Mägi
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 51010 Tartu, Estonia
| | - Tõnu Esko
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 51010 Tartu, Estonia
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Zoltán Kutalik
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland; Center for Primary Care and Public Health, University of Lausanne, 1015 Lausanne, Switzerland; Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
| | - Hakon Hakonarson
- Department of Pediatrics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pediatrics, Division of Human Genetics, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Shamil Sunyaev
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Division of Medical Sciences and Department of Medicine, Harvard Medical School, Boston, MA 02115, USA; Division of Genetics, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Harrison Brand
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Pediatric Surgical Research Laboratories, Massachusetts General Hospital, Boston, MA 02114, USA.
| | - Michael E Talkowski
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA.
| |
Collapse
|
29
|
Dey KK, Gazal S, van de Geijn B, Kim SS, Nasser J, Engreitz JM, Price AL. SNP-to-gene linking strategies reveal contributions of enhancer-related and candidate master-regulator genes to autoimmune disease. CELL GENOMICS 2022; 2:100145. [PMID: 35873673 PMCID: PMC9306342 DOI: 10.1016/j.xgen.2022.100145] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
We assess contributions to autoimmune disease of genes whose regulation is driven by enhancer regions (enhancer-related) and genes that regulate other genes in trans (candidate master-regulator). We link these genes to SNPs using several SNP-to-gene (S2G) strategies and apply heritability analyses to draw three conclusions about 11 autoimmune/blood-related diseases/traits. First, several characterizations of enhancer-related genes using functional genomics data are informative for autoimmune disease heritability after conditioning on a broad set of regulatory annotations. Second, candidate master-regulator genes defined using trans-eQTL in blood are also conditionally informative for autoimmune disease heritability. Third, integrating enhancer-related and master-regulator gene sets with protein-protein interaction (PPI) network information magnified their disease signal. The resulting PPI-enhancer gene score produced >2-fold stronger heritability signal and >2-fold stronger enrichment for drug targets, compared with the recently proposed enhancer domain score. In each case, functionally informed S2G strategies produced 4.1- to 13-fold stronger disease signals than conventional window-based strategies.
Collapse
Affiliation(s)
- Kushal K. Dey
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Corresponding author
| | - Steven Gazal
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Bryce van de Geijn
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Genentech, South San Francisco, CA 94080, USA
| | - Samuel Sungil Kim
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Joseph Nasser
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jesse M. Engreitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
- BASE Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford University School of Medicine, Stanford, CA 94304, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alkes L. Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
30
|
Gazal S, Weissbrod O, Hormozdiari F, Dey KK, Nasser J, Jagadeesh KA, Weiner DJ, Shi H, Fulco CP, O'Connor LJ, Pasaniuc B, Engreitz JM, Price AL. Combining SNP-to-gene linking strategies to identify disease genes and assess disease omnigenicity. Nat Genet 2022; 54:827-836. [PMID: 35668300 PMCID: PMC9894581 DOI: 10.1038/s41588-022-01087-y] [Citation(s) in RCA: 48] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 04/27/2022] [Indexed: 02/04/2023]
Abstract
Disease-associated single-nucleotide polymorphisms (SNPs) generally do not implicate target genes, as most disease SNPs are regulatory. Many SNP-to-gene (S2G) linking strategies have been developed to link regulatory SNPs to the genes that they regulate in cis. Here, we developed a heritability-based framework for evaluating and combining different S2G strategies to optimize their informativeness for common disease risk. Our optimal combined S2G strategy (cS2G) included seven constituent S2G strategies and achieved a precision of 0.75 and a recall of 0.33, more than doubling the recall of any individual strategy. We applied cS2G to fine-mapping results for 49 UK Biobank diseases/traits to predict 5,095 causal SNP-gene-disease triplets (with S2G-derived functional interpretation) with high confidence. We further applied cS2G to provide an empirical assessment of disease omnigenicity; we determined that the top 1% of genes explained roughly half of the SNP heritability linked to all genes and that gene-level architectures vary with variant allele frequency.
Collapse
Affiliation(s)
- Steven Gazal
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Farhad Hormozdiari
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kushal K Dey
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Joseph Nasser
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Karthik A Jagadeesh
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Charles P Fulco
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Bristol Myers Squibb, Cambridge, MA, USA
| | | | - Bogdan Pasaniuc
- Departments of Computational Medicine, Human Genetics, Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Jesse M Engreitz
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- BASE Initiative, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford University School of Medicine, Stanford, CA, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| |
Collapse
|
31
|
Siewert-Rocks KM, Kim SS, Yao DW, Shi H, Price AL. Leveraging gene co-regulation to identify gene sets enriched for disease heritability. Am J Hum Genet 2022; 109:393-404. [PMID: 35108496 PMCID: PMC8948163 DOI: 10.1016/j.ajhg.2022.01.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 01/04/2022] [Indexed: 12/15/2022] Open
Abstract
Identifying gene sets that are associated to disease can provide valuable biological knowledge, but a fundamental challenge of gene set analyses of GWAS data is linking disease-associated SNPs to genes. Transcriptome-wide association studies (TWASs) detect associations between the genetically predicted expression of a gene and disease risk, thus implicating candidate disease genes. However, causal disease genes at TWAS-associated loci generally remain unknown due to gene co-regulation, which leads to correlations across genes in predicted expression. We developed a method, gene co-regulation score (GCSC) regression, to identify gene sets that are enriched for disease heritability explained by predicted expression. GCSC regresses TWAS chi-square statistics on gene co-regulation scores reflecting correlations in predicted gene expression; a gene set is enriched for heritability if genes with high co-regulation to the set have higher TWAS chi-square statistics than genes with low co-regulation to the set, beyond what is expected based on co-regulation to all genes. We verified via simulations that GCSC is well calibrated and well powered. We applied GCSC to gene expression data from GTEx (48 tissues) and GWAS summary statistics for 43 independent diseases and complex traits analyzing a broad set of biological pathways and specifically expressed gene sets. We identified many enriched sets, recapitulating known biology. For Alzheimer disease, we detected evidence of an immune basis, and specifically a role for antigen presentation, in analyses of both biological pathways and specifically expressed gene sets. Our results highlight the advantages of leveraging gene co-regulation within the TWAS framework to identify enriched gene sets.
Collapse
Affiliation(s)
- Katherine M Siewert-Rocks
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Samuel S Kim
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Douglas W Yao
- Program in Systems, Synthetic, and Quantitative Biology, Harvard University, Cambridge, MA 02138, USA
| | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.
| |
Collapse
|
32
|
Improving genetic diagnosis of Mendelian disease with RNA sequencing: a narrative review. JOURNAL OF BIO-X RESEARCH 2022. [DOI: 10.1097/jbr.0000000000000100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
33
|
Wang X, Glubb DM, O'Mara TA. 10 Years of GWAS discovery in endometrial cancer: Aetiology, function and translation. EBioMedicine 2022; 77:103895. [PMID: 35219087 PMCID: PMC8881374 DOI: 10.1016/j.ebiom.2022.103895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 02/07/2022] [Accepted: 02/08/2022] [Indexed: 12/24/2022] Open
Abstract
Endometrial cancer is a common gynaecological cancer with increasing incidence and mortality. In the last decade, endometrial cancer genome-wide association studies (GWAS) have provided a resource to explore aetiology and for functional interpretation of heritable risk variation, informing endometrial cancer biology. Indeed, GWAS data have been used to assess relationships with other traits through correlation and Mendelian randomisation analyses, establishing genetic relationships and potential risk factors. Cross-trait GWAS analyses have increased statistical power and identified novel endometrial cancer risk variation related to other traits. Functional analysis of risk loci has helped prioritise candidate susceptibility genes, revealing molecular mechanisms and networks. Lastly, risk scores generated using endometrial cancer GWAS data may allow for clinical translation through identification of patients at high risk of disease. In the next decade, this knowledge base should enable substantial progress in our understanding of endometrial cancer and, potentially, new approaches for its screening and treatment.
Collapse
|
34
|
Hoskins JW, Chung CC, O’Brien A, Zhong J, Connelly K, Collins I, Shi J, Amundadottir LT. Inferred expression regulator activities suggest genes mediating cardiometabolic genetic signals. PLoS Comput Biol 2021; 17:e1009563. [PMID: 34793442 PMCID: PMC8639061 DOI: 10.1371/journal.pcbi.1009563] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 12/02/2021] [Accepted: 10/15/2021] [Indexed: 12/12/2022] Open
Abstract
Expression QTL (eQTL) analyses have suggested many genes mediating genome-wide association study (GWAS) signals but most GWAS signals still lack compelling explanatory genes. We have leveraged an adipose-specific gene regulatory network to infer expression regulator activities and phenotypic master regulators (MRs), which were used to detect activity QTLs (aQTLs) at cardiometabolic trait GWAS loci. Regulator activities were inferred with the VIPER algorithm that integrates enrichment of expected expression changes among a regulator's target genes with confidence in their regulator-target network interactions and target overlap between different regulators (i.e., pleiotropy). Phenotypic MRs were identified as those regulators whose activities were most important in predicting their respective phenotypes using random forest modeling. While eQTLs were typically more significant than aQTLs in cis, the opposite was true among candidate MRs in trans. Several GWAS loci colocalized with MR trans-eQTLs/aQTLs in the absence of colocalized cis-QTLs. Intriguingly, at the 1p36.1 BMI GWAS locus the EPHB2 cis-aQTL was stronger than its cis-eQTL and colocalized with the GWAS signal and 35 BMI MR trans-aQTLs, suggesting the GWAS signal may be mediated by effects on EPHB2 activity and its downstream effects on a network of BMI MRs. These MR and aQTL analyses represent systems genetic methods that may be broadly applied to supplement standard eQTL analyses for suggesting molecular effects mediating GWAS signals.
Collapse
Affiliation(s)
- Jason W. Hoskins
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (JWH); (LTA)
| | - Charles C. Chung
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- Cancer Genome Research Laboratory, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Aidan O’Brien
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Jun Zhong
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Katelyn Connelly
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Irene Collins
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Jianxin Shi
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Laufey T. Amundadottir
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (JWH); (LTA)
| |
Collapse
|
35
|
The non-coding genome in genetic brain disorders: new targets for therapy? Essays Biochem 2021; 65:671-683. [PMID: 34414418 PMCID: PMC8564736 DOI: 10.1042/ebc20200121] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 07/12/2021] [Accepted: 07/26/2021] [Indexed: 11/30/2022]
Abstract
The non-coding genome, consisting of more than 98% of all genetic information in humans and once judged as ‘Junk DNA’, is increasingly moving into the spotlight in the field of human genetics. Non-coding regulatory elements (NCREs) are crucial to ensure correct spatio-temporal gene expression. Technological advancements have allowed to identify NCREs on a large scale, and mechanistic studies have helped to understand the biological mechanisms underlying their function. It is increasingly becoming clear that genetic alterations of NCREs can cause genetic disorders, including brain diseases. In this review, we concisely discuss mechanisms of gene regulation and how to investigate them, and give examples of non-coding alterations of NCREs that give rise to human brain disorders. The cross-talk between basic and clinical studies enhances the understanding of normal and pathological function of NCREs, allowing better interpretation of already existing and novel data. Improved functional annotation of NCREs will not only benefit diagnostics for patients, but might also lead to novel areas of investigations for targeted therapies, applicable to a wide panel of genetic disorders. The intrinsic complexity and precision of the gene regulation process can be turned to the advantage of highly specific treatments. We further discuss this exciting new field of ‘enhancer therapy’ based on recent examples.
Collapse
|
36
|
Yousefi S, Deng R, Lanko K, Salsench EM, Nikoncuk A, van der Linde HC, Perenthaler E, van Ham TJ, Mulugeta E, Barakat TS. Comprehensive multi-omics integration identifies differentially active enhancers during human brain development with clinical relevance. Genome Med 2021; 13:162. [PMID: 34663447 PMCID: PMC8524963 DOI: 10.1186/s13073-021-00980-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 09/29/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Non-coding regulatory elements (NCREs), such as enhancers, play a crucial role in gene regulation, and genetic aberrations in NCREs can lead to human disease, including brain disorders. The human brain is a complex organ that is susceptible to numerous disorders; many of these are caused by genetic changes, but a multitude remain currently unexplained. Understanding NCREs acting during brain development has the potential to shed light on previously unrecognized genetic causes of human brain disease. Despite immense community-wide efforts to understand the role of the non-coding genome and NCREs, annotating functional NCREs remains challenging. METHODS Here we performed an integrative computational analysis of virtually all currently available epigenome data sets related to human fetal brain. RESULTS Our in-depth analysis unravels 39,709 differentially active enhancers (DAEs) that show dynamic epigenomic rearrangement during early stages of human brain development, indicating likely biological function. Many of these DAEs are linked to clinically relevant genes, and functional validation of selected DAEs in cell models and zebrafish confirms their role in gene regulation. Compared to enhancers without dynamic epigenomic rearrangement, DAEs are subjected to higher sequence constraints in humans, have distinct sequence characteristics and are bound by a distinct transcription factor landscape. DAEs are enriched for GWAS loci for brain-related traits and for genetic variation found in individuals with neurodevelopmental disorders, including autism. CONCLUSION This compendium of high-confidence enhancers will assist in deciphering the mechanism behind developmental genetics of human brain and will be relevant to uncover missing heritability in human genetic brain disorders.
Collapse
Affiliation(s)
- Soheil Yousefi
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Ruizhi Deng
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Kristina Lanko
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Eva Medico Salsench
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Anita Nikoncuk
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Herma C. van der Linde
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Elena Perenthaler
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Tjakko J. van Ham
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Eskeatnaf Mulugeta
- Department of Cell Biology, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Tahsin Stefan Barakat
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
37
|
Guerrini MM, Oguchi A, Suzuki A, Murakawa Y. Cap analysis of gene expression (CAGE) and noncoding regulatory elements. Semin Immunopathol 2021; 44:127-136. [PMID: 34468849 DOI: 10.1007/s00281-021-00886-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 08/13/2021] [Indexed: 01/06/2023]
Abstract
Cap analysis of gene expression (CAGE) was developed to detect the 5' end of RNA. Trapping of the RNA 5'-cap structure enables the enrichment and selective sequencing of complete transcripts. Upscaled high-throughput versions of CAGE have enabled the genome-wide identification of transcription start sites, including transcriptionally active promoters and enhancers. CAGE sequencing can be exploited to draw comprehensive maps of active genomic regulatory elements in a cell type- and activation-specific manner. The cells of the immune system are among the best candidates to be analyzed in humans, since they are easily accessible. In this review, we discuss how CAGE data are instrumental for integrative analyses with quantitative trait loci and omics data, and their usefulness in the mechanistic interpretation of the effects of genetic variations over the entire human genome. Integrating CAGE data with the currently available omics information will contribute to better understanding of the genome-wide association study variants that lie outside of annotated genes, deepening our knowledge on human diseases, and enabling the targeted design of more specific therapeutic interventions.
Collapse
Affiliation(s)
- Matteo Maurizio Guerrini
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan.
| | - Akiko Oguchi
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Akari Suzuki
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| | - Yasuhiro Murakawa
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- IFOM-the FIRC Institute of Molecular Oncology, Milan, Italy
| |
Collapse
|
38
|
Claringbould A, Zaugg JB. Enhancers in disease: molecular basis and emerging treatment strategies. Trends Mol Med 2021; 27:1060-1073. [PMID: 34420874 DOI: 10.1016/j.molmed.2021.07.012] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 07/22/2021] [Accepted: 07/26/2021] [Indexed: 02/07/2023]
Abstract
Enhancers are genomic sequences that play a key role in regulating tissue-specific gene expression levels. An increasing number of diseases are linked to impaired enhancer function through chromosomal rearrangement, genetic variation within enhancers, or epigenetic modulation. Here, we review how these enhancer disruptions have recently been implicated in congenital disorders, cancers, and common complex diseases and address the implications for diagnosis and treatment. Although further fundamental research into enhancer function, target genes, and context is required, enhancer-targeting drugs and gene editing approaches show great therapeutic promise for a range of diseases.
Collapse
Affiliation(s)
- Annique Claringbould
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Judith B Zaugg
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstraße 1, 69117 Heidelberg, Germany.
| |
Collapse
|
39
|
Puig RR, Boddie P, Khan A, Castro-Mondragon JA, Mathelier A. UniBind: maps of high-confidence direct TF-DNA interactions across nine species. BMC Genomics 2021; 22:482. [PMID: 34174819 PMCID: PMC8236138 DOI: 10.1186/s12864-021-07760-6] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 05/27/2021] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Transcription factors (TFs) bind specifically to TF binding sites (TFBSs) at cis-regulatory regions to control transcription. It is critical to locate these TF-DNA interactions to understand transcriptional regulation. Efforts to predict bona fide TFBSs benefit from the availability of experimental data mapping DNA binding regions of TFs (chromatin immunoprecipitation followed by sequencing - ChIP-seq). RESULTS In this study, we processed ~ 10,000 public ChIP-seq datasets from nine species to provide high-quality TFBS predictions. After quality control, it culminated with the prediction of ~ 56 million TFBSs with experimental and computational support for direct TF-DNA interactions for 644 TFs in > 1000 cell lines and tissues. These TFBSs were used to predict > 197,000 cis-regulatory modules representing clusters of binding events in the corresponding genomes. The high-quality of the TFBSs was reinforced by their evolutionary conservation, enrichment at active cis-regulatory regions, and capacity to predict combinatorial binding of TFs. Further, we confirmed that the cell type and tissue specificity of enhancer activity was correlated with the number of TFs with binding sites predicted in these regions. All the data is provided to the community through the UniBind database that can be accessed through its web-interface ( https://unibind.uio.no/ ), a dedicated RESTful API, and as genomic tracks. Finally, we provide an enrichment tool, available as a web-service and an R package, for users to find TFs with enriched TFBSs in a set of provided genomic regions. CONCLUSIONS UniBind is the first resource of its kind, providing the largest collection of high-confidence direct TF-DNA interactions in nine species.
Collapse
Affiliation(s)
- Rafael Riudavets Puig
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0349, Oslo, Norway
| | - Paul Boddie
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0349, Oslo, Norway
| | - Aziz Khan
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0349, Oslo, Norway
- Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | | | - Anthony Mathelier
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0349, Oslo, Norway.
- Department of Medical Genetics, Oslo University Hospital, Oslo, 0424, Norway.
| |
Collapse
|
40
|
Abstract
Shadow enhancers are seemingly redundant transcriptional cis-regulatory elements that regulate the same gene and drive overlapping expression patterns. Recent studies have shown that shadow enhancers are remarkably abundant and control most developmental gene expression in both invertebrates and vertebrates, including mammals. Shadow enhancers might provide an important mechanism for buffering gene expression against mutations in non-coding regulatory regions of genes implicated in human disease. Technological advances in genome editing and live imaging have shed light on how shadow enhancers establish precise gene expression patterns and confer phenotypic robustness. Shadow enhancers can interact in complex ways and may also help to drive the formation of transcriptional hubs within the nucleus. Despite their apparent redundancy, the prevalence and evolutionary conservation of shadow enhancers underscore their key role in emerging metazoan gene regulatory networks.
Collapse
|
41
|
Genome-wide enhancer maps link risk variants to disease genes. Nature 2021; 593:238-243. [PMID: 33828297 PMCID: PMC9153265 DOI: 10.1038/s41586-021-03446-x] [Citation(s) in RCA: 255] [Impact Index Per Article: 85.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 03/11/2021] [Indexed: 02/07/2023]
Abstract
Genome-wide association studies (GWAS) have identified thousands of noncoding loci that are associated with human diseases and complex traits, each of which could reveal insights into the mechanisms of disease1. Many of the underlying causal variants may affect enhancers2,3, but we lack accurate maps of enhancers and their target genes to interpret such variants. We recently developed the activity-by-contact (ABC) model to predict which enhancers regulate which genes and validated the model using CRISPR perturbations in several cell types4. Here we apply this ABC model to create enhancer-gene maps in 131 human cell types and tissues, and use these maps to interpret the functions of GWAS variants. Across 72 diseases and complex traits, ABC links 5,036 GWAS signals to 2,249 unique genes, including a class of 577 genes that appear to influence multiple phenotypes through variants in enhancers that act in different cell types. In inflammatory bowel disease (IBD), causal variants are enriched in predicted enhancers by more than 20-fold in particular cell types such as dendritic cells, and ABC achieves higher precision than other regulatory methods at connecting noncoding variants to target genes. These variant-to-function maps reveal an enhancer that contains an IBD risk variant and that regulates the expression of PPIF to alter the membrane potential of mitochondria in macrophages. Our study reveals principles of genome regulation, identifies genes that affect IBD and provides a resource and generalizable strategy to connect risk variants of common diseases to their molecular and cellular functions.
Collapse
|
42
|
Mu Z, Wei W, Fair B, Miao J, Zhu P, Li YI. The impact of cell type and context-dependent regulatory variants on human immune traits. Genome Biol 2021; 22:122. [PMID: 33926512 PMCID: PMC8082814 DOI: 10.1186/s13059-021-02334-x] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2020] [Accepted: 03/30/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND The vast majority of trait-associated variants identified using genome-wide association studies (GWAS) are noncoding, and therefore assumed to impact gene regulation. However, the majority of trait-associated loci are unexplained by regulatory quantitative trait loci (QTLs). RESULTS We perform a comprehensive characterization of the putative mechanisms by which GWAS loci impact human immune traits. By harmonizing four major immune QTL studies, we identify 26,271 expression QTLs (eQTLs) and 23,121 splicing QTLs (sQTLs) spanning 18 immune cell types. Our colocalization analyses between QTLs and trait-associated loci from 72 GWAS reveals that genetic effects on RNA expression and splicing in immune cells colocalize with 40.4% of GWAS loci for immune-related traits, in many cases increasing the fraction of colocalized loci by two fold compared to previous studies. Notably, we find that the largest contributors of this increase are splicing QTLs, which colocalize on average with 14% of all GWAS loci that do not colocalize with eQTLs. By contrast, we find that cell type-specific eQTLs, and eQTLs with small effect sizes contribute very few new colocalizations. To investigate the 60% of GWAS loci that remain unexplained, we collect H3K27ac CUT&Tag data from rheumatoid arthritis and healthy controls, and find large-scale differences between immune cells from the different disease contexts, including at regions overlapping unexplained GWAS loci. CONCLUSION Altogether, our work supports RNA splicing as an important mediator of genetic effects on immune traits, and suggests that we must expand our study of regulatory processes in disease contexts to improve functional interpretation of as yet unexplained GWAS loci.
Collapse
Affiliation(s)
- Zepeng Mu
- Committee on Genetics, Genomics & Systems Biology, University of Chicago, Chicago, IL USA
| | - Wei Wei
- Department of Clinical Immunology, Xijing Hospital, Xi’an, China
- National Translational Science Center for Molecular Medicine, Xi’an, China
| | - Benjamin Fair
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL USA
| | - Jinlin Miao
- Department of Clinical Immunology, Xijing Hospital, Xi’an, China
- National Translational Science Center for Molecular Medicine, Xi’an, China
| | - Ping Zhu
- Department of Clinical Immunology, Xijing Hospital, Xi’an, China
- National Translational Science Center for Molecular Medicine, Xi’an, China
| | - Yang I. Li
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL USA
- Department of Human Genetics, Department of Medicine, University of Chicago, Chicago, IL USA
| |
Collapse
|
43
|
Kim SS, Dey KK, Weissbrod O, Márquez-Luna C, Gazal S, Price AL. Improving the informativeness of Mendelian disease-derived pathogenicity scores for common disease. Nat Commun 2020; 11:6258. [PMID: 33288751 PMCID: PMC7721881 DOI: 10.1038/s41467-020-20087-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 11/09/2020] [Indexed: 02/08/2023] Open
Abstract
Despite considerable progress on pathogenicity scores prioritizing variants for Mendelian disease, little is known about the utility of these scores for common disease. Here, we assess the informativeness of Mendelian disease-derived pathogenicity scores for common disease and improve upon existing scores. We first apply stratified linkage disequilibrium (LD) score regression to evaluate published pathogenicity scores across 41 common diseases and complex traits (average N = 320K). Several of the resulting annotations are informative for common disease, even after conditioning on a broad set of functional annotations. We then improve upon published pathogenicity scores by developing AnnotBoost, a machine learning framework to impute and denoise pathogenicity scores using a broad set of functional annotations. AnnotBoost substantially increases the informativeness for common disease of both previously uninformative and previously informative pathogenicity scores, implying that Mendelian and common disease variants share similar properties. The boosted scores also produce improvements in heritability model fit and in classifying disease-associated, fine-mapped SNPs. Our boosted scores may improve fine-mapping and candidate gene discovery for common disease.
Collapse
Affiliation(s)
- Samuel S Kim
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 02142, USA.
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.
| | - Kushal K Dey
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Carla Márquez-Luna
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Steven Gazal
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.
| |
Collapse
|
44
|
Viñuela A, Varshney A, van de Bunt M, Prasad RB, Asplund O, Bennett A, Boehnke M, Brown AA, Erdos MR, Fadista J, Hansson O, Hatem G, Howald C, Iyengar AK, Johnson P, Krus U, MacDonald PE, Mahajan A, Manning Fox JE, Narisu N, Nylander V, Orchard P, Oskolkov N, Panousis NI, Payne A, Stitzel ML, Vadlamudi S, Welch R, Collins FS, Mohlke KL, Gloyn AL, Scott LJ, Dermitzakis ET, Groop L, Parker SCJ, McCarthy MI. Genetic variant effects on gene expression in human pancreatic islets and their implications for T2D. Nat Commun 2020; 11:4912. [PMID: 32999275 PMCID: PMC7528108 DOI: 10.1038/s41467-020-18581-8] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 08/12/2020] [Indexed: 02/08/2023] Open
Abstract
Most signals detected by genome-wide association studies map to non-coding sequence and their tissue-specific effects influence transcriptional regulation. However, key tissues and cell-types required for functional inference are absent from large-scale resources. Here we explore the relationship between genetic variants influencing predisposition to type 2 diabetes (T2D) and related glycemic traits, and human pancreatic islet transcription using data from 420 donors. We find: (a) 7741 cis-eQTLs in islets with a replication rate across 44 GTEx tissues between 40% and 73%; (b) marked overlap between islet cis-eQTL signals and active regulatory sequences in islets, with reduced eQTL effect size observed in the stretch enhancers most strongly implicated in GWAS signal location; (c) enrichment of islet cis-eQTL signals with T2D risk variants identified in genome-wide association studies; and (d) colocalization between 47 islet cis-eQTLs and variants influencing T2D or glycemic traits, including DGKB and TCF7L2. Our findings illustrate the advantages of performing functional and regulatory studies in disease relevant tissues.
Collapse
Affiliation(s)
- Ana Viñuela
- grid.8591.50000 0001 2322 4988Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland ,grid.8591.50000 0001 2322 4988Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, 1211 Geneva, Switzerland ,grid.419765.80000 0001 2223 3006Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland ,grid.1006.70000 0001 0462 7212Biosciences Institute, Faculty of Medical Sciences, Newcastle University, NE1 4EP Newcastle, UK
| | - Arushi Varshney
- grid.214458.e0000000086837370Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109 USA
| | - Martijn van de Bunt
- grid.4991.50000 0004 1936 8948Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN UK ,grid.4991.50000 0004 1936 8948Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, OX3 7LE UK ,grid.410556.30000 0001 0440 1440Oxford NIHR Biomedical Research Centre, Oxford University Hospitals Trust, Oxford, OX3 7LE UK
| | - Rashmi B. Prasad
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Olof Asplund
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Amanda Bennett
- grid.4991.50000 0004 1936 8948Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN UK
| | - Michael Boehnke
- grid.214458.e0000000086837370Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109 USA
| | - Andrew A. Brown
- grid.8591.50000 0001 2322 4988Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland ,grid.8591.50000 0001 2322 4988Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, 1211 Geneva, Switzerland ,grid.419765.80000 0001 2223 3006Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland ,grid.8241.f0000 0004 0397 2876Population Health and Genomics, University of Dundee, Dundee, Scotland, DD1 9SY UK
| | - Michael R. Erdos
- grid.280128.10000 0001 2233 9230Medical Genomics and Metabolic Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892 USA
| | - João Fadista
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden ,grid.6203.70000 0004 0417 4147Department of Epidemiology Research, Statens Serum Institut, Copenhagen, DK 2300 Denmark ,grid.7737.40000 0004 0410 2071Finnish Institute for Molecular Medicine (FIMM), University of Helsinki, Helsinki, Finland
| | - Ola Hansson
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden ,grid.7737.40000 0004 0410 2071Finnish Institute for Molecular Medicine (FIMM), University of Helsinki, Helsinki, Finland
| | - Gad Hatem
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Cédric Howald
- grid.8591.50000 0001 2322 4988Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland ,grid.8591.50000 0001 2322 4988Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, 1211 Geneva, Switzerland ,grid.419765.80000 0001 2223 3006Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland
| | - Apoorva K. Iyengar
- grid.410711.20000 0001 1034 1720Department of Genetics, University of North Carolina, Chapel Hill, NC 27599 USA
| | - Paul Johnson
- grid.4991.50000 0004 1936 8948Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN UK
| | - Ulrika Krus
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Patrick E. MacDonald
- grid.17089.37Department of Pharmacology and Alberta Diabetes Institute, University of Alberta, Edmonton, Alberta Canada
| | - Anubha Mahajan
- grid.4991.50000 0004 1936 8948Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN UK ,grid.418158.10000 0004 0534 4718Present Address: Human Genetics, Genentech, 1 DNA Way, South San Francisco, CA 94080 USA
| | - Jocelyn E. Manning Fox
- grid.17089.37Department of Pharmacology and Alberta Diabetes Institute, University of Alberta, Edmonton, Alberta Canada
| | - Narisu Narisu
- grid.280128.10000 0001 2233 9230Medical Genomics and Metabolic Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892 USA
| | - Vibe Nylander
- grid.4991.50000 0004 1936 8948Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, OX3 7LE UK
| | - Peter Orchard
- grid.214458.e0000000086837370Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI 48109 USA
| | - Nikolay Oskolkov
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Nikolaos I. Panousis
- grid.8591.50000 0001 2322 4988Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland ,grid.8591.50000 0001 2322 4988Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, 1211 Geneva, Switzerland ,grid.419765.80000 0001 2223 3006Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland
| | - Anthony Payne
- grid.4991.50000 0004 1936 8948Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN UK
| | - Michael L. Stitzel
- grid.249880.f0000 0004 0374 0039The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA ,grid.63054.340000 0001 0860 4915Department of Genetics and Genome Sciences, Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032 USA
| | - Swarooparani Vadlamudi
- grid.410711.20000 0001 1034 1720Department of Genetics, University of North Carolina, Chapel Hill, NC 27599 USA
| | - Ryan Welch
- grid.214458.e0000000086837370Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109 USA
| | - Francis S. Collins
- grid.280128.10000 0001 2233 9230Medical Genomics and Metabolic Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892 USA
| | - Karen L. Mohlke
- grid.410711.20000 0001 1034 1720Department of Genetics, University of North Carolina, Chapel Hill, NC 27599 USA
| | - Anna L. Gloyn
- grid.4991.50000 0004 1936 8948Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN UK ,grid.4991.50000 0004 1936 8948Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, OX3 7LE UK ,grid.410556.30000 0001 0440 1440Oxford NIHR Biomedical Research Centre, Oxford University Hospitals Trust, Oxford, OX3 7LE UK ,grid.168010.e0000000419368956Department of Pediatrics, Division of Endocrinology, Stanford School of Medicine, Stanford University, Stanford, CA USA
| | - Laura J. Scott
- grid.214458.e0000000086837370Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109 USA
| | - Emmanouil T. Dermitzakis
- grid.8591.50000 0001 2322 4988Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland ,grid.8591.50000 0001 2322 4988Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, 1211 Geneva, Switzerland ,grid.419765.80000 0001 2223 3006Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland
| | - Leif Groop
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden ,grid.7737.40000 0004 0410 2071Finnish Institute for Molecular Medicine (FIMM), University of Helsinki, Helsinki, Finland
| | - Stephen C. J. Parker
- grid.214458.e0000000086837370Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109 USA ,grid.214458.e0000000086837370Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI 48109 USA
| | - Mark I. McCarthy
- grid.4991.50000 0004 1936 8948Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN UK ,grid.4991.50000 0004 1936 8948Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, OX3 7LE UK ,grid.410556.30000 0001 0440 1440Oxford NIHR Biomedical Research Centre, Oxford University Hospitals Trust, Oxford, OX3 7LE UK ,grid.418158.10000 0004 0534 4718Present Address: Human Genetics, Genentech, 1 DNA Way, South San Francisco, CA 94080 USA
| |
Collapse
|
45
|
Verweij N, Benjamins JW, Morley MP, van de Vegte YJ, Teumer A, Trenkwalder T, Reinhard W, Cappola TP, van der Harst P. The Genetic Makeup of the Electrocardiogram. Cell Syst 2020; 11:229-238.e5. [PMID: 32916098 DOI: 10.1016/j.cels.2020.08.005] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 05/27/2020] [Accepted: 08/06/2020] [Indexed: 12/11/2022]
Abstract
The electrocardiogram (ECG) is one of the most useful non-invasive diagnostic tests for a wide array of cardiac disorders. Traditional approaches to analyzing ECGs focus on individual segments. Here, we performed comprehensive deep phenotyping of 77,190 ECGs in the UK Biobank across the complete cycle of cardiac conduction, resulting in 500 spatial-temporal datapoints, across 10 million genetic variants. In addition to characterizing polygenic risk scores for the traditional ECG segments, we identified over 300 genetic loci that are statistically associated with the high-dimensional representation of the ECG. We established the genetic ECG signature for dilated cardiomyopathy, associated the BAG3, HSPB7/CLCNKA, PRKCA, TMEM43, and OBSCN loci with disease risk and confirmed this association in an independent cohort. In total, our work demonstrates that a high-dimensional analysis of the entire ECG provides unique opportunities for studying cardiac biology and disease and furthering drug development. A record of this paper's transparent peer review process is included in the Supplemental Information.
Collapse
Affiliation(s)
- Niek Verweij
- University of Groningen, University Medical Center Groningen, Department of Cardiology, Groningen, the Netherlands; Genomics plc, Oxford, UK.
| | - Jan-Walter Benjamins
- University of Groningen, University Medical Center Groningen, Department of Cardiology, Groningen, the Netherlands
| | - Michael P Morley
- Cardiovascular Institute, Perelman School of Medicine , University of Pennsylvania, Philadelphia, USA
| | - Yordi J van de Vegte
- University of Groningen, University Medical Center Groningen, Department of Cardiology, Groningen, the Netherlands
| | - Alexander Teumer
- Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany; DZHK (German Center for Cardiovascular Research), partner site Greifswald, Greifswald, Germany
| | - Teresa Trenkwalder
- Klinik für Herz- und Kreislauferkrankungen, Deutsches Herzzentrum München, Technical University Munich, Munich, Germany; DZHK (German Center for Cardiovascular Research), partner site Munich Heart Alliance, Munich, Germany
| | - Wibke Reinhard
- Klinik für Herz- und Kreislauferkrankungen, Deutsches Herzzentrum München, Technical University Munich, Munich, Germany
| | - Thomas P Cappola
- Division of Cardiovascular Medicine at the Perelman School of Medicine at the University of Pennsylvania, Philadelphia, USA
| | - Pim van der Harst
- University of Groningen, University Medical Center Groningen, Department of Cardiology, Groningen, the Netherlands; Department of Cardiology, Heart and Lung Division, University Medical Center Utrecht, Utrecht, the Netherlands
| |
Collapse
|
46
|
Boukas L, Bjornsson HT, Hansen KD. Promoter CpG Density Predicts Downstream Gene Loss-of-Function Intolerance. Am J Hum Genet 2020; 107:487-498. [PMID: 32800095 PMCID: PMC7477270 DOI: 10.1016/j.ajhg.2020.07.014] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Accepted: 07/22/2020] [Indexed: 12/26/2022] Open
Abstract
The aggregation and joint analysis of large numbers of exome sequences has recently made it possible to derive estimates of intolerance to loss-of-function (LoF) variation for human genes. Here, we demonstrate strong and widespread coupling between genic LoF intolerance and promoter CpG density across the human genome. Genes downstream of the most CpG-rich promoters (top 10% CpG density) have a 67.2% probability of being highly LoF intolerant, using the LOEUF metric from gnomAD. This is in contrast to 7.4% of genes downstream of the most CpG-poor (bottom 10% CpG density) promoters. Combining promoter CpG density with exonic and promoter conservation explains 33.4% of the variation in LOEUF, and the contribution of CpG density exceeds the individual contributions of exonic and promoter conservation. We leverage this to train a simple and easily interpretable predictive model that outperforms other existing predictors and allows us to classify 1,760 genes-which are currently unascertained in gnomAD-as highly LoF intolerant or not. These predictions have the potential to aid in the interpretation of novel variants in the clinical setting. Moreover, our results reveal that high CpG density is not merely a generic feature of human promoters but is preferentially encountered at the promoters of the most selectively constrained genes, calling into question the prevailing view that CpG islands are not subject to selection.
Collapse
Affiliation(s)
- Leandros Boukas
- Human Genetics Training Program, Johns Hopkins University School of Medicine, 733 N Broadway, Baltimore, MD 21205, USA; Department of Genetic Medicine, Johns Hopkins University School of Medicine, 733 N Broadway, Baltimore, MD 21205, USA
| | - Hans T Bjornsson
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, 733 N Broadway, Baltimore, MD 21205, USA; Department of Pediatrics, Johns Hopkins University School of Medicine, 1800 Orleans Street, Baltimore, MD 21287, USA; Faculty of Medicine, University of Iceland, Sturlugata 8, 101 Reykjavik, Iceland; Landspitali University Hospital, Hringbraut, 101 Reykjavik, Iceland.
| | - Kasper D Hansen
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, 733 N Broadway, Baltimore, MD 21205, USA; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 N. Wolfe St, Baltimore, MD 21205, USA.
| |
Collapse
|
47
|
Diedisheim M, Carcarino E, Vandiedonck C, Roussel R, Gautier JF, Venteclef N. Regulation of inflammation in diabetes: From genetics to epigenomics evidence. Mol Metab 2020; 41:101041. [PMID: 32603690 PMCID: PMC7394913 DOI: 10.1016/j.molmet.2020.101041] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 06/09/2020] [Accepted: 06/11/2020] [Indexed: 12/13/2022] Open
Abstract
Background Diabetes is one of the greatest public health challenges worldwide, and we still lack complementary approaches to significantly enhance the efficacy of preventive and therapeutic approaches. Genetic and environmental factors are the culprits involved in diabetes risk. Evidence from the last decade has highlighted that deregulation in the immune and inflammatory responses increase susceptibility to type 1 and type 2 diabetes. Spatiotemporal patterns of gene expression involved in immune cell polarisation depend on genomic enhancer elements in response to inflammatory and metabolic cues. Several studies have reported that most regulatory genetic variants are located in the non-protein coding regions of the genome and particularly in enhancer regions. The progress of high-throughput technologies has permitted the characterisation of enhancer chromatin properties. These advances support the concept that genetic alteration of enhancers may influence the immune and inflammatory responses in relation to diabetes. Scope of review Results from genome-wide association studies (GWAS) combined with functional and integrative analyses have elucidated the impacts of some diabetes risk-associated variants that are involved in the regulation of the immune system. Additionally, genetic variant mapping to enhancer regions may alter enhancer status, which in turn leads to aberrant expression of inflammatory genes associated with diabetes susceptibility. The focus of this review was to provide an overview of the current indications that inflammatory processes are regulated at the genetic and epigenomic levels in diabetes, along with perspectives on future research avenues that may improve understanding of the disease. Major conclusions In this review, we provide genetic evidence in support of a deregulated immune response as a risk factor in diabetes. We also argue about the importance of enhancer regions in the regulation of immune cell polarisation and how the recent advances using genome-wide methods for enhancer identification have enabled the determination of the impact of enhancer genetic variation on diabetes onset and phenotype. This could eventually lead to better management plans and improved treatment responses in human diabetes.
Collapse
Affiliation(s)
- Marc Diedisheim
- Centre de Recherche des Cordeliers, INSERM, Université de Paris, IMMEDIAB Laboratory, F-75006, Paris, France
| | - Elena Carcarino
- Centre de Recherche des Cordeliers, INSERM, Université de Paris, IMMEDIAB Laboratory, F-75006, Paris, France
| | - Claire Vandiedonck
- Centre de Recherche des Cordeliers, INSERM, Université de Paris, IMMEDIAB Laboratory, F-75006, Paris, France
| | - Ronan Roussel
- Centre de Recherche des Cordeliers, INSERM, Université de Paris, IMMEDIAB Laboratory, F-75006, Paris, France; Bichat-Claude Bernard, Hospital, AP-HP, Diabetology Department, Université de Paris, Paris, France
| | - Jean-François Gautier
- Centre de Recherche des Cordeliers, INSERM, Université de Paris, IMMEDIAB Laboratory, F-75006, Paris, France; Lariboisière Hospital, AP-HP, Diabetology Department, Université de Paris, Paris, France
| | - Nicolas Venteclef
- Centre de Recherche des Cordeliers, INSERM, Université de Paris, IMMEDIAB Laboratory, F-75006, Paris, France.
| |
Collapse
|