1
|
Jiang K, Liu T, Kales S, Tewhey R, Kim D, Park Y, Jarvis JN. A systematic strategy for identifying causal single nucleotide polymorphisms and their target genes on Juvenile arthritis risk haplotypes. BMC Med Genomics 2024; 17:185. [PMID: 38997781 PMCID: PMC11241977 DOI: 10.1186/s12920-024-01954-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 06/27/2024] [Indexed: 07/14/2024] Open
Abstract
BACKGROUND Although genome-wide association studies (GWAS) have identified multiple regions conferring genetic risk for juvenile idiopathic arthritis (JIA), we are still faced with the task of identifying the single nucleotide polymorphisms (SNPs) on the disease haplotypes that exert the biological effects that confer risk. Until we identify the risk-driving variants, identifying the genes influenced by these variants, and therefore translating genetic information to improved clinical care, will remain an insurmountable task. We used a function-based approach for identifying causal variant candidates and the target genes on JIA risk haplotypes. METHODS We used a massively parallel reporter assay (MPRA) in myeloid K562 cells to query the effects of 5,226 SNPs in non-coding regions on JIA risk haplotypes for their ability to alter gene expression when compared to the common allele. The assay relies on 180 bp oligonucleotide reporters ("oligos") in which the allele of interest is flanked by its cognate genomic sequence. Barcodes were added randomly by PCR to each oligo to achieve > 20 barcodes per oligo to provide a quantitative read-out of gene expression for each allele. Assays were performed in both unstimulated K562 cells and cells stimulated overnight with interferon gamma (IFNg). As proof of concept, we then used CRISPRi to demonstrate the feasibility of identifying the genes regulated by enhancers harboring expression-altering SNPs. RESULTS We identified 553 expression-altering SNPs in unstimulated K562 cells and an additional 490 in cells stimulated with IFNg. We further filtered the SNPs to identify those plausibly situated within functional chromatin, using open chromatin and H3K27ac ChIPseq peaks in unstimulated cells and open chromatin plus H3K4me1 in stimulated cells. These procedures yielded 42 unique SNPs (total = 84) for each set. Using CRISPRi, we demonstrated that enhancers harboring MPRA-screened variants in the TRAF1 and LNPEP/ERAP2 loci regulated multiple genes, suggesting complex influences of disease-driving variants. CONCLUSION Using MPRA and CRISPRi, JIA risk haplotypes can be queried to identify plausible candidates for disease-driving variants. Once these candidate variants are identified, target genes can be identified using CRISPRi informed by the 3D chromatin structures that encompass the risk haplotypes.
Collapse
Affiliation(s)
- Kaiyu Jiang
- Department of Pediatrics, Clinical and Translational Research Center, University at Buffalo Jacobs School of Medicine School Medicine & Biomedical Sciences, 701 Ellicott St, Buffalo, NY, 14203, USA
| | - Tao Liu
- Roswell Park Cancer Institute, 665 Elm St, Buffalo, NY, 14203, USA
| | - Susan Kales
- Jackson Laboratories, 600 Main St, Bar Harbor, ME, 04609, USA
| | - Ryan Tewhey
- Jackson Laboratories, 600 Main St, Bar Harbor, ME, 04609, USA
| | - Dongkyeong Kim
- Department of Biochemistry, University at Buffalo Jacobs School of Medicine School Medicine & Biomedical Sciences, 955 Main St, Buffalo, NY, 14203, USA
| | - Yungki Park
- Department of Biochemistry, University at Buffalo Jacobs School of Medicine School Medicine & Biomedical Sciences, 955 Main St, Buffalo, NY, 14203, USA
- Genetics, Genomics, & Bioinformatics Program, University at Buffalo Jacobs School of Medicine School Medicine & Biomedical Sciences, 955 Main St, Buffalo, NY, 14203, USA
| | - James N Jarvis
- Department of Pediatrics, Clinical and Translational Research Center, University at Buffalo Jacobs School of Medicine School Medicine & Biomedical Sciences, 701 Ellicott St, Buffalo, NY, 14203, USA.
- Genetics, Genomics, & Bioinformatics Program, University at Buffalo Jacobs School of Medicine School Medicine & Biomedical Sciences, 955 Main St, Buffalo, NY, 14203, USA.
- University of Washington Rheumatology Research, 750 Republican St., E520, Seattle, WA, 98109, USA.
| |
Collapse
|
2
|
Huang D, Ovcharenko I. The contribution of silencer variants to human diseases. Genome Biol 2024; 25:184. [PMID: 38978133 PMCID: PMC11232194 DOI: 10.1186/s13059-024-03328-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 06/28/2024] [Indexed: 07/10/2024] Open
Abstract
BACKGROUND Although disease-causal genetic variants have been found within silencer sequences, we still lack a comprehensive analysis of the association of silencers with diseases. Here, we profiled GWAS variants in 2.8 million candidate silencers across 97 human samples derived from a diverse panel of tissues and developmental time points, using deep learning models. RESULTS We show that candidate silencers exhibit strong enrichment in disease-associated variants, and several diseases display a much stronger association with silencer variants than enhancer variants. Close to 52% of candidate silencers cluster, forming silencer-rich loci, and, in the loci of Parkinson's-disease-hallmark genes TRIM31 and MAL, the associated SNPs densely populate clustered candidate silencers rather than enhancers displaying an overall twofold enrichment in silencers versus enhancers. The disruption of apoptosis in neuronal cells is associated with both schizophrenia and bipolar disorder and can largely be attributed to variants within candidate silencers. Our model permits a mechanistic explanation of causative SNP effects by identifying altered binding of tissue-specific repressors and activators, validated with a 70% of directional concordance using SNP-SELEX. Narrowing the focus of the analysis to individual silencer variants, experimental data confirms the role of the rs62055708 SNP in Parkinson's disease, rs2535629 in schizophrenia, and rs6207121 in type 1 diabetes. CONCLUSIONS In summary, our results indicate that advances in deep learning models for the discovery of disease-causal variants within candidate silencers effectively "double" the number of functionally characterized GWAS variants. This provides a basis for explaining mechanisms of action and designing novel diagnostics and therapeutics.
Collapse
Affiliation(s)
- Di Huang
- Intramural Research Program, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Ivan Ovcharenko
- Intramural Research Program, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
3
|
Frenkel M, Raman S. Discovering mechanisms of human genetic variation and controlling cell states at scale. Trends Genet 2024; 40:587-600. [PMID: 38658256 DOI: 10.1016/j.tig.2024.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Revised: 03/29/2024] [Accepted: 03/29/2024] [Indexed: 04/26/2024]
Abstract
Population-scale sequencing efforts have catalogued substantial genetic variation in humans such that variant discovery dramatically outpaces interpretation. We discuss how single-cell sequencing is poised to reveal genetic mechanisms at a rate that may soon approach that of variant discovery. The functional genomics toolkit is sufficiently modular to systematically profile almost any type of variation within increasingly diverse contexts and with molecularly comprehensive and unbiased readouts. As a result, we can construct deep phenotypic atlases of variant effects that span the entire regulatory cascade. The same conceptual approach to interpreting genetic variation should be applied to engineering therapeutic cell states. In this way, variant mechanism discovery and cell state engineering will become reciprocating and iterative processes towards genomic medicine.
Collapse
Affiliation(s)
- Max Frenkel
- Cellular and Molecular Biology Graduate Program, University of Wisconsin, Madison, WI, USA; Medical Scientist Training Program, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA; Department of Biochemistry, University of Wisconsin, Madison, WI, USA.
| | - Srivatsan Raman
- Department of Biochemistry, University of Wisconsin, Madison, WI, USA; Department of Bacteriology, University of Wisconsin, Madison, WI, USA; Department of Chemical and Biological Engineering, University of Wisconsin, Madison, WI, USA.
| |
Collapse
|
4
|
Chang TY, Waxman DJ. HDI-STARR-seq: Condition-specific enhancer discovery in mouse liver in vivo. RESEARCH SQUARE 2024:rs.3.rs-4559581. [PMID: 38978599 PMCID: PMC11230509 DOI: 10.21203/rs.3.rs-4559581/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background STARR-seq and other massively-parallel reporter assays are widely used to discover functional enhancers in transfected cell models, which can be confounded by plasmid vector-induced type-I interferon immune responses and lack the multicellular environment and endogenous chromatin state of complex mammalian tissues. Results Here, we describe HDI-STARR-seq, which combines STARR-seq plasmid library delivery to the liver, by hydrodynamic tail vein injection (HDI), with reporter RNA transcriptional initiation driven by a minimal Albumin promoter, which we show is essential for mouse liver STARR-seq enhancer activity assayed 7 days after HDI. Importantly, little or no vector-induced innate type-I interferon responses were observed. Comparisons of HDI-STARR-seq activity between male and female mouse livers and in livers from males treated with an activating ligand of the transcription factor CAR (Nr1i3) identified many condition-dependent enhancers linked to condition-specific gene expression. Further, thousands of active liver enhancers were identified using a high complexity STARR-seq library comprised of ~ 50,000 genomic regions released by DNase-I digestion of mouse liver nuclei. When compared to stringently inactive library sequences, the active enhancer sequences identified were highly enriched for liver open chromatin regions with activating histone marks (H3K27ac, H3K4me1, H3K4me3), were significantly closer to gene transcriptional start sites, and were significantly depleted of repressive (H3K27me3, H3K9me3) and transcribed region histone marks (H3K36me3). Conclusions HDI-STARR-seq offers substantial improvements over current methodologies for large scale, functional profiling of enhancers, including condition-dependent enhancers, in liver tissue in vivo, and can be adapted to characterize enhancer activities in a variety of species and tissues by selecting suitable tissue- and species-specific promoter sequences.
Collapse
|
5
|
Iñiguez-Muñoz S, Llinàs-Arias P, Ensenyat-Mendez M, Bedoya-López AF, Orozco JIJ, Cortés J, Roy A, Forsberg-Nilsson K, DiNome ML, Marzese DM. Hidden secrets of the cancer genome: unlocking the impact of non-coding mutations in gene regulatory elements. Cell Mol Life Sci 2024; 81:274. [PMID: 38902506 DOI: 10.1007/s00018-024-05314-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 12/07/2023] [Accepted: 06/06/2024] [Indexed: 06/22/2024]
Abstract
Discoveries in the field of genomics have revealed that non-coding genomic regions are not merely "junk DNA", but rather comprise critical elements involved in gene expression. These gene regulatory elements (GREs) include enhancers, insulators, silencers, and gene promoters. Notably, new evidence shows how mutations within these regions substantially influence gene expression programs, especially in the context of cancer. Advances in high-throughput sequencing technologies have accelerated the identification of somatic and germline single nucleotide mutations in non-coding genomic regions. This review provides an overview of somatic and germline non-coding single nucleotide alterations affecting transcription factor binding sites in GREs, specifically involved in cancer biology. It also summarizes the technologies available for exploring GREs and the challenges associated with studying and characterizing non-coding single nucleotide mutations. Understanding the role of GRE alterations in cancer is essential for improving diagnostic and prognostic capabilities in the precision medicine era, leading to enhanced patient-centered clinical outcomes.
Collapse
Affiliation(s)
- Sandra Iñiguez-Muñoz
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain
| | - Pere Llinàs-Arias
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain
| | - Miquel Ensenyat-Mendez
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain
| | - Andrés F Bedoya-López
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain
| | - Javier I J Orozco
- Saint John's Cancer Institute, Providence Saint John's Health Center, Santa Monica, CA, USA
| | - Javier Cortés
- International Breast Cancer Center (IBCC), Pangaea Oncology, Quiron Group, 08017, Barcelona, Spain
- Medica Scientia Innovation Research SL (MEDSIR), 08018, Barcelona, Spain
- Faculty of Biomedical and Health Sciences, Department of Medicine, Universidad Europea de Madrid, 28670, Madrid, Spain
| | - Ananya Roy
- Department of Immunology, Genetics and Pathology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Karin Forsberg-Nilsson
- Department of Immunology, Genetics and Pathology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- University of Nottingham Biodiscovery Institute, Nottingham, UK
| | - Maggie L DiNome
- Department of Surgery, Duke University School of Medicine, Durham, NC, USA
| | - Diego M Marzese
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain.
- Department of Surgery, Duke University School of Medicine, Durham, NC, USA.
| |
Collapse
|
6
|
Mishra A, Jajodia A, Weston E, Jayavelu ND, Garcia M, Hossack D, Hawkins RD. Identification of functional enhancer variants associated with type I diabetes in CD4+ T cells. Front Immunol 2024; 15:1387253. [PMID: 38947339 PMCID: PMC11211866 DOI: 10.3389/fimmu.2024.1387253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Accepted: 04/09/2024] [Indexed: 07/02/2024] Open
Abstract
Type I diabetes is an autoimmune disease mediated by T-cell destruction of β cells in pancreatic islets. Currently, there is no known cure, and treatment consists of daily insulin injections. Genome-wide association studies and twin studies have indicated a strong genetic heritability for type I diabetes and implicated several genes. As most strongly associated variants are noncoding, there is still a lack of identification of functional and, therefore, likely causal variants. Given that many of these genetic variants reside in enhancer elements, we have tested 121 CD4+ T-cell enhancer variants associated with T1D. We found four to be functional through massively parallel reporter assays. Three of the enhancer variants weaken activity, while the fourth strengthens activity. We link these to their cognate genes using 3D genome architecture or eQTL data and validate them using CRISPR editing. Validated target genes include CLEC16A and SOCS1. While these genes have been previously implicated in type 1 diabetes and other autoimmune diseases, we show that enhancers controlling their expression harbor functional variants. These variants, therefore, may act as causal type 1 diabetic variants.
Collapse
Affiliation(s)
- Arpit Mishra
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA, United States
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States
| | - Ajay Jajodia
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA, United States
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States
| | - Eryn Weston
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA, United States
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States
| | - Naresh Doni Jayavelu
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA, United States
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States
| | - Mariana Garcia
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA, United States
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States
| | - Daniel Hossack
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA, United States
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States
| | - R. David Hawkins
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA, United States
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States
- Institute for Stem Cell and Regenerative Medicine, University of Washington School of Medicine, Seattle, WA, United States
- Benaroya Research Institute at Virginia Mason, Seattle, WA, United States
| |
Collapse
|
7
|
Chang TY, Waxman DJ. HDI-STARR-seq: Condition-specific enhancer discovery in mouse liver in vivo. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.10.598329. [PMID: 38915578 PMCID: PMC11195054 DOI: 10.1101/2024.06.10.598329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
STARR-seq and other massively-parallel reporter assays are widely used to discover functional enhancers in transfected cell models, which can be confounded by plasmid vector-induced type-I interferon immune responses and lack the multicellular environment and endogenous chromatin state of complex mammalian tissues. Here, we describe HDI-STARR-seq, which combines STARR-seq plasmid library delivery to the liver, by hydrodynamic tail vein injection (HDI), with reporter RNA transcriptional initiation driven by a minimal Albumin promoter, which we show is essential for mouse liver STARR-seq enhancer activity assayed 7 days after HDI. Importantly, little or no vector-induced innate type-I interferon responses were observed. Comparisons of HDI-STARR-seq activity between male and female mouse livers and in livers from males treated with an activating ligand of the transcription factor CAR (Nr1i3) identified many condition-dependent enhancers linked to condition-specific gene expression. Further, thousands of active liver enhancers were identified using a high complexity STARR-seq library comprised of ~50,000 genomic regions released by DNase-I digestion of mouse liver nuclei. When compared to stringently inactive library sequences, the active enhancer sequences identified were highly enriched for liver open chromatin regions with activating histone marks (H3K27ac, H3K4me1, H3K4me3), were significantly closer to gene transcriptional start sites, and were significantly depleted of repressive (H3K27me3, H3K9me3) and transcribed region histone marks (H3K36me3). HDI-STARR-seq offers substantial improvements over current methodologies for large scale, functional profiling of enhancers, including condition-dependent enhancers, in liver tissue in vivo, and can be adapted to characterize enhancer activities in a variety of species and tissues by selecting suitable tissue- and species-specific promoter sequences.
Collapse
Affiliation(s)
- Ting-Ya Chang
- Departments of Biology and Biomedical Engineering, and Bioinformatics program, Boston University, Boston, MA 02215
| | - David J Waxman
- Departments of Biology and Biomedical Engineering, and Bioinformatics program, Boston University, Boston, MA 02215
| |
Collapse
|
8
|
Chin IM, Gardell ZA, Corces MR. Decoding polygenic diseases: advances in noncoding variant prioritization and validation. Trends Cell Biol 2024; 34:465-483. [PMID: 38719704 DOI: 10.1016/j.tcb.2024.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 03/12/2024] [Accepted: 03/21/2024] [Indexed: 06/09/2024]
Abstract
Genome-wide association studies (GWASs) provide a key foundation for elucidating the genetic underpinnings of common polygenic diseases. However, these studies have limitations in their ability to assign causality to particular genetic variants, especially those residing in the noncoding genome. Over the past decade, technological and methodological advances in both analytical and empirical prioritization of noncoding variants have enabled the identification of causative variants by leveraging orthogonal functional evidence at increasing scale. In this review, we present an overview of these approaches and describe how this workflow provides the groundwork necessary to move beyond associations toward genetically informed studies on the molecular and cellular mechanisms of polygenic disease.
Collapse
Affiliation(s)
- Iris M Chin
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Zachary A Gardell
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - M Ryan Corces
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
9
|
Zeng B, Bendl J, Deng C, Lee D, Misir R, Reach SM, Kleopoulos SP, Auluck P, Marenco S, Lewis DA, Haroutunian V, Ahituv N, Fullard JF, Hoffman GE, Roussos P. Genetic regulation of cell type-specific chromatin accessibility shapes brain disease etiology. Science 2024; 384:eadh4265. [PMID: 38781378 DOI: 10.1126/science.adh4265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 12/20/2023] [Indexed: 05/25/2024]
Abstract
Nucleotide variants in cell type-specific gene regulatory elements in the human brain are risk factors for human disease. We measured chromatin accessibility in 1932 aliquots of sorted neurons and non-neurons from 616 human postmortem brains and identified 34,539 open chromatin regions with chromatin accessibility quantitative trait loci (caQTLs). Only 10.4% of caQTLs are shared between neurons and non-neurons, which supports cell type-specific genetic regulation of the brain regulome. Incorporating allele-specific chromatin accessibility improves statistical fine-mapping and refines molecular mechanisms that underlie disease risk. Using massively parallel reporter assays in induced excitatory neurons, we screened 19,893 brain QTLs and identified the functional impact of 476 regulatory variants. Combined, this comprehensive resource captures variation in the human brain regulome and provides insights into disease etiology.
Collapse
Affiliation(s)
- Biao Zeng
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jaroslav Bendl
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Chengyu Deng
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Donghoon Lee
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Ruth Misir
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Sarah M Reach
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Steven P Kleopoulos
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Pavan Auluck
- Human Brain Collection Core, National Institute of Mental Health-Intramural Research Program, Bethesda, MD 20892, USA
| | - Stefano Marenco
- Human Brain Collection Core, National Institute of Mental Health-Intramural Research Program, Bethesda, MD 20892, USA
| | - David A Lewis
- Translational Neuroscience Program, Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Vahram Haroutunian
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Mental Illness Research, Education and Clinical Centers, James J. Peters VA Medical Center, Bronx, NY 10468, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - John F Fullard
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Gabriel E Hoffman
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Panos Roussos
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Mental Illness Research, Education and Clinical Centers, James J. Peters VA Medical Center, Bronx, NY 10468, USA
| |
Collapse
|
10
|
Siraj L, Castro RI, Dewey H, Kales S, Nguyen TTL, Kanai M, Berenzy D, Mouri K, Wang QS, McCaw ZR, Gosai SJ, Aguet F, Cui R, Vockley CM, Lareau CA, Okada Y, Gusev A, Jones TR, Lander ES, Sabeti PC, Finucane HK, Reilly SK, Ulirsch JC, Tewhey R. Functional dissection of complex and molecular trait variants at single nucleotide resolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.05.592437. [PMID: 38766054 PMCID: PMC11100724 DOI: 10.1101/2024.05.05.592437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Identifying the causal variants and mechanisms that drive complex traits and diseases remains a core problem in human genetics. The majority of these variants have individually weak effects and lie in non-coding gene-regulatory elements where we lack a complete understanding of how single nucleotide alterations modulate transcriptional processes to affect human phenotypes. To address this, we measured the activity of 221,412 trait-associated variants that had been statistically fine-mapped using a Massively Parallel Reporter Assay (MPRA) in 5 diverse cell-types. We show that MPRA is able to discriminate between likely causal variants and controls, identifying 12,025 regulatory variants with high precision. Although the effects of these variants largely agree with orthogonal measures of function, only 69% can plausibly be explained by the disruption of a known transcription factor (TF) binding motif. We dissect the mechanisms of 136 variants using saturation mutagenesis and assign impacted TFs for 91% of variants without a clear canonical mechanism. Finally, we provide evidence that epistasis is prevalent for variants in close proximity and identify multiple functional variants on the same haplotype at a small, but important, subset of trait-associated loci. Overall, our study provides a systematic functional characterization of likely causal common variants underlying complex and molecular human traits, enabling new insights into the regulatory grammar underlying disease risk.
Collapse
Affiliation(s)
- Layla Siraj
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Program in Biophysics, Harvard Graduate School of Arts and Sciences, Boston, MA, USA
- Harvard-Massachusetts Institute of Technology MD/PhD Program, Harvard Medical School, Boston, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | | | | | | | | | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, MA, USA
| | | | | | - Qingbo S. Wang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan
| | | | - Sager J. Gosai
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - François Aguet
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Ran Cui
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Caleb A. Lareau
- Program in Computational and Systems Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan
- Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| | - Thouis R. Jones
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eric S. Lander
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Biology, MIT, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Pardis C. Sabeti
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Hilary K. Finucane
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
| | - Steven K. Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Wu Tsai Institute, Yale University, New Haven, CT, USA
| | - Jacob C. Ulirsch
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Ryan Tewhey
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
- Graduate School of Biomedical Sciences, Tufts University School of Medicine, Boston, MA, USA
| |
Collapse
|
11
|
Nigrovic PA, Wang Q, Kim T, Martinez-Bonet M, Aguiar VRC, Sim S, Cui J, Sparks JA, Chen X, Todd M, Wauford B, Marion MC, Langefeld CD, Weirauch MT, Gutierrez-Arcelus M. High-throughput identification of functional regulatory SNPs in systemic lupus erythematosus. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.08.16.553538. [PMID: 37645953 PMCID: PMC10462027 DOI: 10.1101/2023.08.16.553538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Genome-wide association studies implicate multiple loci in risk for systemic lupus erythematosus (SLE), but few contain exonic variants, rendering systematic identification of non-coding variants essential to decoding SLE genetics. We utilized SNP-seq and bioinformatic enrichment to interrogate 2180 single-nucleotide polymorphisms (SNPs) from 87 SLE risk loci for potential binding of transcription factors and related proteins from B cells. 52 SNPs that passed initial screening were tested by electrophoretic mobility shift and luciferase reporter assays. To validate the approach, we studied rs2297550 in detail, finding that the risk allele enhanced binding to the transcription factor Ikaros (IKZF1), thereby modulating expression of IKBKE. Correspondingly, primary cells from genotyped healthy donors bearing the risk allele expressed higher levels of the interferon / NF-κB regulator IKKϵ. Together, these findings define a set of likely functional non-coding lupus risk variants and identify a new regulatory pathway involving rs2297550, Ikaros, and IKKϵ implicated by human genetics in risk for SLE.
Collapse
|
12
|
Fu Y, Kelly JA, Gopalakrishnan J, Pelikan RC, Tessneer KL, Pasula S, Grundahl K, Murphy DA, Gaffney PM. Massively parallel reporter assay confirms regulatory potential of hQTLs and reveals important variants in lupus and other autoimmune diseases. HGG ADVANCES 2024; 5:100279. [PMID: 38389303 PMCID: PMC10943488 DOI: 10.1016/j.xhgg.2024.100279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 02/15/2024] [Accepted: 02/18/2024] [Indexed: 02/24/2024] Open
Abstract
We designed a massively parallel reporter assay (MPRA) in an Epstein-Barr virus transformed B cell line to directly characterize the potential for histone post-translational modifications, i.e., histone quantitative trait loci (hQTLs), expression QTLs (eQTLs), and variants on systemic lupus erythematosus (SLE) and autoimmune (AI) disease risk haplotypes to modulate regulatory activity in an allele-dependent manner. Our study demonstrates that hQTLs, as a group, are more likely to modulate regulatory activity in an MPRA compared with other variant classes tested, including a set of eQTLs previously shown to interact with hQTLs and tested AI risk variants. In addition, we nominate 17 variants (including 11 previously unreported) as putative causal variants for SLE and another 14 for various other AI diseases, prioritizing these variants for future functional studies in primary and immortalized B cells. Thus, we uncover important insights into the mechanistic relationships among genotype, epigenetics, and gene expression in SLE and AI disease phenotypes.
Collapse
Affiliation(s)
- Yao Fu
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Jennifer A Kelly
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Jaanam Gopalakrishnan
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA; Neuro-Immune Regulome Unit, National Eye Institute, National Institute of Health, Bethesda, MD 20892, USA
| | - Richard C Pelikan
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Kandice L Tessneer
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Satish Pasula
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Kiely Grundahl
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - David A Murphy
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Patrick M Gaffney
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA.
| |
Collapse
|
13
|
Lambourne L, Mattioli K, Santoso C, Sheynkman G, Inukai S, Kaundal B, Berenson A, Spirohn-Fitzgerald K, Bhattacharjee A, Rothman E, Shrestha S, Laval F, Yang Z, Bisht D, Sewell JA, Li G, Prasad A, Phanor S, Lane R, Campbell DM, Hunt T, Balcha D, Gebbia M, Twizere JC, Hao T, Frankish A, Riback JA, Salomonis N, Calderwood MA, Hill DE, Sahni N, Vidal M, Bulyk ML, Fuxman Bass JI. Widespread variation in molecular interactions and regulatory properties among transcription factor isoforms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.12.584681. [PMID: 38617209 PMCID: PMC11014633 DOI: 10.1101/2024.03.12.584681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Most human Transcription factors (TFs) genes encode multiple protein isoforms differing in DNA binding domains, effector domains, or other protein regions. The global extent to which this results in functional differences between isoforms remains unknown. Here, we systematically compared 693 isoforms of 246 TF genes, assessing DNA binding, protein binding, transcriptional activation, subcellular localization, and condensate formation. Relative to reference isoforms, two-thirds of alternative TF isoforms exhibit differences in one or more molecular activities, which often could not be predicted from sequence. We observed two primary categories of alternative TF isoforms: "rewirers" and "negative regulators", both of which were associated with differentiation and cancer. Our results support a model wherein the relative expression levels of, and interactions involving, TF isoforms add an understudied layer of complexity to gene regulatory networks, demonstrating the importance of isoform-aware characterization of TF functions and providing a rich resource for further studies.
Collapse
Affiliation(s)
- Luke Lambourne
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Kaia Mattioli
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Clarissa Santoso
- Department of Biology, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
| | - Gloria Sheynkman
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sachi Inukai
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Babita Kaundal
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Anna Berenson
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, MA, USA
| | - Kerstin Spirohn-Fitzgerald
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Anukana Bhattacharjee
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Elisabeth Rothman
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Florent Laval
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Zhipeng Yang
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Deepa Bisht
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jared A Sewell
- Department of Biology, Boston University, Boston, MA, USA
| | - Guangyuan Li
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Anisa Prasad
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Harvard College, Cambridge MA, USA
| | - Sabrina Phanor
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Ryan Lane
- Department of Biology, Boston University, Boston, MA, USA
| | | | - Toby Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Dawit Balcha
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Marinella Gebbia
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Jean-Claude Twizere
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Adam Frankish
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Josh A Riback
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Nathan Salomonis
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Michael A Calderwood
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - David E Hill
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nidhi Sahni
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Juan I Fuxman Bass
- Department of Biology, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, MA, USA
| |
Collapse
|
14
|
Nizomov J, Jin W, Xia Y, Liu Y, Li Z, Chen L. MPRAVarDB: an online database and web server for exploring regulatory effects of genetic variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.02.587790. [PMID: 38617248 PMCID: PMC11014600 DOI: 10.1101/2024.04.02.587790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Massively parallel reporter assay (MPRA) is an important technology to evaluate the impact of genetic variants on gene regulation. Here, we present MPRAVarDB, an online database and web server, for exploring regulatory effects of genetic variants. MPRAVarDB harbors 18 MPRA experiments designed to assess the regulatory effects of genetic variants associated with GWAS loci, eQTLs and various genomic features, resulting in a total of 242,818 variants tested across more than 30 cell lines and 30 human diseases or traits. MPRAVarDB empowers the query of MPRA variants by genomic region, disease and cell line or by any combination of these query terms. Notably, MPRAVarDB offers a suite of pretrained machine learning models tailored to the specific disease and cell line, facilitating the genome-wide prediction of regulatory variants. MPRAVarDB is friendly to use, and users only need a few clicks to receive query and prediction results.
Collapse
Affiliation(s)
- Javlon Nizomov
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603
| | - Weijia Jin
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603
| | - Yi Xia
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603
| | - Yunlong Liu
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, 46202
| | - Zhigang Li
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603
| | - Li Chen
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603
| |
Collapse
|
15
|
Lopdell TJ, Trevarton AJ, Moody J, Prowse-Wilkins C, Knowles S, Tiplady K, Chamberlain AJ, Goddard ME, Spelman RJ, Lehnert K, Snell RG, Davis SR, Littlejohn MD. A common regulatory haplotype doubles lactoferrin concentration in milk. Genet Sel Evol 2024; 56:22. [PMID: 38549172 PMCID: PMC11234695 DOI: 10.1186/s12711-024-00890-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 03/12/2024] [Indexed: 04/02/2024] Open
Abstract
BACKGROUND Bovine lactoferrin (Lf) is an iron absorbing whey protein with antibacterial, antiviral, and antifungal activity. Lactoferrin is economically valuable and has an extremely variable concentration in milk, partly driven by environmental influences such as milking frequency, involution, or mastitis. A significant genetic influence has also been previously observed to regulate lactoferrin content in milk. Here, we conducted genetic mapping of lactoferrin protein concentration in conjunction with RNA-seq, ChIP-seq, and ATAC-seq data to pinpoint candidate causative variants that regulate lactoferrin concentrations in milk. RESULTS We identified a highly-significant lactoferrin protein quantitative trait locus (pQTL), as well as a cis lactotransferrin (LTF) expression QTL (cis-eQTL) mapping to the LTF locus. Using ChIP-seq and ATAC-seq datasets representing lactating mammary tissue samples, we also report a number of regions where the openness of chromatin is under genetic influence. Several of these also show highly significant QTL with genetic signatures similar to those highlighted through pQTL and eQTL analysis. By performing correlation analysis between these QTL, we revealed an ATAC-seq peak in the putative promotor region of LTF, that highlights a set of 115 high-frequency variants that are potentially responsible for these effects. One of the 115 variants (rs110000337), which maps within the ATAC-seq peak, was predicted to alter binding sites of transcription factors known to be involved in lactation-related pathways. CONCLUSIONS Here, we report a regulatory haplotype of 115 variants with conspicuously large impacts on milk lactoferrin concentration. These findings could enable the selection of animals for high-producing specialist herds.
Collapse
Affiliation(s)
- Thomas J Lopdell
- Research & Development, Livestock Improvement Corporation, Ruakura Road, Hamilton, New Zealand.
| | - Alexander J Trevarton
- School of Biological Sciences, University of Auckland, Private Bag 92019, Auckland, New Zealand
| | - Janelle Moody
- School of Biological Sciences, University of Auckland, Private Bag 92019, Auckland, New Zealand
| | - Claire Prowse-Wilkins
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC, Australia
- Faculty of Veterinarian and Agricultural Science, The University of Melbourne, Parkville, VIC, Australia
| | - Sarah Knowles
- Auckland War Memorial Museum, Victoria Street West, Auckland, New Zealand
| | - Kathryn Tiplady
- Research & Development, Livestock Improvement Corporation, Ruakura Road, Hamilton, New Zealand
| | - Amanda J Chamberlain
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC, Australia
| | - Michael E Goddard
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC, Australia
- Faculty of Veterinarian and Agricultural Science, The University of Melbourne, Parkville, VIC, Australia
| | - Richard J Spelman
- Research & Development, Livestock Improvement Corporation, Ruakura Road, Hamilton, New Zealand
| | - Klaus Lehnert
- School of Biological Sciences, University of Auckland, Private Bag 92019, Auckland, New Zealand
| | - Russell G Snell
- School of Biological Sciences, University of Auckland, Private Bag 92019, Auckland, New Zealand
| | - Stephen R Davis
- Research & Development, Livestock Improvement Corporation, Ruakura Road, Hamilton, New Zealand
| | - Mathew D Littlejohn
- Research & Development, Livestock Improvement Corporation, Ruakura Road, Hamilton, New Zealand
- AL Rae Centre for Genetics and Breeding, Massey University, Palmerston North, New Zealand
| |
Collapse
|
16
|
Khetan S, Bulyk ML. Overlapping binding sites underlie TF genomic occupancy. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.05.583629. [PMID: 38496549 PMCID: PMC10942454 DOI: 10.1101/2024.03.05.583629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Sequence-specific DNA binding by transcription factors (TFs) is a crucial step in gene regulation. However, current high-throughput in vitro approaches cannot reliably detect lower affinity TF-DNA interactions, which play key roles in gene regulation. Here, we developed PADIT-seq ( p rotein a ffinity to D NA by in vitro transcription and RNA seq uencing) to assay TF binding preferences to all 10-bp DNA sequences at far greater sensitivity than prior approaches. The expanded catalogs of low affinity DNA binding sites for the human TFs HOXD13 and EGR1 revealed that nucleotides flanking high affinity DNA binding sites create overlapping lower affinity sites that together modulate TF genomic occupancy in vivo . Formation of such extended recognition sequences stems from an inherent property of TF binding sites to interweave each other and expands the genomic sequence space for identifying noncoding variants that directly alter TF binding. One-Sentence Summary Overlapping DNA binding sites underlie TF genomic occupancy through their inherent propensity to interweave each other.
Collapse
|
17
|
Kotliar D, Raju S, Tabrizi S, Odia I, Goba A, Momoh M, Sandi JD, Nair P, Phelan E, Tariyal R, Eromon PE, Mehta S, Robles-Sikisaka R, Siddle KJ, Stremlau M, Jalloh S, Gire SK, Winnicki S, Chak B, Schaffner SF, Pauthner M, Karlsson EK, Chapin SR, Kennedy SG, Branco LM, Kanneh L, Vitti JJ, Broodie N, Gladden-Young A, Omoniwa O, Jiang PP, Yozwiak N, Heuklom S, Moses LM, Akpede GO, Asogun DA, Rubins K, Kales S, Happi AN, Iruolagbe CO, Dic-Ijiewere M, Iraoyah K, Osazuwa OO, Okonkwo AK, Kunz S, McCormick JB, Khan SH, Honko AN, Lander ES, Oldstone MBA, Hensley L, Folarin OA, Okogbenin SA, Günther S, Ollila HM, Tewhey R, Okokhere PO, Schieffelin JS, Andersen KG, Reilly SK, Grant DS, Garry RF, Barnes KG, Happi CT, Sabeti PC. Genome-wide association study identifies human genetic variants associated with fatal outcome from Lassa fever. Nat Microbiol 2024; 9:751-762. [PMID: 38326571 PMCID: PMC10914620 DOI: 10.1038/s41564-023-01589-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 12/14/2023] [Indexed: 02/09/2024]
Abstract
Infection with Lassa virus (LASV) can cause Lassa fever, a haemorrhagic illness with an estimated fatality rate of 29.7%, but causes no or mild symptoms in many individuals. Here, to investigate whether human genetic variation underlies the heterogeneity of LASV infection, we carried out genome-wide association studies (GWAS) as well as seroprevalence surveys, human leukocyte antigen typing and high-throughput variant functional characterization assays. We analysed Lassa fever susceptibility and fatal outcomes in 533 cases of Lassa fever and 1,986 population controls recruited over a 7 year period in Nigeria and Sierra Leone. We detected genome-wide significant variant associations with Lassa fever fatal outcomes near GRM7 and LIF in the Nigerian cohort. We also show that a haplotype bearing signatures of positive selection and overlapping LARGE1, a required LASV entry factor, is associated with decreased risk of Lassa fever in the Nigerian cohort but not in the Sierra Leone cohort. Overall, we identified variants and genes that may impact the risk of severe Lassa fever, demonstrating how GWAS can provide insight into viral pathogenesis.
Collapse
Affiliation(s)
- Dylan Kotliar
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA.
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Department of Internal Medicine, Brigham and Women's Hospital, Boston, MA, USA.
| | - Siddharth Raju
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Shervin Tabrizi
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Boston, MA, USA
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ikponmwosa Odia
- Institute of Lassa Fever, Research and Control, Irrua Specialist Teaching Hospital, Irrua, Nigeria
| | - Augustine Goba
- College of Medicine and Allied Health Sciences, University of Sierra Leone, Freetown, Sierra Leone
| | - Mambu Momoh
- College of Medicine and Allied Health Sciences, University of Sierra Leone, Freetown, Sierra Leone
- Eastern Polytechnic College, Kenema, Sierra Leone
| | - John Demby Sandi
- College of Medicine and Allied Health Sciences, University of Sierra Leone, Freetown, Sierra Leone
| | - Parvathy Nair
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | | | | | - Philomena E Eromon
- Institute of Lassa Fever, Research and Control, Irrua Specialist Teaching Hospital, Irrua, Nigeria
- African Centre of Excellence for Genomics of Infectious Diseases (ACEGID), Redeemer's University, Ede, Nigeria
| | - Samar Mehta
- Department of Critical Care Medicine, University of Maryland Medical Center, Baltimore, MA, USA
| | - Refugio Robles-Sikisaka
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
| | - Katherine J Siddle
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
| | | | - Simbirie Jalloh
- College of Medicine and Allied Health Sciences, University of Sierra Leone, Freetown, Sierra Leone
| | | | - Sarah Winnicki
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
| | - Bridget Chak
- Biological Sciences Division, University of Chicago, Chicago, IL, USA
| | - Stephen F Schaffner
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
| | | | - Elinor K Karlsson
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Genomics and Computational Biology, UMass Chan Medical School, Worcester, MA, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA, USA
| | - Sarah R Chapin
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
| | - Sharon G Kennedy
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | | | - Lansana Kanneh
- Viral Hemorrhagic Fever Program, Kenema Government Hospital, Ministry of Health and Sanitation, Kenema, Sierra Leone
| | - Joseph J Vitti
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
| | - Nisha Broodie
- New York-Presbyterian Hospital-Columbia and Cornell, New York, NY, USA
| | - Adrianne Gladden-Young
- Molecular Microbiology, Graduate School of Biomedical Sciences, Tufts University, Boston, MA, USA
| | | | | | - Nathan Yozwiak
- Gene and Cell Therapy Institute, Mass General Brigham, Cambridge, MA, USA
| | - Shannon Heuklom
- San Francisco Community Health Center, San Francisco, CA, USA
| | - Lina M Moses
- Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
| | - George O Akpede
- Institute of Lassa Fever, Research and Control, Irrua Specialist Teaching Hospital, Irrua, Nigeria
- Department of Medicine, Ambrose Alli University, Ekpoma, Nigeria
| | - Danny A Asogun
- Department of Community Medicine, Ambrose Alli University, Ekpoma, Nigeria
| | - Kathleen Rubins
- National Aeronautics and Space Administration, Houston, TX, USA
| | | | - Anise N Happi
- African Centre of Excellence for Genomics of Infectious Diseases (ACEGID), Redeemer's University, Ede, Nigeria
| | | | - Mercy Dic-Ijiewere
- Department of Medicine, Irrua Specialist Teaching Hospital, Irrua, Nigeria
| | - Kelly Iraoyah
- Department of Medicine, Irrua Specialist Teaching Hospital, Irrua, Nigeria
| | - Omoregie O Osazuwa
- Department of Medicine, Irrua Specialist Teaching Hospital, Irrua, Nigeria
| | | | - Stefan Kunz
- Institute of Microbiology, University Hospital Center and University of Lausanne, Lausanne, Switzerland
| | - Joseph B McCormick
- UTHealth Houston School of Public Health, Brownsville Campus, Brownsville, TX, USA
| | - S Humarr Khan
- College of Medicine and Allied Health Sciences, University of Sierra Leone, Freetown, Sierra Leone
| | - Anna N Honko
- Boston University School of Medicine, Boston, MA, USA
| | - Eric S Lander
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Department of Biology, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Michael B A Oldstone
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
| | - Lisa Hensley
- National Institutes of Health Integrated Research Facility, Frederick, MA, USA
| | - Onikepe A Folarin
- African Centre of Excellence for Genomics of Infectious Diseases (ACEGID), Redeemer's University, Ede, Nigeria
- Department of Biological Sciences, Redeemer's University, Ede, Nigeria
| | - Sylvanus A Okogbenin
- Institute of Lassa Fever, Research and Control, Irrua Specialist Teaching Hospital, Irrua, Nigeria
| | - Stephan Günther
- Bernhard Nocht Institute for Tropical Medicine, Hamburg, Germany
| | - Hanna M Ollila
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Anesthesia, Critical Care, and Pain Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Peter O Okokhere
- Institute of Lassa Fever, Research and Control, Irrua Specialist Teaching Hospital, Irrua, Nigeria
- Department of Medicine, Ambrose Alli University, Ekpoma, Nigeria
- Department of Medicine, Irrua Specialist Teaching Hospital, Irrua, Nigeria
| | - John S Schieffelin
- Section of Infectious Disease, Department of Pediatrics, Tulane University School of Medicine, New Orleans, LA, USA
| | - Kristian G Andersen
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
| | - Steven K Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Donald S Grant
- College of Medicine and Allied Health Sciences, University of Sierra Leone, Freetown, Sierra Leone
- Viral Hemorrhagic Fever Program, Kenema Government Hospital, Ministry of Health and Sanitation, Kenema, Sierra Leone
| | - Robert F Garry
- Tulane University School of Medicine, New Orleans, LA, USA
| | - Kayla G Barnes
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Malawi-Liverpool-Wellcome Trust Clinical Research Programme, Kamuzu University of Health Sciences, Blantyre, Malawi
- Department of Vector Biology and Tropical Disease Biology, Liverpool School of Tropical Medicine, Liverpool, UK
| | - Christian T Happi
- African Centre of Excellence for Genomics of Infectious Diseases (ACEGID), Redeemer's University, Ede, Nigeria.
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA.
- Department of Biological Sciences, Redeemer's University, Ede, Nigeria.
| | - Pardis C Sabeti
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA.
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA.
- Department of Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Massachusetts Consortium on Pathogen Readiness, Boston, MA, USA.
| |
Collapse
|
18
|
Xiao F, Zhang X, Morton SU, Kim SW, Fan Y, Gorham JM, Zhang H, Berkson PJ, Mazumdar N, Cao Y, Chen J, Hagen J, Liu X, Zhou P, Richter F, Shen Y, Ward T, Gelb BD, Seidman JG, Seidman CE, Pu WT. Functional dissection of human cardiac enhancers and noncoding de novo variants in congenital heart disease. Nat Genet 2024; 56:420-430. [PMID: 38378865 PMCID: PMC11218660 DOI: 10.1038/s41588-024-01669-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 01/23/2024] [Indexed: 02/22/2024]
Abstract
Rare coding mutations cause ∼45% of congenital heart disease (CHD). Noncoding mutations that perturb cis-regulatory elements (CREs) likely contribute to the remaining cases, but their identification has been problematic. Using a lentiviral massively parallel reporter assay (lentiMPRA) in human induced pluripotent stem cell-derived cardiomyocytes (iPSC-CMs), we functionally evaluated 6,590 noncoding de novo variants (ncDNVs) prioritized from the whole-genome sequencing of 750 CHD trios. A total of 403 ncDNVs substantially affected cardiac CRE activity. A majority increased enhancer activity, often at regions with undetectable reference sequence activity. Of ten DNVs tested by introduction into their native genomic context, four altered the expression of neighboring genes and iPSC-CM transcriptional state. To prioritize future DNVs for functional testing, we used the MPRA data to develop a regression model, EpiCard. Analysis of an independent CHD cohort by EpiCard found enrichment of DNVs. Together, we developed a scalable system to measure the effect of ncDNVs on CRE activity and deployed it to systematically assess the contribution of ncDNVs to CHD.
Collapse
Affiliation(s)
- Feng Xiao
- Department of Cardiology, Boston Children's Hospital, Boston, MA, USA
| | - Xiaoran Zhang
- Department of Cardiology, Boston Children's Hospital, Boston, MA, USA
| | - Sarah U Morton
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Division of Newborn Medicine, Boston Children's Hospital, Boston, MA, USA
| | - Seong Won Kim
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Youfei Fan
- Department of Pediatrics, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Joshua M Gorham
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Huan Zhang
- Department of Radiation Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Paul J Berkson
- Department of Cardiology, Boston Children's Hospital, Boston, MA, USA
| | - Neil Mazumdar
- Department of Cardiology, Boston Children's Hospital, Boston, MA, USA
| | - Yangpo Cao
- Department of Cardiology, Boston Children's Hospital, Boston, MA, USA
- Department of Pharmacology, School of Medicine, Southern University of Science and Technology, Shenzhen, China
| | - Jian Chen
- Department of Cardiology, Boston Children's Hospital, Boston, MA, USA
| | - Jacob Hagen
- Mindich Child Health and Development Institute and Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York City, NY, USA
| | - Xujie Liu
- Department of Cardiology, Boston Children's Hospital, Boston, MA, USA
| | - Pingzhu Zhou
- Department of Cardiology, Boston Children's Hospital, Boston, MA, USA
| | - Felix Richter
- Mindich Child Health and Development Institute and Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York City, NY, USA
| | - Yufeng Shen
- Departments of Systems Biology and Biomedical Informatics, Columbia University Medical Center, New York City, NY, USA
| | - Tarsha Ward
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Bruce D Gelb
- Mindich Child Health and Development Institute and Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York City, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York City, NY, USA
| | | | - Christine E Seidman
- Department of Genetics, Harvard Medical School, Boston, MA, USA.
- Division of Cardiology, Brigham and Women's Hospital, Boston, MA, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| | - William T Pu
- Department of Cardiology, Boston Children's Hospital, Boston, MA, USA.
- Harvard Stem Cell Institute, Cambridge, MA, USA.
| |
Collapse
|
19
|
Liu J, Ashuach T, Inoue F, Ahituv N, Yosef N, Kreimer A. Optimizing sequence design strategies for perturbation MPRAs: a computational evaluation framework. Nucleic Acids Res 2024; 52:1613-1627. [PMID: 38296821 PMCID: PMC10939410 DOI: 10.1093/nar/gkae012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 12/26/2023] [Accepted: 01/12/2024] [Indexed: 02/02/2024] Open
Abstract
The advent of perturbation-based massively parallel reporter assays (MPRAs) technique has facilitated the delineation of the roles of non-coding regulatory elements in orchestrating gene expression. However, computational efforts remain scant to evaluate and establish guidelines for sequence design strategies for perturbation MPRAs. In this study, we propose a framework for evaluating and comparing various perturbation strategies for MPRA experiments. Within this framework, we benchmark three different perturbation approaches from the perspectives of alteration in motif-based profiles, consistency of MPRA outputs, and robustness of models that predict the activities of putative regulatory motifs. While our analyses show very similar results across multiple benchmarking metrics, the predictive modeling for the approach involving random nucleotide shuffling shows significant robustness compared with the other two approaches. Thus, we recommend designing sequences by randomly shuffling the nucleotides of the perturbed site in perturbation-MPRA, followed by a coherence check to prevent the introduction of other variations of the target motifs. In summary, our evaluation framework and the benchmarking findings create a resource of computational pipelines and highlight the potential of perturbation-MPRA in predicting non-coding regulatory activities.
Collapse
Affiliation(s)
- Jiayi Liu
- Graduate Program in Cell & Developmental Biology, Rutgers, The State University of New Jersey, 604 Allison Rd, Piscataway, NJ 08854, USA
- Department of Biochemistry and Molecular Biology, Rutgers, The State University of New Jersey, 604 Allison Road, Piscataway, NJ 08854, USA
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, 679 Hoes Lane West, Piscataway, Piscataway, NJ 08854, USA
| | - Tal Ashuach
- Department of Electrical Engineering and Computer Sciences and Center for Computational Biology, University of California, Berkeley, 387 Soda Hall, Berkeley, CA 94720, USA
| | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Faculty of Medicine Building B, Yoshidatachibanacho, Sakyo Ward, Kyoto 606-8303, Japan
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, 1700 4th Street, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California, 513 Parnassus Ave, San Francisco, CA 94143, USA
| | - Nir Yosef
- Department of Systems Immunology, Weizmann Institute of Science, 234 Herzl Street, Rehovot 7610001, Israel
- Chan-Zuckerberg Biohub, 499 Illinois St, San Francisco, CA 94158, USA
- Department of Systems Immunology, Ragon Institute of MGH, MIT, and Harvard Institute of Science, 400 Technology Square, Cambridge, MA 02139, USA
| | - Anat Kreimer
- Department of Biochemistry and Molecular Biology, Rutgers, The State University of New Jersey, 604 Allison Road, Piscataway, NJ 08854, USA
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, 679 Hoes Lane West, Piscataway, Piscataway, NJ 08854, USA
| |
Collapse
|
20
|
Farrow SL, Gokuladhas S, Schierding W, Pudjihartono M, Perry JK, Cooper AA, O'Sullivan JM. Identification of 27 allele-specific regulatory variants in Parkinson's disease using a massively parallel reporter assay. NPJ Parkinsons Dis 2024; 10:44. [PMID: 38413607 PMCID: PMC10899198 DOI: 10.1038/s41531-024-00659-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 02/12/2024] [Indexed: 02/29/2024] Open
Abstract
Genome wide association studies (GWAS) have identified a number of genomic loci that are associated with Parkinson's disease (PD) risk. However, the majority of these variants lie in non-coding regions, and thus the mechanisms by which they influence disease development, and/or potential subtypes, remain largely elusive. To address this, we used a massively parallel reporter assay (MPRA) to screen the regulatory function of 5254 variants that have a known or putative connection to PD. We identified 138 loci with enhancer activity, of which 27 exhibited allele-specific regulatory activity in HEK293 cells. The identified regulatory variant(s) typically did not match the original tag variant within the PD associated locus, supporting the need for deeper exploration of these loci. The existence of allele specific transcriptional impacts within HEK293 cells, confirms that at least a subset of the PD associated regions mark functional gene regulatory elements. Future functional studies that confirm the putative targets of the empirically verified regulatory variants will be crucial for gaining a greater understanding of how gene regulatory network(s) modulate PD risk.
Collapse
Affiliation(s)
- Sophie L Farrow
- Liggins Institute, The University of Auckland, Auckland, New Zealand.
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand.
| | | | - William Schierding
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
- Department of Ophthalmology, The University of Auckland, Auckland, New Zealand
| | | | - Jo K Perry
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
| | - Antony A Cooper
- Australian Parkinsons Mission, Garvan Institute of Medical Research, Sydney, NSW, Australia
- St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - Justin M O'Sullivan
- Liggins Institute, The University of Auckland, Auckland, New Zealand.
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand.
- Australian Parkinsons Mission, Garvan Institute of Medical Research, Sydney, NSW, Australia.
- Singapore Institute for Clinical Sciences, Agency for Science Technology and Research, Singapore, Singapore.
- MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, United Kingdom.
| |
Collapse
|
21
|
Kwak IY, Kim BC, Lee J, Kang T, Garry DJ, Zhang J, Gong W. Proformer: a hybrid macaron transformer model predicts expression values from promoter sequences. BMC Bioinformatics 2024; 25:81. [PMID: 38378442 PMCID: PMC10877777 DOI: 10.1186/s12859-024-05645-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 01/08/2024] [Indexed: 02/22/2024] Open
Abstract
The breakthrough high-throughput measurement of the cis-regulatory activity of millions of randomly generated promoters provides an unprecedented opportunity to systematically decode the cis-regulatory logic that determines the expression values. We developed an end-to-end transformer encoder architecture named Proformer to predict the expression values from DNA sequences. Proformer used a Macaron-like Transformer encoder architecture, where two half-step feed forward (FFN) layers were placed at the beginning and the end of each encoder block, and a separable 1D convolution layer was inserted after the first FFN layer and in front of the multi-head attention layer. The sliding k-mers from one-hot encoded sequences were mapped onto a continuous embedding, combined with the learned positional embedding and strand embedding (forward strand vs. reverse complemented strand) as the sequence input. Moreover, Proformer introduced multiple expression heads with mask filling to prevent the transformer models from collapsing when training on relatively small amount of data. We empirically determined that this design had significantly better performance than the conventional design such as using the global pooling layer as the output layer for the regression task. These analyses support the notion that Proformer provides a novel method of learning and enhances our understanding of how cis-regulatory sequences determine the expression values.
Collapse
Affiliation(s)
- Il-Youp Kwak
- Department of Applied Statistics, Chung‑Ang University, Seoul, Republic of Korea
| | - Byeong-Chan Kim
- Department of Applied Statistics, Chung‑Ang University, Seoul, Republic of Korea
| | - Juhyun Lee
- Department of Applied Statistics, Chung‑Ang University, Seoul, Republic of Korea
| | - Taein Kang
- Department of Applied Statistics, Chung‑Ang University, Seoul, Republic of Korea
| | - Daniel J Garry
- Cardiovascular Division, Department of Medicine, Lillehei Heart Institute, University of Minnesota, 2231 6th St SE, Minneapolis, MN, 55455, USA.
- Stem Cell Institute, University of Minnesota, Minneapolis, MN, 55455, USA.
- Paul and Sheila Wellstone Muscular Dystrophy Center, University of Minnesota, Minneapolis, MN, 55455, USA.
| | - Jianyi Zhang
- Department of Biomedical Engineering, The University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Wuming Gong
- Cardiovascular Division, Department of Medicine, Lillehei Heart Institute, University of Minnesota, 2231 6th St SE, Minneapolis, MN, 55455, USA.
| |
Collapse
|
22
|
Boye C, Kalita CA, Findley AS, Alazizi A, Wei J, Wen X, Pique-Regi R, Luca F. Characterization of caffeine response regulatory variants in vascular endothelial cells. eLife 2024; 13:e85235. [PMID: 38334359 PMCID: PMC10901511 DOI: 10.7554/elife.85235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 02/08/2024] [Indexed: 02/10/2024] Open
Abstract
Genetic variants in gene regulatory sequences can modify gene expression and mediate the molecular response to environmental stimuli. In addition, genotype-environment interactions (GxE) contribute to complex traits such as cardiovascular disease. Caffeine is the most widely consumed stimulant and is known to produce a vascular response. To investigate GxE for caffeine, we treated vascular endothelial cells with caffeine and used a massively parallel reporter assay to measure allelic effects on gene regulation for over 43,000 genetic variants. We identified 665 variants with allelic effects on gene regulation and 6 variants that regulate the gene expression response to caffeine (GxE, false discovery rate [FDR] < 5%). When overlapping our GxE results with expression quantitative trait loci colocalized with coronary artery disease and hypertension, we dissected their regulatory mechanisms and showed a modulatory role for caffeine. Our results demonstrate that massively parallel reporter assay is a powerful approach to identify and molecularly characterize GxE in the specific context of caffeine consumption.
Collapse
Affiliation(s)
- Carly Boye
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
| | - Cynthia A Kalita
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
| | - Anthony S Findley
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
| | - Adnan Alazizi
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
| | - Julong Wei
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
| | - Xiaoquan Wen
- Department of Biostatistics, University of MichiganAnn ArborUnited States
| | - Roger Pique-Regi
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
- Department of Obstetrics and Gynecology, Wayne State UniversityDetroitUnited States
| | - Francesca Luca
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
- Department of Obstetrics and Gynecology, Wayne State UniversityDetroitUnited States
- Department of Biology, University of Rome Tor VergataRomeItaly
| |
Collapse
|
23
|
Andreani V, South EJ, Dunlop MJ. Generating information-dense promoter sequences with optimal string packing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.01.565124. [PMID: 37961203 PMCID: PMC10635063 DOI: 10.1101/2023.11.01.565124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Dense arrangements of binding sites within nucleotide sequences can collectively influence downstream transcription rates or initiate biomolecular interactions. For example, natural promoter regions can harbor many overlapping transcription factor binding sites that influence the rate of transcription initiation. Despite the prevalence of overlapping binding sites in nature, rapid design of nucleotide sequences with many overlapping sites remains a challenge. Here, we show that this is an NP-hard problem, coined here as the nucleotide String Packing Problem (SPP). We then introduce a computational technique that efficiently assembles sets of DNA-protein binding sites into dense, contiguous stretches of double-stranded DNA. For the efficient design of nucleotide sequences spanning hundreds of base pairs, we reduce the SPP to an Orienteering Problem with integer distances, and then leverage modern integer linear programming solvers. Our method optimally packs libraries of 20-100 binding sites into dense nucleotide arrays of 50-300 base pairs in 0.05-10 seconds. Unlike approximation algorithms or meta-heuristics, our approach finds provably optimal solutions. We demonstrate how our method can generate large sets of diverse sequences suitable for library generation, where the frequency of binding site usage across the returned sequences can be controlled by modulating the objective function. As an example, we then show how adding additional constraints, like the inclusion of sequence elements with fixed positions, allows for the design of bacterial promoters. The nucleotide string packing approach we present can accelerate the design of sequences with complex DNA-protein interactions. When used in combination with synthesis and high-throughput screening, this design strategy could help interrogate how complex binding site arrangements impact either gene expression or biomolecular mechanisms in varied cellular contexts. Author Summary The way protein binding sites are arranged on DNA can control the regulation and transcription of downstream genes. Areas with a high concentration of binding sites can enable complex interplay between transcription factors, a feature that is exploited by natural promoters. However, designing synthetic promoters that contain dense arrangements of binding sites is a challenge. The task involves overlapping many binding sites, each typically about 10 nucleotides long, within a constrained sequence area, which becomes increasingly difficult as sequence length decreases, and binding site variety increases. We introduce an approach to design nucleotide sequences with optimally packed protein binding sites, which we call the nucleotide String Packing Problem (SPP). We show that the SPP can be solved efficiently using integer linear programming to identify the densest arrangements of binding sites for a specified sequence length. We show how adding additional constraints, like the inclusion of sequence elements with fixed positions, allows for the design of bacterial promoters. The presented approach enables the rapid design and study of nucleotide sequences with complex, dense binding site architectures.
Collapse
|
24
|
Shook MS, Lu X, Chen X, Parameswaran S, Edsall L, Trimarchi MP, Ernst K, Granitto M, Forney C, Donmez OA, Diouf AA, VonHandorf A, Rothenberg ME, Weirauch MT, Kottyan LC. Systematic identification of genotype-dependent enhancer variants in eosinophilic esophagitis. Am J Hum Genet 2024; 111:280-294. [PMID: 38183988 PMCID: PMC10870143 DOI: 10.1016/j.ajhg.2023.12.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 12/01/2023] [Accepted: 12/05/2023] [Indexed: 01/08/2024] Open
Abstract
Eosinophilic esophagitis (EoE) is a rare atopic disorder associated with esophageal dysfunction, including difficulty swallowing, food impaction, and inflammation, that develops in a small subset of people with food allergies. Genome-wide association studies (GWASs) have identified 9 independent EoE risk loci reaching genome-wide significance (p < 5 × 10-8) and 27 additional loci of suggestive significance (5 × 10-8 < p < 1 × 10-5). In the current study, we perform linkage disequilibrium (LD) expansion of these loci to nominate a set of 531 variants that are potentially causal. To systematically interrogate the gene regulatory activity of these variants, we designed a massively parallel reporter assay (MPRA) containing the alleles of each variant within their genomic sequence context cloned into a GFP reporter library. Analysis of reporter gene expression in TE-7, HaCaT, and Jurkat cells revealed cell-type-specific gene regulation. We identify 32 allelic enhancer variants, representing 6 genome-wide significant EoE loci and 7 suggestive EoE loci, that regulate reporter gene expression in a genotype-dependent manner in at least one cellular context. By annotating these variants with expression quantitative trait loci (eQTL) and chromatin looping data in related tissues and cell types, we identify putative target genes affected by genetic variation in individuals with EoE. Transcription factor enrichment analyses reveal possible roles for cell-type-specific regulators, including GATA3. Our approach reduces the large set of EoE-associated variants to a set of 32 with allelic regulatory activity, providing functional insights into the effects of genetic variation in this disease.
Collapse
Affiliation(s)
- Molly S Shook
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Xiaoming Lu
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Xiaoting Chen
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Sreeja Parameswaran
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Lee Edsall
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Michael P Trimarchi
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Kevin Ernst
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Marissa Granitto
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Carmy Forney
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Omer A Donmez
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Arame A Diouf
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Andrew VonHandorf
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Marc E Rothenberg
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA; Division of Allergy and Immunology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Matthew T Weirauch
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA; Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA; Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA; Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA.
| | - Leah C Kottyan
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA; Division of Allergy and Immunology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA; Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA.
| |
Collapse
|
25
|
Lim F, Solvason JJ, Ryan GE, Le SH, Jindal GA, Steffen P, Jandu SK, Farley EK. Affinity-optimizing enhancer variants disrupt development. Nature 2024; 626:151-159. [PMID: 38233525 PMCID: PMC10830414 DOI: 10.1038/s41586-023-06922-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 11/30/2023] [Indexed: 01/19/2024]
Abstract
Enhancers control the location and timing of gene expression and contain the majority of variants associated with disease1-3. The ZRS is arguably the most well-studied vertebrate enhancer and mediates the expression of Shh in the developing limb4. Thirty-one human single-nucleotide variants (SNVs) within the ZRS are associated with polydactyly4-6. However, how this enhancer encodes tissue-specific activity, and the mechanisms by which SNVs alter the number of digits, are poorly understood. Here we show that the ETS sites within the ZRS are low affinity, and identify a functional ETS site, ETS-A, with extremely low affinity. Two human SNVs and a synthetic variant optimize the binding affinity of ETS-A subtly from 15% to around 25% relative to the strongest ETS binding sequence, and cause polydactyly with the same penetrance and severity. A greater increase in affinity results in phenotypes that are more penetrant and more severe. Affinity-optimizing SNVs in other ETS sites in the ZRS, as well as in ETS, interferon regulatory factor (IRF), HOX and activator protein 1 (AP-1) sites within a wide variety of enhancers, cause gain-of-function gene expression. The prevalence of binding sites with suboptimal affinity in enhancers creates a vulnerability in genomes whereby SNVs that optimize affinity, even slightly, can be pathogenic. Searching for affinity-optimizing SNVs in genomes could provide a mechanistic approach to identify causal variants that underlie enhanceropathies.
Collapse
Affiliation(s)
- Fabian Lim
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
- Biological Sciences Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Joe J Solvason
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Genevieve E Ryan
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Sophia H Le
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Granton A Jindal
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Paige Steffen
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Simran K Jandu
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Emma K Farley
- Department of Medicine, University of California San Diego, La Jolla, CA, USA.
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
26
|
Brown EA, Kales S, Boyle MJ, Vitti J, Kotliar D, Schaffner S, Tewhey R, Sabeti PC. Three linked variants have opposing regulatory effects on isovaleryl-CoA dehydrogenase gene expression. Hum Mol Genet 2024; 33:270-283. [PMID: 37930192 PMCID: PMC10800014 DOI: 10.1093/hmg/ddad177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 10/03/2023] [Accepted: 10/09/2023] [Indexed: 11/07/2023] Open
Abstract
While genome-wide association studies (GWAS) and positive selection scans identify genomic loci driving human phenotypic diversity, functional validation is required to discover the variant(s) responsible. We dissected the IVD gene locus-which encodes the isovaleryl-CoA dehydrogenase enzyme-implicated by selection statistics, multiple GWAS, and clinical genetics as important to function and fitness. We combined luciferase assays, CRISPR/Cas9 genome-editing, massively parallel reporter assays (MPRA), and a deletion tiling MPRA strategy across regulatory loci. We identified three regulatory variants, including an indel, that may underpin GWAS signals for pulmonary fibrosis and testosterone, and that are linked on a positively selected haplotype in the Japanese population. These regulatory variants exhibit synergistic and opposing effects on IVD expression experimentally. Alleles at these variants lie on a haplotype tagged by the variant most strongly associated with IVD expression and metabolites, but with no functional evidence itself. This work demonstrates how comprehensive functional investigation and multiple technologies are needed to discover the true genetic drivers of phenotypic diversity.
Collapse
Affiliation(s)
- Elizabeth A Brown
- The Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, United States
- Broad Institute of MIT and Harvard, 75 Ames Street, Cambridge, MA 02142, United States
| | - Susan Kales
- The Jackson Laboratory, 600 Main St, Bar Harbor, ME 04609, United States
| | - Michael James Boyle
- The Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, United States
| | - Joseph Vitti
- The Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, United States
- Broad Institute of MIT and Harvard, 75 Ames Street, Cambridge, MA 02142, United States
| | - Dylan Kotliar
- The Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, United States
- Broad Institute of MIT and Harvard, 75 Ames Street, Cambridge, MA 02142, United States
| | - Steve Schaffner
- Broad Institute of MIT and Harvard, 75 Ames Street, Cambridge, MA 02142, United States
| | - Ryan Tewhey
- The Jackson Laboratory, 600 Main St, Bar Harbor, ME 04609, United States
| | - Pardis C Sabeti
- The Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, United States
- Broad Institute of MIT and Harvard, 75 Ames Street, Cambridge, MA 02142, United States
- Howard Hughes Medical Institute, Harvard University, 26 Oxford Street, Cambridge, MA 02138, United States
| |
Collapse
|
27
|
Kang CK, Kim AR. Deep molecular learning of transcriptional control of a synthetic CRE enhancer and its variants. iScience 2024; 27:108747. [PMID: 38222110 PMCID: PMC10784702 DOI: 10.1016/j.isci.2023.108747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 08/29/2023] [Accepted: 12/12/2023] [Indexed: 01/16/2024] Open
Abstract
Massively parallel reporter assay measures transcriptional activities of various cis-regulatory modules (CRMs) in a single experiment. We developed a thermodynamic computational model framework that calculates quantitative levels of gene expression directly from regulatory DNA sequences. Using the framework, we investigated the molecular mechanisms of cis-regulatory mutations of a synthetic enhancer that cause abnormal gene expression. We found that, in a human cell line, competitive binding between family transcription factors (TFs) with slightly different binding preferences significantly increases the accuracy of recapitulating the transcriptional effects of thousands of single- or multi-mutations. We also discovered that even if various harmful mutations occurred in an activator binding site, CRM could stably maintain or even increase gene expression through a certain form of competitive binding between family TFs. These findings enhance understanding the effect of SNPs and indels on CRMs and would help building robust custom-designed CRMs for biologics production and gene therapy.
Collapse
Affiliation(s)
- Chan-Koo Kang
- School of Life Science, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
- Department of Advanced Convergence, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
| | - Ah-Ram Kim
- School of Life Science, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
- Department of Advanced Convergence, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- School of Applied Artificial Intelligence, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
| |
Collapse
|
28
|
Breeze CE, Haugen E, Gutierrez-Arcelus M, Yao X, Teschendorff A, Beck S, Dunham I, Stamatoyannopoulos J, Franceschini N, Machiela MJ, Berndt SI. FORGEdb: a tool for identifying candidate functional variants and uncovering target genes and mechanisms for complex diseases. Genome Biol 2024; 25:3. [PMID: 38167104 PMCID: PMC10763681 DOI: 10.1186/s13059-023-03126-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Accepted: 11/27/2023] [Indexed: 01/05/2024] Open
Abstract
The majority of disease-associated variants identified through genome-wide association studies are located outside of protein-coding regions. Prioritizing candidate regulatory variants and gene targets to identify potential biological mechanisms for further functional experiments can be challenging. To address this challenge, we developed FORGEdb ( https://forgedb.cancer.gov/ ; https://forge2.altiusinstitute.org/files/forgedb.html ; and https://doi.org/10.5281/zenodo.10067458 ), a standalone and web-based tool that integrates multiple datasets, delivering information on associated regulatory elements, transcription factor binding sites, and target genes for over 37 million variants. FORGEdb scores provide researchers with a quantitative assessment of the relative importance of each variant for targeted functional experiments.
Collapse
Affiliation(s)
- Charles E Breeze
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue 98121, Seattle, USA.
- UCL Cancer Institute, University College London, 72 Huntley Street, London, WC1E 6BT, UK.
| | - Eric Haugen
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue 98121, Seattle, USA
| | - María Gutierrez-Arcelus
- Division of Immunology, Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Xiaozheng Yao
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Andrew Teschendorff
- CAS Key Lab of Computational Biology, Shanghai Institute for Biological Sciences, CAS-MPG Partner Institute for Computational Biology, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai, 200031, China
| | - Stephan Beck
- UCL Cancer Institute, University College London, 72 Huntley Street, London, WC1E 6BT, UK
| | - Ian Dunham
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | | | - Nora Franceschini
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA
| | - Mitchell J Machiela
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Sonja I Berndt
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| |
Collapse
|
29
|
Teng J, Gao Y, Yin H, Bai Z, Liu S, Zeng H, Bai L, Cai Z, Zhao B, Li X, Xu Z, Lin Q, Pan Z, Yang W, Yu X, Guan D, Hou Y, Keel BN, Rohrer GA, Lindholm-Perry AK, Oliver WT, Ballester M, Crespo-Piazuelo D, Quintanilla R, Canela-Xandri O, Rawlik K, Xia C, Yao Y, Zhao Q, Yao W, Yang L, Li H, Zhang H, Liao W, Chen T, Karlskov-Mortensen P, Fredholm M, Amills M, Clop A, Giuffra E, Wu J, Cai X, Diao S, Pan X, Wei C, Li J, Cheng H, Wang S, Su G, Sahana G, Lund MS, Dekkers JCM, Kramer L, Tuggle CK, Corbett R, Groenen MAM, Madsen O, Gòdia M, Rocha D, Charles M, Li CJ, Pausch H, Hu X, Frantz L, Luo Y, Lin L, Zhou Z, Zhang Z, Chen Z, Cui L, Xiang R, Shen X, Li P, Huang R, Tang G, Li M, Zhao Y, Yi G, Tang Z, Jiang J, Zhao F, Yuan X, Liu X, Chen Y, Xu X, Zhao S, Zhao P, Haley C, Zhou H, Wang Q, Pan Y, Ding X, Ma L, Li J, Navarro P, Zhang Q, Li B, Tenesa A, Li K, Liu GE, Zhang Z, Fang L. A compendium of genetic regulatory effects across pig tissues. Nat Genet 2024; 56:112-123. [PMID: 38177344 PMCID: PMC10786720 DOI: 10.1038/s41588-023-01585-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 10/13/2023] [Indexed: 01/06/2024]
Abstract
The Farm Animal Genotype-Tissue Expression (FarmGTEx) project has been established to develop a public resource of genetic regulatory variants in livestock, which is essential for linking genetic polymorphisms to variation in phenotypes, helping fundamental biological discovery and exploitation in animal breeding and human biomedicine. Here we show results from the pilot phase of PigGTEx by processing 5,457 RNA-sequencing and 1,602 whole-genome sequencing samples passing quality control from pigs. We build a pig genotype imputation panel and associate millions of genetic variants with five types of transcriptomic phenotypes in 34 tissues. We evaluate tissue specificity of regulatory effects and elucidate molecular mechanisms of their action using multi-omics data. Leveraging this resource, we decipher regulatory mechanisms underlying 207 pig complex phenotypes and demonstrate the similarity of pigs to humans in gene expression and the genetic regulation behind complex phenotypes, supporting the importance of pigs as a human biomedical model.
Collapse
Affiliation(s)
- Jinyan Teng
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
| | - Yahui Gao
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service (ARS), U.S. Department of Agriculture (USDA), Beltsville, MD, USA
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD, USA
| | - Hongwei Yin
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Zhonghao Bai
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| | - Shuli Liu
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service (ARS), U.S. Department of Agriculture (USDA), Beltsville, MD, USA
- School of Life Sciences, Westlake University, Hangzhou, China
| | - Haonan Zeng
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
| | - Lijing Bai
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Zexi Cai
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
| | - Bingru Zhao
- College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Xiujin Li
- Guangdong Provincial Key Laboratory of Waterfowl Healthy Breeding, College of Animal Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China
| | - Zhiting Xu
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
| | - Qing Lin
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
| | - Zhangyuan Pan
- Department of Animal Science, University of California, Davis, Davis, CA, USA
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Wenjing Yang
- College of Animal Science and Technology, China Agricultural University, Beijing, China
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Xiaoshan Yu
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| | - Dailu Guan
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Yali Hou
- Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, China
| | - Brittney N Keel
- ARS, USDA, U.S. Meat Animal Research Center, Clay Center, NE, USA
| | - Gary A Rohrer
- ARS, USDA, U.S. Meat Animal Research Center, Clay Center, NE, USA
| | | | - William T Oliver
- ARS, USDA, U.S. Meat Animal Research Center, Clay Center, NE, USA
| | - Maria Ballester
- Animal Breeding and Genetics Programme, Institut de Recerca i Tecnologia Agroalimentàries (IRTA), Torre Marimon, Caldes de Montbui, Spain
| | - Daniel Crespo-Piazuelo
- Animal Breeding and Genetics Programme, Institut de Recerca i Tecnologia Agroalimentàries (IRTA), Torre Marimon, Caldes de Montbui, Spain
| | - Raquel Quintanilla
- Animal Breeding and Genetics Programme, Institut de Recerca i Tecnologia Agroalimentàries (IRTA), Torre Marimon, Caldes de Montbui, Spain
| | - Oriol Canela-Xandri
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| | - Konrad Rawlik
- Baillie Gifford Pandemic Science Hub, University of Edinburgh, Edinburgh, UK
| | - Charley Xia
- Lothian Birth Cohort studies, University of Edinburgh, Edinburgh, UK
- Department of Psychology, University of Edinburgh, Edinburgh, UK
| | - Yuelin Yao
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
- School of Informatics, The University of Edinburgh, Edinburgh, UK
| | - Qianyi Zhao
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Wenye Yao
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| | - Liu Yang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Houcheng Li
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
| | - Huicong Zhang
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
| | - Wang Liao
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| | - Tianshuo Chen
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| | - Peter Karlskov-Mortensen
- Animal Genetics, Bioinformatics and Breeding, Department of Veterinary and Animal Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Merete Fredholm
- Animal Genetics, Bioinformatics and Breeding, Department of Veterinary and Animal Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Marcel Amills
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Campus de la Universitat Autònoma de Barcelona, Bellaterra, Spain
- Departament de Ciència Animal i dels Aliments, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Alex Clop
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Campus de la Universitat Autònoma de Barcelona, Bellaterra, Spain
- Consejo Superior de Investigaciones Científicas, Barcelona, Spain
| | - Elisabetta Giuffra
- Paris-Saclay University, INRAE, AgroParisTech, GABI, Jouy-en-Josas, France
| | - Jun Wu
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
| | - Xiaodian Cai
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
| | - Shuqi Diao
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
| | - Xiangchun Pan
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
| | - Chen Wei
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
| | - Jinghui Li
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Hao Cheng
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Sheng Wang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Guosheng Su
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
| | - Goutam Sahana
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
| | - Mogens Sandø Lund
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
| | - Jack C M Dekkers
- Department of Animal Science, Iowa State University, Ames, IA, USA
| | - Luke Kramer
- Department of Animal Science, Iowa State University, Ames, IA, USA
| | | | - Ryan Corbett
- Department of Animal Science, Iowa State University, Ames, IA, USA
| | - Martien A M Groenen
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| | - Ole Madsen
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| | - Marta Gòdia
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Campus de la Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Dominique Rocha
- Paris-Saclay University, INRAE, AgroParisTech, GABI, Jouy-en-Josas, France
| | - Mathieu Charles
- Paris-Saclay University, INRAE, AgroParisTech, GABI, SIGENAE, Jouy-en-Josas, France
| | - Cong-Jun Li
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service (ARS), U.S. Department of Agriculture (USDA), Beltsville, MD, USA
| | - Hubert Pausch
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, Zurich, Switzerland
| | - Xiaoxiang Hu
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, China
| | - Laurent Frantz
- Palaeogenomics Group, Department of Veterinary Sciences, Ludwig Maximilian University, Munich, Germany
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
| | - Yonglun Luo
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- Steno Diabetes Center Aarhus, Aarhus University Hospital, Aarhus, Denmark
- Lars Bolund Institute of Regenerative Medicine, Qingdao-Europe Advanced Institute for Life Sciences, BGI-Research, Qingdao, China
| | - Lin Lin
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- Steno Diabetes Center Aarhus, Aarhus University Hospital, Aarhus, Denmark
| | - Zhongyin Zhou
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Zhe Zhang
- Department of Animal Science, College of Animal Sciences, Zhejiang University, Hangzhou, China
| | - Zitao Chen
- Department of Animal Science, College of Animal Sciences, Zhejiang University, Hangzhou, China
| | - Leilei Cui
- School of Life Sciences, Nanchang University, Nanchang, China
- Human Aging Research Institute and School of Life Science, Nanchang University, and Jiangxi Key Laboratory of Human Aging, Jiangxi, China
- UCL Genetics Institute, University College London, London, UK
| | - Ruidong Xiang
- Faculty of Veterinary and Agricultural Science, The University of Melbourne, Parkville, Victoria, Australia
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, Victoria, Australia
| | - Xia Shen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Center for Intelligent Medicine Research, Greater Bay Area Institute of Precision Medicine, Fudan University, Guangzhou, China
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh, UK
| | - Pinghua Li
- Institute of Swine Science, Nanjing Agricultural University, Nanjing, China
| | - Ruihua Huang
- Institute of Swine Science, Nanjing Agricultural University, Nanjing, China
| | - Guoqing Tang
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, China
| | - Mingzhou Li
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, China
| | - Yunxiang Zhao
- College of Animal Science and Technology, Guangxi University, Nanning, China
| | - Guoqiang Yi
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Zhonglin Tang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Jicai Jiang
- Department of Animal Science, North Carolina State University, Raleigh, NC, USA
| | - Fuping Zhao
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Xiaolong Yuan
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
| | - Xiaohong Liu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Yaosheng Chen
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Xuewen Xu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education and College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, China
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education and College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, China
| | - Pengju Zhao
- Hainan Institute, Zhejiang University, Yongyou Industry Park, Yazhou Bay Sci-Tech City, Sanya, China
| | - Chris Haley
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, UK
| | - Huaijun Zhou
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Qishan Wang
- Department of Animal Science, College of Animal Sciences, Zhejiang University, Hangzhou, China
| | - Yuchun Pan
- Department of Animal Science, College of Animal Sciences, Zhejiang University, Hangzhou, China
| | - Xiangdong Ding
- College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Li Ma
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD, USA
| | - Jiaqi Li
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
| | - Pau Navarro
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, UK
| | - Qin Zhang
- College of Animal Science and Technology, Shandong Agricultural University, Tai'an, China
| | - Bingjie Li
- Scotland's Rural College (SRUC), Roslin Institute Building, Midlothian, UK
| | - Albert Tenesa
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK.
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, UK.
| | - Kui Li
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
| | - George E Liu
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service (ARS), U.S. Department of Agriculture (USDA), Beltsville, MD, USA.
| | - Zhe Zhang
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China.
| | - Lingzhao Fang
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark.
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
30
|
Wang J, Cheng X, Liang Q, Owen LA, Lu J, Zheng Y, Wang M, Chen S, DeAngelis MM, Li Y, Chen R. Single-cell multiomics of the human retina reveals hierarchical transcription factor collaboration in mediating cell type-specific effects of genetic variants on gene regulation. Genome Biol 2023; 24:269. [PMID: 38012720 PMCID: PMC10680294 DOI: 10.1186/s13059-023-03111-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 11/15/2023] [Indexed: 11/29/2023] Open
Abstract
BACKGROUND Systematic characterization of how genetic variation modulates gene regulation in a cell type-specific context is essential for understanding complex traits. To address this question, we profile gene expression and chromatin accessibility in cells from healthy retinae of 20 human donors through single-cell multiomics and genomic sequencing. RESULTS We map eQTL, caQTL, allelic-specific expression, and allelic-specific chromatin accessibility in major retinal cell types. By integrating these results, we identify and characterize regulatory elements and genetic variants effective on gene regulation in individual cell types. The majority of identified sc-eQTLs and sc-caQTLs display cell type-specific effects, while the cis-elements containing genetic variants with cell type-specific effects are often accessible in multiple cell types. Furthermore, the transcription factors whose binding sites are perturbed by genetic variants tend to have higher expression levels in the cell types where the variants exert their effects, compared to the cell types where the variants have no impact. We further validate our findings with high-throughput reporter assays. Lastly, we identify the enriched cell types, candidate causal variants and genes, and cell type-specific regulatory mechanism underlying GWAS loci. CONCLUSIONS Overall, genetic effects on gene regulation are highly context dependent. Our results suggest that cell type-dependent genetic effect is driven by precise modulation of both trans-factor expression and chromatin accessibility of cis-elements. Our findings indicate hierarchical collaboration among transcription factors plays a crucial role in mediating cell type-specific effects of genetic variants on gene regulation.
Collapse
Affiliation(s)
- Jun Wang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Xuesen Cheng
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Qingnan Liang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Leah A Owen
- Department of Ophthalmology and Visual Sciences, John A. Moran Eye Center, University of Utah, Salt Lake City, UT, USA
| | - Jiaxiong Lu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Yiqiao Zheng
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, MO, USA
| | - Meng Wang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Shiming Chen
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, MO, USA
- Department of Developmental Biology, Washington University in St Louis, Saint Louis, MO, USA
| | - Margaret M DeAngelis
- Department of Ophthalmology, University at Buffalo the State University of New York, Buffalo, NY, USA
| | - Yumei Li
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Rui Chen
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
31
|
Chen Y, Paramo MI, Zhang Y, Yao L, Shah SR, Jin Y, Zhang J, Pan X, Yu H. Finding Needles in the Haystack: Strategies for Uncovering Noncoding Regulatory Variants. Annu Rev Genet 2023; 57:201-222. [PMID: 37562413 DOI: 10.1146/annurev-genet-030723-120717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
Despite accumulating evidence implicating noncoding variants in human diseases, unraveling their functionality remains a significant challenge. Systematic annotations of the regulatory landscape and the growth of sequence variant data sets have fueled the development of tools and methods to identify causal noncoding variants and evaluate their regulatory effects. Here, we review the latest advances in the field and discuss potential future research avenues to gain a more in-depth understanding of noncoding regulatory variants.
Collapse
Affiliation(s)
- You Chen
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Mauricio I Paramo
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Yingying Zhang
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Li Yao
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
- Department of Computational Biology, Cornell University, Ithaca, New York, USA
| | - Sagar R Shah
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Yiyang Jin
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Junke Zhang
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
- Department of Computational Biology, Cornell University, Ithaca, New York, USA
| | - Xiuqi Pan
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Haiyuan Yu
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
- Department of Computational Biology, Cornell University, Ithaca, New York, USA
| |
Collapse
|
32
|
Lagunas T, Plassmeyer SP, Fischer AD, Friedman RZ, Rieger MA, Selmanovic D, Sarafinovska S, Sol YK, Kasper MJ, Fass SB, Aguilar Lucero AF, An JY, Sanders SJ, Cohen BA, Dougherty JD. A Cre-dependent massively parallel reporter assay allows for cell-type specific assessment of the functional effects of non-coding elements in vivo. Commun Biol 2023; 6:1151. [PMID: 37953348 PMCID: PMC10641075 DOI: 10.1038/s42003-023-05483-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 10/18/2023] [Indexed: 11/14/2023] Open
Abstract
The function of regulatory elements is highly dependent on the cellular context, and thus for understanding the function of elements associated with psychiatric diseases these would ideally be studied in neurons in a living brain. Massively Parallel Reporter Assays (MPRAs) are molecular genetic tools that enable functional screening of hundreds of predefined sequences in a single experiment. These assays have not yet been adapted to query specific cell types in vivo in a complex tissue like the mouse brain. Here, using a test-case 3'UTR MPRA library with genomic elements containing variants from autism patients, we developed a method to achieve reproducible measurements of element effects in vivo in a cell type-specific manner, using excitatory cortical neurons and striatal medium spiny neurons as test cases. This targeted technique should enable robust, functional annotation of genetic elements in the cellular contexts most relevant to psychiatric disease.
Collapse
Affiliation(s)
- Tomas Lagunas
- Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Department of Psychiatry, Washington University School of Medicine., 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Division of Biology and Biomedical Sciences, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
| | - Stephen P Plassmeyer
- Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Department of Psychiatry, Washington University School of Medicine., 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
| | - Anthony D Fischer
- Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Department of Psychiatry, Washington University School of Medicine., 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
| | - Ryan Z Friedman
- Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Division of Biology and Biomedical Sciences, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
| | - Michael A Rieger
- Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Department of Psychiatry, Washington University School of Medicine., 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Division of Biology and Biomedical Sciences, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
| | - Din Selmanovic
- Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Department of Psychiatry, Washington University School of Medicine., 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Division of Biology and Biomedical Sciences, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
| | - Simona Sarafinovska
- Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Department of Psychiatry, Washington University School of Medicine., 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
| | - Yvette K Sol
- Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Department of Psychiatry, Washington University School of Medicine., 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
| | - Michael J Kasper
- Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Department of Psychiatry, Washington University School of Medicine., 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
| | - Stuart B Fass
- Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
- Department of Psychiatry, Washington University School of Medicine., 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
| | - Alessandra F Aguilar Lucero
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neuroscience, University of California San Francisco, San Francisco, CA, 94518, USA
| | - Joon-Yong An
- Department of Integrated Biomedical and Life Science, Korea University, Seoul, 02841, Republic of Korea
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul, 02841, Republic of Korea
| | - Stephan J Sanders
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neuroscience, University of California San Francisco, San Francisco, CA, 94518, USA
| | - Barak A Cohen
- Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA
| | - Joseph D Dougherty
- Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave, Saint Louis, MO, 63108, USA.
- Department of Psychiatry, Washington University School of Medicine., 660 S. Euclid Ave, Saint Louis, MO, 63108, USA.
| |
Collapse
|
33
|
Rummel CK, Gagliardi M, Ahmad R, Herholt A, Jimenez-Barron L, Murek V, Weigert L, Hausruckinger A, Maidl S, Hauger B, Raabe FJ, Fürle C, Trastulla L, Turecki G, Eder M, Rossner MJ, Ziller MJ. Massively parallel functional dissection of schizophrenia-associated noncoding genetic variants. Cell 2023; 186:5165-5182.e33. [PMID: 37852259 DOI: 10.1016/j.cell.2023.09.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 06/12/2023] [Accepted: 09/14/2023] [Indexed: 10/20/2023]
Abstract
Schizophrenia (SCZ) is a highly heritable mental disorder with thousands of associated genetic variants located mostly in the noncoding space of the genome. Translating these associations into insights regarding the underlying pathomechanisms has been challenging because the causal variants, their mechanisms of action, and their target genes remain largely unknown. We implemented a massively parallel variant annotation pipeline (MVAP) to perform SCZ variant-to-function mapping at scale in disease-relevant neural cell types. This approach identified 620 functional variants (1.7%) that operate in a highly developmental context and neuronal-activity-dependent manner. Multimodal integration of epigenomic and CRISPRi screening data enabled us to link these functional variants to target genes, biological processes, and ultimately alterations of neuronal physiology. These results provide a multistage prioritization strategy to map functional single-nucleotide polymorphism (SNP)-to-gene-to-endophenotype relations and offer biological insights into the context-dependent molecular processes modulated by SCZ-associated genetic variation.
Collapse
Affiliation(s)
- Christine K Rummel
- Max Planck Institute of Psychiatry, Munich 80804, Germany; International Max Planck Research School for Translational Psychiatry (IMPRS-TP), Munich 80804, Germany
| | - Miriam Gagliardi
- Department of Psychiatry, University of Münster, Münster 48149, Germany
| | - Ruhel Ahmad
- Max Planck Institute of Psychiatry, Munich 80804, Germany
| | - Alexander Herholt
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU, Munich 80336, Germany; Systasy Bioscience GmbH, Munich 81669, Germany
| | - Laura Jimenez-Barron
- Max Planck Institute of Psychiatry, Munich 80804, Germany; International Max Planck Research School for Translational Psychiatry (IMPRS-TP), Munich 80804, Germany
| | - Vanessa Murek
- Max Planck Institute of Psychiatry, Munich 80804, Germany
| | - Liesa Weigert
- Max Planck Institute of Psychiatry, Munich 80804, Germany
| | | | - Susanne Maidl
- Max Planck Institute of Psychiatry, Munich 80804, Germany
| | - Barbara Hauger
- Max Planck Institute of Psychiatry, Munich 80804, Germany
| | - Florian J Raabe
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU, Munich 80336, Germany
| | | | - Lucia Trastulla
- International Max Planck Research School for Translational Psychiatry (IMPRS-TP), Munich 80804, Germany; Department of Psychiatry, University of Münster, Münster 48149, Germany; Technische Universität München Medical Graduate Center Experimental Medicine, Munich 80333, Germany
| | - Gustavo Turecki
- Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, QC, Canada
| | - Matthias Eder
- Max Planck Institute of Psychiatry, Munich 80804, Germany
| | - Moritz J Rossner
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU, Munich 80336, Germany; Systasy Bioscience GmbH, Munich 81669, Germany
| | - Michael J Ziller
- Max Planck Institute of Psychiatry, Munich 80804, Germany; Department of Psychiatry, University of Münster, Münster 48149, Germany; Center for Soft Nanoscience, University of Münster, Münster 48149, Germany.
| |
Collapse
|
34
|
Guo MG, Reynolds DL, Ang CE, Liu Y, Zhao Y, Donohue LKH, Siprashvili Z, Yang X, Yoo Y, Mondal S, Hong A, Kain J, Meservey L, Fabo T, Elfaki I, Kellman LN, Abell NS, Pershad Y, Bayat V, Etminani P, Holodniy M, Geschwind DH, Montgomery SB, Duncan LE, Urban AE, Altman RB, Wernig M, Khavari PA. Integrative analyses highlight functional regulatory variants associated with neuropsychiatric diseases. Nat Genet 2023; 55:1876-1891. [PMID: 37857935 PMCID: PMC10859123 DOI: 10.1038/s41588-023-01533-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 09/15/2023] [Indexed: 10/21/2023]
Abstract
Noncoding variants of presumed regulatory function contribute to the heritability of neuropsychiatric disease. A total of 2,221 noncoding variants connected to risk for ten neuropsychiatric disorders, including autism spectrum disorder, attention deficit hyperactivity disorder, bipolar disorder, borderline personality disorder, major depression, generalized anxiety disorder, panic disorder, post-traumatic stress disorder, obsessive-compulsive disorder and schizophrenia, were studied in developing human neural cells. Integrating epigenomic and transcriptomic data with massively parallel reporter assays identified differentially-active single-nucleotide variants (daSNVs) in specific neural cell types. Expression-gene mapping, network analyses and chromatin looping nominated candidate disease-relevant target genes modulated by these daSNVs. Follow-up integration of daSNV gene editing with clinical cohort analyses suggested that magnesium transport dysfunction may increase neuropsychiatric disease risk and indicated that common genetic pathomechanisms may mediate specific symptoms that are shared across multiple neuropsychiatric diseases.
Collapse
Affiliation(s)
- Margaret G Guo
- Stanford Program in Biomedical Informatics, Stanford University, Stanford, CA, USA
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
| | - David L Reynolds
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
| | - Cheen E Ang
- Department of Pathology, Stanford University, Stanford, CA, USA
- Department of Bioengineering, Stanford University, Stanford, CA, USA
- Institute for Stem Cell Biology & Regenerative Medicine, Stanford University, Stanford, CA, USA
| | - Yingfei Liu
- Institute for Stem Cell Biology & Regenerative Medicine, Stanford University, Stanford, CA, USA
- Institute of Neurobiology, Xi'an Jiaotong University Health Science Center, Xi'an, China
| | - Yang Zhao
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
| | - Laura K H Donohue
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Zurab Siprashvili
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
| | - Xue Yang
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
- Stanford Program in Cancer Biology, Stanford University, Stanford, CA, USA
| | - Yongjin Yoo
- Institute for Stem Cell Biology & Regenerative Medicine, Stanford University, Stanford, CA, USA
| | - Smarajit Mondal
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
| | - Audrey Hong
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
| | - Jessica Kain
- Department of Genetics, Stanford University, Stanford, CA, USA
| | | | - Tania Fabo
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Ibtihal Elfaki
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Laura N Kellman
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
- Stanford Program in Cancer Biology, Stanford University, Stanford, CA, USA
| | - Nathan S Abell
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Yash Pershad
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | | | | | - Mark Holodniy
- Public Health Surveillance and Research, Department of Veterans Affairs, Washington, DC, USA
- Division of Infectious Disease & Geographic Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Daniel H Geschwind
- Program in Neurobehavioral Genetics, Semel Institute, UCLA, Los Angeles, CA, USA
| | - Stephen B Montgomery
- Department of Pathology, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Laramie E Duncan
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | - Alexander E Urban
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | - Russ B Altman
- Stanford Program in Biomedical Informatics, Stanford University, Stanford, CA, USA
- Department of Bioengineering, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Marius Wernig
- Department of Pathology, Stanford University, Stanford, CA, USA
- Institute for Stem Cell Biology & Regenerative Medicine, Stanford University, Stanford, CA, USA
| | - Paul A Khavari
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA.
- Stanford Program in Cancer Biology, Stanford University, Stanford, CA, USA.
- Veterans Affairs Palo Alto Healthcare System, Palo Alto, CA, USA.
| |
Collapse
|
35
|
McAfee JC, Lee S, Lee J, Bell JL, Krupa O, Davis J, Insigne K, Bond ML, Zhao N, Boyle AP, Phanstiel DH, Love MI, Stein JL, Ruzicka WB, Davila-Velderrain J, Kosuri S, Won H. Systematic investigation of allelic regulatory activity of schizophrenia-associated common variants. CELL GENOMICS 2023; 3:100404. [PMID: 37868037 PMCID: PMC10589626 DOI: 10.1016/j.xgen.2023.100404] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 02/23/2023] [Accepted: 08/21/2023] [Indexed: 10/24/2023]
Abstract
Genome-wide association studies (GWASs) have successfully identified 145 genomic regions that contribute to schizophrenia risk, but linkage disequilibrium makes it challenging to discern causal variants. We performed a massively parallel reporter assay (MPRA) on 5,173 fine-mapped schizophrenia GWAS variants in primary human neural progenitors and identified 439 variants with allelic regulatory effects (MPRA-positive variants). Transcription factor binding had modest predictive power, while fine-map posterior probability, enhancer overlap, and evolutionary conservation failed to predict MPRA-positive variants. Furthermore, 64% of MPRA-positive variants did not exhibit expressive quantitative trait loci signature, suggesting that MPRA could identify yet unexplored variants with regulatory potentials. To predict the combinatorial effect of MPRA-positive variants on gene regulation, we propose an accessibility-by-contact model that combines MPRA-measured allelic activity with neuronal chromatin architecture.
Collapse
Affiliation(s)
- Jessica C. McAfee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Sool Lee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jiseok Lee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jessica L. Bell
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Oleh Krupa
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jessica Davis
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Kimberly Insigne
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Marielle L. Bond
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Nanxiang Zhao
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Alan P. Boyle
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Douglas H. Phanstiel
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Michael I. Love
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jason L. Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - W. Brad Ruzicka
- Laboratory for Epigenomics in Human Psychopathology, McLean Hospital, Belmont, MA 02141, USA
- Harvard Medical School, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | - Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Hyejung Won
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
36
|
Tovar A, Kyono Y, Nishino K, Bose M, Varshney A, Parker SCJ, Kitzman JO. Using a modular massively parallel reporter assay to discover context-specific regulatory grammars in type 2 diabetes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.08.561391. [PMID: 37873175 PMCID: PMC10592691 DOI: 10.1101/2023.10.08.561391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Recent genome-wide association studies have established that most complex disease-associated loci are found in noncoding regions where defining their function is nontrivial. In this study, we leverage a modular massively parallel reporter assay (MPRA) to uncover sequence features linked to context-specific regulatory activity. We screened enhancer activity across a panel of 198-bp fragments spanning over 10k type 2 diabetes- and metabolic trait-associated variants in the 832/13 rat insulinoma cell line, a relevant model of pancreatic beta cells. We explored these fragments' context sensitivity by comparing their activities when placed up-or downstream of a reporter gene, and in combination with either a synthetic housekeeping promoter (SCP1) or a more biologically relevant promoter corresponding to the human insulin gene ( INS ). We identified clear effects of MPRA construct design on measured fragment enhancer activity. Specifically, a subset of fragments (n = 702/11,656) displayed positional bias, evenly distributed across up- and downstream preference. A separate set of fragments exhibited promoter bias (n = 698/11,656), mostly towards the cell-specific INS promoter (73.4%). To identify sequence features associated with promoter preference, we used Lasso regression with 562 genomic annotations and discovered that fragments with INS promoter-biased activity are enriched for HNF1 motifs. HNF1 family transcription factors are key regulators of glucose metabolism disrupted in maturity onset diabetes of the young (MODY), suggesting genetic convergence between rare coding variants that cause MODY and common T2D-associated regulatory variants. We designed a follow-up MPRA containing HNF1 motif-enriched fragments and observed several instances where deletion or mutation of HNF1 motifs disrupted the INS promoter-biased enhancer activity, specifically in the beta cell model but not in a skeletal muscle cell line, another diabetes-relevant cell type. Together, our study suggests that cell-specific regulatory activity is partially influenced by enhancer-promoter compatibility and indicates that careful attention should be paid when designing MPRA libraries to capture context-specific regulatory processes at disease-associated genetic signals.
Collapse
|
37
|
Örd T, Örd D, Adler P, Örd T. Genome-wide census of ATF4 binding sites and functional profiling of trait-associated genetic variants overlapping ATF4 binding motifs. PLoS Genet 2023; 19:e1011014. [PMID: 37906604 PMCID: PMC10637723 DOI: 10.1371/journal.pgen.1011014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 11/10/2023] [Accepted: 10/11/2023] [Indexed: 11/02/2023] Open
Abstract
Activating Transcription Factor 4 (ATF4) is an important regulator of gene expression in stress responses and developmental processes in many cell types. Here, we catalogued ATF4 binding sites in the human genome and identified overlaps with trait-associated genetic variants. We probed these genetic variants for allelic regulatory activity using a massively parallel reporter assay (MPRA) in HepG2 hepatoma cells exposed to tunicamycin to induce endoplasmic reticulum stress and ATF4 upregulation. The results revealed that in the majority of cases, the MPRA allelic activity of these SNPs was in agreement with the nucleotide preference seen in the ATF4 binding motif from ChIP-Seq. Luciferase and electrophoretic mobility shift assays in additional cellular models further confirmed ATF4-dependent regulatory effects for the SNPs rs532446 (GADD45A intronic; linked to hematological parameters), rs7011846 (LPL upstream; myocardial infarction), rs2718215 (diastolic blood pressure), rs281758 (psychiatric disorders) and rs6491544 (educational attainment). CRISPR-Cas9 disruption and/or deletion of the regulatory elements harboring rs532446 and rs7011846 led to the downregulation of GADD45A and LPL, respectively. Thus, these SNPs could represent examples of GWAS genetic variants that affect gene expression by altering ATF4-mediated transcriptional activation.
Collapse
Affiliation(s)
- Tiit Örd
- Institute of Genomics, University of Tartu, Tartu, Estonia
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Daima Örd
- Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Priit Adler
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - Tõnis Örd
- Institute of Genomics, University of Tartu, Tartu, Estonia
| |
Collapse
|
38
|
Antontseva EV, Degtyareva AO, Korbolina EE, Damarov IS, Merkulova TI. Human-genome single nucleotide polymorphisms affecting transcription factor binding and their role in pathogenesis. Vavilovskii Zhurnal Genet Selektsii 2023; 27:662-675. [PMID: 37965371 PMCID: PMC10641029 DOI: 10.18699/vjgb-23-77] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 03/24/2023] [Accepted: 03/30/2023] [Indexed: 11/16/2023] Open
Abstract
Single nucleotide polymorphisms (SNPs) are the most common type of variation in the human genome. The vast majority of SNPs identified in the human genome do not have any effect on the phenotype; however, some can lead to changes in the function of a gene or the level of its expression. Most SNPs associated with certain traits or pathologies are mapped to regulatory regions of the genome and affect gene expression by changing transcription factor binding sites. In recent decades, substantial effort has been invested in searching for such regulatory SNPs (rSNPs) and understanding the mechanisms by which they lead to phenotypic differences, primarily to individual differences in susceptibility to diseases and in sensitivity to drugs. The development of the NGS (next-generation sequencing) technology has contributed not only to the identification of a huge number of SNPs and to the search for their association (genome-wide association studies, GWASs) with certain diseases or phenotypic manifestations, but also to the development of more productive approaches to their functional annotation. It should be noted that the presence of an association does not allow one to identify a functional, truly disease-associated DNA sequence variant among multiple marker SNPs that are detected due to linkage disequilibrium. Moreover, determination of associations of genetic variants with a disease does not provide information about the functionality of these variants, which is necessary to elucidate the molecular mechanisms of the development of pathology and to design effective methods for its treatment and prevention. In this regard, the functional analysis of SNPs annotated in the GWAS catalog, both at the genome-wide level and at the level of individual SNPs, became especially relevant in recent years. A genome-wide search for potential rSNPs is possible without any prior knowledge of their association with a trait. Thus, mapping expression quantitative trait loci (eQTLs) makes it possible to identify an SNP for which - among transcriptomes of homozygotes and heterozygotes for its various alleles - there are differences in the expression level of certain genes, which can be located at various distances from the SNP. To predict rSNPs, approaches based on searches for allele-specific events in RNA-seq, ChIP-seq, DNase-seq, ATAC-seq, MPRA, and other data are also used. Nonetheless, for a more complete functional annotation of such rSNPs, it is necessary to establish their association with a trait, in particular, with a predisposition to a certain pathology or sensitivity to drugs. Thus, approaches to finding SNPs important for the development of a trait can be categorized into two groups: (1) starting from data on an association of SNPs with a certain trait, (2) starting from the determination of allele-specific changes at the molecular level (in a transcriptome or regulome). Only comprehensive use of strategically different approaches can considerably enrich our knowledge about the role of genetic determinants in the molecular mechanisms of trait formation, including predisposition to multifactorial diseases.
Collapse
Affiliation(s)
- E V Antontseva
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - A O Degtyareva
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - E E Korbolina
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - I S Damarov
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - T I Merkulova
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| |
Collapse
|
39
|
Frenkel M, Hujoel ML, Morris Z, Raman S. Discovering chromatin dysregulation induced by protein-coding perturbations at scale. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.20.555752. [PMID: 37781603 PMCID: PMC10541138 DOI: 10.1101/2023.09.20.555752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/03/2023]
Abstract
Although population-scale databases have expanded to millions of protein-coding variants, insight into variant mechanisms has not kept pace. We present PROD-ATAC, a high-throughput method for discovering the effects of protein-coding variants on chromatin. A pooled library of variants is expressed in a disease-agnostic cell line, and single-cell ATAC resolves each variant's effect on chromatin. Using PROD-ATAC, we characterized the effects of >100 oncofusions (a class of cancer-causing chimeric proteins) and controls and revealed that pioneer activity is a common feature of fusions spanning an enormous range of fusion frequencies. Further, fusion-induced dysregulation can be context-agnostic as observed mechanisms often overlapped with cancer and cell-type specific prior knowledge. We also showed that gain-of-function pioneering is common among oncofusions. This work provides a global view of fusion-induced chromatin. We uncovered convergent mechanisms among disparate oncofusions and shared modes of dysregulation across different cancers. PROD-ATAC is generalizable to any set of protein-coding variants.
Collapse
Affiliation(s)
- Max Frenkel
- Cellular and Molecular Biology Graduate Program, University of Wisconsin, Madison, Wisconsin, USA
- Medical Scientist Training Program, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
- Department of Biochemistry, University of Wisconsin, Madison, Wisconsin, USA
| | - Margaux L.A. Hujoel
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Zachary Morris
- Department of Human Oncology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - Srivatsan Raman
- Department of Biochemistry, University of Wisconsin, Madison, Wisconsin, USA
- Department of Bacteriology, University of Wisconsin, Madison, Wisconsin, USA
- Department of Chemical and Biological Engineering, University of Wisconsin, Madison, Wisconsin, USA
| |
Collapse
|
40
|
Mulvey B, Selmanovic D, Dougherty JD. Sex Significantly Impacts the Function of Major Depression-Linked Variants In Vivo. Biol Psychiatry 2023; 94:466-478. [PMID: 36803612 PMCID: PMC10425576 DOI: 10.1016/j.biopsych.2023.02.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 01/19/2023] [Accepted: 02/07/2023] [Indexed: 02/17/2023]
Abstract
BACKGROUND Genome-wide association studies have discovered blocks of common variants-likely transcriptional-regulatory-associated with major depressive disorder (MDD), though the functional subset and their biological impacts remain unknown. Likewise, why depression occurs in females more frequently than males is unclear. We therefore tested the hypothesis that risk-associated functional variants interact with sex and produce greater impact in female brains. METHODS We developed techniques to directly measure regulatory variant activity and sex interactions using massively parallel reporter assays in the mouse brain in vivo, in a cell type-specific manner, and applied these approaches to measure activity of >1000 variants from >30 MDD loci. RESULTS We identified extensive sex-by-allele effects in mature hippocampal neurons, suggesting that sex-differentiated impacts of genetic risk may underlie sex bias in disease. Unbiased informatics approaches indicated that functional MDD variants recurrently disrupt a number of transcription factor binding motifs, including those of sex hormone receptors. We confirmed a role for the latter by performing massively parallel reporter assays in neonatal mice on the day of birth (during a sex-differentiating hormone surge) and hormonally quiescent juveniles. CONCLUSIONS Our study provides novel insights into the influence of age, biological sex, and cell type on regulatory variant function and provides a framework for in vivo parallel assays to functionally define interactions between organismal variables such as sex and regulatory variation. Moreover, we experimentally demonstrate that a portion of the sex differences seen in MDD occurrence may be a product of sex-differentiated effects at associated regulatory variants.
Collapse
Affiliation(s)
- Bernard Mulvey
- Division of Biology and Biomedical Sciences, Washington University in St. Louis School of Medicine, St. Louis, Missouri; Department of Genetics, Washington University in St. Louis School of Medicine, St. Louis, Missouri
| | - Din Selmanovic
- Department of Genetics, Washington University in St. Louis School of Medicine, St. Louis, Missouri
| | - Joseph D Dougherty
- Department of Genetics, Washington University in St. Louis School of Medicine, St. Louis, Missouri; Department of Psychiatry, Washington University in St. Louis School of Medicine, St. Louis, Missouri; Intellectual and Developmental Disabilities Research Center, Washington University in St. Louis School of Medicine, St. Louis, Missouri.
| |
Collapse
|
41
|
Kleinschmidt H, Xu C, Bai L. Using Synthetic DNA Libraries to Investigate Chromatin and Gene Regulation. Chromosoma 2023; 132:167-189. [PMID: 37184694 PMCID: PMC10542970 DOI: 10.1007/s00412-023-00796-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 04/25/2023] [Accepted: 04/26/2023] [Indexed: 05/16/2023]
Abstract
Despite the recent explosion in genome-wide studies in chromatin and gene regulation, we are still far from extracting a set of genetic rules that can predict the function of the regulatory genome. One major reason for this deficiency is that gene regulation is a multi-layered process that involves an enormous variable space, which cannot be fully explored using native genomes. This problem can be partially solved by introducing synthetic DNA libraries into cells, a method that can test the regulatory roles of thousands to millions of sequences with limited variables. Here, we review recent applications of this method to study transcription factor (TF) binding, nucleosome positioning, and transcriptional activity. We discuss the design principles, experimental procedures, and major findings from these studies and compare the pros and cons of different approaches.
Collapse
Affiliation(s)
- Holly Kleinschmidt
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Cheng Xu
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA.
- Department of Physics, The Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
42
|
Fu Y, Kelly JA, Gopalakrishnan J, Pelikan RC, Tessneer KL, Pasula S, Grundahl K, Murphy DA, Gaffney PM. Massively Parallel Reporter Assay Confirms Regulatory Potential of hQTLs and Reveals Important Variants in Lupus and Other Autoimmune Diseases. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.17.553722. [PMID: 37645944 PMCID: PMC10462090 DOI: 10.1101/2023.08.17.553722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Objective To systematically characterize the potential for histone post-translational modifications, i.e., histone quantitative trait loci (hQTLs), expression QTLs (eQTLs), and variants on systemic lupus erythematosus (SLE) and autoimmune (AI) disease risk haplotypes to modulate gene expression in an allele dependent manner. Methods We designed a massively parallel reporter assay (MPRA) containing ~32K variants and transfected it into an Epstein-Barr virus transformed B cell line generated from an SLE case. Results Our study expands our understanding of hQTLs, illustrating that epigenetic QTLs are more likely to contribute to functional mechanisms than eQTLs and other variant types, and a large proportion of hQTLs overlap transcription start sites (TSS) of noncoding RNAs. In addition, we nominate 17 variants (including 11 novel) as putative causal variants for SLE and another 14 for various other AI diseases, prioritizing these variants for future functional studies primary and immortalized B cells. Conclusion We uncover important insights into the mechanistic relationships between genotype, epigenetics, gene expression, and SLE and AI disease phenotypes.
Collapse
Affiliation(s)
- Yao Fu
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104, USA
| | - Jennifer A Kelly
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104, USA
| | - Jaanam Gopalakrishnan
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104, USA
- Neuro-Immune Regulome Unit, National Eye Institute, National Institute of Health, Bethesda, MD, 20892, USA
| | - Richard C Pelikan
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104, USA
| | - Kandice L Tessneer
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104, USA
| | - Satish Pasula
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104, USA
| | - Kiely Grundahl
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104, USA
| | - David A Murphy
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104, USA
| | - Patrick M Gaffney
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104, USA
| |
Collapse
|
43
|
Gosai SJ, Castro RI, Fuentes N, Butts JC, Kales S, Noche RR, Mouri K, Sabeti PC, Reilly SK, Tewhey R. Machine-guided design of synthetic cell type-specific cis-regulatory elements. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.08.552077. [PMID: 37609287 PMCID: PMC10441439 DOI: 10.1101/2023.08.08.552077] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Cis-regulatory elements (CREs) control gene expression, orchestrating tissue identity, developmental timing, and stimulus responses, which collectively define the thousands of unique cell types in the body. While there is great potential for strategically incorporating CREs in therapeutic or biotechnology applications that require tissue specificity, there is no guarantee that an optimal CRE for an intended purpose has arisen naturally through evolution. Here, we present a platform to engineer and validate synthetic CREs capable of driving gene expression with programmed cell type specificity. We leverage innovations in deep neural network modeling of CRE activity across three cell types, efficient in silico optimization, and massively parallel reporter assays (MPRAs) to design and empirically test thousands of CREs. Through in vitro and in vivo validation, we show that synthetic sequences outperform natural sequences from the human genome in driving cell type-specific expression. Synthetic sequences leverage unique sequence syntax to promote activity in the on-target cell type and simultaneously reduce activity in off-target cells. Together, we provide a generalizable framework to prospectively engineer CREs and demonstrate the required literacy to write regulatory code that is fit-for-purpose in vivo across vertebrates.
Collapse
Affiliation(s)
- SJ Gosai
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Graduate Program in Biological and Biomedical Science, Boston MA
- Department Of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - RI Castro
- The Jackson Laboratory, Bar Harbor, ME, USA
| | - N Fuentes
- The Jackson Laboratory, Bar Harbor, ME, USA
- Harvard College, Harvard University, Cambridge, MA, USA
| | - JC Butts
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
| | - S Kales
- The Jackson Laboratory, Bar Harbor, ME, USA
| | - RR Noche
- Department of Comparative Medicine, Yale School of Medicine, New Haven, CT, USA
- Yale Zebrafish Research Core, Yale School of Medicine, New Haven, CT, USA
| | - K Mouri
- The Jackson Laboratory, Bar Harbor, ME, USA
| | - PC Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department Of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - SK Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Wu Tsai Institute, Yale University, New Haven, CT, USA
| | - R Tewhey
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
- Graduate School of Biomedical Sciences, Tufts University School of Medicine, Boston, MA, USA
| |
Collapse
|
44
|
Dincer TU, Ernst J. Integrative epigenomic and functional characterization assay based annotation of regulatory activity across diverse human cell types. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.14.549056. [PMID: 37503240 PMCID: PMC10369970 DOI: 10.1101/2023.07.14.549056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
We introduce ChromActivity, a computational framework for predicting and annotating regulatory activity across the genome through integration of multiple epigenomic maps and various functional characterization datasets. ChromActivity generates genomewide predictions of regulatory activity associated with each functional characterization dataset across many cell types based on available epigenomic data. It then for each cell type produces (1) ChromScoreHMM genome annotations based on the combinatorial and spatial patterns within these predictions and (2) ChromScore tracks of overall predicted regulatory activity. ChromActivity provides a resource for analyzing and interpreting the human regulatory genome across diverse cell types.
Collapse
Affiliation(s)
- Tevfik Umut Dincer
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, CA, 90095, USA
- Department of Biological Chemistry, University of California, Los Angeles, CA, 90095, USA
| | - Jason Ernst
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, CA, 90095, USA
- Department of Biological Chemistry, University of California, Los Angeles, CA, 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research at University of California, Los Angeles, CA, 90095, USA
- Computer Science Department, University of California, Los Angeles, CA, 90095, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, CA, 90095, USA
- Department of Computational Medicine, University of California, Los Angeles, CA, 90095, USA
| |
Collapse
|
45
|
Jeong R, Bulyk ML. Blood cell traits' GWAS loci colocalization with variation in PU.1 genomic occupancy prioritizes causal noncoding regulatory variants. CELL GENOMICS 2023; 3:100327. [PMID: 37492098 PMCID: PMC10363807 DOI: 10.1016/j.xgen.2023.100327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 02/10/2023] [Accepted: 04/25/2023] [Indexed: 07/27/2023]
Abstract
Genome-wide association studies (GWASs) have uncovered numerous trait-associated loci across the human genome, most of which are located in noncoding regions, making interpretation difficult. Moreover, causal variants are hard to statistically fine-map at many loci because of widespread linkage disequilibrium. To address this challenge, we present a strategy utilizing transcription factor (TF) binding quantitative trait loci (bQTLs) for colocalization analysis to identify trait associations likely mediated by TF occupancy variation and to pinpoint likely causal variants using motif scores. We applied this approach to PU.1 bQTLs in lymphoblastoid cell lines and blood cell trait GWAS data. Colocalization analysis revealed 69 blood cell trait GWAS loci putatively driven by PU.1 occupancy variation. We nominate PU.1 motif-altering variants as the likely shared causal variants at 51 loci. Such integration of TF bQTL data with other GWAS data may reveal transcriptional regulatory mechanisms and causal noncoding variants underlying additional complex traits.
Collapse
Affiliation(s)
- Raehoon Jeong
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
- Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, MA 02138, USA
| | - Martha L. Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
- Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, MA 02138, USA
- Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
46
|
Oliveros W, Delfosse K, Lato DF, Kiriakopulos K, Mokhtaridoost M, Said A, McMurray BJ, Browning JW, Mattioli K, Meng G, Ellis J, Mital S, Melé M, Maass PG. Systematic characterization of regulatory variants of blood pressure genes. CELL GENOMICS 2023; 3:100330. [PMID: 37492106 PMCID: PMC10363820 DOI: 10.1016/j.xgen.2023.100330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 03/29/2023] [Accepted: 04/28/2023] [Indexed: 07/27/2023]
Abstract
High blood pressure (BP) is the major risk factor for cardiovascular disease. Genome-wide association studies have identified genetic variants for BP, but functional insights into causality and related molecular mechanisms lag behind. We functionally characterize 4,608 genetic variants in linkage with 135 BP loci in vascular smooth muscle cells and cardiomyocytes by massively parallel reporter assays. High densities of regulatory variants at BP loci (i.e., ULK4, MAP4, CFDP1, PDE5A) indicate that multiple variants drive genetic association. Regulatory variants are enriched in repeats, alter cardiovascular-related transcription factor motifs, and spatially converge with genes controlling specific cardiovascular pathways. Using heuristic scoring, we define likely causal variants, and CRISPR prime editing finally determines causal variants for KCNK9, SFXN2, and PCGF6, which are candidates for developing high BP. Our systems-level approach provides a catalog of functionally relevant variants and their genomic architecture in two trait-relevant cell lines for a better understanding of BP gene regulation.
Collapse
Affiliation(s)
- Winona Oliveros
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Catalonia, Spain
| | - Kate Delfosse
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Daniella F. Lato
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Katerina Kiriakopulos
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Milad Mokhtaridoost
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Abdelrahman Said
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Brandon J. McMurray
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Jared W.L. Browning
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Kaia Mattioli
- Division of Genetics, Department of Medicine, Brigham & Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Guoliang Meng
- Developmental and Stem Cell Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - James Ellis
- Developmental and Stem Cell Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Seema Mital
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Ted Rogers Centre for Heart Research, Toronto, ON M5G 1X8, Canada
- Department of Pediatrics, The Hospital for Sick Children, University of Toronto, Toronto, ON M5G 0A4, Canada
| | - Marta Melé
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Catalonia, Spain
| | - Philipp G. Maass
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
47
|
Viñas R, Joshi CK, Georgiev D, Lin P, Dumitrascu B, Gamazon ER, Liò P. Hypergraph factorization for multi-tissue gene expression imputation. NAT MACH INTELL 2023; 5:739-753. [PMID: 37771758 PMCID: PMC10538467 DOI: 10.1038/s42256-023-00684-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Accepted: 06/02/2023] [Indexed: 09/30/2023]
Abstract
Integrating gene expression across tissues and cell types is crucial for understanding the coordinated biological mechanisms that drive disease and characterise homeostasis. However, traditional multitissue integration methods cannot handle uncollected tissues or rely on genotype information, which is often unavailable and subject to privacy concerns. Here we present HYFA (Hypergraph Factorisation), a parameter-efficient graph representation learning approach for joint imputation of multi-tissue and cell-type gene expression. HYFA is genotype-agnostic, supports a variable number of collected tissues per individual, and imposes strong inductive biases to leverage the shared regulatory architecture of tissues and genes. In performance comparison on Genotype-Tissue Expression project data, HYFA achieves superior performance over existing methods, especially when multiple reference tissues are available. The HYFA-imputed dataset can be used to identify replicable regulatory genetic variations (eQTLs), with substantial gains over the original incomplete dataset. HYFA can accelerate the effective and scalable integration of tissue and cell-type transcriptome biorepositories.
Collapse
Affiliation(s)
- Ramon Viñas
- Department of Computer Science and Technology, University of Cambridge
| | | | - Dobrik Georgiev
- Department of Computer Science and Technology, University of Cambridge
| | - Phillip Lin
- Division of Genetic Medicine, Vanderbilt University Medical Center
| | - Bianca Dumitrascu
- Department of Statistics and Irving Institute for Cancer Dynamics, Columbia University
| | - Eric R. Gamazon
- Vanderbilt Genetics Institute and Data Science Institute, MRC Epidemiology Unit, University of Cambridge
| | - Pietro Liò
- Department of Computer Science and Technology, University of Cambridge
| |
Collapse
|
48
|
Rummel CK, Gagliardi M, Herholt A, Ahmad R, Murek V, Weigert L, Hausruckinger A, Maidl S, Jimenez-Barron L, Trastulla L, Eder M, Rossner M, Ziller MJ. Cell type and condition specific functional annotation of schizophrenia associated non-coding genetic variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.27.545266. [PMID: 37425902 PMCID: PMC10326990 DOI: 10.1101/2023.06.27.545266] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Schizophrenia (SCZ) is a highly polygenic disease and genome wide association studies have identified thousands of genetic variants that are statistically associated with this psychiatric disorder. However, our ability to translate these associations into insights on the disease mechanisms has been challenging since the causal genetic variants, their molecular function and their target genes remain largely unknown. In order to address these questions, we established a functional genomics pipeline in combination with induced pluripotent stem cell technology to functionally characterize ~35,000 non-coding genetic variants associated with schizophrenia along with their target genes. This analysis identified a set of 620 (1.7%) single nucleotide polymorphisms as functional on a molecular level in a highly cell type and condition specific fashion. These results provide a high-resolution map of functional variant-gene combinations and offer comprehensive biological insights into the developmental context and stimulation dependent molecular processes modulated by SCZ associated genetic variation.
Collapse
Affiliation(s)
- Christine K. Rummel
- Max Planck Institute of Psychiatry, Munich, Germany
- International Max Planck Research School for Translational Psychiatry (IMPRS-TP), Munich, Germany
| | - Miriam Gagliardi
- Department of Psychiatry, University of Münster, Münster, Germany
| | - Alexander Herholt
- Department of Psychiatry and Psychotherapy, University Hospital, LMU Munich, Munich, Germany
| | - Ruhel Ahmad
- Max Planck Institute of Psychiatry, Munich, Germany
| | | | | | | | | | - Laura Jimenez-Barron
- Max Planck Institute of Psychiatry, Munich, Germany
- International Max Planck Research School for Translational Psychiatry (IMPRS-TP), Munich, Germany
| | - Lucia Trastulla
- Department of Psychiatry, University of Münster, Münster, Germany
| | - Mathias Eder
- Max Planck Institute of Psychiatry, Munich, Germany
| | - Moritz Rossner
- Department of Psychiatry and Psychotherapy, University Hospital, LMU Munich, Munich, Germany
| | - Michael J. Ziller
- Max Planck Institute of Psychiatry, Munich, Germany
- Department of Psychiatry, University of Münster, Münster, Germany
- Center for Soft Nanoscience, University of Münster, Münster, Germany
| |
Collapse
|
49
|
Brown AC, Cohen CJ, Mielczarek O, Migliorini G, Costantino F, Allcock A, Davidson C, Elliott KS, Fang H, Lledó Lara A, Martin AC, Osgood JA, Sanniti A, Scozzafava G, Vecellio M, Zhang P, Black MH, Li S, Truong D, Molineros J, Howe T, Wordsworth BP, Bowness P, Knight JC. Comprehensive epigenomic profiling reveals the extent of disease-specific chromatin states and informs target discovery in ankylosing spondylitis. CELL GENOMICS 2023; 3:100306. [PMID: 37388915 PMCID: PMC10300554 DOI: 10.1016/j.xgen.2023.100306] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 01/30/2023] [Accepted: 03/27/2023] [Indexed: 07/01/2023]
Abstract
Ankylosing spondylitis (AS) is a common, highly heritable inflammatory arthritis characterized by enthesitis of the spine and sacroiliac joints. Genome-wide association studies (GWASs) have revealed more than 100 genetic associations whose functional effects remain largely unresolved. Here, we present a comprehensive transcriptomic and epigenomic map of disease-relevant blood immune cell subsets from AS patients and healthy controls. We find that, while CD14+ monocytes and CD4+ and CD8+ T cells show disease-specific differences at the RNA level, epigenomic differences are only apparent upon multi-omics integration. The latter reveals enrichment at disease-associated loci in monocytes. We link putative functional SNPs to genes using high-resolution Capture-C at 10 loci, including PTGER4 and ETS1, and show how disease-specific functional genomic data can be integrated with GWASs to enhance therapeutic target discovery. This study combines epigenetic and transcriptional analysis with GWASs to identify disease-relevant cell types and gene regulation of likely pathogenic relevance and prioritize drug targets.
Collapse
Affiliation(s)
- Andrew C. Brown
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Carla J. Cohen
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DS, UK
- Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Olga Mielczarek
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
- Horizon Discovery (PerkinElmer) Cambridge Research Park, 8100 Beach Dr., Waterbeach, Cambridge CB25 9TL, UK
| | - Gabriele Migliorini
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, UK
| | - Félicie Costantino
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
- UVSQ, INSERM UMR1173, Infection et Inflammation, Laboratory of Excellence INFLAMEX, Université Paris-Saclay, Paris, France
- Rheumatology Department, AP-HP, Ambroise Paré Hospital, Paris, France
| | - Alice Allcock
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Connor Davidson
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
- Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | | | - Hai Fang
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Centre for Translational Medicine at Shanghai, Ruijin Hospital affiliated with Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| | - Alicia Lledó Lara
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Alice C. Martin
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Julie A. Osgood
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Anna Sanniti
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Giuseppe Scozzafava
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Matteo Vecellio
- Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
- Centro Ricerche Fondazione Italiana Ricerca sull’Artrite (FIRA), Fondazione Pisana per la Scienza ONLUS, Via Ferruccio Giovannini 13, 56017 San Giuliano Terme (Pisa), Italy
| | - Ping Zhang
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
- Chinese Academy of Medical Sciences Oxford Institute, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
| | - Mary Helen Black
- Data Science, Population Analytics, Janssen R&D, Spring House, PA 19002, USA
| | - Shuwei Li
- Data Science, Population Analytics, Janssen R&D, Spring House, PA 19002, USA
| | - Dongnhu Truong
- Data Science, Population Analytics, Janssen R&D, Spring House, PA 19002, USA
| | - Julio Molineros
- Data Science, Population Analytics, Janssen R&D, Spring House, PA 19002, USA
| | - Trevor Howe
- Data Science, External Innovation, Janssen R&D, London W1G 0BG, UK
| | - B. Paul Wordsworth
- Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
- National Institute for Health Research, Comprehensive Biomedical Research Centre, Oxford OX4 2PG, UK
| | - Paul Bowness
- Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
- National Institute for Health Research, Comprehensive Biomedical Research Centre, Oxford OX4 2PG, UK
| | - Julian C. Knight
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
- Chinese Academy of Medical Sciences Oxford Institute, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
- National Institute for Health Research, Comprehensive Biomedical Research Centre, Oxford OX4 2PG, UK
| |
Collapse
|
50
|
Nott A, Holtman IR. Genetic insights into immune mechanisms of Alzheimer's and Parkinson's disease. Front Immunol 2023; 14:1168539. [PMID: 37359515 PMCID: PMC10285485 DOI: 10.3389/fimmu.2023.1168539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 04/17/2023] [Indexed: 06/28/2023] Open
Abstract
Microglia, the macrophages of the brain, are vital for brain homeostasis and have been implicated in a broad range of brain disorders. Neuroinflammation has gained traction as a possible therapeutic target for neurodegeneration, however, the precise function of microglia in specific neurodegenerative disorders is an ongoing area of research. Genetic studies offer valuable insights into understanding causality, rather than merely observing a correlation. Genome-wide association studies (GWAS) have identified many genetic loci that are linked to susceptibility to neurodegenerative disorders. (Post)-GWAS studies have determined that microglia likely play an important role in the development of Alzheimer's disease (AD) and Parkinson's disease (PD). The process of understanding how individual GWAS risk loci affect microglia function and mediate susceptibility is complex. A rapidly growing number of publications with genomic datasets and computational tools have formulated new hypotheses that guide the biological interpretation of AD and PD genetic risk. In this review, we discuss the key concepts and challenges in the post-GWAS interpretation of AD and PD GWAS risk alleles. Post-GWAS challenges include the identification of target cell (sub)type(s), causal variants, and target genes. Crucially, the prediction of GWAS-identified disease-risk cell types, variants and genes require validation and functional testing to understand the biological consequences within the pathology of the disorders. Many AD and PD risk genes are highly pleiotropic and perform multiple important functions that might not be equally relevant for the mechanisms by which GWAS risk alleles exert their effect(s). Ultimately, many GWAS risk alleles exert their effect by changing microglia function, thereby altering the pathophysiology of these disorders, and hence, we believe that modelling this context is crucial for a deepened understanding of these disorders.
Collapse
Affiliation(s)
- Alexi Nott
- Department of Brain Sciences, Imperial College London, London, United Kingdom
- UK Dementia Research Institute, Imperial College London, London, United Kingdom
| | - Inge R. Holtman
- Department of Biomedical Sciences of Cells and Systems, Section Molecular Neurobiology, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
| |
Collapse
|