1
|
Shepherdson JL, Granas DM, Li J, Shariff Z, Plassmeyer SP, Holehouse AS, White MA, Cohen BA. Mutational scanning of CRX classifies clinical variants and reveals biochemical properties of the transcriptional effector domain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.21.585809. [PMID: 38585983 PMCID: PMC10996540 DOI: 10.1101/2024.03.21.585809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Cone-Rod Homeobox, encoded by CRX, is a transcription factor (TF) essential for the terminal differentiation and maintenance of mammalian photoreceptors. Structurally, CRX comprises an ordered DNA-binding homeodomain and an intrinsically disordered transcriptional effector domain. Although a handful of human variants in CRX have been shown to cause several different degenerative retinopathies with varying cone and rod predominance, as with most human disease genes the vast majority of observed CRX genetic variants are uncharacterized variants of uncertain significance (VUS). We performed a deep mutational scan (DMS) of nearly all possible single amino acid substitution variants in CRX, using an engineered cell-based transcriptional reporter assay. We measured the ability of each CRX missense variant to transactivate a synthetic fluorescent reporter construct in a pooled fluorescence-activated cell sorting assay and compared the activation strength of each variant to that of wild-type CRX to compute an activity score, identifying thousands of variants with altered transcriptional activity. We calculated a statistical confidence for each activity score derived from multiple independent measurements of each variant marked by unique sequence barcodes, curating a high-confidence list of nearly 2,000 variants with significantly altered transcriptional activity compared to wild-type CRX. We evaluated the performance of the DMS assay as a clinical variant classification tool using gold-standard classified human variants from ClinVar, and determined that activity scores could be used to identify pathogenic variants with high specificity. That this performance could be achieved using a synthetic reporter assay in a foreign cell type, even for a highly cell type-specific TF like CRX, suggests that this approach shows promise for DMS of other TFs that function in cell types that are not easily accessible. Per-position average activity scores closely aligned to a predicted structure of the ordered homeodomain and demonstrated position-specific residue requirements. The intrinsically disordered transcriptional effector domain, by contrast, displayed a qualitatively different pattern of substitution effects, following compositional constraints without specific residue position requirements in the peptide chain. The observed compositional constraints of the effector domain were consistent with the acidic exposure model of transcriptional activation. Together, the results of the CRX DMS identify molecular features of the CRX effector domain and demonstrate clinical utility for variant classification.
Collapse
Affiliation(s)
- James L. Shepherdson
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
| | - David M. Granas
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
| | - Jie Li
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
| | - Zara Shariff
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
| | - Stephen P. Plassmeyer
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
- Center for Biomolecular Condensates, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
| | - Alex S. Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
- Center for Biomolecular Condensates, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
| | - Michael A. White
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
| | - Barak A. Cohen
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
| |
Collapse
|
2
|
Shepherdson JL, Friedman RZ, Zheng Y, Sun C, Oh IY, Granas DM, Cohen BA, Chen S, White MA. Pathogenic variants in CRX have distinct cis-regulatory effects on enhancers and silencers in photoreceptors. Genome Res 2024; 34:243-255. [PMID: 38355306 PMCID: PMC10984388 DOI: 10.1101/gr.278133.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 02/01/2024] [Indexed: 02/16/2024]
Abstract
Dozens of variants in the gene for the homeodomain transcription factor (TF) cone-rod homeobox (CRX) are linked with human blinding diseases that vary in their severity and age of onset. How different variants in this single TF alter its function in ways that lead to a range of phenotypes is unclear. We characterized the effects of human disease-causing variants on CRX cis-regulatory function by deploying massively parallel reporter assays (MPRAs) in mouse retina explants carrying knock-ins of two variants, one in the DNA-binding domain (p.R90W) and the other in the transcriptional effector domain (p.E168d2). The degree of reporter gene dysregulation in these mutant Crx retinas corresponds with their phenotypic severity. The two variants affect similar sets of enhancers, and p.E168d2 has distinct effects on silencers. Cis-regulatory elements (CREs) near cone photoreceptor genes are enriched for silencers that are derepressed in the presence of p.E168d2. Chromatin environments of CRX-bound loci are partially predictive of episomal MPRA activity, and distal elements whose accessibility increases later in retinal development are enriched for CREs with silencer activity. We identified a set of potentially pleiotropic regulatory elements that convert from silencers to enhancers in retinas that lack a functional CRX effector domain. Our findings show that phenotypically distinct variants in different domains of CRX have partially overlapping effects on its cis-regulatory function, leading to misregulation of similar sets of enhancers while having a qualitatively different impact on silencers.
Collapse
Affiliation(s)
- James L Shepherdson
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Ryan Z Friedman
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Yiqiao Zheng
- Department of Ophthalmology and Visual Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Chi Sun
- Department of Ophthalmology and Visual Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Inez Y Oh
- Department of Ophthalmology and Visual Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - David M Granas
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Barak A Cohen
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Shiming Chen
- Department of Ophthalmology and Visual Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA;
- Department of Developmental Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Michael A White
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA;
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| |
Collapse
|
3
|
Zheng Y, Stormo GD, Chen S. Aberrant homeodomain-DNA cooperative dimerization underlies distinct developmental defects in two dominant CRX retinopathy models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.12.584677. [PMID: 38559186 PMCID: PMC10979960 DOI: 10.1101/2024.03.12.584677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Paired-class homeodomain transcription factors (HD TFs) play essential roles in vertebrate development, and their mutations are linked to human diseases. One unique feature of paired-class HD is cooperative dimerization on specific palindrome DNA sequences. Yet, the functional significance of HD cooperative dimerization in animal development and its dysregulation in diseases remain elusive. Using the retinal TF Cone-rod Homeobox (CRX) as a model, we have studied how blindness-causing mutations in the paired HD, p.E80A and p.K88N, alter CRX's cooperative dimerization, lead to gene misexpression and photoreceptor developmental deficits in dominant manners. CRXE80A maintains binding at monomeric WT CRX motifs but is deficient in cooperative binding at dimeric motifs. CRXE80A's cooperativity defect impacts the exponential increase of photoreceptor gene expression in terminal differentiation and produces immature, non-functional photoreceptors in the CrxE80A retinas. CRXK88N is highly cooperative and localizes to ectopic genomic sites with strong enrichment of dimeric HD motifs. CRXK88N's altered biochemical properties disrupt CRX's ability to direct dynamic chromatin remodeling during development to activate photoreceptor differentiation programs and silence progenitor programs. Our study here provides in vitro and in vivo molecular evidence that paired-class HD cooperative dimerization regulates neuronal development and dysregulation of cooperative binding contributes to severe dominant blinding retinopathies.
Collapse
Affiliation(s)
- Yiqiao Zheng
- Molecular Genetics and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Washington University in St Louis, Saint Louis, Missouri, 63110, USA
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, Missouri, 63110, USA
| | - Gary D. Stormo
- Department of Genetics, Washington University in St Louis, Saint Louis, Missouri, 63110, USA
| | - Shiming Chen
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, Missouri, 63110, USA
- Department of Developmental Biology, Washington University in St Louis, Saint Louis, Missouri, 63110, USA
| |
Collapse
|
4
|
Lim B, Domsch K, Mall M, Lohmann I. Canalizing cell fate by transcriptional repression. Mol Syst Biol 2024; 20:144-161. [PMID: 38302581 PMCID: PMC10912439 DOI: 10.1038/s44320-024-00014-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 11/28/2023] [Accepted: 12/15/2023] [Indexed: 02/03/2024] Open
Abstract
Precision in the establishment and maintenance of cellular identities is crucial for the development of multicellular organisms and requires tight regulation of gene expression. While extensive research has focused on understanding cell type-specific gene activation, the complex mechanisms underlying the transcriptional repression of alternative fates are not fully understood. Here, we provide an overview of the repressive mechanisms involved in cell fate regulation. We discuss the molecular machinery responsible for suppressing alternative fates and highlight the crucial role of sequence-specific transcription factors (TFs) in this process. Depletion of these TFs can result in unwanted gene expression and increased cellular plasticity. We suggest that these TFs recruit cell type-specific repressive complexes to their cis-regulatory elements, enabling them to modulate chromatin accessibility in a context-dependent manner. This modulation effectively suppresses master regulators of alternative fate programs and their downstream targets. The modularity and dynamic behavior of these repressive complexes enables a limited number of repressors to canalize and maintain major and minor cell fate decisions at different stages of development.
Collapse
Affiliation(s)
- Bryce Lim
- Cell Fate Engineering and Disease Modeling Group, German Cancer Research Center (DKFZ) and DKFZ-ZMBH Alliance, 69120, Heidelberg, Germany
- HITBR Hector Institute for Translational Brain Research gGmbH, 69120, Heidelberg, Germany
- Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, 68159, Mannheim, Germany
| | - Katrin Domsch
- Heidelberg University, Centre for Organismal Studies (COS) Heidelberg, Department of Developmental Biology and Cell Networks - Cluster of Excellence, Heidelberg, Germany
| | - Moritz Mall
- Cell Fate Engineering and Disease Modeling Group, German Cancer Research Center (DKFZ) and DKFZ-ZMBH Alliance, 69120, Heidelberg, Germany.
- HITBR Hector Institute for Translational Brain Research gGmbH, 69120, Heidelberg, Germany.
- Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, 68159, Mannheim, Germany.
| | - Ingrid Lohmann
- Heidelberg University, Centre for Organismal Studies (COS) Heidelberg, Department of Developmental Biology and Cell Networks - Cluster of Excellence, Heidelberg, Germany.
| |
Collapse
|
5
|
Zheng Y, Chen S. Transcriptional precision in photoreceptor development and diseases - Lessons from 25 years of CRX research. Front Cell Neurosci 2024; 18:1347436. [PMID: 38414750 PMCID: PMC10896975 DOI: 10.3389/fncel.2024.1347436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 01/19/2024] [Indexed: 02/29/2024] Open
Abstract
The vertebrate retina is made up of six specialized neuronal cell types and one glia that are generated from a common retinal progenitor. The development of these distinct cell types is programmed by transcription factors that regulate the expression of specific genes essential for cell fate specification and differentiation. Because of the complex nature of transcriptional regulation, understanding transcription factor functions in development and disease is challenging. Research on the Cone-rod homeobox transcription factor CRX provides an excellent model to address these challenges. In this review, we reflect on 25 years of mammalian CRX research and discuss recent progress in elucidating the distinct pathogenic mechanisms of four CRX coding variant classes. We highlight how in vitro biochemical studies of CRX protein functions facilitate understanding CRX regulatory principles in animal models. We conclude with a brief discussion of the emerging systems biology approaches that could accelerate precision medicine for CRX-linked diseases and beyond.
Collapse
Affiliation(s)
- Yiqiao Zheng
- Molecular Genetics and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Saint Louis, MO, United States
- Department of Ophthalmology and Visual Sciences, Saint Louis, MO, United States
| | - Shiming Chen
- Molecular Genetics and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Saint Louis, MO, United States
- Department of Ophthalmology and Visual Sciences, Saint Louis, MO, United States
- Department of Developmental Biology, Washington University in St. Louis, Saint Louis, MO, United States
| |
Collapse
|
6
|
Loell KJ, Friedman RZ, Myers CA, Corbo JC, Cohen BA, White MA. Transcription factor interactions explain the context-dependent activity of CRX binding sites. PLoS Comput Biol 2024; 20:e1011802. [PMID: 38227575 PMCID: PMC10817189 DOI: 10.1371/journal.pcbi.1011802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 01/26/2024] [Accepted: 01/06/2024] [Indexed: 01/18/2024] Open
Abstract
The effects of transcription factor binding sites (TFBSs) on the activity of a cis-regulatory element (CRE) depend on the local sequence context. In rod photoreceptors, binding sites for the transcription factor (TF) Cone-rod homeobox (CRX) occur in both enhancers and silencers, but the sequence context that determines whether CRX binding sites contribute to activation or repression of transcription is not understood. To investigate the context-dependent activity of CRX sites, we fit neural network-based models to the activities of synthetic CREs composed of photoreceptor TFBSs. The models revealed that CRX binding sites consistently make positive, independent contributions to CRE activity, while negative homotypic interactions between sites cause CREs composed of multiple CRX sites to function as silencers. The effects of negative homotypic interactions can be overcome by the presence of other TFBSs that either interact cooperatively with CRX sites or make independent positive contributions to activity. The context-dependent activity of CRX sites is thus determined by the balance between positive heterotypic interactions, independent contributions of TFBSs, and negative homotypic interactions. Our findings explain observed patterns of activity among genomic CRX-bound enhancers and silencers, and suggest that enhancers may require diverse TFBSs to overcome negative homotypic interactions between TFBSs.
Collapse
Affiliation(s)
- Kaiser J. Loell
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
| | - Ryan Z. Friedman
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
| | - Connie A. Myers
- Department of Pathology and Immunology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
| | - Joseph C. Corbo
- Department of Pathology and Immunology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
| | - Barak A. Cohen
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
| | - Michael A. White
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
| |
Collapse
|
7
|
Shepherdson JL, Friedman RZ, Zheng Y, Sun C, Oh IY, Granas DM, Cohen BA, Chen S, White MA. Pathogenic variants in Crx have distinct cis-regulatory effects on enhancers and silencers in photoreceptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.27.542576. [PMID: 37292699 PMCID: PMC10245955 DOI: 10.1101/2023.05.27.542576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Dozens of variants in the photoreceptor-specific transcription factor (TF) CRX are linked with human blinding diseases that vary in their severity and age of onset. It is unclear how different variants in this single TF alter its function in ways that lead to a range of phenotypes. We examined the effects of human disease-causing variants on CRX cis-regulatory function by deploying massively parallel reporter assays (MPRAs) in live mouse retinas carrying knock-ins of two variants, one in the DNA binding domain (p.R90W) and the other in the transcriptional effector domain (p.E168d2). The degree of reporter gene dysregulation caused by the variants corresponds with their phenotypic severity. The two variants affect similar sets of enhancers, while p.E168d2 has stronger effects on silencers. Cis-regulatory elements (CREs) near cone photoreceptor genes are enriched for silencers that are de-repressed in the presence of p.E168d2. Chromatin environments of CRX-bound loci were partially predictive of episomal MPRA activity, and silencers were notably enriched among distal elements whose accessibility increases later in retinal development. We identified a set of potentially pleiotropic regulatory elements that convert from silencers to enhancers in retinas that lack a functional CRX effector domain. Our findings show that phenotypically distinct variants in different domains of CRX have partially overlapping effects on its cis-regulatory function, leading to misregulation of similar sets of enhancers, while having a qualitatively different impact on silencers.
Collapse
Affiliation(s)
- James L. Shepherdson
- Department of Genetics
- Edison Family Center for Genome Sciences & Systems Biology
| | - Ryan Z. Friedman
- Department of Genetics
- Edison Family Center for Genome Sciences & Systems Biology
| | | | - Chi Sun
- Department of Ophthalmology and Visual Sciences
| | - Inez Y. Oh
- Department of Ophthalmology and Visual Sciences
| | - David M. Granas
- Department of Genetics
- Edison Family Center for Genome Sciences & Systems Biology
| | - Barak A. Cohen
- Department of Genetics
- Edison Family Center for Genome Sciences & Systems Biology
| | - Shiming Chen
- Department of Ophthalmology and Visual Sciences
- Department of Developmental Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
| | - Michael A. White
- Department of Genetics
- Edison Family Center for Genome Sciences & Systems Biology
| |
Collapse
|
8
|
Wang J, Cheng X, Liang Q, Owen LA, Lu J, Zheng Y, Wang M, Chen S, DeAngelis MM, Li Y, Chen R. Single-cell multiomics of the human retina reveals hierarchical transcription factor collaboration in mediating cell type-specific effects of genetic variants on gene regulation. Genome Biol 2023; 24:269. [PMID: 38012720 PMCID: PMC10680294 DOI: 10.1186/s13059-023-03111-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 11/15/2023] [Indexed: 11/29/2023] Open
Abstract
BACKGROUND Systematic characterization of how genetic variation modulates gene regulation in a cell type-specific context is essential for understanding complex traits. To address this question, we profile gene expression and chromatin accessibility in cells from healthy retinae of 20 human donors through single-cell multiomics and genomic sequencing. RESULTS We map eQTL, caQTL, allelic-specific expression, and allelic-specific chromatin accessibility in major retinal cell types. By integrating these results, we identify and characterize regulatory elements and genetic variants effective on gene regulation in individual cell types. The majority of identified sc-eQTLs and sc-caQTLs display cell type-specific effects, while the cis-elements containing genetic variants with cell type-specific effects are often accessible in multiple cell types. Furthermore, the transcription factors whose binding sites are perturbed by genetic variants tend to have higher expression levels in the cell types where the variants exert their effects, compared to the cell types where the variants have no impact. We further validate our findings with high-throughput reporter assays. Lastly, we identify the enriched cell types, candidate causal variants and genes, and cell type-specific regulatory mechanism underlying GWAS loci. CONCLUSIONS Overall, genetic effects on gene regulation are highly context dependent. Our results suggest that cell type-dependent genetic effect is driven by precise modulation of both trans-factor expression and chromatin accessibility of cis-elements. Our findings indicate hierarchical collaboration among transcription factors plays a crucial role in mediating cell type-specific effects of genetic variants on gene regulation.
Collapse
Affiliation(s)
- Jun Wang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Xuesen Cheng
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Qingnan Liang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Leah A Owen
- Department of Ophthalmology and Visual Sciences, John A. Moran Eye Center, University of Utah, Salt Lake City, UT, USA
| | - Jiaxiong Lu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Yiqiao Zheng
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, MO, USA
| | - Meng Wang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Shiming Chen
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, MO, USA
- Department of Developmental Biology, Washington University in St Louis, Saint Louis, MO, USA
| | - Margaret M DeAngelis
- Department of Ophthalmology, University at Buffalo the State University of New York, Buffalo, NY, USA
| | - Yumei Li
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Rui Chen
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
9
|
Li J, Wang J, Ibarra IL, Cheng X, Luecken MD, Lu J, Monavarfeshani A, Yan W, Zheng Y, Zuo Z, Colborn SLZ, Cortez BS, Owen LA, Tran NM, Shekhar K, Sanes JR, Stout JT, Chen S, Li Y, DeAngelis MM, Theis FJ, Chen R. Integrated multi-omics single cell atlas of the human retina. RESEARCH SQUARE 2023:rs.3.rs-3471275. [PMID: 38014002 PMCID: PMC10680922 DOI: 10.21203/rs.3.rs-3471275/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Single-cell sequencing has revolutionized the scale and resolution of molecular profiling of tissues and organs. Here, we present an integrated multimodal reference atlas of the most accessible portion of the mammalian central nervous system, the retina. We compiled around 2.4 million cells from 55 donors, including 1.4 million unpublished data points, to create a comprehensive human retina cell atlas (HRCA) of transcriptome and chromatin accessibility, unveiling over 110 types. Engaging the retina community, we annotated each cluster, refined the Cell Ontology for the retina, identified distinct marker genes, and characterized cis-regulatory elements and gene regulatory networks (GRNs) for these cell types. Our analysis uncovered intriguing differences in transcriptome, chromatin, and GRNs across cell types. In addition, we modeled changes in gene expression and chromatin openness across gender and age. This integrated atlas also enabled the fine-mapping of GWAS and eQTL variants. Accessible through interactive browsers, this multimodal cross-donor and cross-lab HRCA, can facilitate a better understanding of retinal function and pathology.
Collapse
Affiliation(s)
- Jin Li
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States
| | - Jun Wang
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States
| | - Ignacio L Ibarra
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Xuesen Cheng
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States
| | - Malte D Luecken
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
- Institute of Lung Health & Immunity, Helmholtz Munich; Member of the German Center for Lung Research (DZL), Munich, Germany
| | - Jiaxiong Lu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States
- Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, United States
| | - Aboozar Monavarfeshani
- Center for Brain Science and Department of Molecular and Cellular Biology, Harvard University, Cambridge, United States
| | - Wenjun Yan
- Center for Brain Science and Department of Molecular and Cellular Biology, Harvard University, Cambridge, United States
| | - Yiqiao Zheng
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, Missouri, United States
| | - Zhen Zuo
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States
| | | | | | - Leah A Owen
- John A. Moran Eye Center, Department of Ophthalmology and Visual Sciences, University of Utah, Salt Lake City, Utah, United States
| | - Nicholas M Tran
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States
| | - Karthik Shekhar
- Department of Chemical and Biomolecular Engineering; Helen Wills Neuroscience Institute; Center for Computational Biology; California Institute for Quantitative Biosciences, QB3, University of California, Berkeley, Berkeley, California, United States
| | - Joshua R Sanes
- Center for Brain Science and Department of Molecular and Cellular Biology, Harvard University, Cambridge, United States
| | - J Timothy Stout
- Department of Ophthalmology, Cullen Eye Institute, Baylor College of Medicine, Houston, Texas, United States
| | - Shiming Chen
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, Missouri, United States
- Department of Developmental Biology, Washington University in St Louis, Saint Louis, Missouri, United States
| | - Yumei Li
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States
| | - Margaret M DeAngelis
- Department of Ophthalmology, Ross Eye Institute, Jacobs School of Medicine and Biomedical Sciences, State University of New York at Buffalo, Buffalo, New York, United States
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Rui Chen
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States
- Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, United States
| |
Collapse
|
10
|
Friedman RZ, Ramu A, Lichtarge S, Myers CA, Granas DM, Gause M, Corbo JC, Cohen BA, White MA. Active learning of enhancer and silencer regulatory grammar in photoreceptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.21.554146. [PMID: 37662358 PMCID: PMC10473580 DOI: 10.1101/2023.08.21.554146] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Cis-regulatory elements (CREs) direct gene expression in health and disease, and models that can accurately predict their activities from DNA sequences are crucial for biomedicine. Deep learning represents one emerging strategy to model the regulatory grammar that relates CRE sequence to function. However, these models require training data on a scale that exceeds the number of CREs in the genome. We address this problem using active machine learning to iteratively train models on multiple rounds of synthetic DNA sequences assayed in live mammalian retinas. During each round of training the model actively selects sequence perturbations to assay, thereby efficiently generating informative training data. We iteratively trained a model that predicts the activities of sequences containing binding motifs for the photoreceptor transcription factor Cone-rod homeobox (CRX) using an order of magnitude less training data than current approaches. The model's internal confidence estimates of its predictions are reliable guides for designing sequences with high activity. The model correctly identified critical sequence differences between active and inactive sequences with nearly identical transcription factor binding sites, and revealed order and spacing preferences for combinations of motifs. Our results establish active learning as an effective method to train accurate deep learning models of cis-regulatory function after exhausting naturally occurring training examples in the genome.
Collapse
Affiliation(s)
- Ryan Z. Friedman
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Avinash Ramu
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Sara Lichtarge
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Connie A. Myers
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, 63110
| | - David M. Granas
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Maria Gause
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Joseph C. Corbo
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Barak A. Cohen
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Michael A. White
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| |
Collapse
|
11
|
DelRosso N, Tycko J, Suzuki P, Andrews C, Aradhana, Mukund A, Liongson I, Ludwig C, Spees K, Fordyce P, Bassik MC, Bintu L. Large-scale mapping and mutagenesis of human transcriptional effector domains. Nature 2023; 616:365-372. [PMID: 37020022 PMCID: PMC10484233 DOI: 10.1038/s41586-023-05906-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 03/01/2023] [Indexed: 04/07/2023]
Abstract
Human gene expression is regulated by more than 2,000 transcription factors and chromatin regulators1,2. Effector domains within these proteins can activate or repress transcription. However, for many of these regulators we do not know what type of effector domains they contain, their location in the protein, their activation and repression strengths, and the sequences that are necessary for their functions. Here, we systematically measure the effector activity of more than 100,000 protein fragments tiling across most chromatin regulators and transcription factors in human cells (2,047 proteins). By testing the effect they have when recruited at reporter genes, we annotate 374 activation domains and 715 repression domains, roughly 80% of which are new and have not been previously annotated3-5. Rational mutagenesis and deletion scans across all the effector domains reveal aromatic and/or leucine residues interspersed with acidic, proline, serine and/or glutamine residues are necessary for activation domain activity. Furthermore, most repression domain sequences contain sites for small ubiquitin-like modifier (SUMO)ylation, short interaction motifs for recruiting corepressors or are structured binding domains for recruiting other repressive proteins. We discover bifunctional domains that can both activate and repress, some of which dynamically split a cell population into high- and low-expression subpopulations. Our systematic annotation and characterization of effector domains provide a rich resource for understanding the function of human transcription factors and chromatin regulators, engineering compact tools for controlling gene expression and refining predictive models of effector domain function.
Collapse
Affiliation(s)
| | - Josh Tycko
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Peter Suzuki
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Cecelia Andrews
- Department of Developmental Biology, Stanford University, Stanford, CA, USA
| | - Aradhana
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Adi Mukund
- Biophysics Program, Stanford University, Stanford, CA, USA
| | - Ivan Liongson
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Connor Ludwig
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Kaitlyn Spees
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Polly Fordyce
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Bioengineering, Stanford University, Stanford, CA, USA
- ChEM-H Institute, Stanford University, Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | | | - Lacramioara Bintu
- Department of Bioengineering, Stanford University, Stanford, CA, USA.
| |
Collapse
|
12
|
A single-cell massively parallel reporter assay detects cell-type-specific gene regulation. Nat Genet 2023; 55:346-354. [PMID: 36635387 PMCID: PMC9931678 DOI: 10.1038/s41588-022-01278-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 12/05/2022] [Indexed: 01/14/2023]
Abstract
Massively parallel reporter gene assays are key tools in regulatory genomics but cannot be used to identify cell-type-specific regulatory elements without performing assays serially across different cell types. To address this problem, we developed a single-cell massively parallel reporter assay (scMPRA) to measure the activity of libraries of cis-regulatory sequences (CRSs) across multiple cell types simultaneously. We assayed a library of core promoters in a mixture of HEK293 and K562 cells and showed that scMPRA is a reproducible, highly parallel, single-cell reporter gene assay that detects cell-type-specific cis-regulatory activity. We then measured a library of promoter variants across multiple cell types in live mouse retinas and showed that subtle genetic variants can produce cell-type-specific effects on cis-regulatory activity. We anticipate that scMPRA will be widely applicable for studying the role of CRSs across diverse cell types.
Collapse
|
13
|
Pang B, van Weerd JH, Hamoen FL, Snyder MP. Identification of non-coding silencer elements and their regulation of gene expression. Nat Rev Mol Cell Biol 2022; 24:383-395. [DOI: 10.1038/s41580-022-00549-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/29/2022] [Indexed: 11/09/2022]
|
14
|
VandenBosch LS, Luu K, Timms AE, Challam S, Wu Y, Lee AY, Cherry TJ. Machine Learning Prediction of Non-Coding Variant Impact in Human Retinal cis-Regulatory Elements. Transl Vis Sci Technol 2022; 11:16. [PMID: 35435921 PMCID: PMC9034719 DOI: 10.1167/tvst.11.4.16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 03/25/2022] [Indexed: 11/24/2022] Open
Abstract
Purpose Prior studies have demonstrated the significance of specific cis-regulatory variants in retinal disease; however, determining the functional impact of regulatory variants remains a major challenge. In this study, we utilized a machine learning approach, trained on epigenomic data from the adult human retina, to systematically quantify the predicted impact of cis-regulatory variants. Methods We used human retinal DNA accessibility data (ATAC-seq) to determine a set of 18.9k high-confidence, putative cis-regulatory elements. Eighty percent of these elements were used to train a machine learning model utilizing a gapped k-mer support vector machine-based approach. In silico saturation mutagenesis and variant scoring was applied to predict the functional impact of all potential single nucleotide variants within cis-regulatory elements. Impact scores were tested in a 20% hold-out dataset and compared to allele population frequency, phylogenetic conservation, transcription factor (TF) binding motifs, and existing massively parallel reporter assay data. Results We generated a model that distinguishes between human retinal regulatory elements and negative test sequences with 95% accuracy. Among a hold-out test set of 3.7k human retinal CREs, all possible single nucleotide variants were scored. Variants with negative impact scores correlated with higher phylogenetic conservation of the reference allele, disruption of predicted TF binding motifs, and massively parallel reporter expression. Conclusions We demonstrated the utility of human retinal epigenomic data to train a machine learning model for the purpose of predicting the impact of non-coding regulatory sequence variants. Our model accurately scored sequences and predicted putative transcription factor binding motifs. This approach has the potential to expedite the characterization of pathogenic non-coding sequence variants in the context of unexplained retinal disease. Translational Relevance This workflow and resulting dataset serve as a promising genomic tool to facilitate the clinical prioritization of functionally disruptive non-coding mutations in the retina.
Collapse
Affiliation(s)
- Leah S. VandenBosch
- Center for Developmental Biology and Regenerative Medicine, Seattle Children's Research Institute, Seattle, WA, USA
| | - Kelsey Luu
- Center for Developmental Biology and Regenerative Medicine, Seattle Children's Research Institute, Seattle, WA, USA
| | - Andrew E. Timms
- Center for Developmental Biology and Regenerative Medicine, Seattle Children's Research Institute, Seattle, WA, USA
| | - Shriya Challam
- Center for Developmental Biology and Regenerative Medicine, Seattle Children's Research Institute, Seattle, WA, USA
| | - Yue Wu
- University of Washington Department of Ophthalmology, Seattle, WA, USA
| | - Aaron Y. Lee
- University of Washington Department of Ophthalmology, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Timothy J. Cherry
- Center for Developmental Biology and Regenerative Medicine, Seattle Children's Research Institute, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- University of Washington Department of Pediatrics, Seattle, WA, USA
| |
Collapse
|
15
|
Friedman RZ, Granas DM, Myers CA, Corbo JC, Cohen BA, White MA. Information content differentiates enhancers from silencers in mouse photoreceptors. eLife 2021; 10:67403. [PMID: 34486522 PMCID: PMC8492058 DOI: 10.7554/elife.67403] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 09/03/2021] [Indexed: 12/12/2022] Open
Abstract
Enhancers and silencers often depend on the same transcription factors (TFs) and are conflated in genomic assays of TF binding or chromatin state. To identify sequence features that distinguish enhancers and silencers, we assayed massively parallel reporter libraries of genomic sequences targeted by the photoreceptor TF cone-rod homeobox (CRX) in mouse retinas. Both enhancers and silencers contain more TF motifs than inactive sequences, but relative to silencers, enhancers contain motifs from a more diverse collection of TFs. We developed a measure of information content that describes the number and diversity of motifs in a sequence and found that, while both enhancers and silencers depend on CRX motifs, enhancers have higher information content. The ability of information content to distinguish enhancers and silencers targeted by the same TF illustrates how motif context determines the activity of cis-regulatory sequences. Different cell types are established by activating and repressing the activity of specific sets of genes, a process controlled by proteins called transcription factors. Transcription factors work by recognizing and binding short stretches of DNA in parts of the genome called cis-regulatory sequences. A cis-regulatory sequence that increases the activity of a gene when bound by transcription factors is called an enhancer, while a sequence that causes a decrease in gene activity is called a silencer. To establish a cell type, a particular transcription factor will act on both enhancers and silencers that control the activity of different genes. For example, the transcription factor cone-rod homeobox (CRX) is critical for specifying different types of cells in the retina, and it acts on both enhancers and silencers. In rod photoreceptors, CRX activates rod genes by binding their enhancers, while repressing cone photoreceptor genes by binding their silencers. However, CRX always recognizes and binds to the same DNA sequence, known as its binding site, making it unclear why some cis-regulatory sequences bound to CRX act as silencers, while others act as enhancers. Friedman et al. sought to understand how enhancers and silencers, both bound by CRX, can have different effects on the genes they control. Since both enhancers and silencers contain CRX binding sites, the difference between the two must lie in the sequence of the DNA surrounding these binding sites. Using retinas that have been explanted from mice and kept alive in the laboratory, Friedman et al. tested the activity of thousands of CRX-binding sequences from the mouse genome. This showed that both enhancers and silencers have more copies of CRX-binding sites than sequences of the genome that are inactive. Additionally, the results revealed that enhancers have a diverse collection of binding sites for other transcription factors, while silencers do not. Friedman et al. developed a new metric they called information content, which captures the diverse combinations of different transcription binding sites that cis-regulatory sequences can have. Using this metric, Friedman et al. showed that it is possible to distinguish enhancers from silencers based on their information content. It is critical to understand how the DNA sequences of cis-regulatory regions determine their activity, because mutations in these regions of the genome can cause disease. However, since every person has thousands of benign mutations in cis-regulatory sequences, it is a challenge to identify specific disease-causing mutations, which are relatively rare. One long-term goal of models of enhancers and silencers, such as Friedman et al.’s information content model, is to understand how mutations can affect cis-regulatory sequences, and, in some cases, lead to disease.
Collapse
Affiliation(s)
- Ryan Z Friedman
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, United States.,Department of Genetics, Washington University School of Medicine, St. Louis, United States
| | - David M Granas
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, United States.,Department of Genetics, Washington University School of Medicine, St. Louis, United States
| | - Connie A Myers
- Department of Pathology and Immunology, Washington University School of Medicine, St Louis, United States
| | - Joseph C Corbo
- Department of Pathology and Immunology, Washington University School of Medicine, St Louis, United States
| | - Barak A Cohen
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, United States.,Department of Genetics, Washington University School of Medicine, St. Louis, United States
| | - Michael A White
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, United States.,Department of Genetics, Washington University School of Medicine, St. Louis, United States
| |
Collapse
|