1
|
Bower G, Kvon EZ. Genetic factors mediating long-range enhancer-promoter communication in mammalian development. Curr Opin Genet Dev 2025; 90:102282. [PMID: 39579740 DOI: 10.1016/j.gde.2024.102282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 10/20/2024] [Accepted: 10/28/2024] [Indexed: 11/25/2024]
Abstract
Enhancers are remotely located noncoding DNA sequences that regulate gene expression in response to developmental, homeostatic, and environmental cues. Canonical short-range enhancers located <50 kb from their cognate promoters function by binding transcription factors, coactivators, and chromatin modifiers. In this review, we discuss recent evidence that medium-range (50-400 kb) and long-range (>400 kb) enhancers rely on additional mechanisms, including cohesin, CCCTC-binding factor, and high-affinity protein-protein interactions. These mechanisms are crucial for establishing the physical proximity and interaction between enhancers and their target promoters over extended genomic distances and ensuring robust gene activation during mammalian development. Future studies will be critical to unravel their prevalence and evolutionary significance across various genomic loci, cell types, and species.
Collapse
Affiliation(s)
- Grace Bower
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92967, USA. https://twitter.com/@gracecbower
| | - Evgeny Z Kvon
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92967, USA.
| |
Collapse
|
2
|
Kassouf MT, Francis HS, Gosden M, Suciu MC, Downes DJ, Harrold C, Larke M, Oudelaar M, Cornell L, Blayney J, Telenius J, Xella B, Shen Y, Sousos N, Sharpe JA, Sloane-Stanley J, Smith AJH, Babbs C, Hughes JR, Higgs DR. The α-globin super-enhancer acts in an orientation-dependent manner. Nat Commun 2025; 16:1033. [PMID: 39863595 PMCID: PMC11762767 DOI: 10.1038/s41467-025-56380-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Accepted: 01/16/2025] [Indexed: 01/27/2025] Open
Abstract
Individual enhancers are defined as short genomic regulatory elements, bound by transcription factors, and able to activate cell-specific gene expression at a distance, in an orientation-independent manner. Within mammalian genomes, enhancer-like elements may be found individually or within clusters referred to as locus control regions or super-enhancers (SEs). While these behave similarly to individual enhancers with respect to cell specificity, distribution and distance, their orientation-dependence has not been formally tested. Here, using the α-globin locus as a model, we show that while an individual enhancer works in an orientation-independent manner, the direction of activity of a SE changes with its orientation. When the SE is inverted within its normal chromosomal context, expression of its normal targets, the α-globin genes, is severely reduced and the normally silent genes lying upstream of the α-globin locus are upregulated. These findings add to our understanding of enhancer-promoter specificity that precisely activate transcription.
Collapse
Affiliation(s)
- Mira T Kassouf
- Gene Regulation Laboratory, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK.
| | - Helena S Francis
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Matthew Gosden
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Maria C Suciu
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Damien J Downes
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Caroline Harrold
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Martin Larke
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Marieke Oudelaar
- Max Planck Institute for Multidisciplinary Sciences, 37077, Gottingen, Germany
| | - Lucy Cornell
- Gene Regulation Laboratory, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Joseph Blayney
- Gene Regulation Laboratory, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Jelena Telenius
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Barbara Xella
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Yuki Shen
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Nikolaos Sousos
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Jacqueline A Sharpe
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Jacqueline Sloane-Stanley
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Andrew J H Smith
- Institute for Regeneration and Repair, MRC Centre for Regenerative Medicine, University of Edinburgh, Edinburgh, Scotland, EH16 4UU, UK
| | - Christian Babbs
- Gene Regulation Laboratory, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Jim R Hughes
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK
| | - Douglas R Higgs
- Gene Regulation Laboratory, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK.
| |
Collapse
|
3
|
Du Y, Yang Y, Zheng B, Zhang Q, Zhou S, Zhao L. Finding a needle in a haystack: functional screening for novel targets in cancer immunology and immunotherapies. Oncogene 2025:10.1038/s41388-025-03273-8. [PMID: 39863748 DOI: 10.1038/s41388-025-03273-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 12/06/2024] [Accepted: 01/14/2025] [Indexed: 01/27/2025]
Abstract
Genome-wide functional genetic screening has been widely used in the biomedicine field, which makes it possible to find a needle in a haystack at the genetic level. In cancer research, gene mutations are closely related to tumor development, metastasis, and recurrence, and the use of state-of-the-art powerful screening technologies, such as clustered regularly interspaced short palindromic repeat (CRISPR), to search for the most critical genes or coding products provides us with a new possibility to further refine the cancer mapping and provide new possibilities for the treatment of cancer patients. The use of CRISPR screening for the most critical genes or coding products has further refined the cancer atlas and provided new possibilities for the treatment of cancer patients. Immunotherapy, as a highly promising cancer treatment method, has been widely validated in the clinic, but it could only meet the needs of a small proportion of cancer patients. Finding new immunotherapy targets is the key to the future of tumor immunotherapy. Here, we revisit the application of functional screening in cancer immunology from different perspectives, from the selection of diverse in vitro and in vivo screening models to the screening of potential immune checkpoints and potentiating genes for CAR-T cells. The data will offer fresh therapeutic clues for cancer patients.
Collapse
Affiliation(s)
- Yi Du
- Department of Obstetrics and Gynecology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE, West China Second Hospital, State Key Laboratory of Biotherapy, and Department of Neurosurgery, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, P. R. China
| | - Yang Yang
- Department of Obstetrics and Gynecology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE, West China Second Hospital, State Key Laboratory of Biotherapy, and Department of Neurosurgery, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, P. R. China
| | - Bohao Zheng
- Department of Obstetrics and Gynecology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE, West China Second Hospital, State Key Laboratory of Biotherapy, and Department of Neurosurgery, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, P. R. China
- Wuxi School of Medicine, Jiangnan University, Wuxi, Jiangsu, China
| | - Qian Zhang
- Department of Obstetrics and Gynecology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE, West China Second Hospital, State Key Laboratory of Biotherapy, and Department of Neurosurgery, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, P. R. China.
| | - Shengtao Zhou
- Department of Obstetrics and Gynecology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE, West China Second Hospital, State Key Laboratory of Biotherapy, and Department of Neurosurgery, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, P. R. China.
| | - Linjie Zhao
- Department of Obstetrics and Gynecology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE, West China Second Hospital, State Key Laboratory of Biotherapy, and Department of Neurosurgery, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, P. R. China.
| |
Collapse
|
4
|
Metzner E, Southard KM, Norman TM. Multiome Perturb-seq unlocks scalable discovery of integrated perturbation effects on the transcriptome and epigenome. Cell Syst 2025; 16:101161. [PMID: 39689711 PMCID: PMC11738662 DOI: 10.1016/j.cels.2024.12.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 10/14/2024] [Accepted: 12/04/2024] [Indexed: 12/19/2024]
Abstract
Single-cell CRISPR screens link genetic perturbations to transcriptional states, but high-throughput methods connecting these induced changes to their regulatory foundations are limited. Here, we introduce Multiome Perturb-seq, extending single-cell CRISPR screens to simultaneously measure perturbation-induced changes in gene expression and chromatin accessibility. We apply Multiome Perturb-seq in a CRISPRi screen of 13 chromatin remodelers in human RPE-1 cells, achieving efficient assignment of sgRNA identities to single nuclei via an improved method for capturing barcode transcripts from nuclear RNA. We organize expression and accessibility measurements into coherent programs describing the integrated effects of perturbations on cell state, finding that ARID1A and SUZ12 knockdowns induce programs enriched for developmental features. Modeling of perturbation-induced heterogeneity connects accessibility changes to changes in gene expression, highlighting the value of multimodal profiling. Overall, our method provides a scalable and simply implemented system to dissect the regulatory logic underpinning cell state. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Eli Metzner
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY 10065, USA
| | - Kaden M Southard
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Thomas M Norman
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
| |
Collapse
|
5
|
Linder J, Srivastava D, Yuan H, Agarwal V, Kelley DR. Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation. Nat Genet 2025:10.1038/s41588-024-02053-6. [PMID: 39779956 DOI: 10.1038/s41588-024-02053-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 12/04/2024] [Indexed: 01/11/2025]
Abstract
Sequence-based machine-learning models trained on genomics data improve genetic variant interpretation by providing functional predictions describing their impact on the cis-regulatory code. However, current tools do not predict RNA-seq expression profiles because of modeling challenges. Here, we introduce Borzoi, a model that learns to predict cell-type-specific and tissue-specific RNA-seq coverage from DNA sequence. Using statistics derived from Borzoi's predicted coverage, we isolate and accurately score DNA variant effects across multiple layers of regulation, including transcription, splicing and polyadenylation. Evaluated on quantitative trait loci, Borzoi is competitive with and often outperforms state-of-the-art models trained on individual regulatory functions. By applying attribution methods to the derived statistics, we extract cis-regulatory motifs driving RNA expression and post-transcriptional regulation in normal tissues. The wide availability of RNA-seq data across species, conditions and assays profiling specific aspects of regulation emphasizes the potential of this approach to decipher the mapping from DNA sequence to regulatory function.
Collapse
Affiliation(s)
| | | | - Han Yuan
- Calico Life Sciences LLC, South San Francisco, CA, USA
| | - Vikram Agarwal
- mRNA Center of Excellence, Sanofi Pasteur Inc., Cambridge, MA, USA
| | | |
Collapse
|
6
|
Dunn J, Moore C, Kim NS, Gao T, Cheng Z, Jin P, Ming GL, Qian J, Su Y, Song H, Zhu H. Transcription Factor-Wide Association Studies to Identify Functional SNPs in Alzheimer's Disease. J Neurosci 2025; 45:e1800242024. [PMID: 39622643 PMCID: PMC11714347 DOI: 10.1523/jneurosci.1800-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Revised: 11/01/2024] [Accepted: 11/08/2024] [Indexed: 12/12/2024] Open
Abstract
Alzheimer's disease (AD) is a progressive neurodegenerative disorder with profound global impact. While genome-wide association studies (GWAS) have revealed genomic variants linked to AD, their translational impact has been limited due to challenges in interpreting the identified genetic associations. To address this challenge, we have devised a novel approach termed transcription factor-wide association studies (TF-WAS). By integrating the GWAS, expression quantitative trait loci, and transcriptome analyses, we selected 30 AD single nucleotide polymorphisms (SNPs) in noncoding regions that are likely to be functional. Using human transcription factor (TF) microarrays, we have identified 90 allele-specific TF interactions with 53 unique TFs. We then focused on several interactions involving SMAD4 and further validated them using electrophoretic mobility shift assay, luciferase, and chromatin immunoprecipitation on engineered genetic backgrounds (female cells). This approach holds promise for unraveling the intricacies of not just AD, but any complex disease with available GWAS data, providing insight into underlying molecular mechanisms and clues toward potential therapeutic targets.
Collapse
Affiliation(s)
- Jessica Dunn
- Department of Pharmacology, Johns Hopkins University, Baltimore, Maryland 21205
| | - Cedric Moore
- Department of Pharmacology, Johns Hopkins University, Baltimore, Maryland 21205
| | - Nam-Shik Kim
- Department of Neuroscience and Mahoney Institute for Neurosciences, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - Tianshun Gao
- Department of Ophthalmology, Johns Hopkins University, Baltimore, Maryland 21205
| | - Zhiqiang Cheng
- Department of Pharmacology, Johns Hopkins University, Baltimore, Maryland 21205
| | - Peng Jin
- Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia 30322
| | - Guo-Li Ming
- Department of Neuroscience and Mahoney Institute for Neurosciences, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - Jiang Qian
- Department of Ophthalmology, Johns Hopkins University, Baltimore, Maryland 21205
| | - Yijing Su
- Department of Neuroscience and Mahoney Institute for Neurosciences, University of Pennsylvania, Philadelphia, Pennsylvania 19104
- Department of Oral Medicine, School of Dental Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - Hongjun Song
- Department of Neuroscience and Mahoney Institute for Neurosciences, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - Heng Zhu
- Department of Pharmacology, Johns Hopkins University, Baltimore, Maryland 21205
| |
Collapse
|
7
|
Cheng W, Song Z, Zhang Y, Wang S, Wang D, Yang M, Li L, Ma J. DNALongBench: A Benchmark Suite for Long-Range DNA Prediction Tasks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.06.631595. [PMID: 39829833 PMCID: PMC11741265 DOI: 10.1101/2025.01.06.631595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
Modeling long-range DNA dependencies is crucial for understanding genome structure and function across a wide range of biological contexts. However, effectively capturing these extensive dependencies, which may span millions of base pairs in tasks such as three-dimensional (3D) chromatin folding prediction, remains a significant challenge. Furthermore, a comprehensive benchmark suite for evaluating tasks that rely on long-range dependencies is notably absent. To address this gap, we introduce DNALongBench, a benchmark dataset encompassing five important genomics tasks that consider long-range dependencies up to 1 million base pairs: enhancer-target gene interaction, expression quantitative trait loci, 3D genome organization, regulatory sequence activity, and transcription initiation signals. To comprehensively assess DNALongBench, we evaluate the performance of five methods: a task-specific expert model, a convolutional neural network (CNN)-based model, and three fine-tuned DNA foundation models - HyenaDNA, Caduceus-Ph, and Caduceus-PS. We envision DNALongBench as a standardized resource with the potential to facilitate comprehensive comparisons and rigorous evaluations of emerging DNA sequence-based deep learning models that account for long-range dependencies.
Collapse
Affiliation(s)
- Wenduo Cheng
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Zhenqiao Song
- Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Yang Zhang
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Shike Wang
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Danqing Wang
- Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Muyu Yang
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Lei Li
- Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Jian Ma
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
8
|
Conery M, Pippin JA, Wagley Y, Trang K, Pahl MC, Villani DA, Favazzo LJ, Ackert-Bicknell CL, Zuscik MJ, Katsevich E, Wells AD, Zemel BS, Voight BF, Hankenson KD, Chesi A, Grant SF. GWAS-Informed data integration and non-coding CRISPRi screen illuminate genetic etiology of bone mineral density. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.19.585778. [PMID: 38562830 PMCID: PMC10983984 DOI: 10.1101/2024.03.19.585778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Over 1,100 independent signals have been identified with genome-wide association studies (GWAS) for bone mineral density (BMD), a key risk factor for mortality-increasing fragility fractures; however, the effector gene(s) for most remain unknown. Informed by a variant-to-gene mapping strategy implicating 89 non-coding elements predicted to regulate osteoblast gene expression at BMD GWAS loci, we executed a single-cell CRISPRi screen in human fetal osteoblasts (hFOBs). The BMD relevance of hFOBs was supported by heritability enrichment from stratified LD-score regression involving 98 cell types grouped into 15 tissues. 23 genes showed perturbation in the screen, with four (ARID5B, CC2D1B, EIF4G2, and NCOA3) exhibiting consistent effects upon siRNA knockdown on three measures of osteoblast maturation and mineralization. Lastly, additional heritability enrichments, genetic correlations, and multi-trait fine-mapping revealed unexpectedly that many BMD GWAS signals are pleiotropic and likely mediate their effects via non-bone tissues. Extending our CRISPRi screening approach to these tissues could play a key role in fully elucidating the etiology of BMD.
Collapse
Affiliation(s)
- Mitchell Conery
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - James A. Pippin
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Yadav Wagley
- Department of Orthopaedic Surgery, University of Michigan Medical School, Ann Arbor, MI 48109
| | - Khanh Trang
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Matthew C. Pahl
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - David A. Villani
- Colorado Program for Musculoskeletal Research, University of Colorado Anschutz Medical Campus, Aurora, CO
- Cell Biology, Stems Cells and Development Ph.D. Program, University of Colorado Anschutz Medical Campus, Aurora, CO
| | - Lacey J. Favazzo
- Colorado Program for Musculoskeletal Research, University of Colorado Anschutz Medical Campus, Aurora, CO
- Department of Orthopedics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States
- University of Colorado Interdisciplinary Joint Biology Program, University of Colorado Anschutz Medical Campus, Aurora, CO
| | - Cheryl L. Ackert-Bicknell
- Colorado Program for Musculoskeletal Research, University of Colorado Anschutz Medical Campus, Aurora, CO
- Department of Orthopedics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States
- University of Colorado Interdisciplinary Joint Biology Program, University of Colorado Anschutz Medical Campus, Aurora, CO
| | - Michael J. Zuscik
- Colorado Program for Musculoskeletal Research, University of Colorado Anschutz Medical Campus, Aurora, CO
- Department of Orthopedics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States
- University of Colorado Interdisciplinary Joint Biology Program, University of Colorado Anschutz Medical Campus, Aurora, CO
| | - Eugene Katsevich
- Department of Statistics and Data Science, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Andrew D. Wells
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Babette S. Zemel
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Division of Gastroenterology, Hepatology and Nutrition, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Benjamin F. Voight
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Institute of Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Kurt D. Hankenson
- Department of Orthopaedic Surgery, University of Michigan Medical School, Ann Arbor, MI 48109
| | - Alessandra Chesi
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Struan F.A. Grant
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Institute of Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Division of Endocrinology and Diabetes, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| |
Collapse
|
9
|
Wan J, van Ouwerkerk A, Mouren JC, Heredia C, Pradel L, Ballester B, Andrau JC, Spicuglia S. Comprehensive mapping of genetic variation at Epromoters reveals pleiotropic association with multiple disease traits. Nucleic Acids Res 2024:gkae1270. [PMID: 39727170 DOI: 10.1093/nar/gkae1270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 10/28/2024] [Accepted: 12/19/2024] [Indexed: 12/28/2024] Open
Abstract
There is growing evidence that a wide range of human diseases and physiological traits are influenced by genetic variation of cis-regulatory elements. We and others have shown that a subset of promoter elements, termed Epromoters, also function as enhancer regulators of distal genes. This opens a paradigm in the study of regulatory variants, as single nucleotide polymorphisms (SNPs) within Epromoters might influence the expression of several (distal) genes at the same time, which could disentangle the identification of disease-associated genes. Here, we built a comprehensive resource of human Epromoters using newly generated and publicly available high-throughput reporter assays. We showed that Epromoters display intrinsic and epigenetic features that distinguish them from typical promoters. By integrating Genome-Wide Association Studies (GWAS), expression Quantitative Trait Loci (eQTLs) and 3D chromatin interactions, we found that regulatory variants at Epromoters are concurrently associated with more disease and physiological traits, as compared with typical promoters. To dissect the regulatory impact of Epromoter variants, we evaluated their impact on regulatory activity by analyzing allelic-specific high-throughput reporter assays and provided reliable examples of pleiotropic Epromoters. In summary, our study represents a comprehensive resource of regulatory variants supporting the pleiotropic role of Epromoters.
Collapse
Affiliation(s)
- Jing Wan
- Aix-Marseille University, INSERM, TAGC, UMR 1090 Marseille, France
- Equipe Labellisée LIGUE, 2023 Marseille, France
| | - Antoinette van Ouwerkerk
- Aix-Marseille University, INSERM, TAGC, UMR 1090 Marseille, France
- Equipe Labellisée LIGUE, 2023 Marseille, France
| | | | - Carla Heredia
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, UMR 5535, Montpellier, France
| | - Lydie Pradel
- Aix-Marseille University, INSERM, TAGC, UMR 1090 Marseille, France
- Equipe Labellisée LIGUE, 2023 Marseille, France
| | - Benoit Ballester
- Aix-Marseille University, INSERM, TAGC, UMR 1090 Marseille, France
| | - Jean-Christophe Andrau
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, UMR 5535, Montpellier, France
| | - Salvatore Spicuglia
- Aix-Marseille University, INSERM, TAGC, UMR 1090 Marseille, France
- Equipe Labellisée LIGUE, 2023 Marseille, France
| |
Collapse
|
10
|
Golov AK, Gavrilov AA, Kaplan N, Razin SV. A genome-wide nucleosome-resolution map of promoter-centered interactions in human cells corroborates the enhancer-promoter looping model. eLife 2024; 12:RP91596. [PMID: 39688903 DOI: 10.7554/elife.91596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2024] Open
Abstract
The enhancer-promoter looping model, in which enhancers activate their target genes via physical contact, has long dominated the field of gene regulation. However, the ubiquity of this model has been questioned due to evidence of alternative mechanisms and the lack of its systematic validation, primarily owing to the absence of suitable experimental techniques. In this study, we present a new MNase-based proximity ligation method called MChIP-C, allowing for the measurement of protein-mediated chromatin interactions at single-nucleosome resolution on a genome-wide scale. By applying MChIP-C to study H3K4me3 promoter-centered interactions in K562 cells, we found that it had greatly improved resolution and sensitivity compared to restriction endonuclease-based C-methods. This allowed us to identify EP300 histone acetyltransferase and the SWI/SNF remodeling complex as potential candidates for establishing and/or maintaining enhancer-promoter interactions. Finally, leveraging data from published CRISPRi screens, we found that most functionally verified enhancers do physically interact with their cognate promoters, supporting the enhancer-promoter looping model.
Collapse
Affiliation(s)
- Arkadiy K Golov
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russian Federation
- Department of Physiology, Biophysics & Systems Biology, Rappaport Faculty of Medicine, Technion - Israel Institute of Technology, Haifa, Israel
| | - Alexey A Gavrilov
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russian Federation
| | - Noam Kaplan
- Department of Physiology, Biophysics & Systems Biology, Rappaport Faculty of Medicine, Technion - Israel Institute of Technology, Haifa, Israel
| | - Sergey V Razin
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russian Federation
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russian Federation
| |
Collapse
|
11
|
Dong G, Wu Y, Huang L, Li F, Zhou F. TExCNN: Leveraging Pre-Trained Models to Predict Gene Expression from Genomic Sequences. Genes (Basel) 2024; 15:1593. [PMID: 39766860 PMCID: PMC11675716 DOI: 10.3390/genes15121593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2024] [Revised: 12/02/2024] [Accepted: 12/10/2024] [Indexed: 01/11/2025] Open
Abstract
BACKGROUND/OBJECTIVES Understanding the relationship between DNA sequences and gene expression levels is of significant biological importance. Recent advancements have demonstrated the ability of deep learning to predict gene expression levels directly from genomic data. However, traditional methods are limited by basic word encoding techniques, which fail to capture the inherent features and patterns of DNA sequences. METHODS We introduce TExCNN, a novel framework that integrates the pre-trained models DNABERT and DNABERT-2 to generate word embeddings for DNA sequences. We partitioned the DNA sequences into manageable segments and computed their respective embeddings using the pre-trained models. These embeddings were then utilized as inputs to our deep learning framework, which was based on convolutional neural network. RESULTS TExCNN outperformed current state-of-the-art models, achieving an average R2 score of 0.622, compared to the 0.596 score achieved by the DeepLncLoc model, which is based on the Word2Vec model and a text convolutional neural network. Furthermore, when the sequence length was extended from 10,500 bp to 50,000 bp, TExCNN achieved an even higher average R2 score of 0.639. The prediction accuracy improved further when additional biological features were incorporated. CONCLUSIONS Our experimental results demonstrate that the use of pre-trained models for word embedding generation significantly improves the accuracy of predicting gene expression. The proposed TExCNN pipeline performes optimally with longer DNA sequences and is adaptable for both cell-type-independent and cell-type-dependent predictions.
Collapse
Affiliation(s)
- Guohao Dong
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China; (G.D.); (Y.W.); (L.H.); (F.L.)
- College of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Yuqian Wu
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China; (G.D.); (Y.W.); (L.H.); (F.L.)
- College of Software, Jilin University, Changchun 130012, China
| | - Lan Huang
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China; (G.D.); (Y.W.); (L.H.); (F.L.)
- College of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Fei Li
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China; (G.D.); (Y.W.); (L.H.); (F.L.)
- College of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Fengfeng Zhou
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China; (G.D.); (Y.W.); (L.H.); (F.L.)
- College of Computer Science and Technology, Jilin University, Changchun 130012, China
| |
Collapse
|
12
|
Battivelli D, Fan Z, Hu H, Gross CT. How can ethology inform the neuroscience of fear, aggression and dominance? Nat Rev Neurosci 2024; 25:809-819. [PMID: 39402310 DOI: 10.1038/s41583-024-00858-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/20/2024] [Indexed: 11/20/2024]
Abstract
The study of behaviour is dominated by two approaches. On the one hand, ethologists aim to understand how behaviour promotes adaptation to natural contexts. On the other, neuroscientists aim to understand the molecular, cellular, circuit and psychological origins of behaviour. These two complementary approaches must be combined to arrive at a full understanding of behaviour in its natural setting. However, methodological limitations have restricted most neuroscientific research to the study of how discrete sensory stimuli elicit simple behavioural responses under controlled laboratory conditions that are only distantly related to those encountered in real life. Fortunately, the recent advent of neural monitoring and manipulation tools adapted for use in freely behaving animals has enabled neuroscientists to incorporate naturalistic behaviours into their studies and to begin to consider ethological questions. Here, we examine the promises and pitfalls of this trend by describing how investigations of rodent fear, aggression and dominance behaviours are changing to take advantage of an ethological appreciation of behaviour. We lay out current impediments to this approach and propose a framework for the evolution of the field that will allow us to take maximal advantage of an ethological approach to neuroscience and to increase its relevance for understanding human behaviour.
Collapse
Affiliation(s)
- Dorian Battivelli
- Epigenetics & Neurobiology Unit, EMBL Rome, European Molecular Biology Laboratory, Monterotondo, Italy
| | - Zhengxiao Fan
- School of Brain Science and Brain Medicine, New Cornerstone Science Laboratory, Zhejiang University School of Medicine, Hangzhou, China
| | - Hailan Hu
- School of Brain Science and Brain Medicine, New Cornerstone Science Laboratory, Zhejiang University School of Medicine, Hangzhou, China.
| | - Cornelius T Gross
- Epigenetics & Neurobiology Unit, EMBL Rome, European Molecular Biology Laboratory, Monterotondo, Italy.
| |
Collapse
|
13
|
Lee SH, Park J, Hwang B. Multiplexed multimodal single-cell technologies: From observation to perturbation analysis. Mol Cells 2024; 47:100147. [PMID: 39522648 PMCID: PMC11626049 DOI: 10.1016/j.mocell.2024.100147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 10/17/2024] [Accepted: 11/03/2024] [Indexed: 11/16/2024] Open
Abstract
Single-cell technologies have undergone a significant transformation, expanding from their initial focus on transcriptomics to encompass a diverse range of modalities. Recent advancements have markedly improved scalability and reduced costs, facilitating the processing of larger cell populations and broadening the scope of single-cell research. The incorporation of clustered regularly interspaced short palindromic repeats (CRISPR)-based perturbations has revolutionized the field by enabling precise functional genomics and detailed studies of gene regulation at the single-cell level. Despite these advancements, challenges persist, particularly in achieving genome-wide perturbations and managing the complexity of high-throughput data. This review discusses the technological milestones that have driven these changes, the current limitations of single-cell CRISPR technologies, and the future directions needed to address these challenges and advance our understanding of cellular biology.
Collapse
Affiliation(s)
- Su-Hyeon Lee
- Department of Biomedical Sciences, Yonsei University College of Medicine, Seoul, South Korea
| | - Junha Park
- Yonsei University College of Medicine, Seoul, South Korea
| | - Byungjin Hwang
- Department of Biomedical Sciences, Yonsei University College of Medicine, Seoul, South Korea; Brain Korea 21 Project, Graduate School of Medical Science, Yonsei University College of Medicine, Seoul, South Korea.
| |
Collapse
|
14
|
Fair T, Pavlovic BJ, Swope D, Castillo OE, Schaefer NK, Pollen AA. Mapping cis- and trans-regulatory target genes of human-specific deletions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.27.573461. [PMID: 38234800 PMCID: PMC10793408 DOI: 10.1101/2023.12.27.573461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Deletion of functional sequence is predicted to represent a fundamental mechanism of molecular evolution1,2. Comparative genetic studies of primates2,3 have identified thousands of human-specific deletions (hDels), and the cis-regulatory potential of short (≤31 base pairs) hDels has been assessed using reporter assays4. However, how structural variant-sized (≥50 base pairs) hDels influence molecular and cellular processes in their native genomic contexts remains unexplored. Here, we design genome-scale libraries of single-guide RNAs targeting 7.2 megabases of sequence in 6,358 hDels and present a systematic CRISPR interference (CRISPRi) screening approach to identify hDels that modify cellular proliferation in chimpanzee pluripotent stem cells. By intersecting hDels with chromatin state features and performing single-cell CRISPRi (Perturb-seq) to identify their cis- and trans-regulatory target genes, we discovered 20 hDels controlling gene expression. We highlight two hDels, hDel_2247 and hDel_585, with tissue-specific activity in the brain. Our findings reveal a molecular and cellular role for sequences lost in the human lineage and establish a framework for functionally interrogating human-specific genetic variants.
Collapse
Affiliation(s)
- Tyler Fair
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
- Biomedical Sciences Graduate Program, University of California, San Francisco, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Bryan J Pavlovic
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Dani Swope
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Octavio E Castillo
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Nathan K Schaefer
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Alex A Pollen
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| |
Collapse
|
15
|
Zhou JL, Guruvayurappan K, Toneyan S, Chen HV, Chen AR, Koo P, McVicker G. Analysis of single-cell CRISPR perturbations indicates that enhancers predominantly act multiplicatively. CELL GENOMICS 2024; 4:100672. [PMID: 39406234 PMCID: PMC11605691 DOI: 10.1016/j.xgen.2024.100672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 07/08/2024] [Accepted: 09/16/2024] [Indexed: 10/30/2024]
Abstract
A single gene may have multiple enhancers, but how they work in concert to regulate transcription is poorly understood. To analyze enhancer interactions throughout the genome, we developed a generalized linear modeling framework, GLiMMIRS, for interrogating enhancer effects from single-cell CRISPR experiments. We applied GLiMMIRS to a published dataset and tested for interactions between 46,166 enhancer pairs and corresponding genes, including 264 "high-confidence" enhancer pairs. We found that enhancer effects combine multiplicatively but with limited evidence for further interactions. Only 31 enhancer pairs exhibited significant interactions (false discovery rate <0.1), none of which came from the high-confidence set, and 20 were driven by outlier expression values. Additional analyses of a second CRISPR dataset and in silico enhancer perturbations with Enformer both support a multiplicative model of enhancer effects without interactions. Altogether, our results indicate that enhancer interactions are uncommon or have small effects that are difficult to detect.
Collapse
Affiliation(s)
- Jessica L Zhou
- Integrative Biology Laboratory, Salk Institute for Biological Studies, 10010 N. Torrey Pines Road, La Jolla, CA 92037, USA; Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA 92093, USA; Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | - Karthik Guruvayurappan
- Integrative Biology Laboratory, Salk Institute for Biological Studies, 10010 N. Torrey Pines Road, La Jolla, CA 92037, USA; School of Biological Sciences, University of California, San Diego, La Jolla, CA 92037, USA; Halicioglu Data Science Institute, University of California, San Diego, La Jolla, CA 92093, USA; Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA
| | - Shushan Toneyan
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | - Hsiuyi V Chen
- Integrative Biology Laboratory, Salk Institute for Biological Studies, 10010 N. Torrey Pines Road, La Jolla, CA 92037, USA
| | - Aaron R Chen
- Integrative Biology Laboratory, Salk Institute for Biological Studies, 10010 N. Torrey Pines Road, La Jolla, CA 92037, USA
| | - Peter Koo
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | - Graham McVicker
- Integrative Biology Laboratory, Salk Institute for Biological Studies, 10010 N. Torrey Pines Road, La Jolla, CA 92037, USA.
| |
Collapse
|
16
|
Ko BS, Lee SB, Kim TK. A brief guide to analyzing expression quantitative trait loci. Mol Cells 2024; 47:100139. [PMID: 39447874 PMCID: PMC11600780 DOI: 10.1016/j.mocell.2024.100139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Revised: 10/14/2024] [Accepted: 10/17/2024] [Indexed: 10/26/2024] Open
Abstract
Molecular quantitative trait locus (molQTL) mapping has emerged as an important approach for elucidating the functional consequences of genetic variants and unraveling the causal mechanisms underlying diseases or complex traits. However, the variety of analysis tools and sophisticated methodologies available for molQTL studies can be overwhelming for researchers with limited computational expertise. Here, we provide a brief guideline with a curated list of methods and software tools for analyzing expression quantitative trait loci, the most widely studied type of molQTL.
Collapse
Affiliation(s)
- Byung Su Ko
- Department of Brain Sciences, DGIST, Daegu 42988, Republic of Korea
| | - Sung Bae Lee
- Department of Brain Sciences, DGIST, Daegu 42988, Republic of Korea
| | - Tae-Kyung Kim
- Department of Life Sciences, Pohang University of Science and Technology (POSTECH), Pohang 37673, Republic of Korea; Institute for Convergence Research and Education in Advanced Technology, Yonsei University, Seoul 03722, Republic of Korea.
| |
Collapse
|
17
|
Toneyan S, Koo PK. Interpreting cis-regulatory interactions from large-scale deep neural networks. Nat Genet 2024; 56:2517-2527. [PMID: 39284975 DOI: 10.1038/s41588-024-01923-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 08/21/2024] [Indexed: 09/25/2024]
Abstract
The rise of large-scale, sequence-based deep neural networks (DNNs) for predicting gene expression has introduced challenges in their evaluation and interpretation. Current evaluations align DNN predictions with orthogonal experimental data, providing insights into generalization but offering limited insights into their decision-making process. Existing model explainability tools focus mainly on motif analysis, which becomes complex when interpreting longer sequences. Here we present cis-regulatory element model explanations (CREME), an in silico perturbation toolkit that interprets the rules of gene regulation learned by a genomic DNN. Applying CREME to Enformer, a state-of-the-art DNN, we identify cis-regulatory elements that enhance or silence gene expression and characterize their complex interactions. CREME can provide interpretations across multiple scales of genomic organization, from cis-regulatory elements to fine-mapped functional sequence elements within them, offering high-resolution insights into the regulatory architecture of the genome. CREME provides a powerful toolkit for translating the predictions of genomic DNNs into mechanistic insights of gene regulation.
Collapse
Affiliation(s)
- Shushan Toneyan
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, New York, NY, USA
| | - Peter K Koo
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, New York, NY, USA.
| |
Collapse
|
18
|
Ren YY, Liu Z. Characterization of Single-Cell Cis-regulatory Elements Informs Implications for Cell Differentiation. Genome Biol Evol 2024; 16:evae241. [PMID: 39506564 PMCID: PMC11580522 DOI: 10.1093/gbe/evae241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 10/17/2024] [Accepted: 11/04/2024] [Indexed: 11/08/2024] Open
Abstract
Cis-regulatory elements govern the specific patterns and dynamics of gene expression in cells during development, which are the fundamental mechanisms behind cell differentiation. However, the genomic characteristics of single-cell cis-regulatory elements closely linked to cell differentiation during development remain unclear. To explore this, we systematically analyzed ∼250,000 putative single-cell cis-regulatory elements obtained from snATAC-seq analysis of the developing mouse cerebellum. We found that over 80% of these single-cell cis-regulatory elements show pleiotropic effects, being active in 2 or more cell types. The pleiotropic degrees of proximal and distal single-cell cis-regulatory elements are positively correlated with the density and diversity of transcription factor binding motifs and GC content. There is a negative correlation between the pleiotropic degrees of single-cell cis-regulatory elements and their distances to the nearest transcription start sites, and proximal single-cell cis-regulatory elements display higher relevance strengths than distal ones. Furthermore, both proximal and distal single-cell cis-regulatory elements related to cell differentiation exhibit enhanced sequence-level evolutionary conservation, increased density and diversity of transcription factor binding motifs, elevated GC content, and greater distances from their nearest genes. Together, our findings reveal the general genomic characteristics of putative single-cell cis-regulatory elements and provide insights into the genomic and evolutionary mechanisms by which single-cell cis-regulatory elements regulate cell differentiation during development.
Collapse
Affiliation(s)
- Ying-Ying Ren
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Key Laboratory of Genetic Evolution & Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Zhen Liu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Key Laboratory of Genetic Evolution & Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Yunnan Key Laboratory of Biodiversity Information, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
19
|
Chen S, Keleş S. GEEES: inferring cell-specific gene-enhancer interactions from multi-modal single-cell data. Bioinformatics 2024; 40:btae638. [PMID: 39468737 PMCID: PMC11549018 DOI: 10.1093/bioinformatics/btae638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 10/17/2024] [Accepted: 10/25/2024] [Indexed: 10/30/2024] Open
Abstract
MOTIVATION Gene-enhancer interactions are central to transcriptional regulation. Current multi-modal single-cell datasets that profile transcriptome and chromatin accessibility simultaneously in a single cell are yielding opportunities to infer gene-enhancer associations in a cell type specific manner. Computational efforts for such multi-modal single-cell datasets thus far focused on methods for identification and refinement of cell types and trajectory construction. While initial attempts for inferring gene-enhancer interactions have emerged, these have not been evaluated against benchmark datasets that materialized from bulk genomic experiments. Furthermore, existing approaches are limited to inferring gene-enhancer associations at the level of grouped cells as opposed to individual cells, thereby ignoring regulatory heterogeneity among the cells. RESULTS We present a new approach, GEEES for "Gene EnhancEr IntEractions from Multi-modal Single Cell Data," for inferring gene-enhancer associations at the single-cell level using multi-modal single-cell transcriptome and chromatin accessibility data. We evaluated GEEES alongside several multivariate regression-based alternatives we devised and state-of-the-art methods using a large number of benchmark datasets, providing a comprehensive assessment of current approaches. This analysis revealed significant discrepancies between gold-standard interactions and gene-enhancer associations derived from multi-modal single-cell data. Notably, incorporating gene-enhancer distance into the analysis markedly improved performance across all methods, positioning GEEES as a leading approach in this domain. While the overall improvement in performance metrics by GEEES is modest, it provides enhanced cell representation learning which can be leveraged for more effective downstream analysis. Furthermore, our review of existing experimentally driven benchmark datasets uncovers their limited concordance, underscoring the necessity for new high-throughput experiments to validate gene-enhancer interactions inferred from single-cell data. AVAILABILITY AND IMPLEMENTATION https://github.com/keleslab/GEEES.
Collapse
Affiliation(s)
- Shuyang Chen
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, United States
| | - Sündüz Keleş
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, United States
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706, United States
| |
Collapse
|
20
|
He AY, Palamuttam NP, Danko CG. Training deep learning models on personalized genomic sequences improves variant effect prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.15.618510. [PMID: 39463940 PMCID: PMC11507713 DOI: 10.1101/2024.10.15.618510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2024]
Abstract
Sequence-to-function models have broad applications in interpreting the molecular impact of genetic variation, yet have been criticized for poor performance in this task. Here we show that training models on functional genomic data with matched personal genomes improves their performance at variant effect prediction. Variant effect representations are retained even when transfer learning models to unseen cellular contexts and experimental readouts. Our results have implications for interpreting trait-associated genetic variation.
Collapse
|
21
|
Barry T, Roeder K, Katsevich E. Exponential family measurement error models for single-cell CRISPR screens. Biostatistics 2024; 25:1254-1272. [PMID: 38649751 PMCID: PMC11471999 DOI: 10.1093/biostatistics/kxae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 01/10/2024] [Accepted: 03/11/2024] [Indexed: 04/25/2024] Open
Abstract
CRISPR genome engineering and single-cell RNA sequencing have accelerated biological discovery. Single-cell CRISPR screens unite these two technologies, linking genetic perturbations in individual cells to changes in gene expression and illuminating regulatory networks underlying diseases. Despite their promise, single-cell CRISPR screens present considerable statistical challenges. We demonstrate through theoretical and real data analyses that a standard method for estimation and inference in single-cell CRISPR screens-"thresholded regression"-exhibits attenuation bias and a bias-variance tradeoff as a function of an intrinsic, challenging-to-select tuning parameter. To overcome these difficulties, we introduce GLM-EIV ("GLM-based errors-in-variables"), a new method for single-cell CRISPR screen analysis. GLM-EIV extends the classical errors-in-variables model to responses and noisy predictors that are exponential family-distributed and potentially impacted by the same set of confounding variables. We develop a computational infrastructure to deploy GLM-EIV across hundreds of processors on clouds (e.g. Microsoft Azure) and high-performance clusters. Leveraging this infrastructure, we apply GLM-EIV to analyze two recent, large-scale, single-cell CRISPR screen datasets, yielding several new insights.
Collapse
Affiliation(s)
- Timothy Barry
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Building 2 435, 655 Huntington Ave, Boston, MA 02115, United States
| | - Kathryn Roeder
- Department of Statistics and Data Science, Carnegie Mellon University, Baker Hall 228B, 4909 Frew St, Pittsburgh, PA 15213, United States
| | - Eugene Katsevich
- Department of Statistics and Data Science, University of Pennsylvania, Academic Research Building 311, 265 South 37th Street Philadelphia, PA 19104, United States
| |
Collapse
|
22
|
Loeb GB, Kathail P, Shuai RW, Chung R, Grona RJ, Peddada S, Sevim V, Federman S, Mader K, Chu AY, Davitte J, Du J, Gupta AR, Ye CJ, Shafer S, Przybyla L, Rapiteanu R, Ioannidis NM, Reiter JF. Variants in tubule epithelial regulatory elements mediate most heritable differences in human kidney function. Nat Genet 2024; 56:2078-2092. [PMID: 39256582 DOI: 10.1038/s41588-024-01904-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 08/12/2024] [Indexed: 09/12/2024]
Abstract
Kidney failure, the decrease of kidney function below a threshold necessary to support life, is a major cause of morbidity and mortality. We performed a genome-wide association study (GWAS) of 406,504 individuals in the UK Biobank, identifying 430 loci affecting kidney function in middle-aged adults. To investigate the cell types affected by these loci, we integrated the GWAS with human kidney candidate cis-regulatory elements (cCREs) identified using single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq). Overall, 56% of kidney function heritability localized to kidney tubule epithelial cCREs and an additional 7% to kidney podocyte cCREs. Thus, most heritable differences in adult kidney function are a result of altered gene expression in these two cell types. Using enhancer assays, allele-specific scATAC-seq and machine learning, we found that many kidney function variants alter tubule epithelial cCRE chromatin accessibility and function. Using CRISPRi, we determined which genes some of these cCREs regulate, implicating NDRG1, CCNB1 and STC1 in human kidney function.
Collapse
Affiliation(s)
- Gabriel B Loeb
- Division of Nephrology, Department of Medicine, University of California, San Francisco, San Francisco, CA, USA.
- Cardiovascular Research Institute, University of California, San Francisco, San Francisco, CA, USA.
| | - Pooja Kathail
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Richard W Shuai
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | - Ryan Chung
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Reinier J Grona
- Division of Nephrology, Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
- Cardiovascular Research Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Sailaja Peddada
- Laboratory for Genomics Research, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Volkan Sevim
- Laboratory for Genomics Research, San Francisco, CA, USA
- Target Discovery, GSK, San Francisco, CA, USA
| | - Scot Federman
- Laboratory for Genomics Research, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Karl Mader
- Laboratory for Genomics Research, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Audrey Y Chu
- Human Genetics and Genomics, GSK, Cambridge, MA, USA
| | | | - Juan Du
- Department of Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Alexander R Gupta
- Department of Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Chun Jimmie Ye
- Division of Rheumatology, Department of Medicine; Bakar Computational Health Sciences Institute; Parker Institute for Cancer Immunotherapy; Institute for Human Genetics; Department of Epidemiology & Biostatistics; Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA
- Gladstone-UCSF Institute of Genomic Immunology, San Francisco, CA, USA
- Arc Institute, Palo Alto, CA, USA
| | - Shawn Shafer
- Laboratory for Genomics Research, San Francisco, CA, USA
- Target Discovery, GSK, San Francisco, CA, USA
| | - Laralynne Przybyla
- Laboratory for Genomics Research, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Radu Rapiteanu
- Genome Biology, Research Technologies, GSK, Stevenage, UK
| | - Nilah M Ioannidis
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Jeremy F Reiter
- Cardiovascular Research Institute, University of California, San Francisco, San Francisco, CA, USA.
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
23
|
Su C, Lee D, Jin P, Zhang J. Cell-type-specific mapping of enhancers and target genes from single-cell multimodal data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.24.614814. [PMID: 39386519 PMCID: PMC11463474 DOI: 10.1101/2024.09.24.614814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Mapping enhancers and target genes in disease-related cell types has provided critical insights into the functional mechanisms of genetic variants identified by genome-wide association studies (GWAS). However, most existing analyses rely on bulk data or cultured cell lines, which may fail to identify cell-type-specific enhancers and target genes. Recently, single-cell multimodal data measuring both gene expression and chromatin accessibility within the same cells have enabled the inference of enhancer-gene pairs in a cell-type-specific and context-specific manner. However, this task is challenged by the data's high sparsity, sequencing depth variation, and the computational burden of analyzing a large number of enhancer-gene pairs. To address these challenges, we propose scMultiMap, a statistical method that infers enhancer-gene association from sparse multimodal counts using a joint latent-variable model. It adjusts for technical confounding, permits fast moment-based estimation and provides analytically derived p -values. In systematic analyses of blood and brain data, scMultiMap shows appropriate type I error control, high statistical power with greater reproducibility across independent datasets and stronger consistency with orthogonal data modalities. Meanwhile, its computational cost is less than 1% of existing methods. When applied to single-cell multimodal data from postmortem brain samples from Alzheimer's disease (AD) patients and controls, scMultiMap gave the highest heritability enrichment in microglia and revealed new insights into the regulatory mechanisms of AD GWAS variants in microglia.
Collapse
Affiliation(s)
- Chang Su
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, USA
| | - Dongsoo Lee
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, USA
| | - Peng Jin
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA, USA
| | - Jingfei Zhang
- Information Systems and Operations Management, Emory University, Atlanta, GA, USA
| |
Collapse
|
24
|
Wan M, Liu Y, Li D, Snyder R, Elkin L, Day C, Rodriguez J, Grunseich C, Mahley R, Watts J, Cheung V. The enhancer RNA, AANCR, regulates APOE expression in astrocytes and microglia. Nucleic Acids Res 2024; 52:10235-10254. [PMID: 39162226 PMCID: PMC11417409 DOI: 10.1093/nar/gkae696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 07/26/2024] [Accepted: 08/01/2024] [Indexed: 08/21/2024] Open
Abstract
Enhancers, critical regulatory elements within the human genome, are often transcribed into enhancer RNAs. The dysregulation of enhancers leads to diseases collectively termed enhanceropathies. While it is known that enhancers play a role in diseases by regulating gene expression, the specific mechanisms by which individual enhancers cause diseases are not well understood. Studies of individual enhancers are needed to fill this gap. This study delves into the role of APOE-activating noncoding RNA, AANCR, in the central nervous system, elucidating its function as a genetic modifier in Alzheimer's Disease. We employed RNA interference, RNaseH-mediated degradation, and single-molecule RNA fluorescence in situ hybridization to demonstrate that mere transcription of AANCR is insufficient; rather, its transcripts are crucial for promoting APOE expression. Our findings revealed that AANCR is induced by ATM-mediated ERK phosphorylation and subsequent AP-1 transcription factor activation. Once activated, AANCR enhances APOE expression, which in turn imparts an inflammatory phenotype to astrocytes. These findings demonstrate that AANCR is a key enhancer RNA in some cell types within the nervous system, pivotal for regulating APOE expression and influencing inflammatory responses, underscoring its potential as a therapeutic target in neurodegenerative diseases.
Collapse
Affiliation(s)
- Ma Wan
- Epigenetics and Stem Cell Laboratory, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC 27709, USA
| | - Yaojuan Liu
- Department of Pediatrics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Dongjun Li
- Department of Pediatrics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Ryan J Snyder
- Epigenetics and Stem Cell Laboratory, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC 27709, USA
| | - Lillian B Elkin
- Epigenetics and Stem Cell Laboratory, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC 27709, USA
| | - Christopher R Day
- Epigenetics and Stem Cell Laboratory, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC 27709, USA
| | - Joseph Rodriguez
- Epigenetics and Stem Cell Laboratory, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC 27709, USA
| | - Christopher Grunseich
- National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892, USA
| | - Robert W Mahley
- Gladstone Institute of Neurological Disease, San Francisco, CA, USA
- Department of Pathology and Medicine, University of California, San Francisco, CA, USA
| | - Jason A Watts
- Epigenetics and Stem Cell Laboratory, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC 27709, USA
| | - Vivian G Cheung
- Department of Pediatrics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
25
|
Goell J, Li J, Mahata B, Ma AJ, Kim S, Shah S, Shah S, Contreras M, Misra S, Reed D, Bedford GC, Escobar M, Hilton IB. Tailoring a CRISPR/Cas-based Epigenome Editor for Programmable Chromatin Acylation and Decreased Cytotoxicity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.22.611000. [PMID: 39345554 PMCID: PMC11429961 DOI: 10.1101/2024.09.22.611000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Engineering histone acylation states can inform mechanistic epigenetics and catalyze therapeutic epigenome editing opportunities. Here, we developed engineered lysine acyltransferases that enable the programmable deposition of acetylation and longer-chain acylations. We show that targeting an engineered lysine crotonyltransferase results in weak levels of endogenous enhancer activation yet retains potency when targeted to promoters. We further identify a single mutation within the catalytic core of human p300 that preserves enzymatic activity while substantially reducing cytotoxicity, enabling improved viral delivery. We leveraged these capabilities to perform single-cell CRISPR activation screening and map enhancers to the genes they regulate in situ. We also discover acylation-specific interactions and find that recruitment of p300, regardless of catalytic activity, to prime editing sites can improve editing efficiency. These new programmable epigenome editing tools and insights expand our ability to understand the mechanistic role of lysine acylation in epigenetic and cellular processes and perform functional genomic screens.
Collapse
Affiliation(s)
- Jacob Goell
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
| | - Jing Li
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
| | - Barun Mahata
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
| | - Alex J Ma
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
| | - Sunghwan Kim
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
| | - Spencer Shah
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
| | - Shriya Shah
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
| | - Maria Contreras
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
| | - Suchir Misra
- Department of Biosciences, Rice University, Houston, TX 77030, USA
| | - Daniel Reed
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
| | - Guy C Bedford
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
| | - Mario Escobar
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
| | - Isaac B Hilton
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
- Department of Biosciences, Rice University, Houston, TX 77030, USA
| |
Collapse
|
26
|
Chardon FM, McDiarmid TA, Page NF, Daza RM, Martin BK, Domcke S, Regalado SG, Lalanne JB, Calderon D, Li X, Starita LM, Sanders SJ, Ahituv N, Shendure J. Multiplex, single-cell CRISPRa screening for cell type specific regulatory elements. Nat Commun 2024; 15:8209. [PMID: 39294132 PMCID: PMC11411074 DOI: 10.1038/s41467-024-52490-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 09/10/2024] [Indexed: 09/20/2024] Open
Abstract
CRISPR-based gene activation (CRISPRa) is a strategy for upregulating gene expression by targeting promoters or enhancers in a tissue/cell-type specific manner. Here, we describe an experimental framework that combines highly multiplexed perturbations with single-cell RNA sequencing (sc-RNA-seq) to identify cell-type-specific, CRISPRa-responsive cis-regulatory elements and the gene(s) they regulate. Random combinations of many gRNAs are introduced to each of many cells, which are then profiled and partitioned into test and control groups to test for effect(s) of CRISPRa perturbations of both enhancers and promoters on the expression of neighboring genes. Applying this method to a library of 493 gRNAs targeting candidate cis-regulatory elements in both K562 cells and iPSC-derived excitatory neurons, we identify gRNAs capable of specifically upregulating intended target genes and no other neighboring genes within 1 Mb, including gRNAs yielding upregulation of six autism spectrum disorder (ASD) and neurodevelopmental disorder (NDD) risk genes in neurons. A consistent pattern is that the responsiveness of individual enhancers to CRISPRa is restricted by cell type, implying a dependency on either chromatin landscape and/or additional trans-acting factors for successful gene activation. The approach outlined here may facilitate large-scale screens for gRNAs that activate genes in a cell type-specific manner.
Collapse
Affiliation(s)
- Florence M Chardon
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Seattle Hub for Synthetic Biology, Seattle, WA, USA
| | - Troy A McDiarmid
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Seattle Hub for Synthetic Biology, Seattle, WA, USA
| | - Nicholas F Page
- Department of Psychiatry and Behavioral Sciences, Kavli Institute for Fundamental Neuroscience, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
| | - Riza M Daza
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Seattle Hub for Synthetic Biology, Seattle, WA, USA
| | - Beth K Martin
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Seattle Hub for Synthetic Biology, Seattle, WA, USA
| | - Silvia Domcke
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Samuel G Regalado
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | | | - Diego Calderon
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Xiaoyi Li
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Seattle Hub for Synthetic Biology, Seattle, WA, USA
| | - Lea M Starita
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Stephan J Sanders
- Department of Psychiatry and Behavioral Sciences, Kavli Institute for Fundamental Neuroscience, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
- Institute of Developmental and Regenerative Medicine, Department of Paediatrics, University of Oxford, Oxford, OX3 7TY, UK
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA.
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA.
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Seattle Hub for Synthetic Biology, Seattle, WA, USA.
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, Seattle, WA, USA.
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA.
| |
Collapse
|
27
|
Bhattacharyya S, Ay F. Identifying genetic variants associated with chromatin looping and genome function. Nat Commun 2024; 15:8174. [PMID: 39289357 PMCID: PMC11408621 DOI: 10.1038/s41467-024-52296-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 08/30/2024] [Indexed: 09/19/2024] Open
Abstract
Here we present a comprehensive HiChIP dataset on naïve CD4 T cells (nCD4) from 30 donors and identify QTLs that associate with genotype-dependent and/or allele-specific variation of HiChIP contacts defining loops between active regulatory regions (iQTLs). We observe a substantial overlap between iQTLs and previously defined eQTLs and histone QTLs, and an enrichment for fine-mapped QTLs and GWAS variants. Furthermore, we describe a distinct subset of nCD4 iQTLs, for which the significant variation of chromatin contacts in nCD4 are translated into significant eQTL trends in CD4 T cell memory subsets. Finally, we define connectivity-QTLs as iQTLs that are significantly associated with concordant genotype-dependent changes in chromatin contacts over a broad genomic region (e.g., GWAS SNP in the RNASET2 locus). Our results demonstrate the importance of chromatin contacts as a complementary modality for QTL mapping and their power in identifying previously uncharacterized QTLs linked to cell-specific gene expression and connectivity.
Collapse
Affiliation(s)
| | - Ferhat Ay
- La Jolla Institute for Immunology, La Jolla, CA, USA.
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
28
|
Liu S, Hamilton MC, Cowart T, Barrera A, Bounds LR, Nelson AC, Doty RW, Allen AS, Crawford GE, Majoros WH, Gersbach CA. Characterization and bioinformatic filtering of ambient gRNAs in single-cell CRISPR screens using CLEANSER. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.04.611293. [PMID: 39282389 PMCID: PMC11398468 DOI: 10.1101/2024.09.04.611293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
Recent technological developments in single-cell RNA-seq CRISPR screens enable high-throughput investigation of the genome. Through transduction of a gRNA library to a cell population followed by transcriptomic profiling by scRNA-seq, it is possible to characterize the effects of thousands of genomic perturbations on global gene expression. A major source of noise in scRNA-seq CRISPR screens are ambient gRNAs, which are contaminating gRNAs that likely originate from other cells. If not properly filtered, ambient gRNAs can result in an excess of false positive gRNA assignments. Here, we utilize CRISPR barnyard assays to characterize ambient gRNA noise in single-cell CRISPR screens. We use these datasets to develop and train CLEANSER, a mixture model that identifies and filters ambient gRNA noise. This model takes advantage of the bimodal distribution between native and ambient gRNAs and includes both gRNA and cell-specific normalization parameters, correcting for confounding technical factors that affect individual gRNAs and cells. The output of CLEANSER is the probability that a gRNA-cell assignment is in the native distribution over the ambient distribution. We find that ambient gRNA filtering methods impact differential gene expression analysis outcomes and that CLEANSER outperforms alternate approaches by increasing gRNA-cell assignment accuracy.
Collapse
Affiliation(s)
- Siyan Liu
- Computational Biology and Bioinformatics, Duke University, Durham, NC, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Marisa C Hamilton
- University Program in Genetics and Genomics, Duke University Medical Center, Durham, NC, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Thomas Cowart
- Computational Biology and Bioinformatics, Duke University, Durham, NC, USA
| | - Alejandro Barrera
- Computational Biology and Bioinformatics, Duke University, Durham, NC, USA
| | - Lexi R Bounds
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | | | - Richard W Doty
- Department of Biostatistics & Bioinformatics, Duke University, Durham, NC, USA
| | - Andrew S Allen
- Department of Biostatistics & Bioinformatics, Duke University, Durham, NC, USA
| | | | - William H Majoros
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Charles A Gersbach
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC USA
| |
Collapse
|
29
|
Chowdhary K, Léon J, Mathis D, Benoist C. An integrated transcription factor framework for Treg identity and diversity. Proc Natl Acad Sci U S A 2024; 121:e2411301121. [PMID: 39196621 PMCID: PMC11388289 DOI: 10.1073/pnas.2411301121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Accepted: 07/12/2024] [Indexed: 08/29/2024] Open
Abstract
Vertebrate cell identity depends on the combined activity of scores of transcription factors (TF). While TFs have often been studied in isolation, a systematic perspective on their integration has been missing. Focusing on FoxP3+ regulatory T cells (Tregs), key guardians of immune tolerance, we combined single-cell chromatin accessibility, machine learning, and high-density genetic variation, to resolve a validated framework of diverse Treg chromatin programs, each shaped by multi-TF inputs. This framework identified previously unrecognized Treg controllers (Smarcc1) and illuminated the mechanism of action of FoxP3, which amplified a pre-existing Treg identity, diversely activating or repressing distinct programs, dependent on different regulatory partners. Treg subpopulations in the colon relied variably on FoxP3, Helios+ Tregs being completely dependent, but RORγ+ Tregs largely independent. These differences were rooted in intrinsic biases decoded by the integrated framework. Moving beyond master regulators, this work unravels how overlapping TF activities coalesce into Treg identity and diversity.
Collapse
Affiliation(s)
| | - Juliette Léon
- Department of Immunology, Harvard Medical School, Boston, MA 02115
- INSERM UMR 1163, Imagine Institute, University of Paris, Paris, France 75015
| | - Diane Mathis
- Department of Immunology, Harvard Medical School, Boston, MA 02115
| | | |
Collapse
|
30
|
Mulet-Lazaro R, Delwel R. Oncogenic Enhancers in Leukemia. Blood Cancer Discov 2024; 5:303-317. [PMID: 39093124 PMCID: PMC11369600 DOI: 10.1158/2643-3230.bcd-23-0211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 06/06/2024] [Accepted: 07/17/2024] [Indexed: 08/04/2024] Open
Abstract
Although the study of leukemogenesis has traditionally focused on protein-coding genes, the role of enhancer dysregulation is becoming increasingly recognized. The advent of high-throughput sequencing, together with a better understanding of enhancer biology, has revealed how various genetic and epigenetic lesions produce oncogenic enhancers that drive transformation. These aberrations include translocations that lead to enhancer hijacking, point mutations that modulate enhancer activity, and copy number alterations that modify enhancer dosage. In this review, we describe these mechanisms in the context of leukemia and discuss potential therapeutic avenues to target these regulatory elements. Significance: Large-scale sequencing projects have uncovered recurrent gene mutations in leukemia, but the picture remains incomplete: some patients harbor no such aberrations, whereas others carry only a few that are insufficient to bring about transformation on their own. One of the missing pieces is enhancer dysfunction, which only recently has emerged as a critical driver of leukemogenesis. Knowledge of the various mechanisms of enhancer dysregulation is thus key for a complete understanding of leukemia and its causes, as well as the development of targeted therapies in the era of precision medicine.
Collapse
Affiliation(s)
- Roger Mulet-Lazaro
- Department of Hematology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands.
- Oncode Institute, Utrecht, the Netherlands.
| | - Ruud Delwel
- Department of Hematology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands.
- Oncode Institute, Utrecht, the Netherlands.
| |
Collapse
|
31
|
Rood JE, Hupalowska A, Regev A. Toward a foundation model of causal cell and tissue biology with a Perturbation Cell and Tissue Atlas. Cell 2024; 187:4520-4545. [PMID: 39178831 DOI: 10.1016/j.cell.2024.07.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 07/15/2024] [Accepted: 07/21/2024] [Indexed: 08/26/2024]
Abstract
Comprehensively charting the biologically causal circuits that govern the phenotypic space of human cells has often been viewed as an insurmountable challenge. However, in the last decade, a suite of interleaved experimental and computational technologies has arisen that is making this fundamental goal increasingly tractable. Pooled CRISPR-based perturbation screens with high-content molecular and/or image-based readouts are now enabling researchers to probe, map, and decipher genetically causal circuits at increasing scale. This scale is now eminently suitable for the deployment of artificial intelligence and machine learning (AI/ML) to both direct further experiments and to predict or generate information that was not-and sometimes cannot-be gathered experimentally. By combining and iterating those through experiments that are designed for inference, we now envision a Perturbation Cell Atlas as a generative causal foundation model to unify human cell biology.
Collapse
Affiliation(s)
| | | | - Aviv Regev
- Genentech, South San Francisco, CA, USA.
| |
Collapse
|
32
|
Kimura Y, Ono Y, Katayama K, Imoto S. IVEA: an integrative variational Bayesian inference method for predicting enhancer-gene regulatory interactions. BIOINFORMATICS ADVANCES 2024; 4:vbae118. [PMID: 39193566 PMCID: PMC11349192 DOI: 10.1093/bioadv/vbae118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 06/26/2024] [Accepted: 08/18/2024] [Indexed: 08/29/2024]
Abstract
Motivation Enhancers play critical roles in cell-type-specific transcriptional control. Despite the identification of thousands of candidate enhancers, unravelling their regulatory relationships with their target genes remains challenging. Therefore, computational approaches are needed to accurately infer enhancer-gene regulatory relationships. Results In this study, we propose a new method, IVEA, that predicts enhancer-gene regulatory interactions by estimating promoter and enhancer activities. Its statistical model is based on the gene regulatory mechanism of transcriptional bursting, which is characterized by burst size and frequency controlled by promoters and enhancers, respectively. Using transcriptional readouts, chromatin accessibility, and chromatin contact data as inputs, promoter and enhancer activities were estimated using variational Bayesian inference, and the contribution of each enhancer-promoter pair to target gene transcription was calculated. Our analysis demonstrates that the proposed method can achieve high prediction accuracy and provide biologically relevant enhancer-gene regulatory interactions. Availability and implementation The IVEA code is available on GitHub at https://github.com/yasumasak/ivea. The publicly available datasets used in this study are described in Supplementary Table S4.
Collapse
Affiliation(s)
- Yasumasa Kimura
- DX Drug Discovery Department, Daiichi Sankyo RD Novare Co., Ltd., Edogawa-ku, Tokyo 134-8630, Japan
- Division of Health Medical Intelligence, Human Genome Center, Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo 108-8639, Japan
- Research Function Research Innovation Planning Department, Daiichi Sankyo Co., Ltd., Edogawa-ku, Tokyo 134-8630, Japan
| | - Yoshimasa Ono
- DX Drug Discovery Department, Daiichi Sankyo RD Novare Co., Ltd., Edogawa-ku, Tokyo 134-8630, Japan
| | - Kotoe Katayama
- Division of Health Medical Intelligence, Human Genome Center, Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo 108-8639, Japan
| | - Seiya Imoto
- Division of Health Medical Intelligence, Human Genome Center, Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo 108-8639, Japan
| |
Collapse
|
33
|
Xiang G, He X, Giardine BM, Isaac KJ, Taylor DJ, McCoy RC, Jansen C, Keller CA, Wixom AQ, Cockburn A, Miller A, Qi Q, He Y, Li Y, Lichtenberg J, Heuston EF, Anderson SM, Luan J, Vermunt MW, Yue F, Sauria MEG, Schatz MC, Taylor J, Göttgens B, Hughes JR, Higgs DR, Weiss MJ, Cheng Y, Blobel GA, Bodine DM, Zhang Y, Li Q, Mahony S, Hardison RC. Interspecies regulatory landscapes and elements revealed by novel joint systematic integration of human and mouse blood cell epigenomes. Genome Res 2024; 34:1089-1105. [PMID: 38951027 PMCID: PMC11368181 DOI: 10.1101/gr.277950.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 06/24/2024] [Indexed: 07/03/2024]
Abstract
Knowledge of locations and activities of cis-regulatory elements (CREs) is needed to decipher basic mechanisms of gene regulation and to understand the impact of genetic variants on complex traits. Previous studies identified candidate CREs (cCREs) using epigenetic features in one species, making comparisons difficult between species. In contrast, we conducted an interspecies study defining epigenetic states and identifying cCREs in blood cell types to generate regulatory maps that are comparable between species, using integrative modeling of eight epigenetic features jointly in human and mouse in our Validated Systematic Integration (VISION) Project. The resulting catalogs of cCREs are useful resources for further studies of gene regulation in blood cells, indicated by high overlap with known functional elements and strong enrichment for human genetic variants associated with blood cell phenotypes. The contribution of each epigenetic state in cCREs to gene regulation, inferred from a multivariate regression, was used to estimate epigenetic state regulatory potential (esRP) scores for each cCRE in each cell type, which were used to categorize dynamic changes in cCREs. Groups of cCREs displaying similar patterns of regulatory activity in human and mouse cell types, obtained by joint clustering on esRP scores, harbor distinctive transcription factor binding motifs that are similar between species. An interspecies comparison of cCREs revealed both conserved and species-specific patterns of epigenetic evolution. Finally, we show that comparisons of the epigenetic landscape between species can reveal elements with similar roles in regulation, even in the absence of genomic sequence alignment.
Collapse
Affiliation(s)
- Guanjue Xiang
- Bioinformatics and Genomics Graduate Program, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Data Science, Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02215, USA
| | - Xi He
- Bioinformatics and Genomics Graduate Program, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Belinda M Giardine
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Kathryn J Isaac
- Department of Biology, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Dylan J Taylor
- Department of Biology, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Camden Jansen
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Cheryl A Keller
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Alexander Q Wixom
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - April Cockburn
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Amber Miller
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Qian Qi
- Department of Hematology, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Yanghua He
- Department of Hematology, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
- Department of Human Nutrition, Food and Animal Sciences, University of Hawaìi at Mānoa, Honolulu, Hawaii 96822, USA
| | - Yichao Li
- Department of Hematology, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Jens Lichtenberg
- Genetics and Molecular Biology Branch, National Human Genome Research Institute, Bethesda, Maryland 20892, USA
| | - Elisabeth F Heuston
- Genetics and Molecular Biology Branch, National Human Genome Research Institute, Bethesda, Maryland 20892, USA
| | - Stacie M Anderson
- Flow Cytometry Core, National Human Genome Research Institute, Bethesda, Maryland 20892, USA
| | - Jing Luan
- Department of Pediatrics, Children's Hospital of Philadelphia, and Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Marit W Vermunt
- Department of Pediatrics, Children's Hospital of Philadelphia, and Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Feng Yue
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Evanston, Illinois 60611, USA
| | - Michael E G Sauria
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Michael C Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - James Taylor
- Department of Biology, Johns Hopkins University, Baltimore, Maryland 21218, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Berthold Göttgens
- Wellcome and MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, United Kingdom
| | - Jim R Hughes
- MRC Weatherall Institute of Molecular Medicine, Oxford University, Oxford OX3 9DS, United Kingdom
| | - Douglas R Higgs
- MRC Weatherall Institute of Molecular Medicine, Oxford University, Oxford OX3 9DS, United Kingdom
| | - Mitchell J Weiss
- Department of Hematology, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Yong Cheng
- Department of Hematology, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Gerd A Blobel
- Department of Pediatrics, Children's Hospital of Philadelphia, and Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - David M Bodine
- Genetics and Molecular Biology Branch, National Human Genome Research Institute, Bethesda, Maryland 20892, USA
| | - Yu Zhang
- Department of Statistics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Qunhua Li
- Department of Statistics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for Computational Biology and Bioinformatics, Genome Sciences Institute, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Shaun Mahony
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for Computational Biology and Bioinformatics, Genome Sciences Institute, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Ross C Hardison
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA;
- Center for Computational Biology and Bioinformatics, Genome Sciences Institute, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
34
|
Liu B, Hu S, Wang X. Applications of single-cell technologies in drug discovery for tumor treatment. iScience 2024; 27:110486. [PMID: 39171294 PMCID: PMC11338156 DOI: 10.1016/j.isci.2024.110486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024] Open
Abstract
Single-cell technologies have been known as advanced and powerful tools to study tumor biological systems at the single-cell resolution and are playing increasingly critical roles in multiple stages of drug discovery and development. Specifically, single-cell technologies can promote the discovery of drug targets, help high-throughput screening at single-cell level, and contribute to pharmacokinetic studies of anti-tumor drugs. Emerging single-cell analysis technologies have been developed to further integrating multidimensional single-cell molecular features, expanding the scale of single-cell data, profiling phenotypic impact of genes in single cell, and providing full-length coverage single-cell sequencing. In this review, we systematically summarized the applications of single-cell technologies in various sections of drug discovery for tumor treatment, including target identification, high-throughput drug screening, and pharmacokinetic evaluation and highlighted emerging single-cell technologies in providing in-depth understanding of tumor biology. Single-cell-technology-based drug discovery is expected to further optimize therapeutic strategies and improve clinical outcomes of tumor patients.
Collapse
Affiliation(s)
- Bingyu Liu
- Department of Hematology, Shandong Provincial Hospital, Shandong University, Jinan, Shandong 250021, China
| | - Shunfeng Hu
- Department of Hematology, Shandong Provincial Hospital, Shandong University, Jinan, Shandong 250021, China
- Department of Hematology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, Shandong 250021, China
| | - Xin Wang
- Department of Hematology, Shandong Provincial Hospital, Shandong University, Jinan, Shandong 250021, China
- Department of Hematology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, Shandong 250021, China
- Taishan Scholars Program of Shandong Province, Jinan, Shandong 250021, China
| |
Collapse
|
35
|
Zhou H, Gelernter J. Human genetics and epigenetics of alcohol use disorder. J Clin Invest 2024; 134:e172885. [PMID: 39145449 PMCID: PMC11324314 DOI: 10.1172/jci172885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/16/2024] Open
Abstract
Alcohol use disorder (AUD) is a prominent contributor to global morbidity and mortality. Its complex etiology involves genetics, epigenetics, and environmental factors. We review progress in understanding the genetics and epigenetics of AUD, summarizing the key findings. Advancements in technology over the decades have elevated research from early candidate gene studies to present-day genome-wide scans, unveiling numerous genetic and epigenetic risk factors for AUD. The latest GWAS on more than one million participants identified more than 100 genetic variants, and the largest epigenome-wide association studies (EWAS) in blood and brain samples have revealed tissue-specific epigenetic changes. Downstream analyses revealed enriched pathways, genetic correlations with other traits, transcriptome-wide association in brain tissues, and drug-gene interactions for AUD. We also discuss limitations and future directions, including increasing the power of GWAS and EWAS studies as well as expanding the diversity of populations included in these analyses. Larger samples, novel technologies, and analytic approaches are essential; these include whole-genome sequencing, multiomics, single-cell sequencing, spatial transcriptomics, deep-learning prediction of variant function, and integrated methods for disease risk prediction.
Collapse
Affiliation(s)
- Hang Zhou
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut, USA
- Veterans Affairs Connecticut Healthcare System, West Haven, Connecticut, USA
- Department of Biomedical Informatics and Data Science
- Center for Brain and Mind Health
| | - Joel Gelernter
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut, USA
- Veterans Affairs Connecticut Healthcare System, West Haven, Connecticut, USA
- Department of Genetics, and
- Department of Neuroscience, Yale School of Medicine, New Haven, Connecticut, USA
| |
Collapse
|
36
|
Southard KM, Ardy RC, Tang A, O’Sullivan DD, Metzner E, Guruvayurappan K, Norman TM. Comprehensive transcription factor perturbations recapitulate fibroblast transcriptional states. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.31.606073. [PMID: 39131349 PMCID: PMC11312553 DOI: 10.1101/2024.07.31.606073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Cell atlas projects have nominated recurrent transcriptional states as drivers of biological processes and disease, but their origins, regulation, and properties remain unclear. To enable complementary functional studies, we developed a scalable approach for recapitulating cell states in vitro using CRISPR activation (CRISPRa) Perturb-seq. Aided by a novel multiplexing method, we activated 1,836 transcription factors in two cell types. Measuring 21,958 perturbations showed that CRISPRa activated targets within physiological ranges, that epigenetic features predicted activatable genes, and that the protospacer seed region drove an off-target effect. Perturbations recapitulated in vivo fibroblast states, including universal and inflammatory states, and identified KLF4 and KLF5 as key regulators of the universal state. Inducing the universal state suppressed disease-associated states, highlighting its therapeutic potential. Our findings cement CRISPRa as a tool for perturbing differentiated cells and indicate that in vivo states can be elicited via perturbation, enabling studies of clinically relevant states ex vivo.
Collapse
Affiliation(s)
- Kaden M. Southard
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Rico C. Ardy
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Anran Tang
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Deirdre D. O’Sullivan
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY, USA
| | - Eli Metzner
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY, USA
| | - Karthik Guruvayurappan
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY, USA
| | - Thomas M. Norman
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| |
Collapse
|
37
|
McCutcheon SR, Rohm D, Iglesias N, Gersbach CA. Epigenome editing technologies for discovery and medicine. Nat Biotechnol 2024; 42:1199-1217. [PMID: 39075148 DOI: 10.1038/s41587-024-02320-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 06/19/2024] [Indexed: 07/31/2024]
Abstract
Epigenome editing has rapidly evolved in recent years, with diverse applications that include elucidating gene regulation mechanisms, annotating coding and noncoding genome functions and programming cell state and lineage specification. Importantly, given the ubiquitous role of epigenetics in complex phenotypes, epigenome editing has unique potential to impact a broad spectrum of diseases. By leveraging powerful DNA-targeting technologies, such as CRISPR, epigenome editing exploits the heritable and reversible mechanisms of epigenetics to alter gene expression without introducing DNA breaks, inducing DNA damage or relying on DNA repair pathways.
Collapse
Affiliation(s)
- Sean R McCutcheon
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
| | - Dahlia Rohm
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
| | - Nahid Iglesias
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
| | - Charles A Gersbach
- Department of Biomedical Engineering, Duke University, Durham, NC, USA.
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA.
| |
Collapse
|
38
|
Yao D, Binan L, Bezney J, Simonton B, Freedman J, Frangieh CJ, Dey K, Geiger-Schuller K, Eraslan B, Gusev A, Regev A, Cleary B. Scalable genetic screening for regulatory circuits using compressed Perturb-seq. Nat Biotechnol 2024; 42:1282-1295. [PMID: 37872410 PMCID: PMC11035494 DOI: 10.1038/s41587-023-01964-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 08/22/2023] [Indexed: 10/25/2023]
Abstract
Pooled CRISPR screens with single-cell RNA sequencing readout (Perturb-seq) have emerged as a key technique in functional genomics, but they are limited in scale by cost and combinatorial complexity. In this study, we modified the design of Perturb-seq by incorporating algorithms applied to random, low-dimensional observations. Compressed Perturb-seq measures multiple random perturbations per cell or multiple cells per droplet and computationally decompresses these measurements by leveraging the sparse structure of regulatory circuits. Applied to 598 genes in the immune response to bacterial lipopolysaccharide, compressed Perturb-seq achieves the same accuracy as conventional Perturb-seq with an order of magnitude cost reduction and greater power to learn genetic interactions. We identified known and novel regulators of immune responses and uncovered evolutionarily constrained genes with downstream targets enriched for immune disease heritability, including many missed by existing genome-wide association studies. Our framework enables new scales of interrogation for a foundational method in functional genomics.
Collapse
Affiliation(s)
- Douglas Yao
- Program in Systems, Synthetic, and Quantitative Biology, Harvard University, Cambridge, MA, USA
| | - Loic Binan
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jon Bezney
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Brooke Simonton
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jahanara Freedman
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Chris J Frangieh
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Kushal Dey
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | | | | | - Alexander Gusev
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Division of Genetics, Brigham and Women's Hospital, Boston, MA, USA
| | - Aviv Regev
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Genentech, South San Francisco, CA, USA
| | - Brian Cleary
- Faculty of Computing and Data Sciences, Boston University, Boston, MA, USA.
- Department of Biology, Boston University, Boston, MA, USA.
- Department of Biomedical Engineering, Boston University, Boston, MA, USA.
- Program in Bioinformatics, Boston University, Boston, MA, USA.
- Biological Design Center, Boston University, Boston, MA, USA.
| |
Collapse
|
39
|
Qi T, Song L, Guo Y, Chen C, Yang J. From genetic associations to genes: methods, applications, and challenges. Trends Genet 2024; 40:642-667. [PMID: 38734482 DOI: 10.1016/j.tig.2024.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 04/15/2024] [Accepted: 04/16/2024] [Indexed: 05/13/2024]
Abstract
Genome-wide association studies (GWASs) have identified numerous genetic loci associated with human traits and diseases. However, pinpointing the causal genes remains a challenge, which impedes the translation of GWAS findings into biological insights and medical applications. In this review, we provide an in-depth overview of the methods and technologies used for prioritizing genes from GWAS loci, including gene-based association tests, integrative analysis of GWAS and molecular quantitative trait loci (xQTL) data, linking GWAS variants to target genes through enhancer-gene connection maps, and network-based prioritization. We also outline strategies for generating context-dependent xQTL data and their applications in gene prioritization. We further highlight the potential of gene prioritization in drug repurposing. Lastly, we discuss future challenges and opportunities in this field.
Collapse
Affiliation(s)
- Ting Qi
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China.
| | - Liyang Song
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China
| | - Yazhou Guo
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China
| | - Chang Chen
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China
| | - Jian Yang
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China.
| |
Collapse
|
40
|
Jiang K, Liu T, Kales S, Tewhey R, Kim D, Park Y, Jarvis JN. A systematic strategy for identifying causal single nucleotide polymorphisms and their target genes on Juvenile arthritis risk haplotypes. BMC Med Genomics 2024; 17:185. [PMID: 38997781 PMCID: PMC11241977 DOI: 10.1186/s12920-024-01954-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 06/27/2024] [Indexed: 07/14/2024] Open
Abstract
BACKGROUND Although genome-wide association studies (GWAS) have identified multiple regions conferring genetic risk for juvenile idiopathic arthritis (JIA), we are still faced with the task of identifying the single nucleotide polymorphisms (SNPs) on the disease haplotypes that exert the biological effects that confer risk. Until we identify the risk-driving variants, identifying the genes influenced by these variants, and therefore translating genetic information to improved clinical care, will remain an insurmountable task. We used a function-based approach for identifying causal variant candidates and the target genes on JIA risk haplotypes. METHODS We used a massively parallel reporter assay (MPRA) in myeloid K562 cells to query the effects of 5,226 SNPs in non-coding regions on JIA risk haplotypes for their ability to alter gene expression when compared to the common allele. The assay relies on 180 bp oligonucleotide reporters ("oligos") in which the allele of interest is flanked by its cognate genomic sequence. Barcodes were added randomly by PCR to each oligo to achieve > 20 barcodes per oligo to provide a quantitative read-out of gene expression for each allele. Assays were performed in both unstimulated K562 cells and cells stimulated overnight with interferon gamma (IFNg). As proof of concept, we then used CRISPRi to demonstrate the feasibility of identifying the genes regulated by enhancers harboring expression-altering SNPs. RESULTS We identified 553 expression-altering SNPs in unstimulated K562 cells and an additional 490 in cells stimulated with IFNg. We further filtered the SNPs to identify those plausibly situated within functional chromatin, using open chromatin and H3K27ac ChIPseq peaks in unstimulated cells and open chromatin plus H3K4me1 in stimulated cells. These procedures yielded 42 unique SNPs (total = 84) for each set. Using CRISPRi, we demonstrated that enhancers harboring MPRA-screened variants in the TRAF1 and LNPEP/ERAP2 loci regulated multiple genes, suggesting complex influences of disease-driving variants. CONCLUSION Using MPRA and CRISPRi, JIA risk haplotypes can be queried to identify plausible candidates for disease-driving variants. Once these candidate variants are identified, target genes can be identified using CRISPRi informed by the 3D chromatin structures that encompass the risk haplotypes.
Collapse
Affiliation(s)
- Kaiyu Jiang
- Department of Pediatrics, Clinical and Translational Research Center, University at Buffalo Jacobs School of Medicine School Medicine & Biomedical Sciences, 701 Ellicott St, Buffalo, NY, 14203, USA
| | - Tao Liu
- Roswell Park Cancer Institute, 665 Elm St, Buffalo, NY, 14203, USA
| | - Susan Kales
- Jackson Laboratories, 600 Main St, Bar Harbor, ME, 04609, USA
| | - Ryan Tewhey
- Jackson Laboratories, 600 Main St, Bar Harbor, ME, 04609, USA
| | - Dongkyeong Kim
- Department of Biochemistry, University at Buffalo Jacobs School of Medicine School Medicine & Biomedical Sciences, 955 Main St, Buffalo, NY, 14203, USA
| | - Yungki Park
- Department of Biochemistry, University at Buffalo Jacobs School of Medicine School Medicine & Biomedical Sciences, 955 Main St, Buffalo, NY, 14203, USA
- Genetics, Genomics, & Bioinformatics Program, University at Buffalo Jacobs School of Medicine School Medicine & Biomedical Sciences, 955 Main St, Buffalo, NY, 14203, USA
| | - James N Jarvis
- Department of Pediatrics, Clinical and Translational Research Center, University at Buffalo Jacobs School of Medicine School Medicine & Biomedical Sciences, 701 Ellicott St, Buffalo, NY, 14203, USA.
- Genetics, Genomics, & Bioinformatics Program, University at Buffalo Jacobs School of Medicine School Medicine & Biomedical Sciences, 955 Main St, Buffalo, NY, 14203, USA.
- University of Washington Rheumatology Research, 750 Republican St., E520, Seattle, WA, 98109, USA.
| |
Collapse
|
41
|
Oguchi A, Suzuki A, Komatsu S, Yoshitomi H, Bhagat S, Son R, Bonnal RJP, Kojima S, Koido M, Takeuchi K, Myouzen K, Inoue G, Hirai T, Sano H, Takegami Y, Kanemaru A, Yamaguchi I, Ishikawa Y, Tanaka N, Hirabayashi S, Konishi R, Sekito S, Inoue T, Kere J, Takeda S, Takaori-Kondo A, Endo I, Kawaoka S, Kawaji H, Ishigaki K, Ueno H, Hayashizaki Y, Pagani M, Carninci P, Yanagita M, Parrish N, Terao C, Yamamoto K, Murakawa Y. An atlas of transcribed enhancers across helper T cell diversity for decoding human diseases. Science 2024; 385:eadd8394. [PMID: 38963856 DOI: 10.1126/science.add8394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Accepted: 05/01/2024] [Indexed: 07/06/2024]
Abstract
Transcribed enhancer maps can reveal nuclear interactions underpinning each cell type and connect specific cell types to diseases. Using a 5' single-cell RNA sequencing approach, we defined transcription start sites of enhancer RNAs and other classes of coding and noncoding RNAs in human CD4+ T cells, revealing cellular heterogeneity and differentiation trajectories. Integration of these datasets with single-cell chromatin profiles showed that active enhancers with bidirectional RNA transcription are highly cell type-specific and that disease heritability is strongly enriched in these enhancers. The resulting cell type-resolved multimodal atlas of bidirectionally transcribed enhancers, which we linked with promoters using fine-scale chromatin contact maps, enabled us to systematically interpret genetic variants associated with a range of immune-mediated diseases.
Collapse
Affiliation(s)
- Akiko Oguchi
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Institute for the Advanced Study of Human Biology, Kyoto University, Kyoto, Japan
- Department of Nephrology, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Akari Suzuki
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Shuichiro Komatsu
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- IFOM ETS - the AIRC Institute of Molecular Oncology, Milan, Italy
| | - Hiroyuki Yoshitomi
- Institute for the Advanced Study of Human Biology, Kyoto University, Kyoto, Japan
- Department of Immunology, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Shruti Bhagat
- Institute for the Advanced Study of Human Biology, Kyoto University, Kyoto, Japan
| | - Raku Son
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Institute for the Advanced Study of Human Biology, Kyoto University, Kyoto, Japan
- Department of Nephrology, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | | | - Shohei Kojima
- Genome Immunobiology RIKEN Hakubi Research Team, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Masaru Koido
- Division of Molecular Pathology, Department of Cancer Biology, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Kazuhiro Takeuchi
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Institute for the Advanced Study of Human Biology, Kyoto University, Kyoto, Japan
- Department of Medical Systems Genomics, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Keiko Myouzen
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Gyo Inoue
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Tomoya Hirai
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Department of Gastroenterological Surgery, Yokohama City University Graduate School of Medicine, Yokohama City University, Yokohama, Japan
| | - Hiromi Sano
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | | | | | | | - Yuki Ishikawa
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Nao Tanaka
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Shigeki Hirabayashi
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Department of Hematology and Oncology, Graduate School of Medicine, Kyoto University, Kyoto, Japan
- Division of Precision Medicine, Kyushu University Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Riyo Konishi
- Inter-Organ Communication Research Team, Institute for Life and Medical Sciences, Kyoto University, Kyoto, Japan
| | - Sho Sekito
- Institute for the Advanced Study of Human Biology, Kyoto University, Kyoto, Japan
- Department of Nephro-Urologic Surgery and Andrology, Mie University Graduate School of Medicine, Mie University, Tsu, Japan
| | - Takahiro Inoue
- Department of Nephro-Urologic Surgery and Andrology, Mie University Graduate School of Medicine, Mie University, Tsu, Japan
| | - Juha Kere
- Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden
- Stem Cells and Metabolism Research Program, University of Helsinki, Helsinki, Finland
- Folkhalsan Research Center, Helsinki, Finland
| | - Shunichi Takeda
- Department of Radiation Genetics, Graduate School of Medicine, Kyoto University, Kyoto, Japan
- Shenzhen University School of Medicine, Shenzhen, Guangdong, China
| | - Akifumi Takaori-Kondo
- Department of Hematology and Oncology, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Itaru Endo
- Department of Gastroenterological Surgery, Yokohama City University Graduate School of Medicine, Yokohama City University, Yokohama, Japan
| | - Shinpei Kawaoka
- Inter-Organ Communication Research Team, Institute for Life and Medical Sciences, Kyoto University, Kyoto, Japan
- Department of Integrative Bioanalytics, Institute of Development, Aging and Cancer, Tohoku University, Sendai, Japan
| | - Hideya Kawaji
- Research Center for Genome & Medical Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
- Preventive Medicine and Applied Genomics Unit, RIKEN Center for Integrative Medical Science, Yokohama, Japan
- RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Japan
| | - Kazuyoshi Ishigaki
- Laboratory for Human Immunogenetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Hideki Ueno
- Institute for the Advanced Study of Human Biology, Kyoto University, Kyoto, Japan
- Department of Immunology, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Yoshihide Hayashizaki
- K.K. DNAFORM, Yokohama, Japan
- RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Japan
| | - Massimiliano Pagani
- IFOM ETS - the AIRC Institute of Molecular Oncology, Milan, Italy
- Department of Medical Biotechnology and Translational Medicine, Università degli Studi, Milan, Italy
| | - Piero Carninci
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Human Technopole, Milan, Italy
| | - Motoko Yanagita
- Institute for the Advanced Study of Human Biology, Kyoto University, Kyoto, Japan
- Department of Nephrology, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Nicholas Parrish
- Genome Immunobiology RIKEN Hakubi Research Team, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- Department of Applied Genetics, School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| | - Kazuhiko Yamamoto
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yasuhiro Murakawa
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Institute for the Advanced Study of Human Biology, Kyoto University, Kyoto, Japan
- IFOM ETS - the AIRC Institute of Molecular Oncology, Milan, Italy
- Department of Medical Systems Genomics, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| |
Collapse
|
42
|
Yang JH, Hansen AS. Enhancer selectivity in space and time: from enhancer-promoter interactions to promoter activation. Nat Rev Mol Cell Biol 2024; 25:574-591. [PMID: 38413840 PMCID: PMC11574175 DOI: 10.1038/s41580-024-00710-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/30/2024] [Indexed: 02/29/2024]
Abstract
The primary regulators of metazoan gene expression are enhancers, originally functionally defined as DNA sequences that can activate transcription at promoters in an orientation-independent and distance-independent manner. Despite being crucial for gene regulation in animals, what mechanisms underlie enhancer selectivity for promoters, and more fundamentally, how enhancers interact with promoters and activate transcription, remain poorly understood. In this Review, we first discuss current models of enhancer-promoter interactions in space and time and how enhancers affect transcription activation. Next, we discuss different mechanisms that mediate enhancer selectivity, including repression, biochemical compatibility and regulation of 3D genome structure. Through 3D polymer simulations, we illustrate how the ability of 3D genome folding mechanisms to mediate enhancer selectivity strongly varies for different enhancer-promoter interaction mechanisms. Finally, we discuss how recent technical advances may provide new insights into mechanisms of enhancer-promoter interactions and how technical biases in methods such as Hi-C and Micro-C and imaging techniques may affect their interpretation.
Collapse
Affiliation(s)
- Jin H Yang
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Cambridge, MA, USA
| | - Anders S Hansen
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Koch Institute for Integrative Cancer Research, Cambridge, MA, USA.
| |
Collapse
|
43
|
Li Y, Tan M, Akkari-Henić A, Zhang L, Kip M, Sun S, Sepers JJ, Xu N, Ariyurek Y, Kloet SL, Davis RP, Mikkers H, Gruber JJ, Snyder MP, Li X, Pang B. Genome-wide Cas9-mediated screening of essential non-coding regulatory elements via libraries of paired single-guide RNAs. Nat Biomed Eng 2024; 8:890-908. [PMID: 38778183 PMCID: PMC11310080 DOI: 10.1038/s41551-024-01204-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 03/27/2024] [Indexed: 05/25/2024]
Abstract
The functions of non-coding regulatory elements (NCREs), which constitute a major fraction of the human genome, have not been systematically studied. Here we report a method involving libraries of paired single-guide RNAs targeting both ends of an NCRE as a screening system for the Cas9-mediated deletion of thousands of NCREs genome-wide to study their functions in distinct biological contexts. By using K562 and 293T cell lines and human embryonic stem cells, we show that NCREs can have redundant functions, and that many ultra-conserved elements have silencer activity and play essential roles in cell growth and in cellular responses to drugs (notably, the ultra-conserved element PAX6_Tarzan may be critical for heart development, as removing it from human embryonic stem cells led to defects in cardiomyocyte differentiation). The high-throughput screen, which is compatible with single-cell sequencing, may allow for the identification of druggable NCREs.
Collapse
Affiliation(s)
- Yufeng Li
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Minkang Tan
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Almira Akkari-Henić
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Limin Zhang
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Maarten Kip
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Shengnan Sun
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Jorian J Sepers
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Ningning Xu
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Yavuz Ariyurek
- Leiden Genome Technology Center, Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | - Susan L Kloet
- Leiden Genome Technology Center, Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | - Richard P Davis
- Department of Anatomy and Embryology, The Novo Nordisk Foundation Center for Stem Cell Medicine (reNEW), Leiden University Medical Center, Leiden, the Netherlands
| | - Harald Mikkers
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Joshua J Gruber
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | | | - Xiao Li
- Department of Biochemistry, The Center for RNA Science and Therapeutics, Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH, USA.
| | - Baoxu Pang
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands.
| |
Collapse
|
44
|
Lim LQJ, Adler L, Hajaj E, Soria LR, Perry RBT, Darzi N, Brody R, Furth N, Lichtenstein M, Bab-Dinitz E, Porat Z, Melman T, Brandis A, Malitsky S, Itkin M, Aylon Y, Ben-Dor S, Orr I, Pri-Or A, Seger R, Shaul Y, Ruppin E, Oren M, Perez M, Meier J, Brunetti-Pierri N, Shema E, Ulitsky I, Erez A. ASS1 metabolically contributes to the nuclear and cytosolic p53-mediated DNA damage response. Nat Metab 2024; 6:1294-1309. [PMID: 38858597 PMCID: PMC11272581 DOI: 10.1038/s42255-024-01060-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Accepted: 04/30/2024] [Indexed: 06/12/2024]
Abstract
Downregulation of the urea cycle enzyme argininosuccinate synthase (ASS1) in multiple tumors is associated with a poor prognosis partly because of the metabolic diversion of cytosolic aspartate for pyrimidine synthesis, supporting proliferation and mutagenesis owing to nucleotide imbalance. Here, we find that prolonged loss of ASS1 promotes DNA damage in colon cancer cells and fibroblasts from subjects with citrullinemia type I. Following acute induction of DNA damage with doxorubicin, ASS1 expression is elevated in the cytosol and the nucleus with at least a partial dependency on p53; ASS1 metabolically restrains cell cycle progression in the cytosol by restricting nucleotide synthesis. In the nucleus, ASS1 and ASL generate fumarate for the succination of SMARCC1, destabilizing the chromatin-remodeling complex SMARCC1-SNF5 to decrease gene transcription, specifically in a subset of the p53-regulated cell cycle genes. Thus, following DNA damage, ASS1 is part of the p53 network that pauses cell cycle progression, enabling genome maintenance and survival. Loss of ASS1 contributes to DNA damage and promotes cell cycle progression, likely contributing to cancer mutagenesis and, hence, adaptability potential.
Collapse
Affiliation(s)
- Lisha Qiu Jin Lim
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Lital Adler
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Emma Hajaj
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
- Department of Medicine D, Beilinson Hospital, Petah Tikva, Israel
| | - Leandro R Soria
- Telethon Institute of Genetics and Medicine, Pozzuoli, Italy
| | - Rotem Ben-Tov Perry
- Department of Immunology and Regenerative Biology, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Neuroscience, Weizmann Institute of Science, Rehovot, Israel
| | - Naama Darzi
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Ruchama Brody
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Noa Furth
- Department of Immunology and Regenerative Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Michal Lichtenstein
- Department of Biochemistry and Molecular Biology, The Institute for Medical Research Israel-Canada, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Elizabeta Bab-Dinitz
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Ziv Porat
- Department of Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot, Israel
| | - Tevie Melman
- Department of Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot, Israel
| | - Alexander Brandis
- Department of Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot, Israel
| | - Sergey Malitsky
- Department of Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot, Israel
| | - Maxim Itkin
- Department of Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot, Israel
| | - Yael Aylon
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Shifra Ben-Dor
- Department of Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot, Israel
| | - Irit Orr
- Department of Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot, Israel
| | - Amir Pri-Or
- The De Botton Protein Profiling Institute of the Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot, Israel
| | - Rony Seger
- Department of Immunology and Regenerative Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Yoav Shaul
- Department of Biochemistry and Molecular Biology, The Institute for Medical Research Israel-Canada, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Eytan Ruppin
- Cancer Data Science Lab, Center for Cancer Research, National Cancer Institute, National Institute of Health, Bethesda, MD, USA
| | - Moshe Oren
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Minervo Perez
- Cancer Data Science Lab, Center for Cancer Research, National Cancer Institute, National Institute of Health, Bethesda, MD, USA
| | - Jordan Meier
- Cancer Data Science Lab, Center for Cancer Research, National Cancer Institute, National Institute of Health, Bethesda, MD, USA
| | - Nicola Brunetti-Pierri
- Telethon Institute of Genetics and Medicine, Pozzuoli, Italy
- Department of Translational Medicine, Medical Genetics, University of Naples Federico II, Naples, Italy
- Scuola Superiore Meridionale (SSM, School of Advanced Studies), Genomics and Experimental Medicine Program, University of Naples Federico II, Naples, Italy
| | - Efrat Shema
- Department of Immunology and Regenerative Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Igor Ulitsky
- Department of Immunology and Regenerative Biology, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Neuroscience, Weizmann Institute of Science, Rehovot, Israel
| | - Ayelet Erez
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
45
|
Moeckel C, Mouratidis I, Chantzi N, Uzun Y, Georgakopoulos-Soares I. Advances in computational and experimental approaches for deciphering transcriptional regulatory networks: Understanding the roles of cis-regulatory elements is essential, and recent research utilizing MPRAs, STARR-seq, CRISPR-Cas9, and machine learning has yielded valuable insights. Bioessays 2024; 46:e2300210. [PMID: 38715516 PMCID: PMC11444527 DOI: 10.1002/bies.202300210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/22/2024] [Accepted: 04/23/2024] [Indexed: 05/16/2024]
Abstract
Understanding the influence of cis-regulatory elements on gene regulation poses numerous challenges given complexities stemming from variations in transcription factor (TF) binding, chromatin accessibility, structural constraints, and cell-type differences. This review discusses the role of gene regulatory networks in enhancing understanding of transcriptional regulation and covers construction methods ranging from expression-based approaches to supervised machine learning. Additionally, key experimental methods, including MPRAs and CRISPR-Cas9-based screening, which have significantly contributed to understanding TF binding preferences and cis-regulatory element functions, are explored. Lastly, the potential of machine learning and artificial intelligence to unravel cis-regulatory logic is analyzed. These computational advances have far-reaching implications for precision medicine, therapeutic target discovery, and the study of genetic variations in health and disease.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Yasin Uzun
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
46
|
Loeb GB, Kathail P, Shuai R, Chung R, Grona RJ, Peddada S, Sevim V, Federman S, Mader K, Chu A, Davitte J, Du J, Gupta AR, Ye CJ, Shafer S, Przybyla L, Rapiteanu R, Ioannidis N, Reiter JF. Variants in tubule epithelial regulatory elements mediate most heritable differences in human kidney function. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.18.599625. [PMID: 38948875 PMCID: PMC11212968 DOI: 10.1101/2024.06.18.599625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Kidney disease is highly heritable; however, the causal genetic variants, the cell types in which these variants function, and the molecular mechanisms underlying kidney disease remain largely unknown. To identify genetic loci affecting kidney function, we performed a GWAS using multiple kidney function biomarkers and identified 462 loci. To begin to investigate how these loci affect kidney function, we generated single-cell chromatin accessibility (scATAC-seq) maps of the human kidney and identified candidate cis-regulatory elements (cCREs) for kidney podocytes, tubule epithelial cells, and kidney endothelial, stromal, and immune cells. Kidney tubule epithelial cCREs explained 58% of kidney function SNP-heritability and kidney podocyte cCREs explained an additional 6.5% of SNP-heritability. In contrast, little kidney function heritability was explained by kidney endothelial, stromal, or immune cell-specific cCREs. Through functionally informed fine-mapping, we identified putative causal kidney function variants and their corresponding cCREs. Using kidney scATAC-seq data, we created a deep learning model (which we named ChromKid) to predict kidney cell type-specific chromatin accessibility from sequence. ChromKid and allele specific kidney scATAC-seq revealed that many fine-mapped kidney function variants locally change chromatin accessibility in tubule epithelial cells. Enhancer assays confirmed that fine-mapped kidney function variants alter tubule epithelial regulatory element function. To map the genes which these regulatory elements control, we used CRISPR interference (CRISPRi) to target these regulatory elements in tubule epithelial cells and assessed changes in gene expression. CRISPRi of enhancers harboring kidney function variants regulated NDRG1 and RBPMS expression. Thus, inherited differences in tubule epithelial NDRG1 and RBPMS expression may predispose to kidney disease in humans. We conclude that genetic variants affecting tubule epithelial regulatory element function account for most SNP-heritability of human kidney function. This work provides an experimental approach to identify the variants, regulatory elements, and genes involved in polygenic disease.
Collapse
Affiliation(s)
- Gabriel B. Loeb
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
- Cardiovascular Research Institute, University of California, San Francisco, San Francisco, CA, US
| | - Pooja Kathail
- Department of Electrical Engineering and Computer Science, Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
| | - Richard Shuai
- Department of Electrical Engineering and Computer Science, Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
| | - Ryan Chung
- Department of Electrical Engineering and Computer Science, Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
| | - Reinier J. Grona
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Sailaja Peddada
- Laboratory for Genomics Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Volkan Sevim
- Laboratory for Genomics Research, University of California, San Francisco, San Francisco, CA, USA
- Genomic Sciences, GlaxoSmithKline, San Francisco, CA, USA
| | - Scot Federman
- Laboratory for Genomics Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Karl Mader
- Laboratory for Genomics Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Audrey Chu
- Genomic Sciences, GlaxoSmithKline, San Francisco, CA, USA
| | | | - Juan Du
- Department of Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Alexander R. Gupta
- Department of Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Chun Jimmie Ye
- Division of Rheumatology, Department of Medicine; Bakar Computational Health Sciences Institute; Parker Institute for Cancer Immunotherapy; Institute for Human Genetics; Department of Epidemiology & Biostatistics; Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA and Gladstone-UCSF Institute of Genomic Immunology, San Francisco, CA, USA
| | - Shawn Shafer
- Laboratory for Genomics Research, University of California, San Francisco, San Francisco, CA, USA
- Genomic Sciences, GlaxoSmithKline, San Francisco, CA, USA
| | - Laralynne Przybyla
- Laboratory for Genomics Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Radu Rapiteanu
- Genomic Sciences, GlaxoSmithKline, San Francisco, CA, USA
| | - Nilah Ioannidis
- Department of Electrical Engineering and Computer Science, Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Jeremy F. Reiter
- Cardiovascular Research Institute, University of California, San Francisco, San Francisco, CA, US
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
47
|
Barth D, Van R, Cardwell J, Han MV. Supervised learning of enhancer-promoter specificity based on genome-wide perturbation studies highlights areas for improvement in learning. Bioinformatics 2024; 40:btae367. [PMID: 38870532 PMCID: PMC11211214 DOI: 10.1093/bioinformatics/btae367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 05/29/2024] [Accepted: 06/11/2024] [Indexed: 06/15/2024] Open
Abstract
MOTIVATION Understanding the rules that govern enhancer-driven transcription remains a central unsolved problem in genomics. Now with multiple massively parallel enhancer perturbation assays published, there are enough data that we can utilize to learn to predict enhancer-promoter (EP) relationships in a data-driven manner. RESULTS We applied machine learning to one of the largest enhancer perturbation studies integrated with transcription factor (TF) and histone modification ChIP-seq. The results uncovered a discrepancy in the prediction of genome-wide data compared to data from targeted experiments. Relative strength of contact was important for prediction, confirming the basic principle of EP regulation. Novel features such as the density of the enhancers/promoters in the genomic region was found to be important, highlighting our lack of understanding on how other elements in the region contribute to the regulation. Several TF peaks were identified that improved the prediction by identifying the negatives and reducing False Positives. In summary, integrating genomic assays with enhancer perturbation studies increased the accuracy of the model, and provided novel insights into the understanding of enhancer-driven transcription. AVAILABILITY AND IMPLEMENTATION The trained models, data, and the source code are available at http://doi.org/10.5281/zenodo.11290386 and https://github.com/HanLabUNLV/sleps.
Collapse
Affiliation(s)
- Dylan Barth
- School of Life Sciences, University of Nevada, Las Vegas, NV 89154, United States
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, NV 89154, United States
| | - Richard Van
- School of Life Sciences, University of Nevada, Las Vegas, NV 89154, United States
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, NV 89154, United States
| | - Jonathan Cardwell
- Department of Medicine, University of Colorado School of Medicine, Denver, CO 80045, United States
| | - Mira V Han
- School of Life Sciences, University of Nevada, Las Vegas, NV 89154, United States
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, NV 89154, United States
| |
Collapse
|
48
|
Chin IM, Gardell ZA, Corces MR. Decoding polygenic diseases: advances in noncoding variant prioritization and validation. Trends Cell Biol 2024; 34:465-483. [PMID: 38719704 DOI: 10.1016/j.tcb.2024.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 03/12/2024] [Accepted: 03/21/2024] [Indexed: 06/09/2024]
Abstract
Genome-wide association studies (GWASs) provide a key foundation for elucidating the genetic underpinnings of common polygenic diseases. However, these studies have limitations in their ability to assign causality to particular genetic variants, especially those residing in the noncoding genome. Over the past decade, technological and methodological advances in both analytical and empirical prioritization of noncoding variants have enabled the identification of causative variants by leveraging orthogonal functional evidence at increasing scale. In this review, we present an overview of these approaches and describe how this workflow provides the groundwork necessary to move beyond associations toward genetically informed studies on the molecular and cellular mechanisms of polygenic disease.
Collapse
Affiliation(s)
- Iris M Chin
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Zachary A Gardell
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - M Ryan Corces
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
49
|
Qin M, Deng C, Wen L, Luo G, Meng Y. CRISPR-Cas and CRISPR-based screening system for precise gene editing and targeted cancer therapy. J Transl Med 2024; 22:516. [PMID: 38816739 PMCID: PMC11138051 DOI: 10.1186/s12967-024-05235-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 04/24/2024] [Indexed: 06/01/2024] Open
Abstract
Target cancer therapy has been developed for clinical cancer treatment based on the discovery of CRISPR (clustered regularly interspaced short palindromic repeat) -Cas system. This forefront and cutting-edge scientific technique improves the cancer research into molecular level and is currently widely utilized in genetic investigation and clinical precision cancer therapy. In this review, we summarized the genetic modification by CRISPR/Cas and CRISPR screening system, discussed key components for successful CRISPR screening, including Cas enzymes, guide RNA (gRNA) libraries, target cells or organs. Furthermore, we focused on the application for CAR-T cell therapy, drug target, drug screening, or drug selection in both ex vivo and in vivo with CRISPR screening system. In addition, we elucidated the advantages and potential obstacles of CRISPR system in precision clinical medicine and described the prospects for future genetic therapy.In summary, we provide a comprehensive and practical perspective on the development of CRISPR/Cas and CRISPR screening system for the treatment of cancer defects, aiming to further improve the precision and accuracy for clinical treatment and individualized gene therapy.
Collapse
Affiliation(s)
- Mingming Qin
- Reproductive Medical Center, Affiliated Foshan Maternity & Child Healthcare Hospital, Southern Medical University (Foshan Women and Children Hospital), Foshan, Guangdong, 528000, China
- Department of Developmental Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, Guangdong, 510515, China
| | - Chunhao Deng
- Chinese Medicine and Translational Medicine R&D center, Zhuhai UM Science & Technology Research Institute, Zhuhai, Guangdong, 519031, China
| | - Liewei Wen
- Guangdong Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital, Zhuhai Clinical Medical College of Jinan University, Zhuhai, Guangdong, 519000, China
| | - Guoqun Luo
- Reproductive Medical Center, Affiliated Foshan Maternity & Child Healthcare Hospital, Southern Medical University (Foshan Women and Children Hospital), Foshan, Guangdong, 528000, China.
| | - Ya Meng
- Guangdong Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital, Zhuhai Clinical Medical College of Jinan University, Zhuhai, Guangdong, 519000, China.
| |
Collapse
|
50
|
Dorans E, Jagadeesh K, Dey K, Price AL. Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.24.24307813. [PMID: 38826240 PMCID: PMC11142273 DOI: 10.1101/2024.05.24.24307813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Methods that analyze single-cell paired RNA-seq and ATAC-seq multiome data have shown great promise in linking regulatory elements to genes. However, existing methods differ in their modeling assumptions and approaches to account for biological and technical noise-leading to low concordance in their linking scores-and do not capture the effects of genomic distance. We propose pgBoost, an integrative modeling framework that trains a non-linear combination of existing linking strategies (including genomic distance) on fine-mapped eQTL data to assign a probabilistic score to each candidate SNP-gene link. We applied pgBoost to single-cell multiome data from 85k cells representing 6 major immune/blood cell types. pgBoost attained higher enrichment for fine-mapped eSNP-eGene pairs (e.g. 21x at distance >10kb) than existing methods (1.2-10x; p-value for difference = 5e-13 vs. distance-based method and < 4e-35 for each other method), with larger improvements at larger distances (e.g. 35x vs. 0.89-6.6x at distance >100kb; p-value for difference < 0.002 vs. each other method). pgBoost also outperformed existing methods in enrichment for CRISPR-validated links (e.g. 4.8x vs. 1.6-4.1x at distance >10kb; p-value for difference = 0.25 vs. distance-based method and < 2e-5 for each other method), with larger improvements at larger distances (e.g. 15x vs. 1.6-2.5x at distance >100kb; p-value for difference < 0.009 for each other method). Similar improvements in enrichment were observed for links derived from Activity-By-Contact (ABC) scores and GWAS data. We further determined that restricting pgBoost to features from a focal cell type improved the identification of SNP-gene links relevant to that cell type. We highlight several examples where pgBoost linked fine-mapped GWAS variants to experimentally validated or biologically plausible target genes that were not implicated by other methods. In conclusion, a non-linear combination of linking strategies, including genomic distance, improves power to identify target genes underlying GWAS associations.
Collapse
|