1
|
Li Z, Zhang Y, Peng B, Qin S, Zhang Q, Chen Y, Chen C, Bao Y, Zhu Y, Hong Y, Liu B, Liu Q, Xu L, Chen X, Ma X, Wang H, Xie L, Yao Y, Deng B, Li J, De B, Chen Y, Wang J, Li T, Liu R, Tang Z, Cao J, Zuo E, Mei C, Zhu F, Shao C, Wang G, Sun T, Wang N, Liu G, Ni JQ, Liu Y. A novel interpretable deep learning-based computational framework designed synthetic enhancers with broad cross-species activity. Nucleic Acids Res 2024; 52:13447-13468. [PMID: 39420601 PMCID: PMC11602155 DOI: 10.1093/nar/gkae912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 09/25/2024] [Accepted: 10/03/2024] [Indexed: 10/19/2024] Open
Abstract
Enhancers play a critical role in dynamically regulating spatial-temporal gene expression and establishing cell identity, underscoring the significance of designing them with specific properties for applications in biosynthetic engineering and gene therapy. Despite numerous high-throughput methods facilitating genome-wide enhancer identification, deciphering the sequence determinants of their activity remains challenging. Here, we present the DREAM (DNA cis-Regulatory Elements with controllable Activity design platforM) framework, a novel deep learning-based approach for synthetic enhancer design. Proficient in uncovering subtle and intricate patterns within extensive enhancer screening data, DREAM achieves cutting-edge sequence-based enhancer activity prediction and highlights critical sequence features implicating strong enhancer activity. Leveraging DREAM, we have engineered enhancers that surpass the potency of the strongest enhancer within the Drosophila genome by approximately 3.6-fold. Remarkably, these synthetic enhancers exhibited conserved functionality across species that have diverged more than billion years, indicating that DREAM was able to learn highly conserved enhancer regulatory grammar. Additionally, we designed silencers and cell line-specific enhancers using DREAM, demonstrating its versatility. Overall, our study not only introduces an interpretable approach for enhancer design but also lays out a general framework applicable to the design of other types of cis-regulatory elements.
Collapse
Affiliation(s)
- Zhaohong Li
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Yuanyuan Zhang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Bo Peng
- Gene Regulatory Lab, School of Basic Medical Sciences, Tsinghua University, NO. 30 Shuangqing road, Haidian district, Beijing 100084, China
- State Key Laboratory of Molecular Oncology, Tsinghua University, NO. 30 Shuangqing road, Haidian district, Beijing 100084, China
| | - Shenghua Qin
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Qian Zhang
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, NO.1 Beichen West Road, Chaoyang District, Beijing 100101, China
| | - Yun Chen
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Choulin Chen
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Yongzhou Bao
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Yuqi Zhu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, NO. 7 Pengfei Road, Dapeng District, Shenzhen 518124, China
| | - Yi Hong
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, NO. 7 Pengfei Road, Dapeng District, Shenzhen 518124, China
| | - Binghua Liu
- State Key Laboratory of Maricultural Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, NO.106 Nanjing Road, Shinan District, Qingdao, Shandong 266071, China
| | - Qian Liu
- State Key Laboratory of Maricultural Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, NO.106 Nanjing Road, Shinan District, Qingdao, Shandong 266071, China
| | - Lingna Xu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Xi Chen
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Xinhao Ma
- College of Grassland Agriculture, National Beef Cattle Improvement Center, College of Animal Science and Technology, Northwest A&F University, NO. 3 Taicheng Road, Yangling District, Yangling, Shaanxi 712100, China
| | - Hongyan Wang
- State Key Laboratory of Maricultural Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, NO.106 Nanjing Road, Shinan District, Qingdao, Shandong 266071, China
| | - Long Xie
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Yilong Yao
- Green Healthy Aquaculture Research Center, Kunpeng Institute of Modern Agriculture at Foshan, Chinese Academy of Agricultural Sciences, Building 26 Lihe Technology Park, Auxiliary Road of Xinxi Avenue South, Nanhai District, Foshan 528226, China
| | - Biao Deng
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Jiaying Li
- Department of Ophthalmology, Beijing Institute of Ophthalmology, Beijing Tongren Eye Center, Beijing Tongren Hospital, Capital Medical University, Dongjiaomin lane No1, Dongcheng District, Beijing 100101, China
| | - Baojun De
- College of Life Sciences, Inner Mongolia Autonomous Region Key Laboratory of Biomanufacturing, Inner Mongolia Agricultural University, NO. 306 Zhaowuda Road, Saihan District, Hohhot 010018, China
| | - Yuting Chen
- College of Life Sciences, Inner Mongolia Autonomous Region Key Laboratory of Biomanufacturing, Inner Mongolia Agricultural University, NO. 306 Zhaowuda Road, Saihan District, Hohhot 010018, China
| | - Jing Wang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Tian Li
- College of JUNCAO Science and Ecology, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University (FAFU), NO.15 Shangxiadian Road, Cangshan District, Fuzhou 0350002, China
| | - Ranran Liu
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Yuanmingyuan West Road NO. 2, Haidian District, Beijing 100193, China
| | - Zhonglin Tang
- Green Healthy Aquaculture Research Center, Kunpeng Institute of Modern Agriculture at Foshan, Chinese Academy of Agricultural Sciences, Building 26 Lihe Technology Park, Auxiliary Road of Xinxi Avenue South, Nanhai District, Foshan 528226, China
| | - Junwei Cao
- College of Life Sciences, Inner Mongolia Autonomous Region Key Laboratory of Biomanufacturing, Inner Mongolia Agricultural University, NO. 306 Zhaowuda Road, Saihan District, Hohhot 010018, China
| | - Erwei Zuo
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Chugang Mei
- College of Grassland Agriculture, National Beef Cattle Improvement Center, College of Animal Science and Technology, Northwest A&F University, NO. 3 Taicheng Road, Yangling District, Yangling, Shaanxi 712100, China
| | - Fangjie Zhu
- College of JUNCAO Science and Ecology, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University (FAFU), NO.15 Shangxiadian Road, Cangshan District, Fuzhou 0350002, China
| | - Changwei Shao
- State Key Laboratory of Maricultural Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, NO.106 Nanjing Road, Shinan District, Qingdao, Shandong 266071, China
| | - Guirong Wang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
| | - Tongjun Sun
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, NO. 7 Pengfei Road, Dapeng District, Shenzhen 518124, China
| | - Ningli Wang
- Department of Ophthalmology, Beijing Institute of Ophthalmology, Beijing Tongren Eye Center, Beijing Tongren Hospital, Capital Medical University, Dongjiaomin lane No1, Dongcheng District, Beijing 100101, China
| | - Gang Liu
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, NO.1 Beichen West Road, Chaoyang District, Beijing 100101, China
| | - Jian-Quan Ni
- Gene Regulatory Lab, School of Basic Medical Sciences, Tsinghua University, NO. 30 Shuangqing road, Haidian district, Beijing 100084, China
- State Key Laboratory of Molecular Oncology, Tsinghua University, NO. 30 Shuangqing road, Haidian district, Beijing 100084, China
- SXMU-Tsinghua Collaborative Innovation Center for Frontier Medicine, Shanxi Medical University, NO. 56 Xinjian South Road, Yingze District, Taiyuan 030001, China
| | - Yuwen Liu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China
- Green Healthy Aquaculture Research Center, Kunpeng Institute of Modern Agriculture at Foshan, Chinese Academy of Agricultural Sciences, Building 26 Lihe Technology Park, Auxiliary Road of Xinxi Avenue South, Nanhai District, Foshan 528226, China
| |
Collapse
|
2
|
Przytycki PF, Pollard KS. Hierarchical annotation of eQTLs by H-eQTL enables identification of genes with cell type-divergent regulation. Genome Biol 2024; 25:299. [PMID: 39587678 PMCID: PMC11587609 DOI: 10.1186/s13059-024-03440-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 11/19/2024] [Indexed: 11/27/2024] Open
Abstract
While context-type-specific regulation of genes is largely determined by cis-regulatory regions, attempts to identify cell type-specific eQTLs are complicated by the nested nature of cell types. We present hierarchical eQTL (H-eQTL), a network-based model for hierarchical annotation of bulk-derived eQTLs to levels of a cell type tree using single-cell chromatin accessibility data and no clustering of cells into discrete cell types. Using our model, we annotate bulk-derived eQTLs from the developing brain with high specificity to levels of a cell type hierarchy, which allows sensitive detection of genes with multiple distinct non-coding elements regulating their expression in different cell types.
Collapse
Affiliation(s)
- Pawel F Przytycki
- Gladstone Institutes, San Francisco, CA, USA
- Present address: Faculty of Computing & Data Sciences, Boston University, Boston University, Boston, MA, USA
| | - Katherine S Pollard
- Gladstone Institutes, San Francisco, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
- Department of Epidemiology and Biostatistics, Institute for Computational Health Sciences, Institute for Human Genetics and University of California, San Francisco, CA, USA.
| |
Collapse
|
3
|
Zhao Y, Xue L, Huang Z, Lei Z, Xie S, Cai Z, Rao X, Zheng Z, Xiao N, Zhang X, Ma F, Yu H, Xie S. Lignin valorization to bioplastics with an aromatic hub metabolite-based autoregulation system. Nat Commun 2024; 15:9288. [PMID: 39468081 PMCID: PMC11519575 DOI: 10.1038/s41467-024-53609-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 10/16/2024] [Indexed: 10/30/2024] Open
Abstract
Exploring microorganisms with downstream synthetic advantages in lignin valorization is an effective strategy to increase target product diversity and yield. This study ingeniously engineers the non-lignin-degrading bacterium Ralstonia eutropha H16 (also known as Cupriavidus necator H16) to convert lignin, a typically underutilized by-product of biorefinery, into valuable bioplastic polyhydroxybutyrate (PHB). The aromatic metabolism capacities of R. eutropha H16 for different lignin-derived aromatics (LDAs) are systematically characterized and complemented by integrating robust functional modules including O-demethylation, aromatic aldehyde metabolism and the mitigation of by-product inhibition. A pivotal discovery is the regulatory element PcaQ, which is highly responsive to the aromatic hub metabolite protocatechuic acid during lignin degradation. Based on the computer-aided design of PcaQ, we develop a hub metabolite-based autoregulation (HMA) system. This system can control the functional genes expression in response to heterologous LDAs and enhance metabolism efficiency. Multi-module genome integration and directed evolution further fortify the strain's stability and lignin conversion capacities, leading to PHB production titer of 2.38 g/L using heterologous LDAs as sole carbon source. This work not only marks a leap in bioplastic production from lignin components but also provides a strategy to redesign the non-LDAs-degrading microbes for efficient lignin valorization.
Collapse
Affiliation(s)
- Yiquan Zhao
- Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Le Xue
- Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Zhiyi Huang
- Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Zixian Lei
- Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Shiyu Xie
- Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Zhenzhen Cai
- Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Xinran Rao
- Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Ze Zheng
- Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Ning Xiao
- National key Laboratory of Non-food Biomass Energy Technology, Guangxi Academy of Sciences, Nanning, Guangxi, China
| | - Xiaoyu Zhang
- Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Fuying Ma
- Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Hongbo Yu
- Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Shangxian Xie
- Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China.
- National key Laboratory of Non-food Biomass Energy Technology, Guangxi Academy of Sciences, Nanning, Guangxi, China.
| |
Collapse
|
4
|
La Fleur A, Shi Y, Seelig G. Decoding biology with massively parallel reporter assays and machine learning. Genes Dev 2024; 38:843-865. [PMID: 39362779 PMCID: PMC11535156 DOI: 10.1101/gad.351800.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2024]
Abstract
Massively parallel reporter assays (MPRAs) are powerful tools for quantifying the impacts of sequence variation on gene expression. Reading out molecular phenotypes with sequencing enables interrogating the impact of sequence variation beyond genome scale. Machine learning models integrate and codify information learned from MPRAs and enable generalization by predicting sequences outside the training data set. Models can provide a quantitative understanding of cis-regulatory codes controlling gene expression, enable variant stratification, and guide the design of synthetic regulatory elements for applications from synthetic biology to mRNA and gene therapy. This review focuses on cis-regulatory MPRAs, particularly those that interrogate cotranscriptional and post-transcriptional processes: alternative splicing, cleavage and polyadenylation, translation, and mRNA decay.
Collapse
Affiliation(s)
- Alyssa La Fleur
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA
| | - Yongsheng Shi
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California, Irvine, Irvine, California 92697, USA;
| | - Georg Seelig
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA;
- Department of Electrical & Computer Engineering, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
5
|
Bond ML, Quiroga-Barber IY, D’Costa S, Wu Y, Bell JL, McAfee JC, Kramer NE, Lee S, Patrucco M, Phanstiel DH, Won H. Deciphering the functional impact of Alzheimer's Disease-associated variants in resting and proinflammatory immune cells. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.09.13.24313654. [PMID: 39371155 PMCID: PMC11451667 DOI: 10.1101/2024.09.13.24313654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/08/2024]
Abstract
Genome-wide association studies have identified loci associated with Alzheimer's Disease (AD), but identifying the exact causal variants and genes at each locus is challenging due to linkage disequilibrium and their largely non-coding nature. To address this, we performed a massively parallel reporter assay of 3,576 AD-associated variants in THP-1 macrophages in both resting and proinflammatory states and identified 47 expression-modulating variants (emVars). To understand the endogenous chromatin context of emVars, we built an activity-by-contact model using epigenomic maps of macrophage inflammation and inferred condition-specific enhancer-promoter pairs. Intersection of emVars with enhancer-promoter pairs and microglia expression quantitative trait loci allowed us to connect 39 emVars to 76 putative AD risk genes enriched for AD-associated molecular signatures. Overall, systematic characterization of AD-associated variants enhances our understanding of the regulatory mechanisms underlying AD pathogenesis.
Collapse
Affiliation(s)
- Marielle L. Bond
- Curriculum in Genetics & Molecular Biology, University of North Carolina at Chapel Hill
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
| | | | - Susan D’Costa
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill
| | - Yijia Wu
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
| | - Jessica L. Bell
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
| | - Jessica C. McAfee
- Curriculum in Genetics & Molecular Biology, University of North Carolina at Chapel Hill
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
| | - Nicole E. Kramer
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill
| | - Sool Lee
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill
| | - Mary Patrucco
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
| | - Douglas H. Phanstiel
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill
- Department of Cell Biology & Physiology, University of North Carolina at Chapel Hill
| | - Hyejung Won
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
| |
Collapse
|
6
|
Xu L, Liu Y. Identification, Design, and Application of Noncoding Cis-Regulatory Elements. Biomolecules 2024; 14:945. [PMID: 39199333 PMCID: PMC11352686 DOI: 10.3390/biom14080945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Revised: 07/25/2024] [Accepted: 07/30/2024] [Indexed: 09/01/2024] Open
Abstract
Cis-regulatory elements (CREs) play a pivotal role in orchestrating interactions with trans-regulatory factors such as transcription factors, RNA-binding proteins, and noncoding RNAs. These interactions are fundamental to the molecular architecture underpinning complex and diverse biological functions in living organisms, facilitating a myriad of sophisticated and dynamic processes. The rapid advancement in the identification and characterization of these regulatory elements has been marked by initiatives such as the Encyclopedia of DNA Elements (ENCODE) project, which represents a significant milestone in the field. Concurrently, the development of CRE detection technologies, exemplified by massively parallel reporter assays, has progressed at an impressive pace, providing powerful tools for CRE discovery. The exponential growth of multimodal functional genomic data has necessitated the application of advanced analytical methods. Deep learning algorithms, particularly large language models, have emerged as invaluable tools for deconstructing the intricate nucleotide sequences governing CRE function. These advancements facilitate precise predictions of CRE activity and enable the de novo design of CREs. A deeper understanding of CRE operational dynamics is crucial for harnessing their versatile regulatory properties. Such insights are instrumental in refining gene therapy techniques, enhancing the efficacy of selective breeding programs, pushing the boundaries of genetic innovation, and opening new possibilities in microbial synthetic biology.
Collapse
Affiliation(s)
- Lingna Xu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China;
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Yuwen Liu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China;
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Kunpeng Institute of Modern Agriculture at Foshan, Chinese Academy of Agricultural Sciences, Foshan 528226, China
| |
Collapse
|
7
|
Retallick-Townsley KG, Lee S, Cartwright S, Cohen S, Sen A, Jia M, Young H, Dobbyn L, Deans M, Fernandez-Garcia M, Huckins LM, Brennand KJ. Dynamic stress- and inflammatory-based regulation of psychiatric risk loci in human neurons. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.09.602755. [PMID: 39026810 PMCID: PMC11257632 DOI: 10.1101/2024.07.09.602755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2024]
Abstract
The prenatal environment can alter neurodevelopmental and clinical trajectories, markedly increasing risk for psychiatric disorders in childhood and adolescence. To understand if and how fetal exposures to stress and inflammation exacerbate manifestation of genetic risk for complex brain disorders, we report a large-scale context-dependent massively parallel reporter assay (MPRA) in human neurons designed to catalogue genotype x environment (GxE) interactions. Across 240 genome-wide association study (GWAS) loci linked to ten brain traits/disorders, the impact of hydrocortisone, interleukin 6, and interferon alpha on transcriptional activity is empirically evaluated in human induced pluripotent stem cell (hiPSC)-derived glutamatergic neurons. Of ~3,500 candidate regulatory risk elements (CREs), 11% of variants are active at baseline, whereas cue-specific CRE regulatory activity range from a high of 23% (hydrocortisone) to a low of 6% (IL-6). Cue-specific regulatory activity is driven, at least in part, by differences in transcription factor binding activity, the gene targets of which show unique enrichments for brain disorders as well as co-morbid metabolic and immune syndromes. The dynamic nature of genetic regulation informs the influence of environmental factors, reveals a mechanism underlying pleiotropy and variable penetrance, and identifies specific risk variants that confer greater disorder susceptibility after exposure to stress or inflammation. Understanding neurodevelopmental GxE interactions will inform mental health trajectories and uncover novel targets for therapeutic intervention.
Collapse
Affiliation(s)
- Kayla G. Retallick-Townsley
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029
| | - Seoyeon Lee
- Department of Psychiatry, Division of Molecular Psychiatry, Yale University School of Medicine, New Haven, CT 06511
- Department of Genetics, Wu Tsai Institute, Yale University School of Medicine, New Haven, CT 06511
| | - Sam Cartwright
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029
| | - Sophie Cohen
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029
| | - Annabel Sen
- Department of Psychiatry, Division of Molecular Psychiatry, Yale University School of Medicine, New Haven, CT 06511
- Department of Genetics, Wu Tsai Institute, Yale University School of Medicine, New Haven, CT 06511
| | - Meng Jia
- Department of Psychiatry, Division of Molecular Psychiatry, Yale University School of Medicine, New Haven, CT 06511
- Department of Genetics, Wu Tsai Institute, Yale University School of Medicine, New Haven, CT 06511
| | - Hannah Young
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lee Dobbyn
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael Deans
- Department of Psychiatry, Division of Molecular Psychiatry, Yale University School of Medicine, New Haven, CT 06511
- Department of Genetics, Wu Tsai Institute, Yale University School of Medicine, New Haven, CT 06511
| | - Meilin Fernandez-Garcia
- Department of Psychiatry, Division of Molecular Psychiatry, Yale University School of Medicine, New Haven, CT 06511
- Department of Genetics, Wu Tsai Institute, Yale University School of Medicine, New Haven, CT 06511
| | - Laura M. Huckins
- Department of Psychiatry, Division of Molecular Psychiatry, Yale University School of Medicine, New Haven, CT 06511
| | - Kristen J. Brennand
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029
- Department of Psychiatry, Division of Molecular Psychiatry, Yale University School of Medicine, New Haven, CT 06511
- Department of Genetics, Wu Tsai Institute, Yale University School of Medicine, New Haven, CT 06511
| |
Collapse
|
8
|
Moeckel C, Mouratidis I, Chantzi N, Uzun Y, Georgakopoulos-Soares I. Advances in computational and experimental approaches for deciphering transcriptional regulatory networks: Understanding the roles of cis-regulatory elements is essential, and recent research utilizing MPRAs, STARR-seq, CRISPR-Cas9, and machine learning has yielded valuable insights. Bioessays 2024; 46:e2300210. [PMID: 38715516 PMCID: PMC11444527 DOI: 10.1002/bies.202300210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/22/2024] [Accepted: 04/23/2024] [Indexed: 05/16/2024]
Abstract
Understanding the influence of cis-regulatory elements on gene regulation poses numerous challenges given complexities stemming from variations in transcription factor (TF) binding, chromatin accessibility, structural constraints, and cell-type differences. This review discusses the role of gene regulatory networks in enhancing understanding of transcriptional regulation and covers construction methods ranging from expression-based approaches to supervised machine learning. Additionally, key experimental methods, including MPRAs and CRISPR-Cas9-based screening, which have significantly contributed to understanding TF binding preferences and cis-regulatory element functions, are explored. Lastly, the potential of machine learning and artificial intelligence to unravel cis-regulatory logic is analyzed. These computational advances have far-reaching implications for precision medicine, therapeutic target discovery, and the study of genetic variations in health and disease.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Yasin Uzun
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
9
|
Yin C, Hair SC, Byeon GW, Bromley P, Meuleman W, Seelig G. Iterative deep learning-design of human enhancers exploits condensed sequence grammar to achieve cell type-specificity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.14.599076. [PMID: 38915713 PMCID: PMC11195158 DOI: 10.1101/2024.06.14.599076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
An important and largely unsolved problem in synthetic biology is how to target gene expression to specific cell types. Here, we apply iterative deep learning to design synthetic enhancers with strong differential activity between two human cell lines. We initially train models on published datasets of enhancer activity and chromatin accessibility and use them to guide the design of synthetic enhancers that maximize predicted specificity. We experimentally validate these sequences, use the measurements to re-optimize the predictor, and design a second generation of enhancers with improved specificity. Our design methods embed relevant transcription factor binding site (TFBS) motifs with higher frequencies than comparable endogenous enhancers while using a more selective motif vocabulary, and we show that enhancer activity is correlated with transcription factor expression at the single cell level. Finally, we characterize causal features of top enhancers via perturbation experiments and show enhancers as short as 50bp can maintain specificity.
Collapse
Affiliation(s)
- Christopher Yin
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA
| | | | - Gun Woo Byeon
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA
| | - Peter Bromley
- Altius Institute for Biomedical Sciences, Seattle, WA
| | - Wouter Meuleman
- Altius Institute for Biomedical Sciences, Seattle, WA
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA
| | - Georg Seelig
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA
| |
Collapse
|
10
|
Lalanne JB, Regalado SG, Domcke S, Calderon D, Martin BK, Li X, Li T, Suiter CC, Lee C, Trapnell C, Shendure J. Multiplex profiling of developmental cis-regulatory elements with quantitative single-cell expression reporters. Nat Methods 2024; 21:983-993. [PMID: 38724692 PMCID: PMC11166576 DOI: 10.1038/s41592-024-02260-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 03/22/2024] [Indexed: 06/13/2024]
Abstract
The inability to scalably and precisely measure the activity of developmental cis-regulatory elements (CREs) in multicellular systems is a bottleneck in genomics. Here we develop a dual RNA cassette that decouples the detection and quantification tasks inherent to multiplex single-cell reporter assays. The resulting measurement of reporter expression is accurate over multiple orders of magnitude, with a precision approaching the limit set by Poisson counting noise. Together with RNA barcode stabilization via circularization, these scalable single-cell quantitative expression reporters provide high-contrast readouts, analogous to classic in situ assays but entirely from sequencing. Screening >200 regions of accessible chromatin in a multicellular in vitro model of early mammalian development, we identify 13 (8 previously uncharacterized) autonomous and cell-type-specific developmental CREs. We further demonstrate that chimeric CRE pairs generate cognate two-cell-type activity profiles and assess gain- and loss-of-function multicellular expression phenotypes from CRE variants with perturbed transcription factor binding sites. Single-cell quantitative expression reporters can be applied in developmental and multicellular systems to quantitatively characterize native, perturbed and synthetic CREs at scale, with high sensitivity and at single-cell resolution.
Collapse
Affiliation(s)
| | - Samuel G Regalado
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Silvia Domcke
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Diego Calderon
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Beth K Martin
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Xiaoyi Li
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Tony Li
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Chase C Suiter
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Molecular and Cellular Biology Program, University of Washington, Seattle, WA, USA
| | - Choli Lee
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Cole Trapnell
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA.
- Howard Hughes Medical Institute, Seattle, WA, USA.
| |
Collapse
|
11
|
Quantitative profiling of regulatory DNA activity at single-cell resolution. Nat Methods 2024; 21:936-937. [PMID: 38724694 DOI: 10.1038/s41592-024-02261-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2024]
|
12
|
Milne TA. Chromatin and aberrant enhancer activity in KMT2A rearranged acute lymphoblastic leukemia. Curr Opin Genet Dev 2024; 86:102191. [PMID: 38579381 DOI: 10.1016/j.gde.2024.102191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 03/11/2024] [Accepted: 03/19/2024] [Indexed: 04/07/2024]
Abstract
To make a multicellular organism, genes need to be transcribed at the right developmental stages and in the right tissues. DNA sequences termed 'enhancers' are crucial to achieve this. Despite concerted efforts, the exact mechanisms of enhancer activity remain elusive. Mixed lineage leukemia (MLL or KMT2A) rearrangements (MLLr), commonly observed in cases of acute lymphoblastic leukemia (ALL) and acute myeloid leukemia, produce novel in-frame fusion proteins. Recent work has shown that the MLL-AF4 fusion protein drives aberrant enhancer activity at key oncogenes in ALL, dependent on the continued presence of MLL-AF4 complex components. As well as providing some general insights into enhancer function, these observations may also provide an explanation for transcriptional heterogeneity observed in MLLr patients.
Collapse
Affiliation(s)
- Thomas A Milne
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford OX3 9DS, UK.
| |
Collapse
|
13
|
Chin IM, Gardell ZA, Corces MR. Decoding polygenic diseases: advances in noncoding variant prioritization and validation. Trends Cell Biol 2024; 34:465-483. [PMID: 38719704 DOI: 10.1016/j.tcb.2024.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 03/12/2024] [Accepted: 03/21/2024] [Indexed: 06/09/2024]
Abstract
Genome-wide association studies (GWASs) provide a key foundation for elucidating the genetic underpinnings of common polygenic diseases. However, these studies have limitations in their ability to assign causality to particular genetic variants, especially those residing in the noncoding genome. Over the past decade, technological and methodological advances in both analytical and empirical prioritization of noncoding variants have enabled the identification of causative variants by leveraging orthogonal functional evidence at increasing scale. In this review, we present an overview of these approaches and describe how this workflow provides the groundwork necessary to move beyond associations toward genetically informed studies on the molecular and cellular mechanisms of polygenic disease.
Collapse
Affiliation(s)
- Iris M Chin
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Zachary A Gardell
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - M Ryan Corces
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
14
|
Deng C, Whalen S, Steyert M, Ziffra R, Przytycki PF, Inoue F, Pereira DA, Capauto D, Norton S, Vaccarino FM, Pollen AA, Nowakowski TJ, Ahituv N, Pollard KS. Massively parallel characterization of regulatory elements in the developing human cortex. Science 2024; 384:eadh0559. [PMID: 38781390 DOI: 10.1126/science.adh0559] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 03/13/2024] [Indexed: 05/25/2024]
Abstract
Nucleotide changes in gene regulatory elements are important determinants of neuronal development and diseases. Using massively parallel reporter assays in primary human cells from mid-gestation cortex and cerebral organoids, we interrogated the cis-regulatory activity of 102,767 open chromatin regions, including thousands of sequences with cell type-specific accessibility and variants associated with brain gene regulation. In primary cells, we identified 46,802 active enhancer sequences and 164 variants that alter enhancer activity. Activity was comparable in organoids and primary cells, suggesting that organoids provide an adequate model for the developing cortex. Using deep learning we decoded the sequence basis and upstream regulators of enhancer activity. This work establishes a comprehensive catalog of functional gene regulatory elements and variants in human neuronal development.
Collapse
Affiliation(s)
- Chengyu Deng
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Sean Whalen
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Marilyn Steyert
- Department of Anatomy, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143, USA
- Chan Zuckerberg Biohub, San Francisco, San Francisco, CA 94158, USA
- Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, CA 94143, USA
- Weill Institute for Neurosciences, University of California, San Francisco, CA 94158, USA
| | - Ryan Ziffra
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Department of Anatomy, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA 94143, USA
| | | | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto 606-8501, Japan
| | - Daniela A Pereira
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
- Graduate Program of Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais 31270-901, Brazil
| | - Davide Capauto
- Child Study Center, Yale University, New Haven, CT 06520, USA
| | - Scott Norton
- Child Study Center, Yale University, New Haven, CT 06520, USA
| | - Flora M Vaccarino
- Child Study Center, Yale University, New Haven, CT 06520, USA
- Department of Neuroscience, Yale University, New Haven, CT 06520, USA
| | - Alex A Pollen
- Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, CA 94143, USA
- Weill Institute for Neurosciences, University of California, San Francisco, CA 94158, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Tomasz J Nowakowski
- Department of Anatomy, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143, USA
- Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, CA 94143, USA
- Weill Institute for Neurosciences, University of California, San Francisco, CA 94158, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Katherine S Pollard
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, San Francisco, CA 94158, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
15
|
Shepherdson JL, Friedman RZ, Zheng Y, Sun C, Oh IY, Granas DM, Cohen BA, Chen S, White MA. Pathogenic variants in CRX have distinct cis-regulatory effects on enhancers and silencers in photoreceptors. Genome Res 2024; 34:243-255. [PMID: 38355306 PMCID: PMC10984388 DOI: 10.1101/gr.278133.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 02/01/2024] [Indexed: 02/16/2024]
Abstract
Dozens of variants in the gene for the homeodomain transcription factor (TF) cone-rod homeobox (CRX) are linked with human blinding diseases that vary in their severity and age of onset. How different variants in this single TF alter its function in ways that lead to a range of phenotypes is unclear. We characterized the effects of human disease-causing variants on CRX cis-regulatory function by deploying massively parallel reporter assays (MPRAs) in mouse retina explants carrying knock-ins of two variants, one in the DNA-binding domain (p.R90W) and the other in the transcriptional effector domain (p.E168d2). The degree of reporter gene dysregulation in these mutant Crx retinas corresponds with their phenotypic severity. The two variants affect similar sets of enhancers, and p.E168d2 has distinct effects on silencers. Cis-regulatory elements (CREs) near cone photoreceptor genes are enriched for silencers that are derepressed in the presence of p.E168d2. Chromatin environments of CRX-bound loci are partially predictive of episomal MPRA activity, and distal elements whose accessibility increases later in retinal development are enriched for CREs with silencer activity. We identified a set of potentially pleiotropic regulatory elements that convert from silencers to enhancers in retinas that lack a functional CRX effector domain. Our findings show that phenotypically distinct variants in different domains of CRX have partially overlapping effects on its cis-regulatory function, leading to misregulation of similar sets of enhancers while having a qualitatively different impact on silencers.
Collapse
Affiliation(s)
- James L Shepherdson
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Ryan Z Friedman
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Yiqiao Zheng
- Department of Ophthalmology and Visual Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Chi Sun
- Department of Ophthalmology and Visual Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Inez Y Oh
- Department of Ophthalmology and Visual Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - David M Granas
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Barak A Cohen
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Shiming Chen
- Department of Ophthalmology and Visual Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA;
- Department of Developmental Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| | - Michael A White
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA;
- Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110, USA
| |
Collapse
|
16
|
Capauto D, Wang Y, Wu F, Norton S, Mariani J, Inoue F, Crawford GE, Ahituv N, Abyzov A, Vaccarino FM. Characterization of enhancer activity in early human neurodevelopment using Massively Parallel Reporter Assay (MPRA) and forebrain organoids. Sci Rep 2024; 14:3936. [PMID: 38365907 PMCID: PMC10873509 DOI: 10.1038/s41598-024-54302-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 02/11/2024] [Indexed: 02/18/2024] Open
Abstract
Regulation of gene expression through enhancers is one of the major processes shaping the structure and function of the human brain during development. High-throughput assays have predicted thousands of enhancers involved in neurodevelopment, and confirming their activity through orthogonal functional assays is crucial. Here, we utilized Massively Parallel Reporter Assays (MPRAs) in stem cells and forebrain organoids to evaluate the activity of ~ 7000 gene-linked enhancers previously identified in human fetal tissues and brain organoids. We used a Gaussian mixture model to evaluate the contribution of background noise in the measured activity signal to confirm the activity of ~ 35% of the tested enhancers, with most showing temporal-specific activity, suggesting their evolving role in neurodevelopment. The temporal specificity was further supported by the correlation of activity with gene expression. Our findings provide a valuable gene regulatory resource to the scientific community.
Collapse
Affiliation(s)
- Davide Capauto
- Child Study Center, Yale University, New Haven, CT, 06520, USA
| | - Yifan Wang
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN, 55905, USA
| | - Feinan Wu
- Child Study Center, Yale University, New Haven, CT, 06520, USA
| | - Scott Norton
- Child Study Center, Yale University, New Haven, CT, 06520, USA
| | - Jessica Mariani
- Child Study Center, Yale University, New Haven, CT, 06520, USA
| | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | | | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
| | - Alexej Abyzov
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN, 55905, USA.
| | - Flora M Vaccarino
- Child Study Center, Yale University, New Haven, CT, 06520, USA.
- Department of Neuroscience, Yale University, New Haven, CT, 06520, USA.
- Yale Stem Cell Center, Yale University, New Haven, CT, 06520, USA.
| |
Collapse
|
17
|
Sun J, Noss S, Banerjee D, Das M, Girirajan S. Strategies for dissecting the complexity of neurodevelopmental disorders. Trends Genet 2024; 40:187-202. [PMID: 37949722 PMCID: PMC10872993 DOI: 10.1016/j.tig.2023.10.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 09/20/2023] [Accepted: 10/16/2023] [Indexed: 11/12/2023]
Abstract
Neurodevelopmental disorders (NDDs) are associated with a wide range of clinical features, affecting multiple pathways involved in brain development and function. Recent advances in high-throughput sequencing have unveiled numerous genetic variants associated with NDDs, which further contribute to disease complexity and make it challenging to infer disease causation and underlying mechanisms. Herein, we review current strategies for dissecting the complexity of NDDs using model organisms, induced pluripotent stem cells, single-cell sequencing technologies, and massively parallel reporter assays. We further highlight single-cell CRISPR-based screening techniques that allow genomic investigation of cellular transcriptomes with high efficiency, accuracy, and throughput. Overall, we provide an integrated review of experimental approaches that can be applicable for investigating a broad range of complex disorders.
Collapse
Affiliation(s)
- Jiawan Sun
- Molecular, Cellular, and Integrative Biosciences Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA
| | - Serena Noss
- Molecular, Cellular, and Integrative Biosciences Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA
| | - Deepro Banerjee
- Bioinformatics and Genomics Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA
| | - Maitreya Das
- Molecular, Cellular, and Integrative Biosciences Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA
| | - Santhosh Girirajan
- Molecular, Cellular, and Integrative Biosciences Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA; Bioinformatics and Genomics Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA; Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA; Department of Anthropology, Pennsylvania State University, University Park, PA 16802, USA.
| |
Collapse
|
18
|
Zhao Y, Deng W, Wang Z, Wang Y, Zheng H, Zhou K, Xu Q, Bai L, Liu H, Ren Z, Jiang Z. Genetics of congenital heart disease. Clin Chim Acta 2024; 552:117683. [PMID: 38030030 DOI: 10.1016/j.cca.2023.117683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 11/22/2023] [Accepted: 11/24/2023] [Indexed: 12/01/2023]
Abstract
During embryonic development, the cardiovascular system and the central nervous system exhibit a coordinated developmental process through intricate interactions. Congenital heart disease (CHD) refers to structural or functional abnormalities that occur during embryonic or prenatal heart development and is the most common congenital disorder. One of the most common complications in CHD patients is neurodevelopmental disorders (NDD). However, the specific mechanisms, connections, and precise ways in which CHD co-occurs with NDD remain unclear. According to relevant research, both genetic and non-genetic factors are significant contributors to the co-occurrence of sporadic CHD and NDD. Genetic variations, such as chromosomal abnormalities and gene mutations, play a role in the susceptibility to both CHD and NDD. Further research should aim to identify common molecular mechanisms that underlie the co-occurrence of CHD and NDD, possibly originating from shared genetic mutations or shared gene regulation. Therefore, this review article summarizes the current advances in the genetics of CHD co-occurring with NDD, elucidating the application of relevant gene detection techniques. This is done with the aim of exploring the genetic regulatory mechanisms of CHD co-occurring with NDD at the gene level and promoting research and treatment of developmental disorders related to the cardiovascular and central nervous systems.
Collapse
Affiliation(s)
- Yuanqin Zhao
- Institute of Cardiovascular Disease, Key Lab for Arteriosclerology of Hunan Province, International Joint Laboratory for Arteriosclerotic Disease Research of Hunan Province, University of South China, Hengyang 421001, China.
| | - Wei Deng
- Institute of Cardiovascular Disease, Key Lab for Arteriosclerology of Hunan Province, International Joint Laboratory for Arteriosclerotic Disease Research of Hunan Province, University of South China, Hengyang 421001, China.
| | - Zhaoyue Wang
- Institute of Cardiovascular Disease, Key Lab for Arteriosclerology of Hunan Province, International Joint Laboratory for Arteriosclerotic Disease Research of Hunan Province, University of South China, Hengyang 421001, China.
| | - Yanxia Wang
- Institute of Cardiovascular Disease, Key Lab for Arteriosclerology of Hunan Province, International Joint Laboratory for Arteriosclerotic Disease Research of Hunan Province, University of South China, Hengyang 421001, China.
| | - Hongyu Zheng
- Institute of Cardiovascular Disease, Key Lab for Arteriosclerology of Hunan Province, International Joint Laboratory for Arteriosclerotic Disease Research of Hunan Province, University of South China, Hengyang 421001, China.
| | - Kun Zhou
- Institute of Cardiovascular Disease, Key Lab for Arteriosclerology of Hunan Province, International Joint Laboratory for Arteriosclerotic Disease Research of Hunan Province, University of South China, Hengyang 421001, China.
| | - Qian Xu
- Institute of Cardiovascular Disease, Key Lab for Arteriosclerology of Hunan Province, International Joint Laboratory for Arteriosclerotic Disease Research of Hunan Province, University of South China, Hengyang 421001, China.
| | - Le Bai
- Institute of Cardiovascular Disease, Key Lab for Arteriosclerology of Hunan Province, International Joint Laboratory for Arteriosclerotic Disease Research of Hunan Province, University of South China, Hengyang 421001, China.
| | - Huiting Liu
- Institute of Cardiovascular Disease, Key Lab for Arteriosclerology of Hunan Province, International Joint Laboratory for Arteriosclerotic Disease Research of Hunan Province, University of South China, Hengyang 421001, China.
| | - Zhong Ren
- Institute of Cardiovascular Disease, Key Lab for Arteriosclerology of Hunan Province, International Joint Laboratory for Arteriosclerotic Disease Research of Hunan Province, University of South China, Hengyang 421001, China.
| | - Zhisheng Jiang
- Institute of Cardiovascular Disease, Key Lab for Arteriosclerology of Hunan Province, International Joint Laboratory for Arteriosclerotic Disease Research of Hunan Province, University of South China, Hengyang 421001, China.
| |
Collapse
|
19
|
de Boer CG, Taipale J. Hold out the genome: a roadmap to solving the cis-regulatory code. Nature 2024; 625:41-50. [PMID: 38093018 DOI: 10.1038/s41586-023-06661-w] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 09/20/2023] [Indexed: 01/05/2024]
Abstract
Gene expression is regulated by transcription factors that work together to read cis-regulatory DNA sequences. The 'cis-regulatory code' - how cells interpret DNA sequences to determine when, where and how much genes should be expressed - has proven to be exceedingly complex. Recently, advances in the scale and resolution of functional genomics assays and machine learning have enabled substantial progress towards deciphering this code. However, the cis-regulatory code will probably never be solved if models are trained only on genomic sequences; regions of homology can easily lead to overestimation of predictive performance, and our genome is too short and has insufficient sequence diversity to learn all relevant parameters. Fortunately, randomly synthesized DNA sequences enable testing a far larger sequence space than exists in our genomes, and designed DNA sequences enable targeted queries to maximally improve the models. As the same biochemical principles are used to interpret DNA regardless of its source, models trained on these synthetic data can predict genomic activity, often better than genome-trained models. Here we provide an outlook on the field, and propose a roadmap towards solving the cis-regulatory code by a combination of machine learning and massively parallel assays using synthetic DNA.
Collapse
Affiliation(s)
- Carl G de Boer
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada.
| | - Jussi Taipale
- Applied Tumor Genomics Research Program, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.
- Department of Biochemistry, University of Cambridge, Cambridge, UK.
| |
Collapse
|
20
|
Minow MAA, Marand AP, Schmitz RJ. Leveraging Single-Cell Populations to Uncover the Genetic Basis of Complex Traits. Annu Rev Genet 2023; 57:297-319. [PMID: 37562412 PMCID: PMC10775913 DOI: 10.1146/annurev-genet-022123-110824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
The ease and throughput of single-cell genomics have steadily improved, and its current trajectory suggests that surveying single-cell populations will become routine. We discuss the merger of quantitative genetics with single-cell genomics and emphasize how this synergizes with advantages intrinsic to plants. Single-cell population genomics provides increased detection resolution when mapping variants that control molecular traits, including gene expression or chromatin accessibility. Additionally, single-cell population genomics reveals the cell types in which variants act and, when combined with organism-level phenotype measurements, unveils which cellular contexts impact higher-order traits. Emerging technologies, notably multiomics, can facilitate the measurement of both genetic changes and genomic traits in single cells, enabling single-cell genetic experiments. The implementation of single-cell genetics will advance the investigation of the genetic architecture of complex molecular traits and provide new experimental paradigms to study eukaryotic genetics.
Collapse
Affiliation(s)
- Mark A A Minow
- Department of Genetics, University of Georgia, Athens, Georgia, USA;
| | | | - Robert J Schmitz
- Department of Genetics, University of Georgia, Athens, Georgia, USA;
| |
Collapse
|
21
|
Chen Y, Paramo MI, Zhang Y, Yao L, Shah SR, Jin Y, Zhang J, Pan X, Yu H. Finding Needles in the Haystack: Strategies for Uncovering Noncoding Regulatory Variants. Annu Rev Genet 2023; 57:201-222. [PMID: 37562413 DOI: 10.1146/annurev-genet-030723-120717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
Despite accumulating evidence implicating noncoding variants in human diseases, unraveling their functionality remains a significant challenge. Systematic annotations of the regulatory landscape and the growth of sequence variant data sets have fueled the development of tools and methods to identify causal noncoding variants and evaluate their regulatory effects. Here, we review the latest advances in the field and discuss potential future research avenues to gain a more in-depth understanding of noncoding regulatory variants.
Collapse
Affiliation(s)
- You Chen
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Mauricio I Paramo
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Yingying Zhang
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Li Yao
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
- Department of Computational Biology, Cornell University, Ithaca, New York, USA
| | - Sagar R Shah
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Yiyang Jin
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Junke Zhang
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
- Department of Computational Biology, Cornell University, Ithaca, New York, USA
| | - Xiuqi Pan
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Haiyuan Yu
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
- Department of Computational Biology, Cornell University, Ithaca, New York, USA
| |
Collapse
|
22
|
Fu ZH, He SZ, Wu Y, Zhao GR. Design and deep learning of synthetic B-cell-specific promoters. Nucleic Acids Res 2023; 51:11967-11979. [PMID: 37889080 PMCID: PMC10681721 DOI: 10.1093/nar/gkad930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 09/20/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023] Open
Abstract
Synthetic biology and deep learning synergistically revolutionize our ability for decoding and recoding DNA regulatory grammar. The B-cell-specific transcriptional regulation is intricate, and unlock the potential of B-cell-specific promoters as synthetic elements is important for B-cell engineering. Here, we designed and pooled synthesized 23 640 B-cell-specific promoters that exhibit larger sequence space, B-cell-specific expression, and enable diverse transcriptional patterns in B-cells. By MPRA (Massively parallel reporter assays), we deciphered the sequence features that regulate promoter transcriptional, including motifs and motif syntax (their combination and distance). Finally, we built and trained a deep learning model capable of predicting the transcriptional strength of the immunoglobulin V gene promoter directly from sequence. Prediction of thousands of promoter variants identified in the global human population shows that polymorphisms in promoters influence the transcription of immunoglobulin V genes, which may contribute to individual differences in adaptive humoral immune responses. Our work helps to decipher the transcription mechanism in immunoglobulin genes and offers thousands of non-similar promoters for B-cell engineering.
Collapse
Affiliation(s)
- Zong-Heng Fu
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin 300072, China
| | - Si-Zhe He
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin 300072, China
| | - Yi Wu
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin 300072, China
| | - Guang-Rong Zhao
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin 300072, China
| |
Collapse
|
23
|
Mukund AX, Tycko J, Allen SJ, Robinson SA, Andrews C, Sinha J, Ludwig CH, Spees K, Bassik MC, Bintu L. High-throughput functional characterization of combinations of transcriptional activators and repressors. Cell Syst 2023; 14:746-763.e5. [PMID: 37543039 PMCID: PMC10642976 DOI: 10.1016/j.cels.2023.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 06/26/2023] [Accepted: 07/06/2023] [Indexed: 08/07/2023]
Abstract
Despite growing knowledge of the functions of individual human transcriptional effector domains, much less is understood about how multiple effector domains within the same protein combine to regulate gene expression. Here, we measure transcriptional activity for 8,400 effector domain combinations by recruiting them to reporter genes in human cells. In our assay, weak and moderate activation domains synergize to drive strong gene expression, whereas combining strong activators often results in weaker activation. In contrast, repressors combine linearly and produce full gene silencing, and repressor domains often overpower activation domains. We use this information to build a synthetic transcription factor whose function can be tuned between repression and activation independent of recruitment to target genes by using a small-molecule drug. Altogether, we outline the basic principles of how effector domains combine to regulate gene expression and demonstrate their value in building precise and flexible synthetic biology tools. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Adi X Mukund
- Biophysics Program, Stanford University, Stanford, CA 94305, USA
| | - Josh Tycko
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Sage J Allen
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | - Cecelia Andrews
- Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA
| | - Joydeb Sinha
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA
| | - Connor H Ludwig
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Kaitlyn Spees
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Michael C Bassik
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Lacramioara Bintu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
24
|
Quaye LNK, Dalzell CE, Deloukas P, Smith AJP. The Genetics of Coronary Artery Disease: A Vascular Perspective. Cells 2023; 12:2232. [PMID: 37759455 PMCID: PMC10527262 DOI: 10.3390/cells12182232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 08/31/2023] [Accepted: 09/01/2023] [Indexed: 09/29/2023] Open
Abstract
Genome-wide association studies (GWAS) have identified a large number of genetic loci for coronary artery disease (CAD), with many located close to genes associated with traditional CAD risk pathways, such as lipid metabolism and inflammation. It is becoming evident with recent CAD GWAS meta-analyses that vascular pathways are also highly enriched and present an opportunity for novel therapeutics. This review examines GWAS-enriched vascular gene loci, the pathways involved and their potential role in CAD pathogenesis. The functionality of variants is explored from expression quantitative trait loci, massively parallel reporter assays and CRISPR-based gene-editing tools. We discuss how this research may lead to novel therapeutic tools to treat cardiovascular disorders.
Collapse
Affiliation(s)
| | | | - Panos Deloukas
- William Harvey Research Institute, Faculty of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK; (L.N.K.Q.); (C.E.D.); (A.J.P.S.)
| | | |
Collapse
|
25
|
Capauto D, Wang Y, Wu F, Norton S, Mariani J, Inoue F, Crawford GE, Ahituv N, Abyzov A, Vaccarino FM. Characterization of enhancer activity in early human neurodevelopment using Massively parallel reporter assay (MPRA) and forebrain organoids. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.14.553170. [PMID: 37645832 PMCID: PMC10461976 DOI: 10.1101/2023.08.14.553170] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Regulation of gene expression through enhancers is one of the major processes shaping the structure and function of the human brain during development. High-throughput assays have predicted thousands of enhancers involved in neurodevelopment, and confirming their activity through orthogonal functional assays is crucial. Here, we utilized Massively Parallel Reporter Assays (MPRAs) in stem cells and forebrain organoids to evaluate the activity of ~7,000 gene-linked enhancers previously identified in human fetal tissues and brain organoids. We used a Gaussian mixture model to evaluate the contribution of background noise in the measured activity signal to confirm the activity of ~35% of the tested enhancers, with most showing temporal-specific activity, suggesting their evolving role in neurodevelopment. The temporal specificity was further supported by the correlation of activity with gene expression. Our findings provide a valuable gene regulatory resource to the scientific community.
Collapse
Affiliation(s)
- Davide Capauto
- Child Study Center, Yale University, New Haven, CT 06520
| | - Yifan Wang
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Feinan Wu
- Child Study Center, Yale University, New Haven, CT 06520
| | - Scott Norton
- Child Study Center, Yale University, New Haven, CT 06520
| | | | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University; Kyoto, Japan
| | | | | | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco; San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco; San Francisco, CA, USA
| | - Alexej Abyzov
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Flora M. Vaccarino
- Child Study Center, Yale University, New Haven, CT 06520
- Department of Neuroscience, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
26
|
Armendariz DA, Sundarrajan A, Hon GC. Breaking enhancers to gain insights into developmental defects. eLife 2023; 12:e88187. [PMID: 37497775 PMCID: PMC10374278 DOI: 10.7554/elife.88187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 07/19/2023] [Indexed: 07/28/2023] Open
Abstract
Despite ground-breaking genetic studies that have identified thousands of risk variants for developmental diseases, how these variants lead to molecular and cellular phenotypes remains a gap in knowledge. Many of these variants are non-coding and occur at enhancers, which orchestrate key regulatory programs during development. The prevailing paradigm is that non-coding variants alter the activity of enhancers, impacting gene expression programs, and ultimately contributing to disease risk. A key obstacle to progress is the systematic functional characterization of non-coding variants at scale, especially since enhancer activity is highly specific to cell type and developmental stage. Here, we review the foundational studies of enhancers in developmental disease and current genomic approaches to functionally characterize developmental enhancers and their variants at scale. In the coming decade, we anticipate systematic enhancer perturbation studies to link non-coding variants to molecular mechanisms, changes in cell state, and disease phenotypes.
Collapse
Affiliation(s)
- Daniel A Armendariz
- Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, United States
| | - Anjana Sundarrajan
- Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, United States
| | - Gary C Hon
- Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, United States
- Hamon Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, United States
- Lyda Hill Department of Bioinformatics, Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, United States
| |
Collapse
|
27
|
Smith GD, Ching WH, Cornejo-Páramo P, Wong ES. Decoding enhancer complexity with machine learning and high-throughput discovery. Genome Biol 2023; 24:116. [PMID: 37173718 PMCID: PMC10176946 DOI: 10.1186/s13059-023-02955-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 04/28/2023] [Indexed: 05/15/2023] Open
Abstract
Enhancers are genomic DNA elements controlling spatiotemporal gene expression. Their flexible organization and functional redundancies make deciphering their sequence-function relationships challenging. This article provides an overview of the current understanding of enhancer organization and evolution, with an emphasis on factors that influence these relationships. Technological advancements, particularly in machine learning and synthetic biology, are discussed in light of how they provide new ways to understand this complexity. Exciting opportunities lie ahead as we continue to unravel the intricacies of enhancer function.
Collapse
Affiliation(s)
- Gabrielle D Smith
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Wan Hern Ching
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia
| | - Paola Cornejo-Páramo
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Emily S Wong
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia.
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington, NSW, Australia.
| |
Collapse
|
28
|
Guo Q, Wu S, Geschwind DH. Characterization of Gene Regulatory Elements in Human Fetal Cortical Development: Enhancing Our Understanding of Neurodevelopmental Disorders and Evolution. Dev Neurosci 2023; 46:69-83. [PMID: 37231806 DOI: 10.1159/000530929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 04/24/2023] [Indexed: 05/27/2023] Open
Abstract
The neocortex is the region that most distinguishes human brain from other mammals and primates [Annu Rev Genet. 2021 Nov;55(1):555-81]. Studying the development of human cortex is important in understanding the evolutionary changes occurring in humans relative to other primates, as well as in elucidating mechanisms underlying neurodevelopmental disorders. Cortical development is a highly regulated process, spatially and temporally coordinated by expression of essential transcriptional factors in response to signaling pathways [Neuron. 2019 Sep;103(6):980-1004]. Enhancers are the most well-understood cis-acting, non-protein-coding regulatory elements that regulate gene expression [Nat Rev Genet. 2014 Apr;15(4):272-86]. Importantly, given the conservation of both DNA sequence and molecular function of the majority of proteins across mammals [Genome Res. 2003 Dec;13(12):2507-18], enhancers [Science. 2015 Mar;347(6226):1155-9], which are far more divergent at the sequence level, likely account for the phenotypes that distinguish the human brain by changing the regulation of gene expression. In this review, we will revisit the conceptual framework of gene regulation during human brain development, as well as the evolution of technologies to study transcriptional regulation, with recent advances in genome biology that open a window allowing us to systematically characterize cis-regulatory elements in developing human brain [Hum Mol Genet. 2022 Oct;31(R1):R84-96]. We provide an update on work to characterize the suite of all enhancers in the developing human brain and the implications for understanding neuropsychiatric disorders. Finally, we discuss emerging therapeutic ideas that utilize our emerging knowledge of enhancer function.
Collapse
Affiliation(s)
- Qiuyu Guo
- Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, California, USA
- Center for Autism Research and Treatment, Semel Institute, University of California Los Angeles, Los Angeles, California, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, USA
| | - Sarah Wu
- Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, California, USA
| | - Daniel H Geschwind
- Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, California, USA
- Center for Autism Research and Treatment, Semel Institute, University of California Los Angeles, Los Angeles, California, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, USA
- Institute of Precision Health, University of California Los Angeles, Los Angeles, California, USA
| |
Collapse
|
29
|
Zheng Y, VanDusen NJ. Massively Parallel Reporter Assays for High-Throughput In Vivo Analysis of Cis-Regulatory Elements. J Cardiovasc Dev Dis 2023; 10:jcdd10040144. [PMID: 37103023 PMCID: PMC10146671 DOI: 10.3390/jcdd10040144] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 03/24/2023] [Accepted: 03/27/2023] [Indexed: 03/31/2023] Open
Abstract
The rapid improvement of descriptive genomic technologies has fueled a dramatic increase in hypothesized connections between cardiovascular gene expression and phenotypes. However, in vivo testing of these hypotheses has predominantly been relegated to slow, expensive, and linear generation of genetically modified mice. In the study of genomic cis-regulatory elements, generation of mice featuring transgenic reporters or cis-regulatory element knockout remains the standard approach. While the data obtained is of high quality, the approach is insufficient to keep pace with candidate identification and therefore results in biases introduced during the selection of candidates for validation. However, recent advances across a range of disciplines are converging to enable functional genomic assays that can be conducted in a high-throughput manner. Here, we review one such method, massively parallel reporter assays (MPRAs), in which the activities of thousands of candidate genomic regulatory elements are simultaneously assessed via the next-generation sequencing of a barcoded reporter transcript. We discuss best practices for MPRA design and use, with a focus on practical considerations, and review how this emerging technology has been successfully deployed in vivo. Finally, we discuss how MPRAs are likely to evolve and be used in future cardiovascular research.
Collapse
|
30
|
Deng C, Whalen S, Steyert M, Ziffra R, Przytycki PF, Inoue F, Pereira DA, Capauto D, Norton S, Vaccarino FM, Pollen A, Nowakowski TJ, Ahituv N, Pollard KS. Massively parallel characterization of psychiatric disorder-associated and cell-type-specific regulatory elements in the developing human cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.15.528663. [PMID: 36824845 PMCID: PMC9949039 DOI: 10.1101/2023.02.15.528663] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
Nucleotide changes in gene regulatory elements are important determinants of neuronal development and disease. Using massively parallel reporter assays in primary human cells from mid-gestation cortex and cerebral organoids, we interrogated the cis-regulatory activity of 102,767 sequences, including differentially accessible cell-type specific regions in the developing cortex and single-nucleotide variants associated with psychiatric disorders. In primary cells, we identified 46,802 active enhancer sequences and 164 disorder-associated variants that significantly alter enhancer activity. Activity was comparable in organoids and primary cells, suggesting that organoids provide an adequate model for the developing cortex. Using deep learning, we decoded the sequence basis and upstream regulators of enhancer activity. This work establishes a comprehensive catalog of functional gene regulatory elements and variants in human neuronal development.
Collapse
Affiliation(s)
- Chengyu Deng
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco; San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco; San Francisco, CA, USA
| | - Sean Whalen
- Gladstone Institutes; San Francisco, CA, USA
| | - Marilyn Steyert
- Department of Anatomy, University of California, San Francisco; San Francisco, CA, USA
- Department of Psychiatry, University of California, San Francisco; San Francisco, CA, USA
- Department of Neurological Surgery, University of California, San Francisco; San Francisco, CA, USA
| | - Ryan Ziffra
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco; San Francisco, CA, USA
| | | | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University; Kyoto, Japan
| | - Daniela A. Pereira
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco; San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco; San Francisco, CA, USA
- Graduate Program of Genetics, Institute of Biological Sciences, Federal University of Minas Gerais; Belo Horizonte, Minas Gerais, Brazil
| | | | - Scott Norton
- Child Study Center, Yale University; New Haven, CT, USA
| | - Flora M. Vaccarino
- Child Study Center, Yale University; New Haven, CT, USA
- Department of Neuroscience, Yale University; New Haven, CT, USA
| | - Alex Pollen
- Department of Neurology, University of California, San Francisco; San Francisco, CA, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco; San Francisco, CA, USA
| | - Tomasz J. Nowakowski
- Department of Anatomy, University of California, San Francisco; San Francisco, CA, USA
- Department of Psychiatry, University of California, San Francisco; San Francisco, CA, USA
- Department of Neurological Surgery, University of California, San Francisco; San Francisco, CA, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco; San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco; San Francisco, CA, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco; San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco; San Francisco, CA, USA
| | - Katherine S. Pollard
- Institute for Human Genetics, University of California, San Francisco; San Francisco, CA, USA
- Gladstone Institutes; San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco; San Francisco, CA, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco; San Francisco, CA, USA
| |
Collapse
|