1
|
Catta-Preta R, Lindtner S, Ypsilanti A, Seban N, Price JD, Abnousi A, Su-Feher L, Wang Y, Cichewicz K, Boerma SA, Juric I, Jones IR, Akiyama JA, Hu M, Shen Y, Visel A, Pennacchio LA, Dickel DE, Rubenstein JLR, Nord AS. Combinatorial transcription factor binding encodes cis-regulatory wiring of mouse forebrain GABAergic neurogenesis. Dev Cell 2024:S1534-5807(24)00603-8. [PMID: 39481376 DOI: 10.1016/j.devcel.2024.10.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 06/17/2024] [Accepted: 10/03/2024] [Indexed: 11/02/2024]
Abstract
Transcription factors (TFs) bind combinatorially to cis-regulatory elements, orchestrating transcriptional programs. Although studies of chromatin state and chromosomal interactions have demonstrated dynamic neurodevelopmental cis-regulatory landscapes, parallel understanding of TF interactions lags. To elucidate combinatorial TF binding driving mouse basal ganglia development, we integrated chromatin immunoprecipitation sequencing (ChIP-seq) for twelve TFs, H3K4me3-associated enhancer-promoter interactions, chromatin and gene expression data, and functional enhancer assays. We identified sets of putative regulatory elements with shared TF binding (TF-pRE modules) that orchestrate distinct processes of GABAergic neurogenesis and suppress other cell fates. The majority of pREs were bound by one or two TFs; however, a small proportion were extensively bound. These sequences had exceptional evolutionary conservation and motif density, complex chromosomal interactions, and activity as in vivo enhancers. Our results provide insights into the combinatorial TF-pRE interactions that activate and repress expression programs during telencephalon neurogenesis and demonstrate the value of TF binding toward modeling developmental transcriptional wiring.
Collapse
Affiliation(s)
- Rinaldo Catta-Preta
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| | - Susan Lindtner
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Athena Ypsilanti
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Nicolas Seban
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| | - James D Price
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Armen Abnousi
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH 44106, USA
| | - Linda Su-Feher
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| | - Yurong Wang
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| | - Karol Cichewicz
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| | - Sally A Boerma
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| | - Ivan Juric
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH 44106, USA
| | - Ian R Jones
- Institute for Human Genetics, Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Jennifer A Akiyama
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ming Hu
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH 44106, USA
| | - Yin Shen
- Institute for Human Genetics, Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Axel Visel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA; School of Natural Sciences, University of California, Merced, Merced, CA 95343, USA
| | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA; Comparative Biochemistry Program, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Diane E Dickel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - John L R Rubenstein
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA.
| | - Alex S Nord
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA.
| |
Collapse
|
2
|
Crossley ER, Fedorova L, Mulyar O, Freeman R, Khuder S, Fedorov A. Computational identification of ultra-conserved elements in the human genome: a hypothesis on homologous DNA pairing. NAR Genom Bioinform 2024; 6:lqae074. [PMID: 38962254 PMCID: PMC11217675 DOI: 10.1093/nargab/lqae074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 05/29/2024] [Accepted: 06/19/2024] [Indexed: 07/05/2024] Open
Abstract
Thousands of prolonged sequences of human ultra-conserved non-coding elements (UCNEs) share only one common feature: peculiarities in the unique composition of their dinucleotides. Here we investigate whether the numerous weak signals emanating from these dinucleotide arrangements can be used for computational identification of UCNEs within the human genome. For this purpose, we analyzed 4272 UCNE sequences, encompassing 1 393 448 nucleotides, alongside equally sized control samples of randomly selected human genomic sequences. Our research identified nine different features of dinucleotide arrangements that enable differentiation of UCNEs from the rest of the genome. We employed these nine features, implementing three Machine Learning techniques - Support Vector Machine, Random Forest, and Artificial Neural Networks - to classify UCNEs, achieving an accuracy rate of 82-84%, with specific conditions allowing for over 90% accuracy. Notably, the strongest feature for UCNE identification was the frequency ratio between GpC dinucleotides and the sum of GpG and CpC dinucleotides. Additionally, we investigated the entire pool of 31 046 SNPs located within UCNEs for their representation in the ClinVar database, which catalogs human SNPs with known phenotypic effects. The presence of UCNE-associated SNPs in ClinVar aligns with the expectation of a random distribution, emphasizing the enigmatic nature of UCNE phenotypic manifestation.
Collapse
Affiliation(s)
- Emily R Crossley
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA
| | | | | | | | - Sadik Khuder
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA
| | - Alexei Fedorov
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA
- CRI Genetics LLC, Santa Monica, CA 90404, USA
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA
| |
Collapse
|
3
|
Cummins M, Watson C, Edwards RJ, Mattick JS. The Evolution of Ultraconserved Elements in Vertebrates. Mol Biol Evol 2024; 41:msae146. [PMID: 39058500 PMCID: PMC11276968 DOI: 10.1093/molbev/msae146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 06/29/2024] [Accepted: 07/08/2024] [Indexed: 07/18/2024] Open
Abstract
Ultraconserved elements were discovered two decades ago, arbitrarily defined as sequences that are identical over a length ≥ 200 bp in the human, mouse, and rat genomes. The definition was subsequently extended to sequences ≥ 100 bp identical in at least three of five mammalian genomes (including dog and cow), and shown to have undergone rapid expansion from ancestors in fish and strong negative selection in birds and mammals. Since then, many more genomes have become available, allowing better definition and more thorough examination of ultraconserved element distribution and evolutionary history. We developed a fast and flexible analytical pipeline for identifying ultraconserved elements in multiple genomes, dedUCE, which allows manipulation of minimum length, sequence identity, and number of species with a detectable ultraconserved element according to specified parameters. We suggest an updated definition of ultraconserved elements as sequences ≥ 100 bp and ≥97% sequence identity in ≥50% of placental mammal orders (12,813 ultraconserved elements). By mapping ultraconserved elements to ∼200 species, we find that placental ultraconserved elements appeared early in vertebrate evolution, well before land colonization, suggesting that the evolutionary pressures driving ultraconserved element selection were present in aquatic environments in the Cambrian-Devonian periods. Most (>90%) ultraconserved elements likely appeared after the divergence of gnathostomes from jawless predecessors, were largely established in sequence identity by early Sarcopterygii evolution-before the divergence of lobe-finned fishes from tetrapods-and became near fixed in the amniotes. Ultraconserved elements are mainly located in the introns of protein-coding and noncoding genes involved in neurological and skeletomuscular development, enriched in regulatory elements, and dynamically expressed throughout embryonic development.
Collapse
Affiliation(s)
- Mitchell Cummins
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW 2052, Australia
| | - Cadel Watson
- School of Engineering, UNSW Sydney, Sydney, NSW 2052, Australia
| | - Richard J Edwards
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW 2052, Australia
| | - John S Mattick
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW 2052, Australia
| |
Collapse
|
4
|
Singh AK, Walavalkar K, Tavernari D, Ciriello G, Notani D, Sabarinathan R. Cis-regulatory effect of HPV integration is constrained by host chromatin architecture in cervical cancers. Mol Oncol 2024; 18:1189-1208. [PMID: 38013620 PMCID: PMC11076994 DOI: 10.1002/1878-0261.13559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 11/24/2023] [Indexed: 11/29/2023] Open
Abstract
Human papillomavirus (HPV) infections are the primary drivers of cervical cancers, and often HPV DNA gets integrated into the host genome. Although the oncogenic impact of HPV encoded genes is relatively well known, the cis-regulatory effect of integrated HPV DNA on host chromatin structure and gene regulation remains less understood. We investigated genome-wide patterns of HPV integrations and associated host gene expression changes in the context of host chromatin states and topologically associating domains (TADs). HPV integrations were significantly enriched in active chromatin regions and depleted in inactive ones. Interestingly, regardless of chromatin state, genomic regions flanking HPV integrations showed transcriptional upregulation. Nevertheless, upregulation (both local and long-range) was mostly confined to TADs with integration, but not affecting adjacent TADs. Few TADs showed recurrent integrations associated with overexpression of oncogenes within them (e.g. MYC, PVT1, TP63 and ERBB2) regardless of proximity. Hi-C and 4C-seq analyses in cervical cancer cell line (HeLa) demonstrated chromatin looping interactions between integrated HPV and MYC/PVT1 regions (~ 500 kb apart), leading to allele-specific overexpression. Based on these, we propose HPV integrations can trigger multimodal oncogenic activation to promote cancer progression.
Collapse
Affiliation(s)
- Anurag Kumar Singh
- National Centre for Biological SciencesTata Institute of Fundamental ResearchBengaluruIndia
| | - Kaivalya Walavalkar
- National Centre for Biological SciencesTata Institute of Fundamental ResearchBengaluruIndia
| | - Daniele Tavernari
- Department of Computational BiologyUniversity of Lausanne (UNIL)Switzerland
- Swiss Cancer Center LemanLausanneSwitzerland
- Swiss Institute for Experimental Cancer Research (ISREC), EPFLLausanneSwitzerland
| | - Giovanni Ciriello
- Department of Computational BiologyUniversity of Lausanne (UNIL)Switzerland
- Swiss Cancer Center LemanLausanneSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| | - Dimple Notani
- National Centre for Biological SciencesTata Institute of Fundamental ResearchBengaluruIndia
| | | |
Collapse
|
5
|
Fleck K, Luria V, Garag N, Karger A, Hunter T, Marten D, Phu W, Nam KM, Sestan N, O’Donnell-Luria AH, Erceg J. Functional associations of evolutionarily recent human genes exhibit sensitivity to the 3D genome landscape and disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.17.585403. [PMID: 38559085 PMCID: PMC10980080 DOI: 10.1101/2024.03.17.585403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Genome organization is intricately tied to regulating genes and associated cell fate decisions. In this study, we examine the positioning and functional significance of human genes, grouped by their evolutionary age, within the 3D organization of the genome. We reveal that genes of different evolutionary origin have distinct positioning relationships with both domains and loop anchors, and remarkably consistent relationships with boundaries across cell types. While the functional associations of each group of genes are primarily cell type-specific, such associations of conserved genes maintain greater stability across 3D genomic features and disease than recently evolved genes. Furthermore, the expression of these genes across various tissues follows an evolutionary progression, such that RNA levels increase from young genes to ancient genes. Thus, the distinct relationships of gene evolutionary age, function, and positioning within 3D genomic features contribute to tissue-specific gene regulation in development and disease.
Collapse
Affiliation(s)
- Katherine Fleck
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269
| | - Victor Luria
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| | - Nitanta Garag
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
| | - Amir Karger
- IT-Research Computing, Harvard Medical School, Boston, MA 02115
| | - Trevor Hunter
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
| | - Daniel Marten
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
| | - William Phu
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
| | - Kee-Myoung Nam
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06510
| | - Nenad Sestan
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510
| | - Anne H. O’Donnell-Luria
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
- Department of Pediatrics, Harvard Medical School, Boston, MA 02115
| | - Jelena Erceg
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030
| |
Collapse
|
6
|
Buckberry S, Liu X, Poppe D, Tan JP, Sun G, Chen J, Nguyen TV, de Mendoza A, Pflueger J, Frazer T, Vargas-Landín DB, Paynter JM, Smits N, Liu N, Ouyang JF, Rossello FJ, Chy HS, Rackham OJL, Laslett AL, Breen J, Faulkner GJ, Nefzger CM, Polo JM, Lister R. Transient naive reprogramming corrects hiPS cells functionally and epigenetically. Nature 2023; 620:863-872. [PMID: 37587336 PMCID: PMC10447250 DOI: 10.1038/s41586-023-06424-7] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Accepted: 07/11/2023] [Indexed: 08/18/2023]
Abstract
Cells undergo a major epigenome reconfiguration when reprogrammed to human induced pluripotent stem cells (hiPS cells). However, the epigenomes of hiPS cells and human embryonic stem (hES) cells differ significantly, which affects hiPS cell function1-8. These differences include epigenetic memory and aberrations that emerge during reprogramming, for which the mechanisms remain unknown. Here we characterized the persistence and emergence of these epigenetic differences by performing genome-wide DNA methylation profiling throughout primed and naive reprogramming of human somatic cells to hiPS cells. We found that reprogramming-induced epigenetic aberrations emerge midway through primed reprogramming, whereas DNA demethylation begins early in naive reprogramming. Using this knowledge, we developed a transient-naive-treatment (TNT) reprogramming strategy that emulates the embryonic epigenetic reset. We show that the epigenetic memory in hiPS cells is concentrated in cell of origin-dependent repressive chromatin marked by H3K9me3, lamin-B1 and aberrant CpH methylation. TNT reprogramming reconfigures these domains to a hES cell-like state and does not disrupt genomic imprinting. Using an isogenic system, we demonstrate that TNT reprogramming can correct the transposable element overexpression and differential gene expression seen in conventional hiPS cells, and that TNT-reprogrammed hiPS and hES cells show similar differentiation efficiencies. Moreover, TNT reprogramming enhances the differentiation of hiPS cells derived from multiple cell types. Thus, TNT reprogramming corrects epigenetic memory and aberrations, producing hiPS cells that are molecularly and functionally more similar to hES cells than conventional hiPS cells. We foresee TNT reprogramming becoming a new standard for biomedical and therapeutic applications and providing a novel system for studying epigenetic memory.
Collapse
Affiliation(s)
- Sam Buckberry
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Western Australia, Australia
- ARC Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, The University of Western Australia, Perth, Western Australia, Australia
- Telethon Kids Institute, Perth, Western Australia, Australia
- John Curtin School of Medical Research, College of Health and Medicine, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Xiaodong Liu
- Department of Anatomy and Developmental Biology, Monash University, Melbourne, Victoria, Australia
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Melbourne, Victoria, Australia
- Australian Regenerative Medicine Institute, Monash University, Melbourne, Victoria, Australia
- School of Life Sciences, Westlake University, Hangzhou, China
- Research Center for Industries of the Future, Westlake University, Hangzhou, China
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, China
- Westlake Institute for Advanced Study, Hangzhou, China
| | - Daniel Poppe
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Western Australia, Australia
- ARC Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, The University of Western Australia, Perth, Western Australia, Australia
| | - Jia Ping Tan
- Department of Anatomy and Developmental Biology, Monash University, Melbourne, Victoria, Australia
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Melbourne, Victoria, Australia
- Australian Regenerative Medicine Institute, Monash University, Melbourne, Victoria, Australia
| | - Guizhi Sun
- Department of Anatomy and Developmental Biology, Monash University, Melbourne, Victoria, Australia
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Melbourne, Victoria, Australia
- Australian Regenerative Medicine Institute, Monash University, Melbourne, Victoria, Australia
| | - Joseph Chen
- Department of Anatomy and Developmental Biology, Monash University, Melbourne, Victoria, Australia
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Melbourne, Victoria, Australia
- Australian Regenerative Medicine Institute, Monash University, Melbourne, Victoria, Australia
| | - Trung Viet Nguyen
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Western Australia, Australia
- ARC Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, The University of Western Australia, Perth, Western Australia, Australia
| | - Alex de Mendoza
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Western Australia, Australia
- ARC Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, The University of Western Australia, Perth, Western Australia, Australia
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
| | - Jahnvi Pflueger
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Western Australia, Australia
- ARC Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, The University of Western Australia, Perth, Western Australia, Australia
| | - Thomas Frazer
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Western Australia, Australia
- ARC Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, The University of Western Australia, Perth, Western Australia, Australia
| | - Dulce B Vargas-Landín
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Western Australia, Australia
- ARC Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, The University of Western Australia, Perth, Western Australia, Australia
| | - Jacob M Paynter
- Department of Anatomy and Developmental Biology, Monash University, Melbourne, Victoria, Australia
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Melbourne, Victoria, Australia
- Australian Regenerative Medicine Institute, Monash University, Melbourne, Victoria, Australia
| | - Nathan Smits
- Mater Research Institute, University of Queensland, Brisbane, Queensland, Australia
| | - Ning Liu
- South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
| | - John F Ouyang
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore Medical School, Singapore, Singapore
| | - Fernando J Rossello
- Department of Anatomy and Developmental Biology, Monash University, Melbourne, Victoria, Australia
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Melbourne, Victoria, Australia
- Australian Regenerative Medicine Institute, Monash University, Melbourne, Victoria, Australia
- Murdoch Children's Research Institute, Melbourne, Victoria, Australia
| | - Hun S Chy
- Australian Regenerative Medicine Institute, Monash University, Melbourne, Victoria, Australia
- Biomedical Manufacturing, Commonwealth Scientific and Industrial Research Organisation, Melbourne, Victoria, Australia
| | - Owen J L Rackham
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore Medical School, Singapore, Singapore
- School of Biological Sciences, University of Southampton, Southampton, UK
| | - Andrew L Laslett
- Australian Regenerative Medicine Institute, Monash University, Melbourne, Victoria, Australia
- Biomedical Manufacturing, Commonwealth Scientific and Industrial Research Organisation, Melbourne, Victoria, Australia
| | - James Breen
- John Curtin School of Medical Research, College of Health and Medicine, Australian National University, Canberra, Australian Capital Territory, Australia
- South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
| | - Geoffrey J Faulkner
- Mater Research Institute, University of Queensland, Brisbane, Queensland, Australia
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
| | - Christian M Nefzger
- Department of Anatomy and Developmental Biology, Monash University, Melbourne, Victoria, Australia
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Melbourne, Victoria, Australia
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia
| | - Jose M Polo
- Department of Anatomy and Developmental Biology, Monash University, Melbourne, Victoria, Australia.
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Melbourne, Victoria, Australia.
- Australian Regenerative Medicine Institute, Monash University, Melbourne, Victoria, Australia.
- Adelaide Centre for Epigenetics, School of Biomedicine, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, South Australia, Australia.
- The South Australian Immunogenomics Cancer Institute, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, South Australia, Australia.
| | - Ryan Lister
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Western Australia, Australia.
- ARC Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, The University of Western Australia, Perth, Western Australia, Australia.
| |
Collapse
|
7
|
Catta-Preta R, Lindtner S, Ypsilanti A, Price J, Abnousi A, Su-Feher L, Wang Y, Juric I, Jones IR, Akiyama JA, Hu M, Shen Y, Visel A, Pennacchio LA, Dickel D, Rubenstein JLR, Nord AS. Combinatorial transcription factor binding encodes cis-regulatory wiring of forebrain GABAergic neurogenesis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.28.546894. [PMID: 37425940 PMCID: PMC10327028 DOI: 10.1101/2023.06.28.546894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Transcription factors (TFs) bind combinatorially to genomic cis-regulatory elements (cREs), orchestrating transcription programs. While studies of chromatin state and chromosomal interactions have revealed dynamic neurodevelopmental cRE landscapes, parallel understanding of the underlying TF binding lags. To elucidate the combinatorial TF-cRE interactions driving mouse basal ganglia development, we integrated ChIP-seq for twelve TFs, H3K4me3-associated enhancer-promoter interactions, chromatin and transcriptional state, and transgenic enhancer assays. We identified TF-cREs modules with distinct chromatin features and enhancer activity that have complementary roles driving GABAergic neurogenesis and suppressing other developmental fates. While the majority of distal cREs were bound by one or two TFs, a small proportion were extensively bound, and these enhancers also exhibited exceptional evolutionary conservation, motif density, and complex chromosomal interactions. Our results provide new insights into how modules of combinatorial TF-cRE interactions activate and repress developmental expression programs and demonstrate the value of TF binding data in modeling gene regulatory wiring.
Collapse
Affiliation(s)
- Rinaldo Catta-Preta
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
- Current Address: Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Susan Lindtner
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Athena Ypsilanti
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - James Price
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Armen Abnousi
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH 44106, USA
- Current Address: NovaSignal, Los Angeles, CA 90064, USA
| | - Linda Su-Feher
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| | - Yurong Wang
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| | - Ivan Juric
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH 44106, USA
| | - Ian R Jones
- Institute for Human Genetics, Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Neurology, University of California, San Francisco, CA 94143, USA
| | - Jennifer A Akiyama
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ming Hu
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH 44106, USA
| | - Yin Shen
- Institute for Human Genetics, Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Neurology, University of California, San Francisco, CA 94143, USA
| | - Axel Visel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
- School of Natural Sciences, University of California, Merced, Merced, CA 95343, USA
| | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
- Comparative Biochemistry Program, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Diane Dickel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - John L R Rubenstein
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Alex S Nord
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| |
Collapse
|
8
|
Smith GD, Ching WH, Cornejo-Páramo P, Wong ES. Decoding enhancer complexity with machine learning and high-throughput discovery. Genome Biol 2023; 24:116. [PMID: 37173718 PMCID: PMC10176946 DOI: 10.1186/s13059-023-02955-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 04/28/2023] [Indexed: 05/15/2023] Open
Abstract
Enhancers are genomic DNA elements controlling spatiotemporal gene expression. Their flexible organization and functional redundancies make deciphering their sequence-function relationships challenging. This article provides an overview of the current understanding of enhancer organization and evolution, with an emphasis on factors that influence these relationships. Technological advancements, particularly in machine learning and synthetic biology, are discussed in light of how they provide new ways to understand this complexity. Exciting opportunities lie ahead as we continue to unravel the intricacies of enhancer function.
Collapse
Affiliation(s)
- Gabrielle D Smith
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Wan Hern Ching
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia
| | - Paola Cornejo-Páramo
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Emily S Wong
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia.
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington, NSW, Australia.
| |
Collapse
|
9
|
Fedorova L, Mulyar OA, Lim J, Fedorov A. Nucleotide Composition of Ultra-Conserved Elements Shows Excess of GpC and Depletion of GG and CC Dinucleotides. Genes (Basel) 2022; 13:2053. [PMID: 36360290 PMCID: PMC9690913 DOI: 10.3390/genes13112053] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 10/25/2022] [Accepted: 11/03/2022] [Indexed: 08/27/2023] Open
Abstract
The public UCNEbase database, comprising 4273 human ultra-conserved noncoding elements (UCNEs), was thoroughly investigated with the aim to find any nucleotide signals or motifs that have made these DNA sequences practically unchanged over three hundred million years of evolution. Each UCNE comprises over 200 nucleotides and has at least 95% identity between humans and chickens. A total of 31,046 SNPs were found within the UCNE database. We demonstrated that every human has over 300 mutations within 4273 UCNEs. No association of UCNEs with non-coding RNAs, nor preference of a particular meiotic recombination rate within them were found. No sequence motifs associated with UCNEs nor their flanking regions have been found. However, we demonstrated that UCNEs have strong nucleotide and dinucleotide sequence abnormalities compared to genome averages. Specifically, UCNEs are depleted for CC and GG dinucleotides, while GC dinucleotides are in excess of 28%. Importantly, GC dinucleotides have extraordinarily strong stacking free-energy inside the DNA helix and unique resistance to dissociation. Based on the adjacent nucleotide stacking abnormalities within UCNEs, we conjecture that peculiarities in dinucleotide distribution within UCNEs may create unique 3D conformation and specificity to bind proteins. We also discuss the strange dynamics of multiple SNPs inside UCNEs and reasons why these sequences are extraordinarily conserved.
Collapse
Affiliation(s)
| | | | - Jan Lim
- CRI Genetics LLC, Santa Monica, CA 90404, USA
| | - Alexei Fedorov
- CRI Genetics LLC, Santa Monica, CA 90404, USA
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA
| |
Collapse
|
10
|
Hayeck TJ, Stong N, Baugh E, Dhindsa R, Turner TN, Malakar A, Mosbruger TL, Shaw GTW, Duan Y, Ionita-Laza I, Goldstein D, Allen AS. Ancestry adjustment improves genome-wide estimates of regional intolerance. Genetics 2022; 221:iyac050. [PMID: 35385101 PMCID: PMC9157129 DOI: 10.1093/genetics/iyac050] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Accepted: 02/24/2022] [Indexed: 11/12/2022] Open
Abstract
Genomic regions subject to purifying selection are more likely to carry disease-causing mutations than regions not under selection. Cross species conservation is often used to identify such regions but with limited resolution to detect selection on short evolutionary timescales such as that occurring in only one species. In contrast, genetic intolerance looks for depletion of variation relative to expectation within a species, allowing species-specific features to be identified. When estimating the intolerance of noncoding sequence, methods strongly leverage variant frequency distributions. As the expected distributions depend on ancestry, if not properly controlled for, ancestral population source may obfuscate signals of selection. We demonstrate that properly incorporating ancestry in intolerance estimation greatly improved variant classification. We provide a genome-wide intolerance map that is conditional on ancestry and likely to be particularly valuable for variant prioritization.
Collapse
Affiliation(s)
- Tristan J Hayeck
- Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Nicholas Stong
- Institute for Genomic Medicine, Columbia University Medical Center, New York, NY 10032, USA
| | - Evan Baugh
- Institute for Genomic Medicine, Columbia University Medical Center, New York, NY 10032, USA
| | - Ryan Dhindsa
- Institute for Genomic Medicine, Columbia University Medical Center, New York, NY 10032, USA
| | - Tychele N Turner
- Department of Genetics, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Ayan Malakar
- Institute for Genomic Medicine, Columbia University Medical Center, New York, NY 10032, USA
| | - Timothy L Mosbruger
- Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Grace Tzun-Wen Shaw
- Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Yuncheng Duan
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27710, USA
| | | | - David Goldstein
- Institute for Genomic Medicine, Columbia University Medical Center, New York, NY 10032, USA
| | - Andrew S Allen
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27710, USA
| |
Collapse
|
11
|
Snetkova V, Pennacchio LA, Visel A, Dickel DE. Perfect and imperfect views of ultraconserved sequences. Nat Rev Genet 2022; 23:182-194. [PMID: 34764456 PMCID: PMC8858888 DOI: 10.1038/s41576-021-00424-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/30/2021] [Indexed: 12/12/2022]
Abstract
Across the human genome, there are nearly 500 'ultraconserved' elements: regions of at least 200 contiguous nucleotides that are perfectly conserved in both the mouse and rat genomes. Remarkably, the majority of these sequences are non-coding, and many can function as enhancers that activate tissue-specific gene expression during embryonic development. From their first description more than 15 years ago, their extreme conservation has both fascinated and perplexed researchers in genomics and evolutionary biology. The intrigue around ultraconserved elements only grew with the observation that they are dispensable for viability. Here, we review recent progress towards understanding the general importance and the specific functions of ultraconserved sequences in mammalian development and human disease and discuss possible explanations for their extreme conservation.
Collapse
Affiliation(s)
- Valentina Snetkova
- Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Molecular Biology, Genentech, South San Francisco, CA, USA
| | - Len A Pennacchio
- Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Comparative Biochemistry Program, University of California, Berkeley, CA, USA.
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA.
| | - Axel Visel
- Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA.
- School of Natural Sciences, University of California, Merced, Merced, CA, USA.
| | - Diane E Dickel
- Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| |
Collapse
|
12
|
Antoine-Lorquin A, Arensburger P, Arnaoty A, Asgari S, Batailler M, Beauclair L, Belleannée C, Buisine N, Coustham V, Guyetant S, Helou L, Lecomte T, Pitard B, Stévant I, Bigot Y. Two repeated motifs enriched within some enhancers and origins of replication are bound by SETMAR isoforms in human colon cells. Genomics 2021; 113:1589-1604. [PMID: 33812898 DOI: 10.1016/j.ygeno.2021.03.032] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 03/25/2021] [Accepted: 03/30/2021] [Indexed: 11/15/2022]
Abstract
Setmar is a gene specific to simian genomes. The function(s) of its isoforms are poorly understood and their existence in healthy tissues remains to be validated. Here we profiled SETMAR expression and its genome-wide binding landscape in colon tissue. We found isoforms V3 and V6 in healthy and tumour colon tissues as well as incell lines. In two colorectal cell lines SETMAR binds to several thousand Hsmar1 and MADE1 terminal ends, transposons mostly located in non-genic regions of active chromatin including in enhancers. It also binds to a 12-bp motifs similar to an inner motif in Hsmar1 and MADE1 terminal ends. This motif is interspersed throughout the genome and is enriched in GC-rich regions as well as in CpG islands that contain constitutive replication origins. It is also found in enhancers other than those associated with Hsmar1 and MADE1. The role of SETMAR in the expression of genes, DNA replication and in DNA repair are discussed.
Collapse
Affiliation(s)
| | - Peter Arensburger
- Biological Sciences Department, California State Polytechnic University, Pomona, CA 91768, - United States
| | - Ahmed Arnaoty
- EA GICC, 7501, CHRU de Tours, 37044 TOURS, Cedex 09, France
| | - Sassan Asgari
- School of Biological Sciences, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Martine Batailler
- PRC, UMR INRA 0085, CNRS 7247, Centre INRA Val de Loire, 37380 Nouzilly, France
| | - Linda Beauclair
- PRC, UMR INRA 0085, CNRS 7247, Centre INRA Val de Loire, 37380 Nouzilly, France
| | | | - Nicolas Buisine
- UMR CNRS 7221, Muséum National d'Histoire Naturelle, 75005 Paris, France
| | | | - Serge Guyetant
- Tumorothèque du CHRU de Tours, 37044 Tours, Cedex, France
| | - Laura Helou
- PRC, UMR INRA 0085, CNRS 7247, Centre INRA Val de Loire, 37380 Nouzilly, France
| | | | - Bruno Pitard
- Université de Nantes, CNRS ERL6001, Inserm 1232, CRCINA, F-44000 Nantes, France
| | - Isabelle Stévant
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon, 1, 46 allée d'Italie, 69364 Lyon, France
| | - Yves Bigot
- PRC, UMR INRA 0085, CNRS 7247, Centre INRA Val de Loire, 37380 Nouzilly, France.
| |
Collapse
|
13
|
Snetkova V, Ypsilanti AR, Akiyama JA, Mannion BJ, Plajzer-Frick I, Novak CS, Harrington AN, Pham QT, Kato M, Zhu Y, Godoy J, Meky E, Hunter RD, Shi M, Kvon EZ, Afzal V, Tran S, Rubenstein JLR, Visel A, Pennacchio LA, Dickel DE. Ultraconserved enhancer function does not require perfect sequence conservation. Nat Genet 2021; 53:521-528. [PMID: 33782603 PMCID: PMC8038972 DOI: 10.1038/s41588-021-00812-3] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 02/04/2021] [Indexed: 01/09/2023]
Abstract
Ultraconserved enhancer sequences show perfect conservation between human and rodent genomes, suggesting that their functions are highly sensitive to mutation. However, current models of enhancer function do not sufficiently explain this extreme evolutionary constraint. We subjected 23 ultraconserved enhancers to different levels of mutagenesis, collectively introducing 1,547 mutations, and examined their activities in transgenic mouse reporter assays. Overall, we find that the regulatory properties of ultraconserved enhancers are robust to mutation. Upon mutagenesis, nearly all (19/23, 83%) still functioned as enhancers at one developmental stage, as did most of those tested again later in development (5/9, 56%). Replacement of endogenous enhancers with mutated alleles in mice corroborated results of transgenic assays, including the functional resilience of ultraconserved enhancers to mutation. Our findings show that the currently known activities of ultraconserved enhancers do not necessarily require the perfect conservation observed in evolution and suggest that additional regulatory or other functions contribute to their sequence constraint.
Collapse
Affiliation(s)
- Valentina Snetkova
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Athena R Ypsilanti
- Department of Psychiatry, Neuroscience Program, UCSF Weill Institute for Neurosciences, and the Nina Ireland Laboratory of Developmental Neurobiology, University of California, San Francisco, San Francisco, CA, USA
| | - Jennifer A Akiyama
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Brandon J Mannion
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Comparative Biochemistry Program, University of California, Berkeley, Berkeley, CA, USA
| | - Ingrid Plajzer-Frick
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Catherine S Novak
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Anne N Harrington
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Quan T Pham
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Momoe Kato
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Yiwen Zhu
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Janeth Godoy
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Eman Meky
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Riana D Hunter
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Marie Shi
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Evgeny Z Kvon
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Developmental & Cell Biology, Department of Ecology & Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| | - Veena Afzal
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Stella Tran
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - John L R Rubenstein
- Department of Psychiatry, Neuroscience Program, UCSF Weill Institute for Neurosciences, and the Nina Ireland Laboratory of Developmental Neurobiology, University of California, San Francisco, San Francisco, CA, USA
| | - Axel Visel
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA.
- School of Natural Sciences, University of California, Merced, Merced, CA, USA.
| | - Len A Pennacchio
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Comparative Biochemistry Program, University of California, Berkeley, Berkeley, CA, USA.
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA.
| | - Diane E Dickel
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| |
Collapse
|
14
|
Van Dam MH, Henderson JB, Esposito L, Trautwein M. Genomic Characterization and Curation of UCEs Improves Species Tree Reconstruction. Syst Biol 2020; 70:307-321. [PMID: 32750133 PMCID: PMC7875437 DOI: 10.1093/sysbio/syaa063] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Revised: 07/26/2020] [Accepted: 07/29/2020] [Indexed: 12/12/2022] Open
Abstract
Ultraconserved genomic elements (UCEs) are generally treated as independent loci in phylogenetic analyses. The identification pipeline for UCE probes does not require prior knowledge of genetic identity, only selecting loci that are highly conserved, single copy, without repeats, and of a particular length. Here, we characterized UCEs from 11 phylogenomic studies across the animal tree of life, from birds to marine invertebrates. We found that within vertebrate lineages, UCEs are mostly intronic and intergenic, while in invertebrates, the majority are in exons. We then curated four different sets of UCE markers by genomic category from five different studies including: birds, mammals, fish, Hymenoptera (ants, wasps, and bees), and Coleoptera (beetles). Of genes captured by UCEs, we find that many are represented by two or more UCEs, corresponding to nonoverlapping segments of a single gene. We considered these UCEs to be nonindependent, merged all UCEs that belonged to a particular gene, constructed gene and species trees, and then evaluated the subsequent effect of merging cogenic UCEs on gene and species tree reconstruction. Average bootstrap support for merged UCE gene trees was significantly improved across all data sets apparently driven by the increase in loci length. Additionally, we conducted simulations and found that gene trees generated from merged UCEs were more accurate than those generated by unmerged UCEs. As loci length improves gene tree accuracy, this modest degree of UCE characterization and curation impacts downstream analyses and demonstrates the advantages of incorporating basic genomic characterizations into phylogenomic analyses. [Anchored hybrid enrichment; ants; ASTRAL; bait capture; carangimorph; Coleoptera; conserved nonexonic elements; exon capture; gene tree; Hymenoptera; mammal; phylogenomic markers; songbird; species tree; ultraconserved elements; weevils.]
Collapse
Affiliation(s)
- Matthew H Van Dam
- Entomology Department, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA.,Center for Comparative Genomics, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA
| | - James B Henderson
- Center for Comparative Genomics, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA
| | - Lauren Esposito
- Entomology Department, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA.,Center for Comparative Genomics, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA
| | - Michelle Trautwein
- Entomology Department, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA.,Center for Comparative Genomics, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA
| |
Collapse
|
15
|
Habic A, Mattick JS, Calin GA, Krese R, Konc J, Kunej T. Genetic Variations of Ultraconserved Elements in the Human Genome. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2020; 23:549-559. [PMID: 31689173 DOI: 10.1089/omi.2019.0156] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Ultraconserved elements (UCEs) are among the most popular DNA markers for phylogenomic analysis. In at least three of five placental mammalian genomes (human, dog, cow, mouse, and rat), 2189 UCEs of at least 200 bp in length that are identical have been identified. Most of these regions have not yet been functionally annotated, and their associations with diseases remain largely unknown. This is an important knowledge gap in human genomics with regard to UCE roles in physiologically critical functions, and by extension, their relevance for shared susceptibilities to common complex diseases across several mammalian organisms in the event of their polymorphic variations. In the present study, we remapped the genomic locations of these UCEs to the latest human genome assembly, and examined them for documented polymorphisms in sequenced human genomes. We identified 29,983 polymorphisms within analyzed UCEs, but revealed that a vast majority exhibits very low minor allele frequencies. Notably, only 112 of the identified polymorphisms are associated with a phenotype in the Ensembl genome browser. Through literature analyses, we confirmed associations of 37 (i.e., out of the 112) polymorphisms within 23 UCEs with 25 diseases and phenotypic traits, including, muscular dystrophies, eye diseases, and cancers (e.g., familial adenomatous polyposis). Most reports of UCE polymorphism-disease associations appeared to be not cognizant that their candidate polymorphisms were actually within UCEs. The present study offers strategic directions and knowledge gaps for future computational and experimental work so as to better understand the thus far intriguing and puzzling role(s) of UCEs in mammalian genomes.
Collapse
Affiliation(s)
- Anamarija Habic
- Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Domzale, Slovenia
| | - John S Mattick
- School of Biotechnology and Biomolecular Science, University of New South Wales, Sydney, Australia.,Green Templeton College, University of Oxford, Oxford, United Kingdom
| | - George Adrian Calin
- Department of Experimental Therapeutics, The University of Texas M.D. Anderson Cancer Center, Houston, Texas.,The Center for RNA Interference and Noncoding RNAs, The University of Texas M.D. Anderson Cancer Center, Houston, Texas
| | - Rok Krese
- Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Domzale, Slovenia
| | - Janez Konc
- National Institute of Chemistry, Ljubljana, Slovenia
| | - Tanja Kunej
- Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Domzale, Slovenia
| |
Collapse
|
16
|
Erceg J, AlHaj Abed J, Goloborodko A, Lajoie BR, Fudenberg G, Abdennur N, Imakaev M, McCole RB, Nguyen SC, Saylor W, Joyce EF, Senaratne TN, Hannan MA, Nir G, Dekker J, Mirny LA, Wu CT. The genome-wide multi-layered architecture of chromosome pairing in early Drosophila embryos. Nat Commun 2019; 10:4486. [PMID: 31582744 PMCID: PMC6776651 DOI: 10.1038/s41467-019-12211-8] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 08/27/2019] [Indexed: 12/13/2022] Open
Abstract
Genome organization involves cis and trans chromosomal interactions, both implicated in gene regulation, development, and disease. Here, we focus on trans interactions in Drosophila, where homologous chromosomes are paired in somatic cells from embryogenesis through adulthood. We first address long-standing questions regarding the structure of embryonic homolog pairing and, to this end, develop a haplotype-resolved Hi-C approach to minimize homolog misassignment and thus robustly distinguish trans-homolog from cis contacts. This computational approach, which we call Ohm, reveals pairing to be surprisingly structured genome-wide, with trans-homolog domains, compartments, and interaction peaks, many coinciding with analogous cis features. We also find a significant genome-wide correlation between pairing, transcription during zygotic genome activation, and binding of the pioneer factor Zelda. Our findings reveal a complex, highly structured organization underlying homolog pairing, first discovered a century ago in Drosophila. Finally, we demonstrate the versatility of our haplotype-resolved approach by applying it to mammalian embryos.
Collapse
Affiliation(s)
- Jelena Erceg
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
| | - Jumana AlHaj Abed
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
| | - Anton Goloborodko
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA
| | - Bryan R Lajoie
- Howard Hughes Medical Institute and Program in Systems Biology, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA, 01605-0103, USA
- Illumina, San Diego, CA, USA
| | - Geoffrey Fudenberg
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA
- Gladstone Institutes of Data Science and Biotechnology, San Francisco, CA, 94158, USA
| | - Nezar Abdennur
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA
| | - Maxim Imakaev
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA
| | - Ruth B McCole
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
| | - Son C Nguyen
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
- Department of Genetics, Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104-6145, USA
| | - Wren Saylor
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
| | - Eric F Joyce
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
- Department of Genetics, Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104-6145, USA
| | - T Niroshini Senaratne
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
| | - Mohammed A Hannan
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
| | - Guy Nir
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
| | - Job Dekker
- Howard Hughes Medical Institute and Program in Systems Biology, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA, 01605-0103, USA
| | - Leonid A Mirny
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA.
- Department of Physics, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA.
| | - C-Ting Wu
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA.
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, 02115, USA.
| |
Collapse
|
17
|
Tan G, Polychronopoulos D, Lenhard B. CNEr: A toolkit for exploring extreme noncoding conservation. PLoS Comput Biol 2019; 15:e1006940. [PMID: 31449516 PMCID: PMC6730951 DOI: 10.1371/journal.pcbi.1006940] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Revised: 09/06/2019] [Accepted: 06/25/2019] [Indexed: 12/18/2022] Open
Abstract
Conserved Noncoding Elements (CNEs) are elements exhibiting extreme noncoding conservation in Metazoan genomes. They cluster around developmental genes and act as long-range enhancers, yet nothing that we know about their function explains the observed conservation levels. Clusters of CNEs coincide with topologically associating domains (TADs), indicating ancient origins and stability of TAD locations. This has suggested further hypotheses about the still elusive origin of CNEs, and has provided a comparative genomics-based method of estimating the position of TADs around developmentally regulated genes in genomes where chromatin conformation capture data is missing. To enable researchers in gene regulation and chromatin biology to start deciphering this phenomenon, we developed CNEr, a R/Bioconductor toolkit for large-scale identification of CNEs and for studying their genomic properties. We apply CNEr to two novel genome comparisons—fruit fly vs tsetse fly, and two sea urchin genomes—and report novel insights gained from their analysis. We also show how to reveal interesting characteristics of CNEs by coupling CNEr with existing Bioconductor packages. CNEr is available at Bioconductor (https://bioconductor.org/packages/CNEr/) and maintained at github (https://github.com/ge11232002/CNEr).
Collapse
Affiliation(s)
- Ge Tan
- Computational Regulatory Genomics Group, MRC London Institute of Medical Sciences, United Kingdom
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Campus, London, United Kingdom
| | - Dimitris Polychronopoulos
- Computational Regulatory Genomics Group, MRC London Institute of Medical Sciences, United Kingdom
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Campus, London, United Kingdom
| | - Boris Lenhard
- Computational Regulatory Genomics Group, MRC London Institute of Medical Sciences, United Kingdom
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Campus, London, United Kingdom
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen, Norway
- * E-mail:
| |
Collapse
|
18
|
Perenthaler E, Yousefi S, Niggl E, Barakat TS. Beyond the Exome: The Non-coding Genome and Enhancers in Neurodevelopmental Disorders and Malformations of Cortical Development. Front Cell Neurosci 2019; 13:352. [PMID: 31417368 PMCID: PMC6685065 DOI: 10.3389/fncel.2019.00352] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 07/16/2019] [Indexed: 12/22/2022] Open
Abstract
The development of the human cerebral cortex is a complex and dynamic process, in which neural stem cell proliferation, neuronal migration, and post-migratory neuronal organization need to occur in a well-organized fashion. Alterations at any of these crucial stages can result in malformations of cortical development (MCDs), a group of genetically heterogeneous neurodevelopmental disorders that present with developmental delay, intellectual disability and epilepsy. Recent progress in genetic technologies, such as next generation sequencing, most often focusing on all protein-coding exons (e.g., whole exome sequencing), allowed the discovery of more than a 100 genes associated with various types of MCDs. Although this has considerably increased the diagnostic yield, most MCD cases remain unexplained. As Whole Exome Sequencing investigates only a minor part of the human genome (1-2%), it is likely that patients, in which no disease-causing mutation has been identified, could harbor mutations in genomic regions beyond the exome. Even though functional annotation of non-coding regions is still lagging behind that of protein-coding genes, tremendous progress has been made in the field of gene regulation. One group of non-coding regulatory regions are enhancers, which can be distantly located upstream or downstream of genes and which can mediate temporal and tissue-specific transcriptional control via long-distance interactions with promoter regions. Although some examples exist in literature that link alterations of enhancers to genetic disorders, a widespread appreciation of the putative roles of these sequences in MCDs is still lacking. Here, we summarize the current state of knowledge on cis-regulatory regions and discuss novel technologies such as massively-parallel reporter assay systems, CRISPR-Cas9-based screens and computational approaches that help to further elucidate the emerging role of the non-coding genome in disease. Moreover, we discuss existing literature on mutations or copy number alterations of regulatory regions involved in brain development. We foresee that the future implementation of the knowledge obtained through ongoing gene regulation studies will benefit patients and will provide an explanation to part of the missing heritability of MCDs and other genetic disorders.
Collapse
Affiliation(s)
| | | | | | - Tahsin Stefan Barakat
- Department of Clinical Genetics, Erasmus MC – University Medical Center, Rotterdam, Netherlands
| |
Collapse
|
19
|
Hedin M, Derkarabetian S, Alfaro A, Ramírez MJ, Bond JE. Phylogenomic analysis and revised classification of atypoid mygalomorph spiders (Araneae, Mygalomorphae), with notes on arachnid ultraconserved element loci. PeerJ 2019; 7:e6864. [PMID: 31110925 PMCID: PMC6501763 DOI: 10.7717/peerj.6864] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Accepted: 03/28/2019] [Indexed: 12/18/2022] Open
Abstract
The atypoid mygalomorphs include spiders from three described families that build a diverse array of entrance web constructs, including funnel-and-sheet webs, purse webs, trapdoors, turrets and silken collars. Molecular phylogenetic analyses have generally supported the monophyly of Atypoidea, but prior studies have not sampled all relevant taxa. Here we generated a dataset of ultraconserved element loci for all described atypoid genera, including taxa (Mecicobothrium and Hexurella) key to understanding familial monophyly, divergence times, and patterns of entrance web evolution. We show that the conserved regions of the arachnid UCE probe set target exons, such that it should be possible to combine UCE and transcriptome datasets in arachnids. We also show that different UCE probes sometimes target the same protein, and under the matching parameters used here show that UCE alignments sometimes include non-orthologs. Using multiple curated phylogenomic matrices we recover a monophyletic Atypoidea, and reveal that the family Mecicobothriidae comprises four separate and divergent lineages. Fossil-calibrated divergence time analyses suggest ancient Triassic (or older) origins for several relictual atypoid lineages, with late Cretaceous/early Tertiary divergences within some genera indicating a high potential for cryptic species diversity. The ancestral entrance web construct for atypoids, and all mygalomorphs, is reconstructed as a funnel-and-sheet web.
Collapse
Affiliation(s)
- Marshal Hedin
- Department of Biology, San Diego State University, San Diego, CA, United States of America
| | - Shahan Derkarabetian
- Department of Biology, San Diego State University, San Diego, CA, United States of America
- Department of Biology, University of California, Riverside, Riverside, CA, United States of America
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, United States of America
| | - Adan Alfaro
- Department of Biology, San Diego State University, San Diego, CA, United States of America
| | - Martín J. Ramírez
- Division of Arachnology, Museo Argentino de Ciencias Naturales “Bernardino Rivadavia”, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Buenos Aires, Argentina
| | - Jason E. Bond
- Department of Entomology and Nematology, University of California, Davis, CA, United States of America
| |
Collapse
|