1
|
Crossley ER, Fedorova L, Mulyar O, Freeman R, Khuder S, Fedorov A. Computational identification of ultra-conserved elements in the human genome: a hypothesis on homologous DNA pairing. NAR Genom Bioinform 2024; 6:lqae074. [PMID: 38962254 PMCID: PMC11217675 DOI: 10.1093/nargab/lqae074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 05/29/2024] [Accepted: 06/19/2024] [Indexed: 07/05/2024] Open
Abstract
Thousands of prolonged sequences of human ultra-conserved non-coding elements (UCNEs) share only one common feature: peculiarities in the unique composition of their dinucleotides. Here we investigate whether the numerous weak signals emanating from these dinucleotide arrangements can be used for computational identification of UCNEs within the human genome. For this purpose, we analyzed 4272 UCNE sequences, encompassing 1 393 448 nucleotides, alongside equally sized control samples of randomly selected human genomic sequences. Our research identified nine different features of dinucleotide arrangements that enable differentiation of UCNEs from the rest of the genome. We employed these nine features, implementing three Machine Learning techniques - Support Vector Machine, Random Forest, and Artificial Neural Networks - to classify UCNEs, achieving an accuracy rate of 82-84%, with specific conditions allowing for over 90% accuracy. Notably, the strongest feature for UCNE identification was the frequency ratio between GpC dinucleotides and the sum of GpG and CpC dinucleotides. Additionally, we investigated the entire pool of 31 046 SNPs located within UCNEs for their representation in the ClinVar database, which catalogs human SNPs with known phenotypic effects. The presence of UCNE-associated SNPs in ClinVar aligns with the expectation of a random distribution, emphasizing the enigmatic nature of UCNE phenotypic manifestation.
Collapse
Affiliation(s)
- Emily R Crossley
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA
| | | | | | | | - Sadik Khuder
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA
| | - Alexei Fedorov
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA
- CRI Genetics LLC, Santa Monica, CA 90404, USA
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA
| |
Collapse
|
2
|
Castillo H, Hanna P, Sachs LM, Buisine N, Godoy F, Gilbert C, Aguilera F, Muñoz D, Boisvert C, Debiais-Thibaud M, Wan J, Spicuglia S, Marcellini S. Xenopus tropicalis osteoblast-specific open chromatin regions reveal promoters and enhancers involved in human skeletal phenotypes and shed light on early vertebrate evolution. Cells Dev 2024:203924. [PMID: 38692409 DOI: 10.1016/j.cdev.2024.203924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Revised: 04/18/2024] [Accepted: 04/26/2024] [Indexed: 05/03/2024]
Abstract
While understanding the genetic underpinnings of osteogenesis has far-reaching implications for skeletal diseases and evolution, a comprehensive characterization of the osteoblastic regulatory landscape in non-mammalian vertebrates is still lacking. Here, we compared the ATAC-Seq profile of Xenopus tropicalis (Xt) osteoblasts to a variety of non mineralizing control tissues, and identified osteoblast-specific nucleosome free regions (NFRs) at 527 promoters and 6747 distal regions. Sequence analyses, Gene Ontology, RNA-Seq and ChIP-Seq against four key histone marks confirmed that the distal regions correspond to bona fide osteogenic transcriptional enhancers exhibiting a shared regulatory logic with mammals. We report 425 regulatory regions conserved with human and globally associated to skeletogenic genes. Of these, 35 regions have been shown to impact human skeletal phenotypes by GWAS, including one trps1 enhancer and the runx2 promoter, two genes which are respectively involved in trichorhinophalangeal syndrome type I and cleidocranial dysplasia. Intriguingly, 60 osteoblastic NFRs also align to the genome of the elephant shark, a species lacking osteoblasts and bone tissue. To tackle this paradox, we chose to focus on dlx5 because its conserved promoter, known to integrate regulatory inputs during mammalian osteogenesis, harbours an osteoblast-specific NFR in both frog and human. Hence, we show that dlx5 is expressed in Xt and elephant shark odontoblasts, supporting a common cellular and genetic origin of bone and dentine. Taken together, our work (i) unravels the Xt osteogenic regulatory landscape, (ii) illustrates how cross-species comparisons harvest data relevant to human biology and (iii) reveals that a set of genes including bnc2, dlx5, ebf3, mir199a, nfia, runx2 and zfhx4 drove the development of a primitive form of mineralized skeletal tissue deep in the vertebrate lineage.
Collapse
Affiliation(s)
- Héctor Castillo
- Group for the Study of Developmental Processes (GDeP), School of Biological Sciences, University of Concepción, Chile.
| | - Patricia Hanna
- Group for the Study of Developmental Processes (GDeP), School of Biological Sciences, University of Concepción, Chile
| | - Laurent M Sachs
- UMR7221, Physiologie Moléculaire et Adaptation, CNRS, MNHN, Paris Cedex 05, France
| | - Nicolas Buisine
- UMR7221, Physiologie Moléculaire et Adaptation, CNRS, MNHN, Paris Cedex 05, France
| | - Francisco Godoy
- Group for the Study of Developmental Processes (GDeP), School of Biological Sciences, University of Concepción, Chile
| | - Clément Gilbert
- Université Paris-Saclay, CNRS, IRD, UMR Évolution, Génomes, Comportement et Écologie, 12 route 128, 91190 Gif-sur-Yvette, France
| | - Felipe Aguilera
- Group for the Study of Developmental Processes (GDeP), School of Biological Sciences, University of Concepción, Chile
| | - David Muñoz
- Group for the Study of Developmental Processes (GDeP), School of Biological Sciences, University of Concepción, Chile
| | - Catherine Boisvert
- School of Molecular and Life Sciences, Curtin University, Perth, WA, Australia
| | - Mélanie Debiais-Thibaud
- Institut des Sciences de l'Evolution de Montpellier, ISEM, Univ Montpellier, CNRS, IRD, Montpellier, France
| | - Jing Wan
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France; Equipe Labelisée LIGUE contre le Cancer, Marseille, France
| | - Salvatore Spicuglia
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France; Equipe Labelisée LIGUE contre le Cancer, Marseille, France
| | - Sylvain Marcellini
- Group for the Study of Developmental Processes (GDeP), School of Biological Sciences, University of Concepción, Chile.
| |
Collapse
|
3
|
Sakamoto F, Kanamori S, Díaz LM, Cádiz A, Ishii Y, Yamaguchi K, Shigenobu S, Nakayama T, Makino T, Kawata M. Detection of evolutionary conserved and accelerated genomic regions related to adaptation to thermal niches in Anolis lizards. Ecol Evol 2024; 14:e11117. [PMID: 38455144 PMCID: PMC10920033 DOI: 10.1002/ece3.11117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 02/18/2024] [Accepted: 02/22/2024] [Indexed: 03/09/2024] Open
Abstract
Understanding the genetic basis for adapting to thermal environments is important due to serious effects of global warming on ectothermic species. Various genes associated with thermal adaptation in lizards have been identified mainly focusing on changes in gene expression or the detection of positively selected genes using coding regions. Only a few comprehensive genome-wide analyses have included noncoding regions. This study aimed to identify evolutionarily conserved and accelerated genomic regions using whole genomes of eight Anolis lizard species that have repeatedly adapted to similar thermal environments in multiple lineages. Evolutionarily conserved genomic regions were extracted as regions with overall sequence conservation (regions with fewer base substitutions) across all lineages compared with the neutral model. Genomic regions that underwent accelerated evolution in the lineage of interest were identified as those with more base substitutions in the target branch than in the entire background branch. Conserved elements across all branches were relatively abundant in "intergenic" genomic regions among noncoding regions. Accelerated regions (ARs) of each lineage contained a significantly greater proportion of noncoding RNA genes than the entire multiple alignment. Common genes containing ARs within 5 kb of their vicinity in lineages with similar thermal habitats were identified. Many genes associated with circadian rhythms and behavior were found in hot-open and cool-shaded habitat lineages. These genes might play a role in contributing to thermal adaptation and assist future studies examining the function of genes involved in thermal adaptation via genome editing.
Collapse
Affiliation(s)
- Fuku Sakamoto
- Graduate School of Life SciencesTohoku UniversitySendaiJapan
| | | | - Luis M. Díaz
- National Museum of Natural History of CubaHavanaCuba
| | - Antonio Cádiz
- Faculty of BiologyUniversity of HavanaHavanaCuba
- Present address:
Department of BiologyUniversity of MiamiCoral GablesFloridaUSA
| | - Yuu Ishii
- Graduate School of Life SciencesTohoku UniversitySendaiJapan
| | | | - Shuji Shigenobu
- Trans‐Omics FacilityNational Institute for Basic BiologyOkazakiJapan
- Department of Basic Biology, School of Life ScienceThe Graduate University for Advanced Studies, SOKENDAIOkazakiJapan
| | - Takuro Nakayama
- Division of Life Sciences, Center for Computational SciencesUniversity of TsukubaTsukubaJapan
| | - Takashi Makino
- Graduate School of Life SciencesTohoku UniversitySendaiJapan
| | - Masakado Kawata
- Graduate School of Life SciencesTohoku UniversitySendaiJapan
| |
Collapse
|
4
|
Lopez Soriano V, Dueñas Rey A, Mukherjee R, Coppieters F, Bauwens M, Willaert A, De Baere E. Multi-omics analysis in human retina uncovers ultraconserved cis-regulatory elements at rare eye disease loci. Nat Commun 2024; 15:1600. [PMID: 38383453 PMCID: PMC10881467 DOI: 10.1038/s41467-024-45381-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 01/19/2024] [Indexed: 02/23/2024] Open
Abstract
Cross-species genome comparisons have revealed a substantial number of ultraconserved non-coding elements (UCNEs). Several of these elements have proved to be essential tissue- and cell type-specific cis-regulators of developmental gene expression. Here, we characterize a set of UCNEs as candidate CREs (cCREs) during retinal development and evaluate the contribution of their genomic variation to rare eye diseases, for which pathogenic non-coding variants are emerging. Integration of bulk and single-cell retinal multi-omics data reveals 594 genes under potential cis-regulatory control of UCNEs, of which 45 are implicated in rare eye disease. Mining of candidate cis-regulatory UCNEs in WGS data derived from the rare eye disease cohort of Genomics England reveals 178 ultrarare variants within 84 UCNEs associated with 29 disease genes. Overall, we provide a comprehensive annotation of ultraconserved non-coding regions acting as cCREs during retinal development which can be targets of non-coding variation underlying rare eye diseases.
Collapse
Affiliation(s)
- Victor Lopez Soriano
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Alfredo Dueñas Rey
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | | | - Frauke Coppieters
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
- Department of Pharmaceutics, Ghent University, Ghent, Belgium
| | - Miriam Bauwens
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Andy Willaert
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Elfride De Baere
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium.
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium.
| |
Collapse
|
5
|
Liu A, Wang N, Xie G, Li Y, Yan X, Li X, Zhu Z, Li Z, Yang J, Meng F, Dou M, Chen W, Ma N, Jiang Y, Gao Y, Wang Y. GC-biased gene conversion drives accelerated evolution of ultraconserved elements in mammalian and avian genomes. Genome Res 2023; 33:1673-1689. [PMID: 37884342 PMCID: PMC10691551 DOI: 10.1101/gr.277784.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 08/23/2023] [Indexed: 10/28/2023]
Abstract
Ultraconserved elements (UCEs) are the most conserved regions among the genomes of evolutionarily distant species and are thought to play critical biological functions. However, some UCEs rapidly evolved in specific lineages, and whether they contributed to adaptive evolution is still controversial. Here, using an increased number of sequenced genomes with high taxonomic coverage, we identified 2191 mammalian UCEs and 5938 avian UCEs from 95 mammal and 94 bird genomes, respectively. Our results show that these UCEs are functionally constrained and that their adjacent genes are prone to widespread expression with low expression diversity across tissues. Functional enrichment of mammalian and avian UCEs shows different trends indicating that UCEs may contribute to adaptive evolution of taxa. Focusing on lineage-specific accelerated evolution, we discover that the proportion of fast-evolving UCEs in nine mammalian and 10 avian test lineages range from 0.19% to 13.2%. Notably, up to 62.1% of fast-evolving UCEs in test lineages are much more likely to result from GC-biased gene conversion (gBGC). A single cervid-specific gBGC region embracing the uc.359 allele significantly alters the expression of Nova1 and other neural-related genes in the rat brain. Combined with the altered regulatory activity of ancient gBGC-induced fast-evolving UCEs in eutherians, our results provide evidence that synergy between gBGC and selection shaped lineage-specific substitution patterns, even in the most constrained regulatory elements. In summary, our results show that gBGC played an important role in facilitating lineage-specific accelerated evolution of UCEs, and further support the idea that a combination of multiple evolutionary forces shapes adaptive evolution.
Collapse
Affiliation(s)
- Anguo Liu
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Nini Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Faculty of Mathematics and Natural Sciences, University of Cologne, and Cologne Excellence Cluster for Cellular Stress Responses in Aging-Associated Diseases (CECAD), University Hospital Cologne, Cologne 50931, Germany
| | - Guoxiang Xie
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yang Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Xixi Yan
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Xinmei Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Zhenliang Zhu
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
- College of Veterinary Medicine, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Zhuohui Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Jing Yang
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
- College of Veterinary Medicine, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Fanxin Meng
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Mingle Dou
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Weihuang Chen
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Nange Ma
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yu Jiang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Center for Functional Genomics, Institute of Future Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yuanpeng Gao
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China;
- College of Veterinary Medicine, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yu Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China;
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| |
Collapse
|
6
|
Cicconardi F, Milanetti E, Pinheiro de Castro EC, Mazo-Vargas A, Van Belleghem SM, Ruggieri AA, Rastas P, Hanly J, Evans E, Jiggins CD, Owen McMillan W, Papa R, Di Marino D, Martin A, Montgomery SH. Evolutionary dynamics of genome size and content during the adaptive radiation of Heliconiini butterflies. Nat Commun 2023; 14:5620. [PMID: 37699868 PMCID: PMC10497600 DOI: 10.1038/s41467-023-41412-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 08/30/2023] [Indexed: 09/14/2023] Open
Abstract
Heliconius butterflies, a speciose genus of Müllerian mimics, represent a classic example of an adaptive radiation that includes a range of derived dietary, life history, physiological and neural traits. However, key lineages within the genus, and across the broader Heliconiini tribe, lack genomic resources, limiting our understanding of how adaptive and neutral processes shaped genome evolution during their radiation. Here, we generate highly contiguous genome assemblies for nine Heliconiini, 29 additional reference-assembled genomes, and improve 10 existing assemblies. Altogether, we provide a dataset of annotated genomes for a total of 63 species, including 58 species within the Heliconiini tribe. We use this extensive dataset to generate a robust and dated heliconiine phylogeny, describe major patterns of introgression, explore the evolution of genome architecture, and the genomic basis of key innovations in this enigmatic group, including an assessment of the evolution of putative regulatory regions at the Heliconius stem. Our work illustrates how the increased resolution provided by such dense genomic sampling improves our power to generate and test gene-phenotype hypotheses, and precisely characterize how genomes evolve.
Collapse
Affiliation(s)
- Francesco Cicconardi
- School of Biological Sciences, Bristol University, Bristol, United Kingdom.
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom.
| | - Edoardo Milanetti
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185, Rome, Italy
- Center for Life Nano- & Neuro-Science, Italian Institute of Technology, Viale Regina Elena 291, 00161, Rome, Italy
| | | | - Anyi Mazo-Vargas
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Steven M Van Belleghem
- Department of Biology, University of Puerto Rico, Rio Piedras, PR, Puerto Rico
- Ecology, Evolution and Conservation Biology, Biology Department, KU Leuven, Leuven, Belgium
| | | | - Pasi Rastas
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | - Joseph Hanly
- Department of Biological Sciences, The George Washington University, Washington DC, WA, 20052, USA
- Smithsonian Tropical Research Institute, Panama City, Panama
| | - Elizabeth Evans
- Department of Biology, University of Puerto Rico, Rio Piedras, PR, Puerto Rico
| | - Chris D Jiggins
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom
| | - W Owen McMillan
- Smithsonian Tropical Research Institute, Panama City, Panama
| | - Riccardo Papa
- Department of Biology, University of Puerto Rico, Rio Piedras, PR, Puerto Rico
- Molecular Sciences and Research Center, University of Puerto Rico, San Juan, PR, Puerto Rico
- Comprehensive Cancer Center, University of Puerto Rico, San Juan, PR, Puerto Rico
| | - Daniele Di Marino
- Department of Life and Environmental Sciences, New York-Marche Structural Biology Center (NY-MaSBiC), Polytechnic University of Marche, Via Brecce Bianche, 60131, Ancona, Italy
- Neuronal Death and Neuroprotection Unit, Department of Neuroscience, Mario Negri Institute for Pharmacological Research-IRCCS, Via Mario Negri 2, 20156, Milano, Italy
- National Biodiversity Future Center (NBFC), Palermo, Italy
| | - Arnaud Martin
- Department of Biological Sciences, The George Washington University, Washington DC, WA, 20052, USA
| | - Stephen H Montgomery
- School of Biological Sciences, Bristol University, Bristol, United Kingdom.
- Smithsonian Tropical Research Institute, Panama City, Panama.
| |
Collapse
|
7
|
Fedorova L, Crossley ER, Mulyar OA, Qiu S, Freeman R, Fedorov A. Profound Non-Randomness in Dinucleotide Arrangements within Ultra-Conserved Non-Coding Elements and the Human Genome. BIOLOGY 2023; 12:1125. [PMID: 37627009 PMCID: PMC10452674 DOI: 10.3390/biology12081125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 08/09/2023] [Accepted: 08/11/2023] [Indexed: 08/27/2023]
Abstract
Long human ultra-conserved non-coding elements (UCNEs) do not have any sequence similarity to each other or other characteristics that make them unalterable during vertebrate evolution. We hypothesized that UCNEs have unique dinucleotide (DN) composition and arrangements compared to the rest of the genome. A total of 4272 human UCNE sequences were analyzed computationally and compared with the whole genomes of human, chicken, zebrafish, and fly. Statistical analysis was performed to assess the non-randomness in DN spacing arrangements within the entire human genome and within UCNEs. Significant non-randomness in DN spacing arrangements was observed in the entire human genome. Additionally, UCNEs exhibited distinct patterns in DN arrangements compared to the rest of the genome. Approximately 83% of all DN pairs within UCNEs showed significant (>10%) non-random genomic arrangements at short distances (2-6 nucleotides) relative to each other. At the extremes, non-randomness in DN spacing distances deviated up to 40% from expected values and were frequently associated with GpC, CpG, ApT, and GpG/CpC dinucleotides. The described peculiarities in DN arrangements have persisted for hundreds of millions of years in vertebrates. These distinctive patterns may suggest that UCNEs have specific DNA conformations.
Collapse
Affiliation(s)
- Larisa Fedorova
- CRI Genetics LLC, Santa Monica, CA 90404, USA; (L.F.); (O.A.M.); (R.F.)
| | - Emily R. Crossley
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA;
| | - Oleh A. Mulyar
- CRI Genetics LLC, Santa Monica, CA 90404, USA; (L.F.); (O.A.M.); (R.F.)
| | - Shuhao Qiu
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA;
| | - Ryan Freeman
- CRI Genetics LLC, Santa Monica, CA 90404, USA; (L.F.); (O.A.M.); (R.F.)
| | - Alexei Fedorov
- CRI Genetics LLC, Santa Monica, CA 90404, USA; (L.F.); (O.A.M.); (R.F.)
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA;
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA;
| |
Collapse
|
8
|
Fan K, Pfister E, Weng Z. Toward a comprehensive catalog of regulatory elements. Hum Genet 2023; 142:1091-1111. [PMID: 36935423 DOI: 10.1007/s00439-023-02519-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Accepted: 01/03/2023] [Indexed: 03/21/2023]
Abstract
Regulatory elements are the genomic regions that interact with transcription factors to control cell-type-specific gene expression in different cellular environments. A precise and complete catalog of functional elements encoded by the human genome is key to understanding mammalian gene regulation. Here, we review the current state of regulatory element annotation. We first provide an overview of assays for characterizing functional elements, including genome, epigenome, transcriptome, three-dimensional chromatin interaction, and functional validation assays. We then discuss computational methods for defining regulatory elements, including peak-calling and other statistical modeling methods. Finally, we introduce several high-quality lists of regulatory element annotations and suggest potential future directions.
Collapse
Affiliation(s)
- Kaili Fan
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, 368 Plantation Street, ASC5-1069, Worcester, MA, 01605, USA
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, 02138, USA
| | - Edith Pfister
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, 368 Plantation Street, ASC5-1069, Worcester, MA, 01605, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, 368 Plantation Street, ASC5-1069, Worcester, MA, 01605, USA.
| |
Collapse
|
9
|
Catta-Preta R, Lindtner S, Ypsilanti A, Price J, Abnousi A, Su-Feher L, Wang Y, Juric I, Jones IR, Akiyama JA, Hu M, Shen Y, Visel A, Pennacchio LA, Dickel D, Rubenstein JLR, Nord AS. Combinatorial transcription factor binding encodes cis-regulatory wiring of forebrain GABAergic neurogenesis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.28.546894. [PMID: 37425940 PMCID: PMC10327028 DOI: 10.1101/2023.06.28.546894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Transcription factors (TFs) bind combinatorially to genomic cis-regulatory elements (cREs), orchestrating transcription programs. While studies of chromatin state and chromosomal interactions have revealed dynamic neurodevelopmental cRE landscapes, parallel understanding of the underlying TF binding lags. To elucidate the combinatorial TF-cRE interactions driving mouse basal ganglia development, we integrated ChIP-seq for twelve TFs, H3K4me3-associated enhancer-promoter interactions, chromatin and transcriptional state, and transgenic enhancer assays. We identified TF-cREs modules with distinct chromatin features and enhancer activity that have complementary roles driving GABAergic neurogenesis and suppressing other developmental fates. While the majority of distal cREs were bound by one or two TFs, a small proportion were extensively bound, and these enhancers also exhibited exceptional evolutionary conservation, motif density, and complex chromosomal interactions. Our results provide new insights into how modules of combinatorial TF-cRE interactions activate and repress developmental expression programs and demonstrate the value of TF binding data in modeling gene regulatory wiring.
Collapse
Affiliation(s)
- Rinaldo Catta-Preta
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
- Current Address: Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Susan Lindtner
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Athena Ypsilanti
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - James Price
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Armen Abnousi
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH 44106, USA
- Current Address: NovaSignal, Los Angeles, CA 90064, USA
| | - Linda Su-Feher
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| | - Yurong Wang
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| | - Ivan Juric
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH 44106, USA
| | - Ian R Jones
- Institute for Human Genetics, Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Neurology, University of California, San Francisco, CA 94143, USA
| | - Jennifer A Akiyama
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ming Hu
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH 44106, USA
| | - Yin Shen
- Institute for Human Genetics, Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Neurology, University of California, San Francisco, CA 94143, USA
| | - Axel Visel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
- School of Natural Sciences, University of California, Merced, Merced, CA 95343, USA
| | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
- Comparative Biochemistry Program, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Diane Dickel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - John L R Rubenstein
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Alex S Nord
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| |
Collapse
|
10
|
Smith GD, Ching WH, Cornejo-Páramo P, Wong ES. Decoding enhancer complexity with machine learning and high-throughput discovery. Genome Biol 2023; 24:116. [PMID: 37173718 PMCID: PMC10176946 DOI: 10.1186/s13059-023-02955-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 04/28/2023] [Indexed: 05/15/2023] Open
Abstract
Enhancers are genomic DNA elements controlling spatiotemporal gene expression. Their flexible organization and functional redundancies make deciphering their sequence-function relationships challenging. This article provides an overview of the current understanding of enhancer organization and evolution, with an emphasis on factors that influence these relationships. Technological advancements, particularly in machine learning and synthetic biology, are discussed in light of how they provide new ways to understand this complexity. Exciting opportunities lie ahead as we continue to unravel the intricacies of enhancer function.
Collapse
Affiliation(s)
- Gabrielle D Smith
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Wan Hern Ching
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia
| | - Paola Cornejo-Páramo
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Emily S Wong
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia.
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington, NSW, Australia.
| |
Collapse
|
11
|
Christmas MJ, Kaplow IM, Genereux DP, Dong MX, Hughes GM, Li X, Sullivan PF, Hindle AG, Andrews G, Armstrong JC, Bianchi M, Breit AM, Diekhans M, Fanter C, Foley NM, Goodman DB, Goodman L, Keough KC, Kirilenko B, Kowalczyk A, Lawless C, Lind AL, Meadows JRS, Moreira LR, Redlich RW, Ryan L, Swofford R, Valenzuela A, Wagner F, Wallerman O, Brown AR, Damas J, Fan K, Gatesy J, Grimshaw J, Johnson J, Kozyrev SV, Lawler AJ, Marinescu VD, Morrill KM, Osmanski A, Paulat NS, Phan BN, Reilly SK, Schäffer DE, Steiner C, Supple MA, Wilder AP, Wirthlin ME, Xue JR, Birren BW, Gazal S, Hubley RM, Koepfli KP, Marques-Bonet T, Meyer WK, Nweeia M, Sabeti PC, Shapiro B, Smit AFA, Springer MS, Teeling EC, Weng Z, Hiller M, Levesque DL, Lewin HA, Murphy WJ, Navarro A, Paten B, Pollard KS, Ray DA, Ruf I, Ryder OA, Pfenning AR, Lindblad-Toh K, Karlsson EK. Evolutionary constraint and innovation across hundreds of placental mammals. Science 2023; 380:eabn3943. [PMID: 37104599 PMCID: PMC10250106 DOI: 10.1126/science.abn3943] [Citation(s) in RCA: 49] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 12/16/2022] [Indexed: 04/29/2023]
Abstract
Zoonomia is the largest comparative genomics resource for mammals produced to date. By aligning genomes for 240 species, we identify bases that, when mutated, are likely to affect fitness and alter disease risk. At least 332 million bases (~10.7%) in the human genome are unusually conserved across species (evolutionarily constrained) relative to neutrally evolving repeats, and 4552 ultraconserved elements are nearly perfectly conserved. Of 101 million significantly constrained single bases, 80% are outside protein-coding exons and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Changes in genes and regulatory elements are associated with exceptional mammalian traits, such as hibernation, that could inform therapeutic development. Earth's vast and imperiled biodiversity offers distinctive power for identifying genetic variants that affect genome function and organismal phenotypes.
Collapse
Affiliation(s)
- Matthew J. Christmas
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Irene M. Kaplow
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | - Michael X. Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Graham M. Hughes
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Xue Li
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Patrick F. Sullivan
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Allyson G. Hindle
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Gregory Andrews
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Joel C. Armstrong
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matteo Bianchi
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ana M. Breit
- School of Biology and Ecology, University of Maine, Orono, ME 04469, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Cornelia Fanter
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Nicole M. Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Daniel B. Goodman
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA 94143, USA
| | | | - Kathleen C. Keough
- Fauna Bio, Inc., Emeryville, CA 94608, USA
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Bogdan Kirilenko
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | - Amanda Kowalczyk
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Colleen Lawless
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Abigail L. Lind
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Jennifer R. S. Meadows
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Lucas R. Moreira
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Ruby W. Redlich
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Louise Ryan
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Ross Swofford
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Alejandro Valenzuela
- Department of Experimental and Health Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Franziska Wagner
- Museum of Zoology, Senckenberg Natural History Collections Dresden, 01109 Dresden, Germany
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ashley R. Brown
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Joana Damas
- The Genome Center, University of California Davis, Davis, CA 95616, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
| | - Jenna Grimshaw
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Jeremy Johnson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Sergey V. Kozyrev
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Alyssa J. Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Voichita D. Marinescu
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Kathleen M. Morrill
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Austin Osmanski
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Nicole S. Paulat
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - BaDoi N. Phan
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Steven K. Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Daniel E. Schäffer
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Cynthia Steiner
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Megan A. Supple
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Aryn P. Wilder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Morgan E. Wirthlin
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - James R. Xue
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | | | - Bruce W. Birren
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Steven Gazal
- Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | | | - Klaus-Peter Koepfli
- Center for Species Survival, Smithsonian’s National Zoo and Conservation Biology Institute, Washington, DC 20008, USA
- Computer Technologies Laboratory, ITMO University, St. Petersburg 197101, Russia
- Smithsonian-Mason School of Conservation, George Mason University, Front Royal, VA 22630, USA
| | - Tomas Marques-Bonet
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08036 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, 08193 Cerdanyola del Vallès, Barcelona, Spain
| | - Wynn K. Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | - Martin Nweeia
- Department of Comprehensive Care, School of Dental Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
- Department of Vertebrate Zoology, Canadian Museum of Nature, Ottawa, Ontario K2P 2R1, Canada
- Department of Vertebrate Zoology, Smithsonian Institution, Washington, DC 20002, USA
- Narwhal Genome Initiative, Department of Restorative Dentistry and Biomaterials Sciences, Harvard School of Dental Medicine, Boston, MA 02115, USA
| | - Pardis C. Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA 02138, USA
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Mark S. Springer
- Department of Evolution, Ecology and Organismal Biology, University of California Riverside, Riverside, CA 92521, USA
| | - Emma C. Teeling
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Michael Hiller
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | | | - Harris A. Lewin
- The Genome Center, University of California Davis, Davis, CA 95616, USA
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
- John Muir Institute for the Environment, University of California Davis, Davis, CA 95616, USA
| | - William J. Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Arcadi Navarro
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation, 08005 Barcelona, Spain
- CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
| | - Benedict Paten
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Katherine S. Pollard
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - David A. Ray
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Irina Ruf
- Division of Messel Research and Mammalogy, Senckenberg Research Institute and Natural History Museum Frankfurt, 60325 Frankfurt am Main, Germany
| | - Oliver A. Ryder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
- Department of Evolution, Behavior and Ecology, School of Biological Sciences, University of California San Diego, La Jolla, CA 92039, USA
| | - Andreas R. Pfenning
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Kerstin Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Elinor K. Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| |
Collapse
|
12
|
Zhou X, Zhu T, Fang W, Yu R, He Z, Chen D. Systematic annotation of conservation states provides insights into regulatory regions in rice. J Genet Genomics 2022; 49:1127-1137. [PMID: 35470092 DOI: 10.1016/j.jgg.2022.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Revised: 04/08/2022] [Accepted: 04/12/2022] [Indexed: 01/14/2023]
Abstract
Plant genomes contain a large fraction of noncoding sequences. The discovery and annotation of conserved noncoding sequences (CNSs) in plants is an ongoing challenge. Here we report the application of comparative genomics to systematically identify CNSs in 50 well-annotated Gramineae genomes using rice (Oryza sativa) as the reference. We conduct multiple-way whole-genome alignments to the rice genome. The rice genome is annotated as 20 conservation states (CSs) at single-nucleotide resolution using a multivariate hidden Markov model (ConsHMM) based on the multiple-genome alignments. Different states show distinct enrichments for various genomic features, and the conservation scores of CSs are highly correlated with the level of associated chromatin accessibility. We find that at least 33.5% of the rice genome is highly under selection, with more than 70% of the sequence lying outside of coding regions. A catalog of 855,366 regulatory CNSs is generated, and they significantly overlapped with putative active regulatory elements such as promoters, enhancers, and transcription factor binding sites. Collectively, our study provides a resource for elucidating functional noncoding regions of the rice genome and an evolutionary aspect of regulatory sequences in higher plants.
Collapse
Affiliation(s)
- Xinkai Zhou
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Tao Zhu
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Wen Fang
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Ranran Yu
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Zhaohui He
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Dijun Chen
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, Jiangsu 210023, China.
| |
Collapse
|
13
|
Fedorova L, Mulyar OA, Lim J, Fedorov A. Nucleotide Composition of Ultra-Conserved Elements Shows Excess of GpC and Depletion of GG and CC Dinucleotides. Genes (Basel) 2022; 13:2053. [PMID: 36360290 PMCID: PMC9690913 DOI: 10.3390/genes13112053] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 10/25/2022] [Accepted: 11/03/2022] [Indexed: 08/27/2023] Open
Abstract
The public UCNEbase database, comprising 4273 human ultra-conserved noncoding elements (UCNEs), was thoroughly investigated with the aim to find any nucleotide signals or motifs that have made these DNA sequences practically unchanged over three hundred million years of evolution. Each UCNE comprises over 200 nucleotides and has at least 95% identity between humans and chickens. A total of 31,046 SNPs were found within the UCNE database. We demonstrated that every human has over 300 mutations within 4273 UCNEs. No association of UCNEs with non-coding RNAs, nor preference of a particular meiotic recombination rate within them were found. No sequence motifs associated with UCNEs nor their flanking regions have been found. However, we demonstrated that UCNEs have strong nucleotide and dinucleotide sequence abnormalities compared to genome averages. Specifically, UCNEs are depleted for CC and GG dinucleotides, while GC dinucleotides are in excess of 28%. Importantly, GC dinucleotides have extraordinarily strong stacking free-energy inside the DNA helix and unique resistance to dissociation. Based on the adjacent nucleotide stacking abnormalities within UCNEs, we conjecture that peculiarities in dinucleotide distribution within UCNEs may create unique 3D conformation and specificity to bind proteins. We also discuss the strange dynamics of multiple SNPs inside UCNEs and reasons why these sequences are extraordinarily conserved.
Collapse
Affiliation(s)
| | | | - Jan Lim
- CRI Genetics LLC, Santa Monica, CA 90404, USA
| | - Alexei Fedorov
- CRI Genetics LLC, Santa Monica, CA 90404, USA
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA
| |
Collapse
|
14
|
Further Delineation of Duplications of ARX Locus Detected in Male Patients with Varying Degrees of Intellectual Disability. Int J Mol Sci 2022; 23:ijms23063084. [PMID: 35328505 PMCID: PMC8955779 DOI: 10.3390/ijms23063084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 03/08/2022] [Accepted: 03/10/2022] [Indexed: 11/20/2022] Open
Abstract
The X-linked gene encoding aristaless-related homeobox (ARX) is a bi-functional transcription factor capable of activating or repressing gene transcription, whose mutations have been found in a wide spectrum of neurodevelopmental disorders (NDDs); these include cortical malformations, paediatric epilepsy, intellectual disability (ID) and autism. In addition to point mutations, duplications of the ARX locus have been detected in male patients with ID. These rearrangements include telencephalon ultraconserved enhancers, whose structural alterations can interfere with the control of ARX expression in the developing brain. Here, we review the structural features of 15 gain copy-number variants (CNVs) of the ARX locus found in patients presenting wide-ranging phenotypic variations including ID, speech delay, hypotonia and psychiatric abnormalities. We also report on a further novel Xp21.3 duplication detected in a male patient with moderate ID and carrying a fully duplicated copy of the ARX locus and the ultraconserved enhancers. As consequences of this rearrangement, the patient-derived lymphoblastoid cell line shows abnormal activity of the ARX-KDM5C-SYN1 regulatory axis. Moreover, the three-dimensional (3D) structure of the Arx locus, both in mouse embryonic stem cells and cortical neurons, provides new insight for the functional consequences of ARX duplications. Finally, by comparing the clinical features of the 16 CNVs affecting the ARX locus, we conclude that—depending on the involvement of tissue-specific enhancers—the ARX duplications are ID-associated risk CNVs with variable expressivity and penetrance.
Collapse
|
15
|
Transcriptional Regulation and Implications for Controlling Hox Gene Expression. J Dev Biol 2022; 10:jdb10010004. [PMID: 35076545 PMCID: PMC8788451 DOI: 10.3390/jdb10010004] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 01/04/2022] [Accepted: 01/06/2022] [Indexed: 02/06/2023] Open
Abstract
Hox genes play key roles in axial patterning and regulating the regional identity of cells and tissues in a wide variety of animals from invertebrates to vertebrates. Nested domains of Hox expression generate a combinatorial code that provides a molecular framework for specifying the properties of tissues along the A–P axis. Hence, it is important to understand the regulatory mechanisms that coordinately control the precise patterns of the transcription of clustered Hox genes required for their roles in development. New insights are emerging about the dynamics and molecular mechanisms governing transcriptional regulation, and there is interest in understanding how these may play a role in contributing to the regulation of the expression of the clustered Hox genes. In this review, we summarize some of the recent findings, ideas and emerging mechanisms underlying the regulation of transcription in general and consider how they may be relevant to understanding the transcriptional regulation of Hox genes.
Collapse
|