1
|
Saenger T, Schulte MF, Vordenbäumen S, Hermann FC, Bertelsbeck J, Meier K, Bleck E, Schneider M, Jose J. Structural Analysis of Breast-Milk α S1-Casein: An α-Helical Conformation Is Required for TLR4-Stimulation. Int J Mol Sci 2024; 25:1743. [PMID: 38339021 PMCID: PMC10855866 DOI: 10.3390/ijms25031743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 01/19/2024] [Accepted: 01/21/2024] [Indexed: 02/12/2024] Open
Abstract
Breast-milk αS1-casein is a Toll-like receptor 4 (TLR4) agonist, whereas phosphorylated αS1-casein does not bind TLR4. The objective of this study was to analyse the structural requirements for these effects. In silico analysis of αS1-casein indicated high α-helical content with coiled-coil characteristics. This was confirmed by CD-spectroscopy, showing the α-helical conformation to be stable between pH 2 and 7.4. After in vitro phosphorylation, the α-helical content was significantly reduced, similar to what it was after incubation at 80 °C. This conformation showed no in vitro induction of IL-8 secretion via TLR4. A synthetic peptide corresponding to V77-E92 of αS1-casein induced an IL-8 secretion of 0.95 ng/mL via TLR4. Our results indicate that αS1-casein appears in two distinct conformations, an α-helical TLR4-agonistic and a less α-helical TLR4 non-agonistic conformation induced by phosphorylation. This is to indicate that the immunomodulatory role of αS1-casein, as described before, could be regulated by conformational changes induced by phosphorylation.
Collapse
Affiliation(s)
- Thorsten Saenger
- Institute for Pharmaceutical and Medicinal Chemistry, University of Münster, PharmaCampus, Correnstr. 48, 48149 Münster, Germany; (T.S.); (M.F.S.)
| | - Marten F. Schulte
- Institute for Pharmaceutical and Medicinal Chemistry, University of Münster, PharmaCampus, Correnstr. 48, 48149 Münster, Germany; (T.S.); (M.F.S.)
| | - Stefan Vordenbäumen
- Department of Rheumatology and Hiller Research Unit Rheumatology, Medical Faculty, Heinrich-Heine-University Düsseldorf, Moorenstr. 5, 40225 Düsseldorf, Germany
| | - Fabian C. Hermann
- Institute for Pharmaceutical Biology and Phytochemie, University of Münster, PharmaCampus, Correnstr. 48, 48149 Münster, Germany
| | - Juliana Bertelsbeck
- Institute for Pharmaceutical and Medicinal Chemistry, University of Münster, PharmaCampus, Correnstr. 48, 48149 Münster, Germany; (T.S.); (M.F.S.)
| | - Kathrin Meier
- Institute for Pharmaceutical and Medicinal Chemistry, University of Münster, PharmaCampus, Correnstr. 48, 48149 Münster, Germany; (T.S.); (M.F.S.)
| | - Ellen Bleck
- Department of Rheumatology and Hiller Research Unit Rheumatology, Medical Faculty, Heinrich-Heine-University Düsseldorf, Moorenstr. 5, 40225 Düsseldorf, Germany
| | - Matthias Schneider
- Department of Rheumatology and Hiller Research Unit Rheumatology, Medical Faculty, Heinrich-Heine-University Düsseldorf, Moorenstr. 5, 40225 Düsseldorf, Germany
| | - Joachim Jose
- Institute for Pharmaceutical and Medicinal Chemistry, University of Münster, PharmaCampus, Correnstr. 48, 48149 Münster, Germany; (T.S.); (M.F.S.)
| |
Collapse
|
2
|
Kuderna LFK, Ulirsch JC, Rashid S, Ameen M, Sundaram L, Hickey G, Cox AJ, Gao H, Kumar A, Aguet F, Christmas MJ, Clawson H, Haeussler M, Janiak MC, Kuhlwilm M, Orkin JD, Bataillon T, Manu S, Valenzuela A, Bergman J, Rouselle M, Silva FE, Agueda L, Blanc J, Gut M, de Vries D, Goodhead I, Harris RA, Raveendran M, Jensen A, Chuma IS, Horvath JE, Hvilsom C, Juan D, Frandsen P, Schraiber JG, de Melo FR, Bertuol F, Byrne H, Sampaio I, Farias I, Valsecchi J, Messias M, da Silva MNF, Trivedi M, Rossi R, Hrbek T, Andriaholinirina N, Rabarivola CJ, Zaramody A, Jolly CJ, Phillips-Conroy J, Wilkerson G, Abee C, Simmons JH, Fernandez-Duque E, Kanthaswamy S, Shiferaw F, Wu D, Zhou L, Shao Y, Zhang G, Keyyu JD, Knauf S, Le MD, Lizano E, Merker S, Navarro A, Nadler T, Khor CC, Lee J, Tan P, Lim WK, Kitchener AC, Zinner D, Gut I, Melin AD, Guschanski K, Schierup MH, Beck RMD, Karakikes I, Wang KC, Umapathy G, Roos C, Boubli JP, Siepel A, Kundaje A, Paten B, Lindblad-Toh K, Rogers J, Marques Bonet T, Farh KKH. Identification of constrained sequence elements across 239 primate genomes. Nature 2024; 625:735-742. [PMID: 38030727 PMCID: PMC10808062 DOI: 10.1038/s41586-023-06798-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 10/30/2023] [Indexed: 12/01/2023]
Abstract
Noncoding DNA is central to our understanding of human gene regulation and complex diseases1,2, and measuring the evolutionary sequence constraint can establish the functional relevance of putative regulatory elements in the human genome3-9. Identifying the genomic elements that have become constrained specifically in primates has been hampered by the faster evolution of noncoding DNA compared to protein-coding DNA10, the relatively short timescales separating primate species11, and the previously limited availability of whole-genome sequences12. Here we construct a whole-genome alignment of 239 species, representing nearly half of all extant species in the primate order. Using this resource, we identified human regulatory elements that are under selective constraint across primates and other mammals at a 5% false discovery rate. We detected 111,318 DNase I hypersensitivity sites and 267,410 transcription factor binding sites that are constrained specifically in primates but not across other placental mammals and validate their cis-regulatory effects on gene expression. These regulatory elements are enriched for human genetic variants that affect gene expression and complex traits and diseases. Our results highlight the important role of recent evolution in regulatory sequence elements differentiating primates, including humans, from other placental mammals.
Collapse
Affiliation(s)
- Lukas F K Kuderna
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Jacob C Ulirsch
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Sabrina Rashid
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Mohamed Ameen
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Laksshman Sundaram
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Glenn Hickey
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Anthony J Cox
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Hong Gao
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Arvind Kumar
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Francois Aguet
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Matthew J Christmas
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Hiram Clawson
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | | | - Mareike C Janiak
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Martin Kuhlwilm
- Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, Vienna, Austria
| | - Joseph D Orkin
- Département d'Anthropologie, Université de Montréal, Montréal, Quebec, Canada
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Shivakumara Manu
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
| | - Alejandro Valenzuela
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | - Juraj Bergman
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
- Section for Ecoinformatics and Biodiversity, Department of Biology, Aarhus University, Aarhus, Denmark
| | | | - Felipe Ennes Silva
- Research Group on Primate Biology and Conservation, Mamirauá Institute for Sustainable Development, Tefé, Brazil
- Evolutionary Biology and Ecology (EBE), Département de Biologie des Organismes, Université libre de Bruxelles (ULB), Brussels, Belgium
| | - Lidia Agueda
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
| | - Julie Blanc
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
| | - Marta Gut
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
| | - Dorien de Vries
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Ian Goodhead
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - R Alan Harris
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Axel Jensen
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
| | | | - Julie E Horvath
- North Carolina Museum of Natural Sciences, Raleigh, NC, USA
- Department of Biological and Biomedical Sciences, North Carolina Central University, Durham, NC, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, USA
- Department of Evolutionary Anthropology, Duke University, Durham, NC, USA
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | - David Juan
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | | | - Joshua G Schraiber
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | | | - Fabrício Bertuol
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Brazil
| | - Hazel Byrne
- Department of Anthropology, University of Utah, Salt Lake City, UT, USA
| | | | - Izeni Farias
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Brazil
| | - João Valsecchi
- Research Group on Terrestrial Vertebrate Ecology, Mamirauá Institute for Sustainable Development, Tefé, Brazil
- Rede de Pesquisa em Diversidade, Conservação e Uso da Fauna da Amazônia - RedeFauna, Manaus, Brazil
- Comunidad de Manejo de Fauna Silvestre en la Amazonía y en Latinoamérica-ComFauna, Iquitos, Peru
| | - Malu Messias
- Universidade Federal de Rondônia, Porto Velho, Brazil
| | | | - Mihir Trivedi
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
| | - Rogerio Rossi
- Instituto de Biociências, Universidade Federal do Mato Grosso, Cuiabá, Brazil
| | - Tomas Hrbek
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Brazil
- Department of Biology, Trinity University, San Antonio, TX, USA
| | - Nicole Andriaholinirina
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Clément J Rabarivola
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Alphonse Zaramody
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Clifford J Jolly
- Department of Anthropology, New York University, New York, NY, USA
| | - Jane Phillips-Conroy
- Department of Neuroscience, Washington University School of Medicine in St Louis, St Louis, MO, USA
| | - Gregory Wilkerson
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop, TX, USA
| | - Christian Abee
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop, TX, USA
| | - Joe H Simmons
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop, TX, USA
| | | | - Sree Kanthaswamy
- School of Interdisciplinary Forensics, Arizona State University, Phoenix, AZ, USA
- California National Primate Research Center, University of California, Davis, CA, USA
| | - Fekadu Shiferaw
- Guinea Worm Eradication Program, The Carter Center Ethiopia, Addis Ababa, Ethiopia
| | - Dongdong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Long Zhou
- Center for Evolutionary and Organismal Biology, Zhejiang University School of Medicine, Hangzhou, China
| | - Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Guojie Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Center for Evolutionary and Organismal Biology, Zhejiang University School of Medicine, Hangzhou, China
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou, China
- Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Julius D Keyyu
- Tanzania Wildlife Research Institute (TAWIRI), Arusha, Tanzania
| | - Sascha Knauf
- Institute of International Animal Health/One Health, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Greifswald-Insel Riems, Germany
- Professorship for International Animal Health/One Health, Faculty of Veterinary Medicine, Justus Liebig University, Giessen, Germany
| | - Minh D Le
- Department of Environmental Ecology, Faculty of Environmental Sciences, University of Science and Central Institute for Natural Resources and Environmental Studies, Vietnam National University, Hanoi, Vietnam
| | - Esther Lizano
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Stefan Merker
- Department of Zoology, State Museum of Natural History Stuttgart, Stuttgart, Germany
| | - Arcadi Navarro
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Barcelonaβeta Brain Research Center, Pasqual Maragall Foundation, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Tilo Nadler
- Cuc Phuong Commune, Nho Quan District, Vietnam
| | - Chiea Chuen Khor
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | | | - Patrick Tan
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore, Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, Singapore
| | - Weng Khong Lim
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore, Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, Singapore
- SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore
| | - Andrew C Kitchener
- Department of Natural Sciences, National Museums Scotland, Edinburgh, UK
- School of Geosciences, Edinburgh, UK
| | - Dietmar Zinner
- Cognitive Ethology Laboratory, Germany Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany
- Department of Primate Cognition, Georg-August-Universität Göttingen, Göttingen, Germany
- Leibniz ScienceCampus Primate Cognition, Göttingen, Germany
| | - Ivo Gut
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
| | - Amanda D Melin
- Department of Anthropology and Archaeology, University of Calgary, Calgary, Alberta, Canada
- Department of Medical Genetics, University of Calgary, Calgary, Alberta, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary, Alberta, Canada
| | - Katerina Guschanski
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | | | - Robin M D Beck
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Ioannis Karakikes
- Cardiovascular Institute, Stanford University, Stanford, CA, USA
- Department of Cardiothoracic Surgery, Stanford University, Stanford, CA, USA
| | - Kevin C Wang
- Department of Cancer Biology, Stanford University, Stanford, CA, USA
- Department of Dermatology, Stanford University School of Medicine, Stanford, CA, USA
- Veterans Affairs Palo Alto Healthcare System, Palo Alto, CA, USA
| | - Govindhaswamy Umapathy
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
| | - Christian Roos
- Gene Bank of Primates and Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany
| | - Jean P Boubli
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Adam Siepel
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Kerstin Lindblad-Toh
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jeffrey Rogers
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| | - Tomas Marques Bonet
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain.
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain.
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
- Universitat Pompeu Fabra, Barcelona, Spain.
| | - Kyle Kai-How Farh
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA.
| |
Collapse
|
3
|
Butruille L, Sébillot A, Ávila K, Vancamp P, Demeneix BA, Pifferi F, Remaud S. Increased oligodendrogenesis and myelination in the subventricular zone of aged mice and gray mouse lemurs. Stem Cell Reports 2023; 18:534-554. [PMID: 36669492 PMCID: PMC9969077 DOI: 10.1016/j.stemcr.2022.12.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 12/16/2022] [Accepted: 12/20/2022] [Indexed: 01/20/2023] Open
Abstract
The adult rodent subventricular zone (SVZ) generates neural stem cells (NSCs) throughout life that migrate to the olfactory bulbs (OBs) and differentiate into olfactory interneurons. Few SVZ NSCs generate oligodendrocyte precursor cells (OPCs). We investigated how neurogliogenesis is regulated during aging in mice and in a non-human primate (NHP) model, the gray mouse lemur. In both species, neuronal commitment decreased with age, while OPC generation and myelin content unexpectedly increased. In the OBs, more tyrosine hydroxylase interneurons in old mice, but fewer in lemurs, marked a surprising interspecies difference that could relate to our observation of a continuous ventricle in lemurs. In the corpus callosum, aging promoted maturation of OPCs into mature oligodendrocytes in mice but blocked it in lemurs. The present study highlights similarities and dissimilarities between rodents and NHPs, revealing that NHPs are a more relevant model than mice to study the evolution of biomarkers of aging.
Collapse
Affiliation(s)
- Lucile Butruille
- Laboratory Molecular Physiology and Adaptation, CNRS UMR 7221, Department Adaptations of Life, Muséum National d'Histoire Naturelle, 7 rue Cuvier, 75005 Paris, France.
| | - Anthony Sébillot
- Laboratory Molecular Physiology and Adaptation, CNRS UMR 7221, Department Adaptations of Life, Muséum National d'Histoire Naturelle, 7 rue Cuvier, 75005 Paris, France
| | - Katia Ávila
- Laboratory Molecular Physiology and Adaptation, CNRS UMR 7221, Department Adaptations of Life, Muséum National d'Histoire Naturelle, 7 rue Cuvier, 75005 Paris, France
| | - Pieter Vancamp
- Laboratory Molecular Physiology and Adaptation, CNRS UMR 7221, Department Adaptations of Life, Muséum National d'Histoire Naturelle, 7 rue Cuvier, 75005 Paris, France
| | - Barbara A Demeneix
- Laboratory Molecular Physiology and Adaptation, CNRS UMR 7221, Department Adaptations of Life, Muséum National d'Histoire Naturelle, 7 rue Cuvier, 75005 Paris, France
| | - Fabien Pifferi
- UMR 7179 Mecadev, CNRS/Muséum National d'Histoire Naturelle, 1 Avenue du Petit Château, 91800 Brunoy, France
| | - Sylvie Remaud
- Laboratory Molecular Physiology and Adaptation, CNRS UMR 7221, Department Adaptations of Life, Muséum National d'Histoire Naturelle, 7 rue Cuvier, 75005 Paris, France.
| |
Collapse
|
4
|
Identification and characterization of constrained non-exonic bases lacking predictive epigenomic and transcription factor binding annotations. Nat Commun 2020; 11:6168. [PMID: 33268804 PMCID: PMC7710766 DOI: 10.1038/s41467-020-19962-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Accepted: 11/06/2020] [Indexed: 12/12/2022] Open
Abstract
Annotations of evolutionary sequence constraint based on multi-species genome alignments and genome-wide maps of epigenomic marks and transcription factor binding provide important complementary information for understanding the human genome and genetic variation. Here we developed the Constrained Non-Exonic Predictor (CNEP) to quantify the evidence of each base in the genome being in an evolutionarily constrained non-exonic element from an input of over 60,000 epigenomic and transcription factor binding features. We find that the CNEP score outperforms baseline and related existing scores at predicting evolutionarily constrained non-exonic bases from such data. However, a subset of them are still not well predicted by CNEP. We developed a complementary Conservation Signature Score by CNEP (CSS-CNEP) that is predictive of those bases. We further characterize the nature of constrained non-exonic bases with low CNEP scores using additional types of information. CNEP and CSS-CNEP are resources for analyzing constrained non-exonic bases in the genome. Genome-wide maps of evolutionary constraint and large-scale compendia of epigenomic and transcription factor data provide complementary information for genome annotation. Here, the authors develop the Constrained Non-Exonic Predictor (CNEP) that enables better understanding of their relationship.
Collapse
|
5
|
Groß C, Bortoluzzi C, de Ridder D, Megens HJ, Groenen MAM, Reinders M, Bosse M. Prioritizing sequence variants in conserved non-coding elements in the chicken genome using chCADD. PLoS Genet 2020; 16:e1009027. [PMID: 32966296 PMCID: PMC7535126 DOI: 10.1371/journal.pgen.1009027] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 10/05/2020] [Accepted: 08/05/2020] [Indexed: 11/30/2022] Open
Abstract
The availability of genomes for many species has advanced our understanding of the non-protein-coding fraction of the genome. Comparative genomics has proven itself to be an invaluable approach for the systematic, genome-wide identification of conserved non-protein-coding elements (CNEs). However, for many non-mammalian model species, including chicken, our capability to interpret the functional importance of variants overlapping CNEs has been limited by current genomic annotations, which rely on a single information type (e.g. conservation). We here studied CNEs in chicken using a combination of population genomics and comparative genomics. To investigate the functional importance of variants found in CNEs we develop a ch(icken) Combined Annotation-Dependent Depletion (chCADD) model, a variant effect prediction tool first introduced for humans and later on for mouse and pig. We show that 73 Mb of the chicken genome has been conserved across more than 280 million years of vertebrate evolution. The vast majority of the conserved elements are in non-protein-coding regions, which display SNP densities and allele frequency distributions characteristic of genomic regions constrained by purifying selection. By annotating SNPs with the chCADD score we are able to pinpoint specific subregions of the CNEs to be of higher functional importance, as supported by SNPs found in these subregions are associated with known disease genes in humans, mice, and rats. Taken together, our findings indicate that CNEs harbor variants of functional significance that should be object of further investigation along with protein-coding mutations. We therefore anticipate chCADD to be of great use to the scientific community and breeding companies in future functional studies in chicken.
Collapse
Affiliation(s)
- Christian Groß
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, The Netherlands
- Delft Bioinformatics Lab, University of Technology Delft, 2600 GA, Delft, The Netherlands
| | - Chiara Bortoluzzi
- Animal Breeding and Genomics Group, Wageningen University & Research, 6708 PB, Wageningen, The Netherlands
| | - Dick de Ridder
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, The Netherlands
| | - Hendrik-Jan Megens
- Animal Breeding and Genomics Group, Wageningen University & Research, 6708 PB, Wageningen, The Netherlands
| | - Martien A. M. Groenen
- Animal Breeding and Genomics Group, Wageningen University & Research, 6708 PB, Wageningen, The Netherlands
| | - Marcel Reinders
- Delft Bioinformatics Lab, University of Technology Delft, 2600 GA, Delft, The Netherlands
| | - Mirte Bosse
- Animal Breeding and Genomics Group, Wageningen University & Research, 6708 PB, Wageningen, The Netherlands
| |
Collapse
|
6
|
Rubin BER, Jones BM, Hunt BG, Kocher SD. Rate variation in the evolution of non-coding DNA associated with social evolution in bees. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180247. [PMID: 31154980 PMCID: PMC6560270 DOI: 10.1098/rstb.2018.0247] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/14/2019] [Indexed: 11/12/2022] Open
Abstract
The evolutionary origins of eusociality represent increases in complexity from individual to caste-based, group reproduction. These behavioural transitions have been hypothesized to go hand in hand with an increased ability to regulate when and where genes are expressed. Bees have convergently evolved eusociality up to five times, providing a framework to test this hypothesis. To examine potential links between putative gene regulatory elements and social evolution, we compare alignable, non-coding sequences in 11 diverse bee species, encompassing three independent origins of reproductive division of labour and two elaborations of eusocial complexity. We find that rates of evolution in a number of non-coding sequences correlate with key social transitions in bees. Interestingly, while we find little evidence for convergent rate changes associated with independent origins of social behaviour, a number of molecular pathways exhibit convergent rate changes in conjunction with subsequent elaborations of social organization. We also present evidence that many novel non-coding regions may have been recruited alongside the origin of sociality in corbiculate bees; these loci could represent gene regulatory elements associated with division of labour within this group. Thus, our findings are consistent with the hypothesis that gene regulatory innovations are associated with the evolution of eusociality and illustrate how a thorough examination of both coding and non-coding sequence can provide a more complete understanding of the molecular mechanisms underlying behavioural evolution. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- Benjamin E. R. Rubin
- Department of Ecology and Evolutionary Biology; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Beryl M. Jones
- Program in Ecology, Evolution, and Conservation Biology, University of Illinois, Urbana, IL, USA
| | - Brendan G. Hunt
- Department of Entomology, University of Georgia, Griffin, GA, USA
| | - Sarah D. Kocher
- Department of Ecology and Evolutionary Biology; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| |
Collapse
|
7
|
Prasad MS, Charney RM, García-Castro MI. Specification and formation of the neural crest: Perspectives on lineage segregation. Genesis 2019; 57:e23276. [PMID: 30576078 PMCID: PMC6570420 DOI: 10.1002/dvg.23276] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2018] [Revised: 12/17/2018] [Accepted: 12/18/2018] [Indexed: 12/21/2022]
Abstract
The neural crest is a fascinating embryonic population unique to vertebrates that is endowed with remarkable differentiation capacity. Thought to originate from ectodermal tissue, neural crest cells generate neurons and glia of the peripheral nervous system, and melanocytes throughout the body. However, the neural crest also generates many ectomesenchymal derivatives in the cranial region, including cell types considered to be of mesodermal origin such as cartilage, bone, and adipose tissue. These ectomesenchymal derivatives play a critical role in the formation of the vertebrate head, and are thought to be a key attribute at the center of vertebrate evolution and diversity. Further, aberrant neural crest cell development and differentiation is the root cause of many human pathologies, including cancers, rare syndromes, and birth malformations. In this review, we discuss the current findings of neural crest cell ontogeny, and consider tissue, cell, and molecular contributions toward neural crest formation. We further provide current perspectives into the molecular network involved during the segregation of the neural crest lineage.
Collapse
Affiliation(s)
- Maneeshi S Prasad
- Division of Biomedical Sciences, School of Medicine, University of California, Riverside, California
| | - Rebekah M Charney
- Division of Biomedical Sciences, School of Medicine, University of California, Riverside, California
| | - Martín I García-Castro
- Division of Biomedical Sciences, School of Medicine, University of California, Riverside, California
| |
Collapse
|
8
|
Abstract
Whole-genome alignment (WGA) is the prediction of evolutionary relationships at the nucleotide level between two or more genomes. It combines aspects of both colinear sequence alignment and gene orthology prediction and is typically more challenging to address than either of these tasks due to the size and complexity of whole genomes. Despite the difficulty of this problem, numerous methods have been developed for its solution because WGAs are valuable for genome-wide analyses such as phylogenetic inference, genome annotation, and function prediction. In this chapter, we discuss the meaning and significance of WGA and present an overview of the methods that address it. We also examine the problem of evaluating whole-genome aligners and offer a set of methodological challenges that need to be tackled in order to make most effective use of our rapidly growing databases of whole genomes.
Collapse
Affiliation(s)
- Colin N Dewey
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
9
|
Abstract
Humans have a close phylogenetic relationship with nonhuman primates (NHPs) and share many physiological parallels, such as highly similar immune systems, with them. Importantly, NHPs can be infected with many human or related simian viruses. In many cases, viruses replicate in the same cell types as in humans, and infections are often associated with the same pathologies. In addition, many reagents that are used to study the human immune response cross-react with NHP molecules. As such, NHPs are often used as models to study viral vaccine efficacy and antiviral therapeutic safety and efficacy and to understand aspects of viral pathogenesis. With several emerging viral infections becoming epidemic, NHPs are proving to be a very beneficial benchmark for investigating human viral infections.
Collapse
Affiliation(s)
- Jacob D Estes
- AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Frederick, MD, USA
- Vaccine and Gene Therapy Institute, Oregon Health and Science University, Beaverton, OR, USA
| | - Scott W Wong
- Vaccine and Gene Therapy Institute, Oregon Health and Science University, Beaverton, OR, USA
| | - Jason M Brenchley
- Barrier Immunity Section, Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA.
| |
Collapse
|
10
|
Yadav DK, Shrestha S, Dadhwal G, Chandak GR. Identification and characterization of cis-regulatory elements 'insulator and repressor' in PPARD gene. Epigenomics 2018; 10:613-627. [PMID: 29583017 DOI: 10.2217/epi-2017-0139] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
AIM Identification and functional characterization of cis-regulatory elements in human PPARD gene. METHODS We used various bioinformatic tools on the publicly available human genome and Encyclopedia of DNA Elements databases to explore potential cis-regulatory elements in PPARD gene region. RESULTS We predicted an insulator and an enhancer element in intron 2 of PPARD gene. Functional characterization using transient transfection, reporter assay and CTCF binding confirmed the insulator status. However, the predicted enhancer element showed repressor/silencer activity. Finally, we observed a potential interaction between these two cis-regulatory elements which is in agreement with 5C-Encyclopedia of DNA Elements data. CONCLUSION We report two functionally validated cis-regulatory elements in PPARD gene which will aid in understanding its regulation and role in metabolic functions.
Collapse
Affiliation(s)
- Dilip K Yadav
- Genomic Research on Complex Diseases (GRC Group), CSIR-Centre for Cellular & Molecular Biology, Hyderabad, Telangana, 500 007, India
| | - Smeeta Shrestha
- Genomic Research on Complex Diseases (GRC Group), CSIR-Centre for Cellular & Molecular Biology, Hyderabad, Telangana, 500 007, India.,Building No.7, School of Basic & Applied Sciences, Dayananda Sagar University, Shavige Malleshwara Hills, Kumaraswamy Layout, Bangalore 560 078, Karnataka, India
| | - Gunjan Dadhwal
- Genomic Research on Complex Diseases (GRC Group), CSIR-Centre for Cellular & Molecular Biology, Hyderabad, Telangana, 500 007, India.,Departement de Biochimie et Medecine Moleculaire, Universite de Montreal, Montreal, Quebec H3T 1J4, Canada
| | - Giriraj R Chandak
- Genomic Research on Complex Diseases (GRC Group), CSIR-Centre for Cellular & Molecular Biology, Hyderabad, Telangana, 500 007, India
| |
Collapse
|
11
|
Liang P, Saqib HSA, Zhang X, Zhang L, Tang H. Single-Base Resolution Map of Evolutionary Constraints and Annotation of Conserved Elements across Major Grass Genomes. Genome Biol Evol 2018; 10:473-488. [PMID: 29378032 PMCID: PMC5798027 DOI: 10.1093/gbe/evy006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/08/2018] [Indexed: 12/20/2022] Open
Abstract
Conserved noncoding sequences (CNSs) are evolutionarily conserved DNA sequences that do not encode proteins but may have potential regulatory roles in gene expression. CNS in crop genomes could be linked to many important agronomic traits and ecological adaptations. Compared with the relatively mature exon annotation protocols, efficient methods are lacking to predict the location of noncoding sequences in the plant genomes. We implemented a computational pipeline that is tailored to the comparisons of plant genomes, yielding a large number of conserved sequences using rice genome as the reference. In this study, we used 17 published grass genomes, along with five monocot genomes as well as the basal angiosperm genome of Amborella trichopoda. Genome alignments among these genomes suggest that at least 12.05% of the rice genome appears to be evolving under constraints in the Poaceae lineage, with close to half of the evolutionarily constrained sequences located outside protein-coding regions. We found evidence for purifying selection acting on the conserved sequences by analyzing segregating SNPs within the rice population. Furthermore, we found that known functional motifs were significantly enriched within CNS, with many motifs associated with the preferred binding of ubiquitous transcription factors. The conserved elements that we have curated are accessible through our public database and the JBrowse server. In-depth functional annotations and evolutionary dynamics of the identified conserved sequences provide a solid foundation for studying gene regulation, genome evolution, as well as to inform gene isolation for cereal biologists.
Collapse
Affiliation(s)
- Pingping Liang
- Key Laboratory of Genetics, Breeding and Multiple Utilization of Corps, Center for Genomics and Biotechnology, Ministry of Education; Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, China
| | - Hafiz Sohaib Ahmed Saqib
- Institute of Applied Ecology, Fujian Agriculture and Forestry University, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xingtan Zhang
- Key Laboratory of Genetics, Breeding and Multiple Utilization of Corps, Center for Genomics and Biotechnology, Ministry of Education; Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Liangsheng Zhang
- Key Laboratory of Genetics, Breeding and Multiple Utilization of Corps, Center for Genomics and Biotechnology, Ministry of Education; Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Haibao Tang
- Key Laboratory of Genetics, Breeding and Multiple Utilization of Corps, Center for Genomics and Biotechnology, Ministry of Education; Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| |
Collapse
|
12
|
Abstract
The idea that much of our genome is irrelevant to fitness-is not the product of positive natural selection at the organismal level-remains viable. Claims to the contrary, and specifically that the notion of "junk DNA" should be abandoned, are based on conflating meanings of the word "function". Recent estimates suggest that perhaps 90% of our DNA, though biochemically active, does not contribute to fitness in any sequence-dependent way, and possibly in no way at all. Comparisons to vertebrates with much larger and smaller genomes (the lungfish and the pufferfish) strongly align with such a conclusion, as they have done for the last half-century.
Collapse
Affiliation(s)
- W Ford Doolittle
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada.
| | - Tyler D P Brunet
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of History and Philosophy of Science, University of Cambridge, Cambridge, UK
| |
Collapse
|
13
|
Ezran C, Karanewsky CJ, Pendleton JL, Sholtz A, Krasnow MR, Willick J, Razafindrakoto A, Zohdy S, Albertelli MA, Krasnow MA. The Mouse Lemur, a Genetic Model Organism for Primate Biology, Behavior, and Health. Genetics 2017; 206:651-664. [PMID: 28592502 PMCID: PMC5499178 DOI: 10.1534/genetics.116.199448] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2016] [Accepted: 04/08/2017] [Indexed: 01/24/2023] Open
Abstract
Systematic genetic studies of a handful of diverse organisms over the past 50 years have transformed our understanding of biology. However, many aspects of primate biology, behavior, and disease are absent or poorly modeled in any of the current genetic model organisms including mice. We surveyed the animal kingdom to find other animals with advantages similar to mice that might better exemplify primate biology, and identified mouse lemurs (Microcebus spp.) as the outstanding candidate. Mouse lemurs are prosimian primates, roughly half the genetic distance between mice and humans. They are the smallest, fastest developing, and among the most prolific and abundant primates in the world, distributed throughout the island of Madagascar, many in separate breeding populations due to habitat destruction. Their physiology, behavior, and phylogeny have been studied for decades in laboratory colonies in Europe and in field studies in Malagasy rainforests, and a high quality reference genome sequence has recently been completed. To initiate a classical genetic approach, we developed a deep phenotyping protocol and have screened hundreds of laboratory and wild mouse lemurs for interesting phenotypes and begun mapping the underlying mutations, in collaboration with leading mouse lemur biologists. We also seek to establish a mouse lemur gene "knockout" library by sequencing the genomes of thousands of mouse lemurs to identify null alleles in most genes from the large pool of natural genetic variants. As part of this effort, we have begun a citizen science project in which students across Madagascar explore the remarkable biology around their schools, including longitudinal studies of the local mouse lemurs. We hope this work spawns a new model organism and cultivates a deep genetic understanding of primate biology and health. We also hope it establishes a new and ethical method of genetics that bridges biological, behavioral, medical, and conservation disciplines, while providing an example of how hands-on science education can help transform developing countries.
Collapse
Affiliation(s)
- Camille Ezran
- Department of Biochemistry
- Howard Hughes Medical Institute, and
| | | | | | - Alex Sholtz
- Department of Biochemistry
- Howard Hughes Medical Institute, and
| | - Maya R Krasnow
- Department of Biochemistry
- Howard Hughes Medical Institute, and
| | - Jason Willick
- Department of Biochemistry
- Howard Hughes Medical Institute, and
| | - Andriamahery Razafindrakoto
- Department of Animal Biology, Faculty of Science, University of Antananarivo, Antananarivo 101, BP 566, Madagascar, and
| | - Sarah Zohdy
- School of Forestry and Wildlife Sciences and College of Veterinary Medicine, Auburn University, Alabama 36849
| | - Megan A Albertelli
- Department of Comparative Medicine, Stanford University School of Medicine, California 94305
| | - Mark A Krasnow
- Department of Biochemistry
- Howard Hughes Medical Institute, and
| |
Collapse
|
14
|
Brittain EL, Chan SY. Integration of complex data sources to provide biologic insight into pulmonary vascular disease (2015 Grover Conference Series). Pulm Circ 2016; 6:251-60. [PMID: 27683602 DOI: 10.1086/686995] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The application of complex data sources to pulmonary vascular diseases is an emerging and promising area of investigation. The use of -omics platforms, in silico modeling of gene networks, and linkage of large human cohorts with DNA biobanks are beginning to bear biologic insight into pulmonary hypertension. These approaches to high-throughput molecular phenotyping offer the possibility of discovering new therapeutic targets and identifying variability in response to therapy that can be leveraged to improve clinical care. Optimizing the methods for analyzing complex data sources and accruing large, well-phenotyped human cohorts linked to biologic data remain significant challenges. Here, we discuss two specific types of complex data sources-gene regulatory networks and DNA-linked electronic medical record cohorts-that illustrate the promise, challenges, and current limitations of these approaches to understanding and managing pulmonary vascular disease.
Collapse
Affiliation(s)
- Evan L Brittain
- Division of Cardiovascular Medicine and Vanderbilt Translational and Clinical Cardiovascular Center, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Stephen Y Chan
- Division of Cardiology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA; and Center for Pulmonary Vascular Biology and Medicine, Pittsburgh Heart, Lung, and Blood Vascular Medicine Institute, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
15
|
Abstract
Human genome sequencing is routine and will soon be a staple in research and clinical genetics. However, the promise of sequencing is often just that, with genome data routinely failing to reveal useful insights about disease in general or a person's health in particular. Nowhere is this chasm between promise and progress more evident than in the designation, "variant of uncertain significance" (VUS). Although it serves an important role, careful consideration of VUS reveals it to be a nebulous description of genomic information and its relationship to disease, symptomatic of our inability to make even crude quantitative assertions about the disease risks conferred by many genetic variants. In this perspective, I discuss the challenge of "variant interpretation" and the value of comparative and functional genomic information in meeting that challenge. Although already essential, genomic annotations will become even more important as our analytical focus widens beyond coding exons. Combined with more genotype and phenotype data, they will help facilitate more quantitative and insightful assessments of the contributions of genetic variants to disease.
Collapse
Affiliation(s)
- Gregory M Cooper
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| |
Collapse
|
16
|
Silbereis JC, Pochareddy S, Zhu Y, Li M, Sestan N. The Cellular and Molecular Landscapes of the Developing Human Central Nervous System. Neuron 2016; 89:248-68. [PMID: 26796689 DOI: 10.1016/j.neuron.2015.12.008] [Citation(s) in RCA: 440] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The human CNS follows a pattern of development typical of all mammals, but certain neurodevelopmental features are highly derived. Building the human CNS requires the precise orchestration and coordination of myriad molecular and cellular processes across a staggering array of cell types and over a long period of time. Dysregulation of these processes affects the structure and function of the CNS and can lead to neurological or psychiatric disorders. Recent technological advances and increased focus on human neurodevelopment have enabled a more comprehensive characterization of the human CNS and its development in both health and disease. The aim of this review is to highlight recent advancements in our understanding of the molecular and cellular landscapes of the developing human CNS, with focus on the cerebral neocortex, and the insights these findings provide into human neural evolution, function, and dysfunction.
Collapse
Affiliation(s)
- John C Silbereis
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Sirisha Pochareddy
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Ying Zhu
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Mingfeng Li
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Nenad Sestan
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA; Department of Genetics and Department of Psychiatry, Yale School of Medicine, New Haven, CT 06510, USA; Program in Cellular Neuroscience, Neurodegeneration and Repair, Yale School of Medicine, New Haven, CT 06510, USA; Section of Comparative Medicine, Yale School of Medicine, New Haven, CT 06510, USA; Yale Child Study Center, Yale School of Medicine, New Haven, CT 06510, USA; Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA.
| |
Collapse
|
17
|
Lowdon RF, Jang HS, Wang T. Evolution of Epigenetic Regulation in Vertebrate Genomes. Trends Genet 2016; 32:269-283. [PMID: 27080453 PMCID: PMC4842087 DOI: 10.1016/j.tig.2016.03.001] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2015] [Revised: 03/02/2016] [Accepted: 03/03/2016] [Indexed: 12/31/2022]
Abstract
Empirical models of sequence evolution have spurred progress in the field of evolutionary genetics for decades. We are now realizing the importance and complexity of the eukaryotic epigenome. While epigenome analysis has been applied to genomes from single-cell eukaryotes to human, comparative analyses are still relatively few and computational algorithms to quantify epigenome evolution remain scarce. Accordingly, a quantitative model of epigenome evolution remains to be established. We review here the comparative epigenomics literature and synthesize its overarching themes. We also suggest one mechanism, transcription factor binding site (TFBS) turnover, which relates sequence evolution to epigenetic conservation or divergence. Lastly, we propose a framework for how the field can move forward to build a coherent quantitative model of epigenome evolution.
Collapse
Affiliation(s)
- Rebecca F Lowdon
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA.
| | - Hyo Sik Jang
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Ting Wang
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
18
|
Pinheiro A, Neves F, Lemos de Matos A, Abrantes J, van der Loo W, Mage R, Esteves PJ. An overview of the lagomorph immune system and its genetic diversity. Immunogenetics 2015; 68:83-107. [PMID: 26399242 DOI: 10.1007/s00251-015-0868-8] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Accepted: 08/31/2015] [Indexed: 01/11/2023]
Abstract
Our knowledge of the lagomorph immune system remains largely based upon studies of the European rabbit (Oryctolagus cuniculus), a major model for studies of immunology. Two important and devastating viral diseases, rabbit hemorrhagic disease and myxomatosis, are affecting European rabbit populations. In this context, we discuss the genetic diversity of the European rabbit immune system and extend to available information about other lagomorphs. Regarding innate immunity, we review the most recent advances in identifying interleukins, chemokines and chemokine receptors, Toll-like receptors, antiviral proteins (RIG-I and Trim5), and the genes encoding fucosyltransferases that are utilized by rabbit hemorrhagic disease virus as a portal for invading host respiratory and gut epithelial cells. Evolutionary studies showed that several genes of innate immunity are evolving by strong natural selection. Studies of the leporid CCR5 gene revealed a very dramatic change unique in mammals at the second extracellular loop of CCR5 resulting from a gene conversion event with the paralogous CCR2. For the adaptive immune system, we review genetic diversity at the loci encoding antibody variable and constant regions, the major histocompatibility complex (RLA) and T cells. Studies of IGHV and IGKC genes expressed in leporids are two of the few examples of trans-species polymorphism observed outside of the major histocompatibility complex. In addition, we review some endogenous viruses of lagomorph genomes, the importance of the European rabbit as a model for human disease studies, and the anticipated role of next-generation sequencing in extending knowledge of lagomorph immune systems and their evolution.
Collapse
Affiliation(s)
- Ana Pinheiro
- InBIO-Research Network in Biodiversity and Evolutionary Biology, CIBIO, Universidade do Porto, Campus Agrário de Vairão, Rua Padre Armando Quintas, nr. 7, 4485-661, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, 4169-007, Porto, Portugal
- SaBio-IREC (CSIC-UCLM-JCCM), Ronda de Toledo s/n, 13071, Ciudad Real, Spain
| | - Fabiana Neves
- InBIO-Research Network in Biodiversity and Evolutionary Biology, CIBIO, Universidade do Porto, Campus Agrário de Vairão, Rua Padre Armando Quintas, nr. 7, 4485-661, Vairão, Portugal
- UMIB/UP-Unidade Multidisciplinar de Investigação Biomédica, Universidade do Porto, Porto, Portugal
| | - Ana Lemos de Matos
- Department of Molecular Genetics & Microbiology, College of Medicine, University of Florida, Gainesville, FL, 32610, USA
| | - Joana Abrantes
- InBIO-Research Network in Biodiversity and Evolutionary Biology, CIBIO, Universidade do Porto, Campus Agrário de Vairão, Rua Padre Armando Quintas, nr. 7, 4485-661, Vairão, Portugal
| | - Wessel van der Loo
- InBIO-Research Network in Biodiversity and Evolutionary Biology, CIBIO, Universidade do Porto, Campus Agrário de Vairão, Rua Padre Armando Quintas, nr. 7, 4485-661, Vairão, Portugal
| | - Rose Mage
- NIAID, NIH, Bethesda, MD, 20892, USA
| | - Pedro José Esteves
- InBIO-Research Network in Biodiversity and Evolutionary Biology, CIBIO, Universidade do Porto, Campus Agrário de Vairão, Rua Padre Armando Quintas, nr. 7, 4485-661, Vairão, Portugal.
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, 4169-007, Porto, Portugal.
- CITS-Centro de Investigação em Tecnologias de Saúde, CESPU, Gandra, Portugal.
| |
Collapse
|
19
|
Lipovich L, Hou ZC, Jia H, Sinkler C, McGowen M, Sterner KN, Weckle A, Sugalski AB, Pipes L, Gatti DL, Mason CE, Sherwood CC, Hof PR, Kuzawa CW, Grossman LI, Goodman M, Wildman DE. High-throughput RNA sequencing reveals structural differences of orthologous brain-expressed genes between western lowland gorillas and humans. J Comp Neurol 2015; 524:288-308. [PMID: 26132897 DOI: 10.1002/cne.23843] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Revised: 06/20/2015] [Accepted: 06/23/2015] [Indexed: 12/22/2022]
Abstract
The human brain and human cognitive abilities are strikingly different from those of other great apes despite relatively modest genome sequence divergence. However, little is presently known about the interspecies divergence in gene structure and transcription that might contribute to these phenotypic differences. To date, most comparative studies of gene structure in the brain have examined humans, chimpanzees, and macaque monkeys. To add to this body of knowledge, we analyze here the brain transcriptome of the western lowland gorilla (Gorilla gorilla gorilla), an African great ape species that is phylogenetically closely related to humans, but with a brain that is approximately one-third the size. Manual transcriptome curation from a sample of the planum temporale region of the neocortex revealed 12 protein-coding genes and one noncoding-RNA gene with exons in the gorilla unmatched by public transcriptome data from the orthologous human loci. These interspecies gene structure differences accounted for a total of 134 amino acids in proteins found in the gorilla that were absent from protein products of the orthologous human genes. Proteins varying in structure between human and gorilla were involved in immunity and energy metabolism, suggesting their relevance to phenotypic differences. This gorilla neocortical transcriptome comprises an empirical, not homology- or prediction-driven, resource for orthologous gene comparisons between human and gorilla. These findings provide a unique repository of the sequences and structures of thousands of genes transcribed in the gorilla brain, pointing to candidate genes that may contribute to the traits distinguishing humans from other closely related great apes.
Collapse
Affiliation(s)
- Leonard Lipovich
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Department of Neurology, School of Medicine, Wayne State University, Detroit, Michigan, 48201
| | - Zhuo-Cheng Hou
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Department of Animal Genetics, China Agricultural University, Beijing, China
| | - Hui Jia
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201
| | - Christopher Sinkler
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201
| | - Michael McGowen
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,School of Biological and Chemical Sciences, Queen Mary, University of London, London, United Kingdom
| | - Kirstin N Sterner
- Department of Anthropology, University of Oregon, Eugene, Oregon, 97403
| | - Amy Weckle
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana, Illinois, 61801.,Department of Molecular and Integrative Physiology, University of Illinois, Urbana, Illinois, 61801
| | - Amara B Sugalski
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201
| | - Lenore Pipes
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, New York, 10021
| | - Domenico L Gatti
- Department of Biochemistry and Molecular Biology, School of Medicine, Wayne State University, Detroit, Michigan, 48201.,Cardiovascular Research Institute, School of Medicine, Wayne State University, Detroit, Michigan, 48201
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, New York, 10021
| | - Chet C Sherwood
- Department of Anthropology and the Center for the Advanced Study of Human Paleobiology, The George Washington University, Washington, DC, 20052
| | - Patrick R Hof
- Fishberg Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, 10029.,New York Consortium in Evolutionary Primatology, New York, New York, 10024
| | | | - Lawrence I Grossman
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201
| | - Morris Goodman
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Department of Anatomy and Cell Biology, School of Medicine, Wayne State University, Detroit, Michigan, 48201
| | - Derek E Wildman
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana, Illinois, 61801.,Department of Molecular and Integrative Physiology, University of Illinois, Urbana, Illinois, 61801
| |
Collapse
|
20
|
Taher L, Narlikar L, Ovcharenko I. Identification and computational analysis of gene regulatory elements. Cold Spring Harb Protoc 2015; 2015:pdb.top083642. [PMID: 25561628 PMCID: PMC5885252 DOI: 10.1101/pdb.top083642] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Over the last two decades, advances in experimental and computational technologies have greatly facilitated genomic research. Next-generation sequencing technologies have made de novo sequencing of large genomes affordable, and powerful computational approaches have enabled accurate annotations of genomic DNA sequences. Charting functional regions in genomes must account for not only the coding sequences, but also noncoding RNAs, repetitive elements, chromatin states, epigenetic modifications, and gene regulatory elements. A mix of comparative genomics, high-throughput biological experiments, and machine learning approaches has played a major role in this truly global effort. Here we describe some of these approaches and provide an account of our current understanding of the complex landscape of the human genome. We also present overviews of different publicly available, large-scale experimental data sets and computational tools, which we hope will prove beneficial for researchers working with large and complex genomes.
Collapse
Affiliation(s)
- Leila Taher
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, University of Rostock, 18051 Rostock, Germany
| | - Leelavati Narlikar
- Chemical Engineering and Process Development Division, National Chemical Laboratory, CSIR, Pune 411008, India
| | - Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894
| |
Collapse
|
21
|
Systematic dissection of coding exons at single nucleotide resolution supports an additional role in cell-specific transcriptional regulation. PLoS Genet 2014; 10:e1004592. [PMID: 25340400 PMCID: PMC4207465 DOI: 10.1371/journal.pgen.1004592] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Accepted: 07/09/2014] [Indexed: 12/22/2022] Open
Abstract
In addition to their protein coding function, exons can also serve as transcriptional enhancers. Mutations in these exonic-enhancers (eExons) could alter both protein function and transcription. However, the functional consequence of eExon mutations is not well known. Here, using massively parallel reporter assays, we dissect the enhancer activity of three liver eExons (SORL1 exon 17, TRAF3IP2 exon 2, PPARG exon 6) at single nucleotide resolution in the mouse liver. We find that both synonymous and non-synonymous mutations have similar effects on enhancer activity and many of the deleterious mutation clusters overlap known liver-associated transcription factor binding sites. Carrying a similar massively parallel reporter assay in HeLa cells with these three eExons found differences in their mutation profiles compared to the liver, suggesting that enhancers could have distinct operating profiles in different tissues. Our results demonstrate that eExon mutations could lead to multiple phenotypes by disrupting both the protein sequence and enhancer activity and that enhancers can have distinct mutation profiles in different cell types. Exons that code for protein can also have additional functions, such as regulating gene transcription through enhancer activity. Here, we changed every nucleotide in three different exons that also function as enhancers, and examined their enhancer activity to test whether nucleotide changes in these exons can affect both the protein sequence and enhancer function. We found that mutations with a significant effect on enhancer function can reside both in regions that change the protein sequence (non-synonymous) and regions that do not change it (synonymous). When we conducted a similar analysis in a different cell type, we observed a difference in the nucleotide changes that cause a significant effect on enhancer activity, suggesting that the enhancer functional units can differ between tissues.
Collapse
|
22
|
Abstract
Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments.
Collapse
|
23
|
Algama M, Oldmeadow C, Tasker E, Mengersen K, Keith JM. Drosophila 3' UTRs are more complex than protein-coding sequences. PLoS One 2014; 9:e97336. [PMID: 24824035 PMCID: PMC4019593 DOI: 10.1371/journal.pone.0097336] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Accepted: 04/18/2014] [Indexed: 01/03/2023] Open
Abstract
The 3′ UTRs of eukaryotic genes participate in a variety of post-transcriptional (and some transcriptional) regulatory interactions. Some of these interactions are well characterised, but an undetermined number remain to be discovered. While some regulatory sequences in 3′ UTRs may be conserved over long evolutionary time scales, others may have only ephemeral functional significance as regulatory profiles respond to changing selective pressures. Here we propose a sensitive segmentation methodology for investigating patterns of composition and conservation in 3′ UTRs based on comparison of closely related species. We describe encodings of pairwise and three-way alignments integrating information about conservation, GC content and transition/transversion ratios and apply the method to three closely related Drosophila species: D. melanogaster, D. simulans and D. yakuba. Incorporating multiple data types greatly increased the number of segment classes identified compared to similar methods based on conservation or GC content alone. We propose that the number of segments and number of types of segment identified by the method can be used as proxies for functional complexity. Our main finding is that the number of segments and segment classes identified in 3′ UTRs is greater than in the same length of protein-coding sequence, suggesting greater functional complexity in 3′ UTRs. There is thus a need for sustained and extensive efforts by bioinformaticians to delineate functional elements in this important genomic fraction. C code, data and results are available upon request.
Collapse
Affiliation(s)
- Manjula Algama
- School of Mathematical Sciences, Monash University, Clayton, Victoria, Australia
| | - Christopher Oldmeadow
- School of Medicine and Public Health, University of Newcastle, Newcastle, New South Wales, Australia
| | - Edward Tasker
- School of Mathematical Sciences, Monash University, Clayton, Victoria, Australia
| | - Kerrie Mengersen
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Jonathan M. Keith
- School of Mathematical Sciences, Monash University, Clayton, Victoria, Australia
- * E-mail:
| |
Collapse
|
24
|
Horvath JE, Ramachandran GL, Fedrigo O, Nielsen WJ, Babbitt CC, St Clair EM, Pfefferle LW, Jernvall J, Wray GA, Wall CE. Genetic comparisons yield insight into the evolution of enamel thickness during human evolution. J Hum Evol 2014; 73:75-87. [PMID: 24810709 DOI: 10.1016/j.jhevol.2014.01.005] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2013] [Revised: 10/29/2013] [Accepted: 01/09/2014] [Indexed: 12/29/2022]
Abstract
Enamel thickness varies substantially among extant hominoids and is a key trait with significance for interpreting dietary adaptation, life history trajectory, and phylogenetic relationships. There is a strong link in humans between enamel formation and mutations in the exons of the four genes that code for the enamel matrix proteins and the associated protease. The evolution of thick enamel in humans may have included changes in the regulation of these genes during tooth development. The cis-regulatory region in the 5' flank (upstream non-coding region) of MMP20, which codes for enamelysin, the predominant protease active during enamel secretion, has previously been shown to be under strong positive selection in the lineages leading to both humans and chimpanzees. Here we examine evidence for positive selection in the 5' flank and 3' flank of AMELX, AMBN, ENAM, and MMP20. We contrast the human sequence changes with other hominoids (chimpanzees, gorillas, orangutans, gibbons) and rhesus macaques (outgroup), a sample comprising a range of enamel thickness. We find no evidence for positive selection in the protein-coding regions of any of these genes. In contrast, we find strong evidence for positive selection in the 5' flank region of MMP20 and ENAM along the lineage leading to humans, and in both the 5' flank and 3' flank regions of MMP20 along the lineage leading to chimpanzees. We also identify putative transcription factor binding sites overlapping some of the species-specific nucleotide sites and we refine which sections of the up- and downstream putative regulatory regions are most likely to harbor important changes. These non-coding changes and their potential for differential regulation by transcription factors known to regulate tooth development may offer insight into the mechanisms that allow for rapid evolutionary changes in enamel thickness across closely-related species, and contribute to our understanding of the enamel phenotype in hominoids.
Collapse
Affiliation(s)
- Julie E Horvath
- North Carolina Museum of Natural Sciences, Nature Research Center, Raleigh, NC 27601, USA; Department of Biology, North Carolina Central University, Durham, NC 27707, USA; Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA; Duke Institute for Genome Sciences and Policy, Duke University, Durham, NC 27708, USA
| | | | - Olivier Fedrigo
- Duke Institute for Genome Sciences and Policy, Duke University, Durham, NC 27708, USA
| | | | - Courtney C Babbitt
- Duke Institute for Genome Sciences and Policy, Duke University, Durham, NC 27708, USA; Department of Biology, Duke University, Durham, NC 27708, USA
| | | | | | - Jukka Jernvall
- Institute for Biotechnology, University of Helsinki, Helsinki, Finland
| | - Gregory A Wray
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA; Duke Institute for Genome Sciences and Policy, Duke University, Durham, NC 27708, USA; Department of Biology, Duke University, Durham, NC 27708, USA
| | - Christine E Wall
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA.
| |
Collapse
|
25
|
Hoeppner MP, Lundquist A, Pirun M, Meadows JRS, Zamani N, Johnson J, Sundström G, Cook A, FitzGerald MG, Swofford R, Mauceli E, Moghadam BT, Greka A, Alföldi J, Abouelleil A, Aftuck L, Bessette D, Berlin A, Brown A, Gearin G, Lui A, Macdonald JP, Priest M, Shea T, Turner-Maier J, Zimmer A, Lander ES, di Palma F, Lindblad-Toh K, Grabherr MG. An improved canine genome and a comprehensive catalogue of coding genes and non-coding transcripts. PLoS One 2014; 9:e91172. [PMID: 24625832 PMCID: PMC3953330 DOI: 10.1371/journal.pone.0091172] [Citation(s) in RCA: 163] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2013] [Accepted: 02/08/2014] [Indexed: 12/22/2022] Open
Abstract
The domestic dog, Canis familiaris, is a well-established model system for mapping trait and disease loci. While the original draft sequence was of good quality, gaps were abundant particularly in promoter regions of the genome, negatively impacting the annotation and study of candidate genes. Here, we present an improved genome build, canFam3.1, which includes 85 MB of novel sequence and now covers 99.8% of the euchromatic portion of the genome. We also present multiple RNA-Sequencing data sets from 10 different canine tissues to catalog ∼175,000 expressed loci. While about 90% of the coding genes previously annotated by EnsEMBL have measurable expression in at least one sample, the number of transcript isoforms detected by our data expands the EnsEMBL annotations by a factor of four. Syntenic comparison with the human genome revealed an additional ∼3,000 loci that are characterized as protein coding in human and were also expressed in the dog, suggesting that those were previously not annotated in the EnsEMBL canine gene set. In addition to ∼20,700 high-confidence protein coding loci, we found ∼4,600 antisense transcripts overlapping exons of protein coding genes, ∼7,200 intergenic multi-exon transcripts without coding potential, likely candidates for long intergenic non-coding RNAs (lincRNAs) and ∼11,000 transcripts were reported by two different library construction methods but did not fit any of the above categories. Of the lincRNAs, about 6,000 have no annotated orthologs in human or mouse. Functional analysis of two novel transcripts with shRNA in a mouse kidney cell line altered cell morphology and motility. All in all, we provide a much-improved annotation of the canine genome and suggest regulatory functions for several of the novel non-coding transcripts.
Collapse
Affiliation(s)
- Marc P. Hoeppner
- Science for Life Laboratories, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Andrew Lundquist
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Division of Nephrology, Massachusetts General Hospital and Harvard Medical School, Charlestown, Massachusetts, United States of America
| | - Mono Pirun
- Memorial Sloan-Kettering Cancer Center, New York, New York, United States of America
| | - Jennifer R. S. Meadows
- Science for Life Laboratories, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Neda Zamani
- Science for Life Laboratories, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Jeremy Johnson
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Görel Sundström
- Science for Life Laboratories, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - April Cook
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Michael G. FitzGerald
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Ross Swofford
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Evan Mauceli
- Boston Children's Hospital, Boston, Massachusetts, United States of America
| | | | - Anna Greka
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Jessica Alföldi
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Amr Abouelleil
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Lynne Aftuck
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Daniel Bessette
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Aaron Berlin
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Adam Brown
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Gary Gearin
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Annie Lui
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | | | - Margaret Priest
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Terrance Shea
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Jason Turner-Maier
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Andrew Zimmer
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Eric S. Lander
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Federica di Palma
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Vertebrate and Health Genomics, The Genome Analysis Centre, Norwich, United Kingdom
| | - Kerstin Lindblad-Toh
- Science for Life Laboratories, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- * E-mail: (KL-T); (MGG)
| | - Manfred G. Grabherr
- Science for Life Laboratories, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- * E-mail: (KL-T); (MGG)
| |
Collapse
|
26
|
Capra JA, Erwin GD, McKinsey G, Rubenstein JLR, Pollard KS. Many human accelerated regions are developmental enhancers. Philos Trans R Soc Lond B Biol Sci 2013; 368:20130025. [PMID: 24218637 PMCID: PMC3826498 DOI: 10.1098/rstb.2013.0025] [Citation(s) in RCA: 137] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The genetic changes underlying the dramatic differences in form and function between humans and other primates are largely unknown, although it is clear that gene regulatory changes play an important role. To identify regulatory sequences with potentially human-specific functions, we and others used comparative genomics to find non-coding regions conserved across mammals that have acquired many sequence changes in humans since divergence from chimpanzees. These regions are good candidates for performing human-specific regulatory functions. Here, we analysed the DNA sequence, evolutionary history, histone modifications, chromatin state and transcription factor (TF) binding sites of a combined set of 2649 non-coding human accelerated regions (ncHARs) and predicted that at least 30% of them function as developmental enhancers. We prioritized the predicted ncHAR enhancers using analysis of TF binding site gain and loss, along with the functional annotations and expression patterns of nearby genes. We then tested both the human and chimpanzee sequence for 29 ncHARs in transgenic mice, and found 24 novel developmental enhancers active in both species, 17 of which had very consistent patterns of activity in specific embryonic tissues. Of these ncHAR enhancers, five drove expression patterns suggestive of different activity for the human and chimpanzee sequence at embryonic day 11.5. The changes to human non-coding DNA in these ncHAR enhancers may modify the complex patterns of gene expression necessary for proper development in a human-specific manner and are thus promising candidates for understanding the genetic basis of human-specific biology.
Collapse
Affiliation(s)
- John A Capra
- Gladstone Institutes, University of California, , San Francisco, CA 94158, USA
| | | | | | | | | |
Collapse
|
27
|
Abstract
When the human genome project started, the major challenge was how to sequence a 3 billion letter code in an organized and cost-effective manner. When completed, the project had laid the foundation for a huge variety of biomedical fields through the production of a complete human genome sequence, but also had driven the development of laboratory and analytical methods that could produce large amounts of sequencing data cheaply. These technological developments made possible the sequencing of many more vertebrate genomes, which have been necessary for the interpretation of the human genome. They have also enabled large-scale studies of vertebrate genome evolution, as well as comparative and human medicine. In this review, we give examples of evolutionary analysis using a wide variety of time frames—from the comparison of populations within a species to the comparison of species separated by at least 300 million years. Furthermore, we anticipate discoveries related to evolutionary mechanisms, adaptation, and disease to quickly accelerate in the coming years.
Collapse
Affiliation(s)
- Jessica Alföldi
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | | |
Collapse
|
28
|
Smith MA, Gesell T, Stadler PF, Mattick JS. Widespread purifying selection on RNA structure in mammals. Nucleic Acids Res 2013; 41:8220-36. [PMID: 23847102 PMCID: PMC3783177 DOI: 10.1093/nar/gkt596] [Citation(s) in RCA: 130] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2013] [Revised: 05/29/2013] [Accepted: 06/16/2013] [Indexed: 12/14/2022] Open
Abstract
Evolutionarily conserved RNA secondary structures are a robust indicator of purifying selection and, consequently, molecular function. Evaluating their genome-wide occurrence through comparative genomics has consistently been plagued by high false-positive rates and divergent predictions. We present a novel benchmarking pipeline aimed at calibrating the precision of genome-wide scans for consensus RNA structure prediction. The benchmarking data obtained from two refined structure prediction algorithms, RNAz and SISSIz, were then analyzed to fine-tune the parameters of an optimized workflow for genomic sliding window screens. When applied to consistency-based multiple genome alignments of 35 mammals, our approach confidently identifies >4 million evolutionarily constrained RNA structures using a conservative sensitivity threshold that entails historically low false discovery rates for such analyses (5-22%). These predictions comprise 13.6% of the human genome, 88% of which fall outside any known sequence-constrained element, suggesting that a large proportion of the mammalian genome is functional. As an example, our findings identify both known and novel conserved RNA structure motifs in the long noncoding RNA MALAT1. This study provides an extensive set of functional transcriptomic annotations that will assist researchers in uncovering the precise mechanisms underlying the developmental ontologies of higher eukaryotes.
Collapse
Affiliation(s)
- Martin A. Smith
- RNA Biology and Plasticity Laboratory, Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, Sydney, NSW 2010 Australia, Genomics and Computational Biology Division, Institute for Molecular Bioscience, 306 Carmody Rd, University of Queensland, Brisbane, 4067 Australia, Department of Structural and Computational Biology; and Center for Integrative Bioinformatics Vienna (CIBIV), Max F. Perutz Laboratories (MFPL), University of Vienna, Medical University of Vienna, Dr. Bohr-Gasse 9, A-1030 Vienna, Austria, Bioinformatics Group, Department of Computer Science; and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16–18, D-04107 Leipzig, Germany, Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany, Center for Non-coding RNA in Technology and Health, Department of Basic Veterinary and Animal Sciences, Faculty of Life Sciences University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C Denmark, Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501, USA and St Vincent’s Clinical School, University of New South Wales, Level 5, de Lacy, Victoria St, St Vincent's Hospital, Sydney, NSW 2010 Australia
| | - Tanja Gesell
- RNA Biology and Plasticity Laboratory, Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, Sydney, NSW 2010 Australia, Genomics and Computational Biology Division, Institute for Molecular Bioscience, 306 Carmody Rd, University of Queensland, Brisbane, 4067 Australia, Department of Structural and Computational Biology; and Center for Integrative Bioinformatics Vienna (CIBIV), Max F. Perutz Laboratories (MFPL), University of Vienna, Medical University of Vienna, Dr. Bohr-Gasse 9, A-1030 Vienna, Austria, Bioinformatics Group, Department of Computer Science; and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16–18, D-04107 Leipzig, Germany, Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany, Center for Non-coding RNA in Technology and Health, Department of Basic Veterinary and Animal Sciences, Faculty of Life Sciences University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C Denmark, Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501, USA and St Vincent’s Clinical School, University of New South Wales, Level 5, de Lacy, Victoria St, St Vincent's Hospital, Sydney, NSW 2010 Australia
| | - Peter F. Stadler
- RNA Biology and Plasticity Laboratory, Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, Sydney, NSW 2010 Australia, Genomics and Computational Biology Division, Institute for Molecular Bioscience, 306 Carmody Rd, University of Queensland, Brisbane, 4067 Australia, Department of Structural and Computational Biology; and Center for Integrative Bioinformatics Vienna (CIBIV), Max F. Perutz Laboratories (MFPL), University of Vienna, Medical University of Vienna, Dr. Bohr-Gasse 9, A-1030 Vienna, Austria, Bioinformatics Group, Department of Computer Science; and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16–18, D-04107 Leipzig, Germany, Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany, Center for Non-coding RNA in Technology and Health, Department of Basic Veterinary and Animal Sciences, Faculty of Life Sciences University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C Denmark, Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501, USA and St Vincent’s Clinical School, University of New South Wales, Level 5, de Lacy, Victoria St, St Vincent's Hospital, Sydney, NSW 2010 Australia
| | - John S. Mattick
- RNA Biology and Plasticity Laboratory, Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, Sydney, NSW 2010 Australia, Genomics and Computational Biology Division, Institute for Molecular Bioscience, 306 Carmody Rd, University of Queensland, Brisbane, 4067 Australia, Department of Structural and Computational Biology; and Center for Integrative Bioinformatics Vienna (CIBIV), Max F. Perutz Laboratories (MFPL), University of Vienna, Medical University of Vienna, Dr. Bohr-Gasse 9, A-1030 Vienna, Austria, Bioinformatics Group, Department of Computer Science; and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16–18, D-04107 Leipzig, Germany, Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany, Center for Non-coding RNA in Technology and Health, Department of Basic Veterinary and Animal Sciences, Faculty of Life Sciences University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C Denmark, Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501, USA and St Vincent’s Clinical School, University of New South Wales, Level 5, de Lacy, Victoria St, St Vincent's Hospital, Sydney, NSW 2010 Australia
| |
Collapse
|
29
|
De S, Pedersen BS, Kechris K. The dilemma of choosing the ideal permutation strategy while estimating statistical significance of genome-wide enrichment. Brief Bioinform 2013; 15:919-28. [PMID: 23956260 DOI: 10.1093/bib/bbt053] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Integrative analyses of genomic, epigenomic and transcriptomic features for human and various model organisms have revealed that many such features are nonrandomly distributed in the genome. Significant enrichment (or depletion) of genomic features is anticipated to be biologically important. Detection of genomic regions having enrichment of certain features and estimation of corresponding statistical significance rely on the expected null distribution generated by a permutation model. We discuss different genome-wide permutation approaches, present examples where the permutation strategy affects the null model and show that the confidence in estimating statistical significance of genome-wide enrichment might depend on the choice of the permutation approach. In those cases, where biologically relevant constraints are unclear, it is preferable to examine whether key conclusions are consistent, irrespective of the choice of the randomization strategy.
Collapse
|
30
|
Quast C, Cuboni S, Bader D, Altmann A, Weber P, Arloth J, Röh S, Brückl T, Ising M, Kopczak A, Erhardt A, Hausch F, Lucae S, Binder EB. Functional coding variants in SLC6A15, a possible risk gene for major depression. PLoS One 2013; 8:e68645. [PMID: 23874702 PMCID: PMC3712998 DOI: 10.1371/journal.pone.0068645] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2013] [Accepted: 05/30/2013] [Indexed: 11/18/2022] Open
Abstract
SLC6A15 is a neuron-specific neutral amino acid transporter that belongs to the solute carrier 6 gene family. This gene family is responsible for presynaptic re-uptake of the majority of neurotransmitters. Convergent data from human studies, animal models and pharmacological investigations suggest a possible role of SLC6A15 in major depressive disorder. In this work, we explored potential functional variants in this gene that could influence the activity of the amino acid transporter and thus downstream neuronal function and possibly the risk for stress-related psychiatric disorders. DNA from 400 depressed patients and 400 controls was screened for genetic variants using a pooled targeted re-sequencing approach. Results were verified by individual re-genotyping and validated non-synonymous coding variants were tested in an independent sample (N = 1934). Nine variants altering the amino acid sequence were then assessed for their functional effects by measuring SLC6A15 transporter activity in a cellular uptake assay. In total, we identified 405 genetic variants, including twelve non-synonymous variants. While none of the non-synonymous coding variants showed significant differences in case-control associations, two rare non-synonymous variants were associated with a significantly increased maximal (3)H proline uptake as compared to the wildtype sequence. Our data suggest that genetic variants in the SLC6A15 locus change the activity of the amino acid transporter and might thus influence its neuronal function and the risk for stress-related psychiatric disorders. As statistically significant association for rare variants might only be achieved in extremely large samples (N >70,000) functional exploration may shed light on putatively disease-relevant variants.
Collapse
Affiliation(s)
- Carina Quast
- Max Planck Institute of Psychiatry, Munich, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Moncaut N, Rigby PWJ, Carvajal JJ. Dial M(RF) for myogenesis. FEBS J 2013; 280:3980-90. [DOI: 10.1111/febs.12379] [Citation(s) in RCA: 80] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2013] [Revised: 05/31/2013] [Accepted: 06/04/2013] [Indexed: 12/21/2022]
Affiliation(s)
- Natalia Moncaut
- Division of Cancer Biology; The Institute of Cancer Research; London; UK
| | - Peter W. J. Rigby
- Division of Cancer Biology; The Institute of Cancer Research; London; UK
| | - Jaime J. Carvajal
- Molecular Embryology Team; Centro Andaluz de Biología del Desarrollo; Seville; Spain
| |
Collapse
|
32
|
Haudry A, Platts AE, Vello E, Hoen DR, Leclercq M, Williamson RJ, Forczek E, Joly-Lopez Z, Steffen JG, Hazzouri KM, Dewar K, Stinchcombe JR, Schoen DJ, Wang X, Schmutz J, Town CD, Edger PP, Pires JC, Schumaker KS, Jarvis DE, Mandáková T, Lysak MA, van den Bergh E, Schranz ME, Harrison PM, Moses AM, Bureau TE, Wright SI, Blanchette M. An atlas of over 90,000 conserved noncoding sequences provides insight into crucifer regulatory regions. Nat Genet 2013; 45:891-8. [PMID: 23817568 DOI: 10.1038/ng.2684] [Citation(s) in RCA: 219] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2012] [Accepted: 06/04/2013] [Indexed: 12/17/2022]
Abstract
Despite the central importance of noncoding DNA to gene regulation and evolution, understanding of the extent of selection on plant noncoding DNA remains limited compared to that of other organisms. Here we report sequencing of genomes from three Brassicaceae species (Leavenworthia alabamica, Sisymbrium irio and Aethionema arabicum) and their joint analysis with six previously sequenced crucifer genomes. Conservation across orthologous bases suggests that at least 17% of the Arabidopsis thaliana genome is under selection, with nearly one-quarter of the sequence under selection lying outside of coding regions. Much of this sequence can be localized to approximately 90,000 conserved noncoding sequences (CNSs) that show evidence of transcriptional and post-transcriptional regulation. Population genomics analyses of two crucifer species, A. thaliana and Capsella grandiflora, confirm that most of the identified CNSs are evolving under medium to strong purifying selection. Overall, these CNSs highlight both similarities and several key differences between the regulatory DNA of plants and other species.
Collapse
Affiliation(s)
- Annabelle Haudry
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Diogo R, Peng Z, Wood B. First comparative study of primate morphological and molecular evolutionary rates including muscle data: implications for the tempo and mode of primate and human evolution. J Anat 2013; 222:410-8. [PMID: 23320764 PMCID: PMC3610034 DOI: 10.1111/joa.12024] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/18/2012] [Indexed: 12/27/2022] Open
Abstract
Here we provide the first report about the rates of muscle evolution derived from Bayesian and parsimony cladistic analyses of primate higher-level phylogeny, and compare these rates with published rates of molecular evolution. It is commonly accepted that there is a 'general molecular slow-down of hominoids', but interestingly the rates of muscle evolution in the nodes leading and within the hominoid clade are higher than those in the vast majority of other primate clades. The rate of muscle evolution at the node leading to Homo (1.77) is higher than that at the nodes leading to Pan (0.89) and particularly to Gorilla (0.28). Notably, the rates of muscle evolution at the major euarchontan and primate nodes are different, but within each major primate clade (Strepsirrhini, Platyrrhini, Cercopithecidae and Hominoidea) the rates at the various nodes, and particularly at the nodes leading to the higher groups (i.e. including more than one genera), are strikingly similar. We explore the implications of these new data for the tempo and mode of primate and human evolution.
Collapse
Affiliation(s)
- Rui Diogo
- Department of Anatomy, Howard University College of Medicine, Washington, DC 20059, USA.
| | | | | |
Collapse
|
34
|
Sanges R, Hadzhiev Y, Gueroult-Bellone M, Roure A, Ferg M, Meola N, Amore G, Basu S, Brown ER, De Simone M, Petrera F, Licastro D, Strähle U, Banfi S, Lemaire P, Birney E, Müller F, Stupka E. Highly conserved elements discovered in vertebrates are present in non-syntenic loci of tunicates, act as enhancers and can be transcribed during development. Nucleic Acids Res 2013; 41:3600-18. [PMID: 23393190 PMCID: PMC3616699 DOI: 10.1093/nar/gkt030] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2012] [Revised: 12/21/2012] [Accepted: 01/03/2013] [Indexed: 01/17/2023] Open
Abstract
Co-option of cis-regulatory modules has been suggested as a mechanism for the evolution of expression sites during development. However, the extent and mechanisms involved in mobilization of cis-regulatory modules remains elusive. To trace the history of non-coding elements, which may represent candidate ancestral cis-regulatory modules affirmed during chordate evolution, we have searched for conserved elements in tunicate and vertebrate (Olfactores) genomes. We identified, for the first time, 183 non-coding sequences that are highly conserved between the two groups. Our results show that all but one element are conserved in non-syntenic regions between vertebrate and tunicate genomes, while being syntenic among vertebrates. Nevertheless, in all the groups, they are significantly associated with transcription factors showing specific functions fundamental to animal development, such as multicellular organism development and sequence-specific DNA binding. The majority of these regions map onto ultraconserved elements and we demonstrate that they can act as functional enhancers within the organism of origin, as well as in cross-transgenesis experiments, and that they are transcribed in extant species of Olfactores. We refer to the elements as 'Olfactores conserved non-coding elements'.
Collapse
Affiliation(s)
- Remo Sanges
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Yavor Hadzhiev
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Marion Gueroult-Bellone
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Agnes Roure
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Marco Ferg
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Nicola Meola
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Gabriele Amore
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Swaraj Basu
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Euan R. Brown
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Marco De Simone
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Francesca Petrera
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Danilo Licastro
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Uwe Strähle
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Sandro Banfi
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Patrick Lemaire
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Ewan Birney
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Ferenc Müller
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Elia Stupka
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy, Centre for Rare Diseases and Personalised Medicine, School of Clinical and Experimental Medicine, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK, Institut de Biologie du Développement de Marseille Luminy, UMR 6216 CNRS/Université de la Méditerranée, F-13288 Marseille cedex 9, France, Centre de Recherche de Biochimie Macromoléculaire (CRBM), UMR5237 CNRS/Universités Montpellier 1, 2, 1919 route de Mende, F-34293 Montpellier cedex 5, France, Karlsruhe Institute of Technology (KIT), Institute of Toxicology and Genetics and University of Heidelberg, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany, Telethon Institute of Genetics and Medicine, 80131 Naples, Italy, School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK, CBM Scrl, AREA Science Park, Basovizza, 34149 Trieste, Italy, Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK and Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| |
Collapse
|
35
|
Su MY, Steiner LA, Bogardus H, Mishra T, Schulz VP, Hardison RC, Gallagher PG. Identification of biologically relevant enhancers in human erythroid cells. J Biol Chem 2013; 288:8433-8444. [PMID: 23341446 DOI: 10.1074/jbc.m112.413260] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Identification of cell type-specific enhancers is important for understanding the regulation of programs controlling cellular development and differentiation. Enhancers are typically marked by the co-transcriptional activator protein p300 or by groups of cell-expressed transcription factors. We hypothesized that a unique set of enhancers regulates gene expression in human erythroid cells, a highly specialized cell type evolved to provide adequate amounts of oxygen throughout the body. Using chromatin immunoprecipitation followed by massively parallel sequencing, genome-wide maps of candidate enhancers were constructed for p300 and four transcription factors, GATA1, NF-E2, KLF1, and SCL, using primary human erythroid cells. These data were combined with gene expression analyses, and candidate enhancers were identified. Consistent with their predicted function as candidate enhancers, there was statistically significant enrichment of p300 and combinations of co-localizing erythroid transcription factors within 1-50 kb of the transcriptional start site (TSS) of genes highly expressed in erythroid cells. Candidate enhancers were also enriched near genes with known erythroid cell function or phenotype. Candidate enhancers exhibited moderate conservation with mouse and minimal conservation with nonplacental vertebrates. Candidate enhancers were mapped to a set of erythroid-associated, biologically relevant, SNPs from the genome-wide association studies (GWAS) catalogue of NHGRI, National Institutes of Health. Fourteen candidate enhancers, representing 10 genetic loci, mapped to sites associated with biologically relevant erythroid traits. Fragments from these loci directed statistically significant expression in reporter gene assays. Identification of enhancers in human erythroid cells will allow a better understanding of erythroid cell development, differentiation, structure, and function and provide insights into inherited and acquired hematologic disease.
Collapse
Affiliation(s)
- Mack Y Su
- Department of Pediatrics, Yale University School of Medicine, New Haven, Connecticut 06520
| | - Laurie A Steiner
- Department of Pediatrics, University of Rochester, Rochester, New York 14642
| | - Hannah Bogardus
- Department of Pediatrics, Yale University School of Medicine, New Haven, Connecticut 06520
| | - Tejaswini Mishra
- Department of Biochemistry and Molecular Biology, Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, Pennsylvania 16802
| | - Vincent P Schulz
- Department of Pediatrics, Yale University School of Medicine, New Haven, Connecticut 06520
| | - Ross C Hardison
- Department of Biochemistry and Molecular Biology, Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, Pennsylvania 16802
| | - Patrick G Gallagher
- Department of Pediatrics, Yale University School of Medicine, New Haven, Connecticut 06520; Departments of Pathology and Genetics, Yale University School of Medicine, New Haven, Connecticut 06520.
| |
Collapse
|
36
|
Blanchette M. Exploiting ancestral mammalian genomes for the prediction of human transcription factor binding sites. BMC Bioinformatics 2012; 13 Suppl 19:S2. [PMID: 23281809 PMCID: PMC3526440 DOI: 10.1186/1471-2105-13-s19-s2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background The computational prediction of Transcription Factor Binding Sites (TFBS) remains a challenge due to their short length and low information content. Comparative genomics approaches that simultaneously consider several related species and favor sites that have been conserved throughout evolution improve the accuracy (specificity) of the predictions but are limited due to a phenomenon called binding site turnover, where sequence evolution causes one TFBS to replace another in the same region. In parallel to this development, an increasing number of mammalian genomes are now sequenced and it is becoming possible to infer, to a surprisingly high degree of accuracy, ancestral mammalian sequences. Results We propose a TFBS prediction approach that makes use of the availability of inferred ancestral mammalian genomes to improve its accuracy. This method aims to identify binding loci, which are regions of a few hundred base pairs that have preserved their potential to bind a given transcription factor over evolutionary time. After proposing a neutral evolutionary model of predicted TFBS counts in a DNA region of a given length, we use it to identify regions that have preserved the number of predicted TFBS they contain to an unexpected degree given their divergence. The approach is applied to human chromosome 1 and shows significant gains in accuracy as compared to both existing single-species and multi-species TFBS prediction approaches, in particular for transcription factors that are subject to high turnover rates. Availability The source code and predictions made by the program are available at http://www.cs.mcgill.ca/~blanchem/bindingLoci.
Collapse
Affiliation(s)
- Mathieu Blanchette
- McGill Centre for Bioinformatics and School of Computer Science, McGill University, H3C 2B4 Québec, Canada.
| |
Collapse
|
37
|
Samstein RM, Josefowicz SZ, Arvey A, Treuting PM, Rudensky AY. Extrathymic generation of regulatory T cells in placental mammals mitigates maternal-fetal conflict. Cell 2012; 150:29-38. [PMID: 22770213 DOI: 10.1016/j.cell.2012.05.031] [Citation(s) in RCA: 467] [Impact Index Per Article: 38.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2012] [Revised: 04/26/2012] [Accepted: 05/10/2012] [Indexed: 12/13/2022]
Abstract
Regulatory T (Treg) cells, whose differentiation and function are controlled by X chromosome-encoded transcription factor Foxp3, are generated in the thymus (tTreg) and extrathymically (peripheral, pTreg), and their deficiency results in fatal autoimmunity. Here, we demonstrate that a Foxp3 enhancer, conserved noncoding sequence 1 (CNS1), essential for pTreg but dispensable for tTreg cell generation, is present only in placental mammals. CNS1 is largely composed of mammalian-wide interspersed repeats (MIR) that have undergone retrotransposition during early mammalian radiation. During pregnancy, pTreg cells specific to a model paternal alloantigen were generated in a CNS1-dependent manner and accumulated in the placenta. Furthermore, when mated with allogeneic, but not syngeneic, males, CNS1-deficient females showed increased fetal resorption accompanied by increased immune cell infiltration and defective remodeling of spiral arteries. Our results suggest that, during evolution, a CNS1-dependent mechanism of extrathymic differentiation of Treg cells emerged in placental animals to enforce maternal-fetal tolerance.
Collapse
Affiliation(s)
- Robert M Samstein
- Howard Hughes Medical Institute and Immunology Program, Memorial Sloan-Kettering Cancer Center, New York, NY 10065, USA
| | | | | | | | | |
Collapse
|
38
|
Popadin KY, Nikolaev SI, Junier T, Baranova M, Antonarakis SE. Purifying selection in mammalian mitochondrial protein-coding genes is highly effective and congruent with evolution of nuclear genes. Mol Biol Evol 2012; 30:347-55. [PMID: 22983951 DOI: 10.1093/molbev/mss219] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The mammalian mitochondrial genomes differ from the nuclear genomes by maternal inheritance, absence of recombination, and higher mutation rate. All these differences decrease the effective population size of mitochondrial genome and make it more susceptible to accumulation of slightly deleterious mutations. It was hypothesized that mitochondrial genes, especially in species with low effective population size, irreversibly degrade leading to decrease of organismal fitness and even to extinction of species through the mutational meltdown. To interrogate this hypothesis, we compared the purifying selections acting on the representative set of mitochondrial (potentially degrading) and nuclear (potentially not degrading) protein-coding genes in species with different effective population size. For 21 mammalian species, we calculated the ratios of accumulation of slightly deleterious mutations approximated by Kn/Ks separately for mitochondrial and nuclear genomes. The 75% of variation in Kn/Ks is explained by two independent variables: type of a genome (mitochondrial or nuclear) and effective population size of species approximated by generation time. First, we observed that purifying selection is more effective in mitochondria than in the nucleus that implies strong evolutionary constraints of mitochondrial genome. Mitochondrial de novo nonsynonymous mutations have at least 5-fold more harmful effect when compared with nuclear. Second, Kn/Ks of mitochondrial and nuclear genomes is positively correlated with generation time of species, indicating relaxation of purifying selection with decrease of species-specific effective population size. Most importantly, the linear regression lines of mitochondrial and nuclear Kn/Ks's from generation times of species are parallel, indicating congruent relaxation of purifying selection in both genomes. Thus, our results reveal that the distribution of selection coefficients of de novo nonsynonymous mitochondrial mutations has a similar shape with the distribution of de novo nonsynonymous nuclear mutations, but its mean is five times smaller. The harmful effect of mitochondrial de novo nonsynonymous mutations triggers highly effective purifying selection, which maintains the fitness of the mammalian mitochondrial genome.
Collapse
Affiliation(s)
- Konstantin Yu Popadin
- Department of Genetic Medicine and Development, University of Geneva Medical School and iGE3 Institute of Genetics and Genomics of Geneva, Switzerland.
| | | | | | | | | |
Collapse
|
39
|
Elsharawy A, Forster M, Schracke N, Keller A, Thomsen I, Petersen BS, Stade B, Stähler P, Schreiber S, Rosenstiel P, Franke A. Improving mapping and SNP-calling performance in multiplexed targeted next-generation sequencing. BMC Genomics 2012; 13:417. [PMID: 22913592 PMCID: PMC3563481 DOI: 10.1186/1471-2164-13-417] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2011] [Accepted: 08/10/2012] [Indexed: 11/10/2022] Open
Abstract
Background Compared to classical genotyping, targeted next-generation sequencing (tNGS) can be custom-designed to interrogate entire genomic regions of interest, in order to detect novel as well as known variants. To bring down the per-sample cost, one approach is to pool barcoded NGS libraries before sample enrichment. Still, we lack a complete understanding of how this multiplexed tNGS approach and the varying performance of the ever-evolving analytical tools can affect the quality of variant discovery. Therefore, we evaluated the impact of different software tools and analytical approaches on the discovery of single nucleotide polymorphisms (SNPs) in multiplexed tNGS data. To generate our own test model, we combined a sequence capture method with NGS in three experimental stages of increasing complexity (E. coli genes, multiplexed E. coli, and multiplexed HapMap BRCA1/2 regions). Results We successfully enriched barcoded NGS libraries instead of genomic DNA, achieving reproducible coverage profiles (Pearson correlation coefficients of up to 0.99) across multiplexed samples, with <10% strand bias. However, the SNP calling quality was substantially affected by the choice of tools and mapping strategy. With the aim of reducing computational requirements, we compared conventional whole-genome mapping and SNP-calling with a new faster approach: target-region mapping with subsequent ‘read-backmapping’ to the whole genome to reduce the false detection rate. Consequently, we developed a combined mapping pipeline, which includes standard tools (BWA, SAMtools, etc.), and tested it on public HiSeq2000 exome data from the 1000 Genomes Project. Our pipeline saved 12 hours of run time per Hiseq2000 exome sample and detected ~5% more SNPs than the conventional whole genome approach. This suggests that more potential novel SNPs may be discovered using both approaches than with just the conventional approach. Conclusions We recommend applying our general ‘two-step’ mapping approach for more efficient SNP discovery in tNGS. Our study has also shown the benefit of computing inter-sample SNP-concordances and inspecting read alignments in order to attain more confident results.
Collapse
Affiliation(s)
- Abdou Elsharawy
- Institute of Clinical Molecular Biology, Christian-Albrechts-University, Kiel, Germany
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Mutational signatures of de-differentiation in functional non-coding regions of melanoma genomes. PLoS Genet 2012; 8:e1002871. [PMID: 22912592 PMCID: PMC3415438 DOI: 10.1371/journal.pgen.1002871] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Accepted: 06/11/2012] [Indexed: 11/23/2022] Open
Abstract
Much emphasis has been placed on the identification, functional characterization, and therapeutic potential of somatic variants in tumor genomes. However, the majority of somatic variants lie outside coding regions and their role in cancer progression remains to be determined. In order to establish a system to test the functional importance of non-coding somatic variants in cancer, we created a low-passage cell culture of a metastatic melanoma tumor sample. As a foundation for interpreting functional assays, we performed whole-genome sequencing and analysis of this cell culture, the metastatic tumor from which it was derived, and the patient-matched normal genomes. When comparing somatic mutations identified in the cell culture and tissue genomes, we observe concordance at the majority of single nucleotide variants, whereas copy number changes are more variable. To understand the functional impact of non-coding somatic variation, we leveraged functional data generated by the ENCODE Project Consortium. We analyzed regulatory regions derived from multiple different cell types and found that melanocyte-specific regions are among the most depleted for somatic mutation accumulation. Significant depletion in other cell types suggests the metastatic melanoma cells de-differentiated to a more basal regulatory state. Experimental identification of genome-wide regulatory sites in two different melanoma samples supports this observation. Together, these results show that mutation accumulation in metastatic melanoma is nonrandom across the genome and that a de-differentiated regulatory architecture is common among different samples. Our findings enable identification of the underlying genetic components of melanoma and define the differences between a tissue-derived tumor sample and the cell culture created from it. Such information helps establish a broader mechanistic understanding of the linkage between non-coding genomic variations and the cellular evolution of cancer. Here we investigate the relationship between somatic variants and non-coding regulatory regions. To do this, we develop a new algorithm for identifying single nucleotide somatic variants in whole-genome sequencing data and apply it to a metastatic melanoma sample and a cell culture derived from this sample. Our results show that the two genomes are similar at the level of single nucleotide changes and more variable at larger copy number changes. We further observe that patterns of somatic mutation accumulation in non-coding regulatory regions suggests that the metastatic melanoma cells de-differentiated into a more basal regulatory state. That is, by simply looking at mutation accumulation across cell-type-specific non-coding functional regions, one can clearly see patterns that are indicative of cell state de-differentiation. Results from genome-wide functional regulatory region experimental mapping support this observation.
Collapse
|
41
|
Abstract
Whole-genome alignment (WGA) is the prediction of evolutionary relationships at the nucleotide level between two or more genomes. It combines aspects of both colinear sequence alignment and gene orthology prediction, and is typically more challenging to address than either of these tasks due to the size and complexity of whole genomes. Despite the difficulty of this problem, numerous methods have been developed for its solution because WGAs are valuable for genome-wide analyses, such as phylogenetic inference, genome annotation, and function prediction. In this chapter, we discuss the meaning and significance of WGA and present an overview of the methods that address it. We also examine the problem of evaluating whole-genome aligners and offer a set of methodological challenges that need to be tackled in order to make the most effective use of our rapidly growing databases of whole genomes.
Collapse
Affiliation(s)
- Colin N Dewey
- Biostatistics and Medical Informatics and Computer Sciences, Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
42
|
Hemberg M, Gray JM, Cloonan N, Kuersten S, Grimmond S, Greenberg ME, Kreiman G. Integrated genome analysis suggests that most conserved non-coding sequences are regulatory factor binding sites. Nucleic Acids Res 2012; 40:7858-69. [PMID: 22684627 PMCID: PMC3439890 DOI: 10.1093/nar/gks477] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
More than 98% of a typical vertebrate genome does not code for proteins. Although non-coding regions are sprinkled with short (<200 bp) islands of evolutionarily conserved sequences, the function of most of these unannotated conserved islands remains unknown. One possibility is that unannotated conserved islands could encode non-coding RNAs (ncRNAs); alternatively, unannotated conserved islands could serve as promoter-distal regulatory factor binding sites (RFBSs) like enhancers. Here we assess these possibilities by comparing unannotated conserved islands in the human and mouse genomes to transcribed regions and to RFBSs, relying on a detailed case study of one human and one mouse cell type. We define transcribed regions by applying a novel transcript-calling algorithm to RNA-Seq data obtained from total cellular RNA, and we define RFBSs using ChIP-Seq and DNAse-hypersensitivity assays. We find that unannotated conserved islands are four times more likely to coincide with RFBSs than with unannotated ncRNAs. Thousands of conserved RFBSs can be categorized as insulators based on the presence of CTCF or as enhancers based on the presence of p300/CBP and H3K4me1. While many unannotated conserved RFBSs are transcriptionally active to some extent, the transcripts produced tend to be unspliced, non-polyadenylated and expressed at levels 10 to 100-fold lower than annotated coding or ncRNAs. Extending these findings across multiple cell types and tissues, we propose that most conserved non-coding genomic DNA in vertebrate genomes corresponds to promoter-distal regulatory elements.
Collapse
Affiliation(s)
- Martin Hemberg
- Department of Ophthalmology, Children's Hospital Boston, Harvard Medical School, Boston, MA 02215, USA.
| | | | | | | | | | | | | |
Collapse
|
43
|
Hou ZC, Sterner KN, Romero R, Than NG, Gonzalez JM, Weckle A, Xing J, Benirschke K, Goodman M, Wildman DE. Elephant transcriptome provides insights into the evolution of eutherian placentation. Genome Biol Evol 2012; 4:713-25. [PMID: 22546564 PMCID: PMC3381679 DOI: 10.1093/gbe/evs045] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
The chorioallantoic placenta connects mother and fetus in eutherian pregnancies. In order to understand the evolution of the placenta and provide further understanding of placenta biology, we sequenced the transcriptome of a term placenta of an African elephant (Loxodonta africana) and compared these data with RNA sequence and microarray data from other eutherian placentas including human, mouse, and cow. We characterized the composition of 55,910 expressed sequence tag (i.e., cDNA) contigs using our custom annotation pipeline. A Markov algorithm was used to cluster orthologs of human, mouse, cow, and elephant placenta transcripts. We found 2,963 genes are commonly expressed in the placentas of these eutherian mammals. Gene ontology categories previously suggested to be important for placenta function (e.g., estrogen receptor signaling pathway, cell motion and migration, and adherens junctions) were significantly enriched in these eutherian placenta–expressed genes. Genes duplicated in different lineages and also specifically expressed in the placenta contribute to the great diversity observed in mammalian placenta anatomy. We identified 1,365 human lineage–specific, 1,235 mouse lineage–specific, 436 cow lineage–specific, and 904 elephant-specific placenta-expressed (PE) genes. The most enriched clusters of human-specific PE genes are signal/glycoprotein and immunoglobulin, and humans possess a deeply invasive human hemochorial placenta that comes into direct contact with maternal immune cells. Inference of phylogenetically conserved and derived transcripts demonstrates the power of comparative transcriptomics to trace placenta evolution and variation across mammals and identified candidate genes that may be important in the normal function of the human placenta, and their dysfunction may be related to human pregnancy complications.
Collapse
Affiliation(s)
- Zhuo-Cheng Hou
- Perinatology Research Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development/NIH/DHHS, Detroit, Michigan, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Westesson O, Lunter G, Paten B, Holmes I. Accurate reconstruction of insertion-deletion histories by statistical phylogenetics. PLoS One 2012; 7:e34572. [PMID: 22536326 PMCID: PMC3335033 DOI: 10.1371/journal.pone.0034572] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2012] [Accepted: 03/05/2012] [Indexed: 11/24/2022] Open
Abstract
The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.
Collapse
Affiliation(s)
- Oscar Westesson
- University of California Berkeley and University of California San Francisco Graduate Program in Bioengineering, University of California, Berkeley, California, United States of America
| | - Gerton Lunter
- Wellcome Trust Center for Human Genetics, Oxford, Oxford, United Kingdom
| | - Benedict Paten
- Baskin School of Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
| | - Ian Holmes
- University of California Berkeley and University of California San Francisco Graduate Program in Bioengineering, University of California, Berkeley, California, United States of America
| |
Collapse
|
45
|
Song G, Riemer C, Dickins B, Kim HL, Zhang L, Zhang Y, Hsu CH, Hardison RC, Nisc Comparative Sequencing Program, Green ED, Miller W. Revealing mammalian evolutionary relationships by comparative analysis of gene clusters. Genome Biol Evol 2012; 4:586-601. [PMID: 22454131 PMCID: PMC3342878 DOI: 10.1093/gbe/evs032] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/19/2012] [Indexed: 12/13/2022] Open
Abstract
Many software tools for comparative analysis of genomic sequence data have been released in recent decades. Despite this, it remains challenging to determine evolutionary relationships in gene clusters due to their complex histories involving duplications, deletions, inversions, and conversions. One concept describing these relationships is orthology. Orthologs derive from a common ancestor by speciation, in contrast to paralogs, which derive from duplication. Discriminating orthologs from paralogs is a necessary step in most multispecies sequence analyses, but doing so accurately is impeded by the occurrence of gene conversion events. We propose a refined method of orthology assignment based on two paradigms for interpreting its definition: by genomic context or by sequence content. X-orthology (based on context) traces orthology resulting from speciation and duplication only, while N-orthology (based on content) includes the influence of conversion events. We developed a computational method for automatically mapping both types of orthology on a per-nucleotide basis in gene cluster regions studied by comparative sequencing, and we make this mapping accessible by visualizing the output. All of these steps are incorporated into our newly extended CHAP 2 package. We evaluate our method using both simulated data and real gene clusters (including the well-characterized α-globin and β-globin clusters). We also illustrate use of CHAP 2 by analyzing four more loci: CCL (chemokine ligand), IFN (interferon), CYP2abf (part of cytochrome P450 family 2), and KIR (killer cell immunoglobulin-like receptors). These new methods facilitate and extend our understanding of evolution at these and other loci by adding automated accurate evolutionary inference to the biologist's toolkit. The CHAP 2 package is freely available from http://www.bx.psu.edu/miller_lab.
Collapse
Affiliation(s)
- Giltae Song
- Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, PA, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Patwardhan RP, Hiatt JB, Witten DM, Kim MJ, Smith RP, May D, Lee C, Andrie JM, Lee SI, Cooper GM, Ahituv N, Pennacchio LA, Shendure J. Massively parallel functional dissection of mammalian enhancers in vivo. Nat Biotechnol 2012; 30:265-70. [PMID: 22371081 PMCID: PMC3402344 DOI: 10.1038/nbt.2136] [Citation(s) in RCA: 372] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2011] [Accepted: 01/23/2012] [Indexed: 01/01/2023]
Abstract
The functional consequences of genetic variation in mammalian regulatory elements are poorly understood. We report the in vivo dissection of three mammalian enhancers at single-nucleotide resolution through a massively parallel reporter assay. For each enhancer, we synthesized a library of >100,000 mutant haplotypes with 2-3% divergence from the wild-type sequence. Each haplotype was linked to a unique sequence tag embedded within a transcriptional cassette. We introduced each enhancer library into mouse liver and measured the relative activities of individual haplotypes en masse by sequencing the transcribed tags. Linear regression analysis yielded highly reproducible estimates of the effect of every possible single-nucleotide change on enhancer activity. The functional consequence of most mutations was modest, with ∼22% affecting activity by >1.2-fold and ∼3% by >2-fold. Several, but not all, positions with higher effects showed evidence for purifying selection, or co-localized with known liver-associated transcription factor binding sites, demonstrating the value of empirical high-resolution functional analysis.
Collapse
Affiliation(s)
- Rupali P Patwardhan
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Fontanesi L, Mazzoni G, Bovo S, Frabetti A, Fornasini D, Dall'Olio S, Russo V. Association between a polymorphism in the IGF2 gene and finishing weight in a commercial rabbit population. Anim Genet 2012; 43:651-2. [DOI: 10.1111/j.1365-2052.2012.02318.x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/15/2011] [Indexed: 11/30/2022]
Affiliation(s)
- L. Fontanesi
- Department of Agro-Food Science and Technology; Sezione di Allevamenti Zootecnici; Faculty of Agriculture; University of Bologna; Viale Fanin 48; Bologna; 40127; Italy
| | - G. Mazzoni
- Department of Agro-Food Science and Technology; Sezione di Allevamenti Zootecnici; Faculty of Agriculture; University of Bologna; Viale Fanin 48; Bologna; 40127; Italy
| | - S. Bovo
- Department of Agro-Food Science and Technology; Sezione di Allevamenti Zootecnici; Faculty of Agriculture; University of Bologna; Viale Fanin 48; Bologna; 40127; Italy
| | - A. Frabetti
- Gruppo Martini S.p.A.; Centro Genetica Conigli (Rabbit Genetic Center); Longiano; 47020; FC; Italy
| | - D. Fornasini
- Gruppo Martini S.p.A.; Centro Genetica Conigli (Rabbit Genetic Center); Longiano; 47020; FC; Italy
| | - S. Dall'Olio
- Department of Agro-Food Science and Technology; Sezione di Allevamenti Zootecnici; Faculty of Agriculture; University of Bologna; Viale Fanin 48; Bologna; 40127; Italy
| | - V. Russo
- Department of Agro-Food Science and Technology; Sezione di Allevamenti Zootecnici; Faculty of Agriculture; University of Bologna; Viale Fanin 48; Bologna; 40127; Italy
| |
Collapse
|
48
|
Devillers H, Chiapello H, Schbath S, Karoui ME. Robustness assessment of whole bacterial genome segmentations. J Comput Biol 2012; 18:1155-65. [PMID: 21899422 DOI: 10.1089/cmb.2011.0115] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Comparison of closely related bacterial genomes has revealed the presence of highly conserved sequences forming a "backbone" that is interrupted by numerous, less conserved, DNA fragments. Segmentation of bacterial genomes into backbone and variable regions is particularly useful to investigate, among other things, bacterial genome evolution. Several software tools have been designed to compare complete bacterial chromosomes and a few online databases store pre-computed genome comparisons. However, very few statistical methods are available to evaluate the reliability of these software tools and to compare the results obtained with them. To fill this gap, we have developed two local scores to measure the robustness of bacterial genome segmentations. Our method uses a simulation procedure based on random perturbations of the compared genomes. The two scores described in this article provide useful information and are easy to implement, and their interpretation is intuitive. We show that they are suited to discriminate between robust and non-robust segmentations when genome aligners such as MAUVE and MGA are used.
Collapse
Affiliation(s)
- Hugo Devillers
- Mathématique, Informatique et Génome, INRA, UR1077, Jouy-en-Josas, France.
| | | | | | | |
Collapse
|
49
|
Use of comparative genomics approaches to characterize interspecies differences in response to environmental chemicals: challenges, opportunities, and research needs. Toxicol Appl Pharmacol 2011; 271:372-85. [PMID: 22142766 DOI: 10.1016/j.taap.2011.11.011] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2011] [Revised: 11/11/2011] [Accepted: 11/16/2011] [Indexed: 01/12/2023]
Abstract
A critical challenge for environmental chemical risk assessment is the characterization and reduction of uncertainties introduced when extrapolating inferences from one species to another. The purpose of this article is to explore the challenges, opportunities, and research needs surrounding the issue of how genomics data and computational and systems level approaches can be applied to inform differences in response to environmental chemical exposure across species. We propose that the data, tools, and evolutionary framework of comparative genomics be adapted to inform interspecies differences in chemical mechanisms of action. We compare and contrast existing approaches, from disciplines as varied as evolutionary biology, systems biology, mathematics, and computer science, that can be used, modified, and combined in new ways to discover and characterize interspecies differences in chemical mechanism of action which, in turn, can be explored for application to risk assessment. We consider how genetic, protein, pathway, and network information can be interrogated from an evolutionary biology perspective to effectively characterize variations in biological processes of toxicological relevance among organisms. We conclude that comparative genomics approaches show promise for characterizing interspecies differences in mechanisms of action, and further, for improving our understanding of the uncertainties inherent in extrapolating inferences across species in both ecological and human health risk assessment. To achieve long-term relevance and consistent use in environmental chemical risk assessment, improved bioinformatics tools, computational methods robust to data gaps, and quantitative approaches for conducting extrapolations across species are critically needed. Specific areas ripe for research to address these needs are recommended.
Collapse
|
50
|
Mertes F, Elsharawy A, Sauer S, van Helvoort JMLM, van der Zaag PJ, Franke A, Nilsson M, Lehrach H, Brookes AJ. Targeted enrichment of genomic DNA regions for next-generation sequencing. Brief Funct Genomics 2011; 10:374-86. [PMID: 22121152 PMCID: PMC3245553 DOI: 10.1093/bfgp/elr033] [Citation(s) in RCA: 164] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
In this review, we discuss the latest targeted enrichment methods and aspects of their utilization along with second-generation sequencing for complex genome analysis. In doing so, we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a powerful tool. We explain how targeted enrichment for next-generation sequencing has made great progress in terms of methodology, ease of use and applicability, but emphasize the remaining challenges such as the lack of even coverage across targeted regions. Costs are also considered versus the alternative of whole-genome sequencing which is becoming ever more affordable. We conclude that targeted enrichment is likely to be the most economical option for many years to come in a range of settings.
Collapse
Affiliation(s)
- Florian Mertes
- Max Planck Institute for Molecular Genetics, Berlin, Germany.
| | | | | | | | | | | | | | | | | |
Collapse
|