1
|
Barts N, Bhatt RH, Toner C, Meyer WK, Durrant JD, Kohl KD. Functional convergence in gastric lysozymes of foregut-fermenting rodents, ruminants, and primates is not attributed to convergent molecular evolution. Comp Biochem Physiol B Biochem Mol Biol 2024; 271:110949. [PMID: 38341948 DOI: 10.1016/j.cbpb.2024.110949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 01/28/2024] [Accepted: 01/28/2024] [Indexed: 02/13/2024]
Abstract
Convergent evolution is a widespread phenomenon. While there are many examples of convergent evolution at the phenotypic scale, convergence at the molecular level has been more difficult to identify. A classic example of convergent evolution across scales is that of the digestive lysozyme found in ruminants and Colobine monkeys. These herbivorous species rely on foregut fermentation, which has evolved to function more optimally under acidic conditions. Here, we explored if rodents with similar dietary strategies and digestive morphologies have convergently evolved a lysozyme with digestive functions. At the phenotypic level, we find that rodents with bilocular stomach morphologies exhibited a lysozyme that maintained higher relative activities at low pH values, similar to the lysozymes of ruminants and Colobine monkeys. Additionally, the lysozyme of Peromyscus leucopus shared a similar predicted protonation state as that observed in previously identified digestive lysozymes. However, we found limited evidence of positive selection acting on the lysozyme gene in foregut-fermenting species and did not identify patterns of convergent molecular evolution in this gene. This study emphasizes that phenotypic convergence need not be the result of convergent genetic modifications, and we encourage further exploration into the mechanisms regulating convergence across biological scales.
Collapse
Affiliation(s)
- Nick Barts
- Department of Biological and Clinical Sciences, University of Central Missouri, Warrensburg, MO, USA; Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Roshni H Bhatt
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA. https://twitter.com/RoshniBhatt3
| | - Chelsea Toner
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Wynn K Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA, USA. https://twitter.com/sorrywm
| | - Jacob D Durrant
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Kevin D Kohl
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA. https://twitter.com/KevinDKohl
| |
Collapse
|
2
|
Wirthlin ME, Schmid TA, Elie JE, Zhang X, Kowalczyk A, Redlich R, Shvareva VA, Rakuljic A, Ji MB, Bhat NS, Kaplow IM, Schäffer DE, Lawler AJ, Wang AZ, Phan BN, Annaldasula S, Brown AR, Lu T, Lim BK, Azim E, Clark NL, Meyer WK, Pond SLK, Chikina M, Yartsev MM, Pfenning AR, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. Vocal learning-associated convergent evolution in mammalian proteins and regulatory elements. Science 2024; 383:eabn3263. [PMID: 38422184 DOI: 10.1126/science.abn3263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 02/20/2024] [Indexed: 03/02/2024]
Abstract
Vocal production learning ("vocal learning") is a convergently evolved trait in vertebrates. To identify brain genomic elements associated with mammalian vocal learning, we integrated genomic, anatomical, and neurophysiological data from the Egyptian fruit bat (Rousettus aegyptiacus) with analyses of the genomes of 215 placental mammals. First, we identified a set of proteins evolving more slowly in vocal learners. Then, we discovered a vocal motor cortical region in the Egyptian fruit bat, an emergent vocal learner, and leveraged that knowledge to identify active cis-regulatory elements in the motor cortex of vocal learners. Machine learning methods applied to motor cortex open chromatin revealed 50 enhancers robustly associated with vocal learning whose activity tended to be lower in vocal learners. Our research implicates convergent losses of motor cortex regulatory elements in mammalian vocal learning evolution.
Collapse
Affiliation(s)
- Morgan E Wirthlin
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Tobias A Schmid
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Julie E Elie
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94708, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Xiaomeng Zhang
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Amanda Kowalczyk
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Ruby Redlich
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Varvara A Shvareva
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Ashley Rakuljic
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Maria B Ji
- Department of Psychology, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Ninad S Bhat
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Irene M Kaplow
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Daniel E Schäffer
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Alyssa J Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Andrew Z Wang
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - BaDoi N Phan
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Siddharth Annaldasula
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Ashley R Brown
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Tianyu Lu
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Byung Kook Lim
- Neurobiology section, Division of Biological Science, University of California, San Diego, La Jolla, CA 92093, USA
| | - Eiman Azim
- Molecular Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Nathan L Clark
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Wynn K Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | | | - Maria Chikina
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Michael M Yartsev
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94708, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Andreas R Pfenning
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Redlich R, Kowalczyk A, Tene M, Sestili HH, Foley K, Saputra E, Clark N, Chikina M, Meyer WK, Pfenning A. RERconverge Expansion: Using Relative Evolutionary Rates to Study Complex Categorical Trait Evolution. bioRxiv 2023:2023.12.06.570425. [PMID: 38106136 PMCID: PMC10723433 DOI: 10.1101/2023.12.06.570425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Comparative genomics approaches seek to associate evolutionary genetic changes with the evolution of phenotypes across a phylogeny. Many of these methods, including our evolutionary rates based method, RERconverge, lack the capability of analyzing non-ordinal, multicategorical traits. To address this limitation, we introduce an expansion to RERconverge that associates shifts in evolutionary rates with the convergent evolution of multi-categorical traits. The categorical RERconverge expansion includes methods for performing categorical ancestral state reconstruction, statistical tests for associating relative evolutionary rates with categorical variables, and a new method for performing phylogenetic permulations on multi-categorical traits. In addition to demonstrating our new method on a three-category diet phenotype, we compare its performance to naive pairwise binary RERconverge analyses and two existing methods for comparative genomic analyses of categorical traits: phylogenetic simulations and a phylogenetic signal based method. We also present a diagnostic analysis of the new permulations approach demonstrating how the method scales with the number of species and the number of categories included in the analysis. Our results show that our new categorical method outperforms phylogenetic simulations at identifying genes and enriched pathways significantly associated with the diet phenotype and that the new ancestral reconstruction drives an improvement in our ability to capture diet-related enriched pathways. Our categorical permulations were able to account for non-uniform null distributions and correct for non-independence in gene rank during pathway enrichment analysis. The categorical expansion to RERconverge will provide a strong foundation for applying the comparative method to categorical traits on larger data sets with more species and more complex trait evolution.
Collapse
|
4
|
Graham AM, Jamison JM, Bustos M, Cournoyer C, Michaels A, Presnell JS, Richter R, Crocker DE, Fustukjian A, Hunter ME, Rea LD, Marsillach J, Furlong CE, Meyer WK, Clark NL. Reduction of Paraoxonase Expression Followed by Inactivation across Independent Semiaquatic Mammals Suggests Stepwise Path to Pseudogenization. Mol Biol Evol 2023; 40:msad104. [PMID: 37146172 PMCID: PMC10202596 DOI: 10.1093/molbev/msad104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 03/27/2023] [Accepted: 04/17/2023] [Indexed: 05/07/2023] Open
Abstract
Convergent adaptation to the same environment by multiple lineages frequently involves rapid evolutionary change at the same genes, implicating these genes as important for environmental adaptation. Such adaptive molecular changes may yield either change or loss of protein function; loss of function can eliminate newly deleterious proteins or reduce energy necessary for protein production. We previously found a striking case of recurrent pseudogenization of the Paraoxonase 1 (Pon1) gene among aquatic mammal lineages-Pon1 became a pseudogene with genetic lesions, such as stop codons and frameshifts, at least four times independently in aquatic and semiaquatic mammals. Here, we assess the landscape and pace of pseudogenization by studying Pon1 sequences, expression levels, and enzymatic activity across four aquatic and semiaquatic mammal lineages: pinnipeds, cetaceans, otters, and beavers. We observe in beavers and pinnipeds an unexpected reduction in expression of Pon3, a paralog with similar expression patterns but different substrate preferences. Ultimately, in all lineages with aquatic/semiaquatic members, we find that preceding any coding-level pseudogenization events in Pon1, there is a drastic decrease in expression, followed by relaxed selection, thus allowing accumulation of disrupting mutations. The recurrent loss of Pon1 function in aquatic/semiaquatic lineages is consistent with a benefit to Pon1 functional loss in aquatic environments. Accordingly, we examine diving and dietary traits across pinniped species as potential driving forces of Pon1 functional loss. We find that loss is best associated with diving activity and likely results from changes in selective pressures associated with hypoxia and hypoxia-induced inflammation.
Collapse
Affiliation(s)
- Allie M Graham
- Department of Human Genetics, University of Utah, Salt Lake City, UT
| | - Jerrica M Jamison
- Department of Biological Sciences, University of Toronto—Scarborough, Scarborough, Ontario, Canada
| | - Marisol Bustos
- Department of Biomedical Engineering, University of Texas—San Antonio, San Antonio, TX
| | | | - Alexa Michaels
- Graduate School of Biomedical Sciences, Tufts University, Boston, MA
- The Jackson Laboratory, Bar Harbor, ME
| | - Jason S Presnell
- Department of Human Genetics, University of Utah, Salt Lake City, UT
| | - Rebecca Richter
- Department of Medicine, Division of Medical Genetics, University of Washington, Seattle, WA
| | - Daniel E Crocker
- Department of Biology, Sonoma State University, Rohnert Park, CA
| | | | - Margaret E Hunter
- U.S. Geological Survey, Wetland and Aquatic Research Center, Gainesville, FL
| | - Lorrie D Rea
- Water and Environmental Research Center, Institute of Northern Engineering, University of Alaska—Fairbanks, Fairbanks, AK
| | - Judit Marsillach
- Department of Environmental & Occupational Health Sciences, University of Washington School of Public Health, Seattle, WA
| | - Clement E Furlong
- Department of Medicine, Division of Medical Genetics, University of Washington, Seattle, WA
- Department of Genome Sciences, University of Washington, Seattle, WA
| | - Wynn K Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA
| | - Nathan L Clark
- Department of Human Genetics, University of Utah, Salt Lake City, UT
| |
Collapse
|
5
|
Christmas MJ, Kaplow IM, Genereux DP, Dong MX, Hughes GM, Li X, Sullivan PF, Hindle AG, Andrews G, Armstrong JC, Bianchi M, Breit AM, Diekhans M, Fanter C, Foley NM, Goodman DB, Goodman L, Keough KC, Kirilenko B, Kowalczyk A, Lawless C, Lind AL, Meadows JRS, Moreira LR, Redlich RW, Ryan L, Swofford R, Valenzuela A, Wagner F, Wallerman O, Brown AR, Damas J, Fan K, Gatesy J, Grimshaw J, Johnson J, Kozyrev SV, Lawler AJ, Marinescu VD, Morrill KM, Osmanski A, Paulat NS, Phan BN, Reilly SK, Schäffer DE, Steiner C, Supple MA, Wilder AP, Wirthlin ME, Xue JR, Birren BW, Gazal S, Hubley RM, Koepfli KP, Marques-Bonet T, Meyer WK, Nweeia M, Sabeti PC, Shapiro B, Smit AFA, Springer MS, Teeling EC, Weng Z, Hiller M, Levesque DL, Lewin HA, Murphy WJ, Navarro A, Paten B, Pollard KS, Ray DA, Ruf I, Ryder OA, Pfenning AR, Lindblad-Toh K, Karlsson EK, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. Evolutionary constraint and innovation across hundreds of placental mammals. Science 2023. [PMID: 37104599 DOI: 0.1126/science.abn3943] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Zoonomia is the largest comparative genomics resource for mammals produced to date. By aligning genomes for 240 species, we identify bases that, when mutated, are likely to affect fitness and alter disease risk. At least 332 million bases (~10.7%) in the human genome are unusually conserved across species (evolutionarily constrained) relative to neutrally evolving repeats, and 4552 ultraconserved elements are nearly perfectly conserved. Of 101 million significantly constrained single bases, 80% are outside protein-coding exons and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Changes in genes and regulatory elements are associated with exceptional mammalian traits, such as hibernation, that could inform therapeutic development. Earth's vast and imperiled biodiversity offers distinctive power for identifying genetic variants that affect genome function and organismal phenotypes.
Collapse
Affiliation(s)
- Matthew J Christmas
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Irene M Kaplow
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | - Michael X Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Graham M Hughes
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Xue Li
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Patrick F Sullivan
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Allyson G Hindle
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Gregory Andrews
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Joel C Armstrong
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matteo Bianchi
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ana M Breit
- School of Biology and Ecology, University of Maine, Orono, ME 04469, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Cornelia Fanter
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Nicole M Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Daniel B Goodman
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA 94143, USA
| | | | - Kathleen C Keough
- Fauna Bio, Inc., Emeryville, CA 94608, USA
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Bogdan Kirilenko
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | - Amanda Kowalczyk
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Colleen Lawless
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Abigail L Lind
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Jennifer R S Meadows
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Lucas R Moreira
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Ruby W Redlich
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Louise Ryan
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Ross Swofford
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Alejandro Valenzuela
- Department of Experimental and Health Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Franziska Wagner
- Museum of Zoology, Senckenberg Natural History Collections Dresden, 01109 Dresden, Germany
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ashley R Brown
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Joana Damas
- The Genome Center, University of California Davis, Davis, CA 95616, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
| | - Jenna Grimshaw
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Jeremy Johnson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Sergey V Kozyrev
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Alyssa J Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Voichita D Marinescu
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Kathleen M Morrill
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Austin Osmanski
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Nicole S Paulat
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - BaDoi N Phan
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Steven K Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Daniel E Schäffer
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Cynthia Steiner
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Megan A Supple
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Aryn P Wilder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Morgan E Wirthlin
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - James R Xue
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Bruce W Birren
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Steven Gazal
- Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | | | - Klaus-Peter Koepfli
- Center for Species Survival, Smithsonian's National Zoo and Conservation Biology Institute, Washington, DC 20008, USA
- Computer Technologies Laboratory, ITMO University, St. Petersburg 197101, Russia
- Smithsonian-Mason School of Conservation, George Mason University, Front Royal, VA 22630, USA
| | - Tomas Marques-Bonet
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08036 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, 08193 Cerdanyola del Vallès, Barcelona, Spain
| | - Wynn K Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | - Martin Nweeia
- Department of Comprehensive Care, School of Dental Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
- Department of Vertebrate Zoology, Canadian Museum of Nature, Ottawa, Ontario K2P 2R1, Canada
- Department of Vertebrate Zoology, Smithsonian Institution, Washington, DC 20002, USA
- Narwhal Genome Initiative, Department of Restorative Dentistry and Biomaterials Sciences, Harvard School of Dental Medicine, Boston, MA 02115, USA
| | - Pardis C Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA 02138, USA
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Mark S Springer
- Department of Evolution, Ecology and Organismal Biology, University of California Riverside, Riverside, CA 92521, USA
| | - Emma C Teeling
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Michael Hiller
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | | | - Harris A Lewin
- The Genome Center, University of California Davis, Davis, CA 95616, USA
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
- John Muir Institute for the Environment, University of California Davis, Davis, CA 95616, USA
| | - William J Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Arcadi Navarro
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation, 08005 Barcelona, Spain
- CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
| | - Benedict Paten
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Katherine S Pollard
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - David A Ray
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Irina Ruf
- Division of Messel Research and Mammalogy, Senckenberg Research Institute and Natural History Museum Frankfurt, 60325 Frankfurt am Main, Germany
| | - Oliver A Ryder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
- Department of Evolution, Behavior and Ecology, School of Biological Sciences, University of California San Diego, La Jolla, CA 92039, USA
| | - Andreas R Pfenning
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Kerstin Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Elinor K Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Kaplow IM, Lawler AJ, Schäffer DE, Srinivasan C, Sestili HH, Wirthlin ME, Phan BN, Prasad K, Brown AR, Zhang X, Foley K, Genereux DP, Karlsson EK, Lindblad-Toh K, Meyer WK, Pfenning AR, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. Relating enhancer genetic variation across mammals to complex phenotypes using machine learning. Science 2023; 380:eabm7993. [PMID: 37104615 DOI: 10.1126/science.abm7993] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Protein-coding differences between species often fail to explain phenotypic diversity, suggesting the involvement of genomic elements that regulate gene expression such as enhancers. Identifying associations between enhancers and phenotypes is challenging because enhancer activity can be tissue-dependent and functionally conserved despite low sequence conservation. We developed the Tissue-Aware Conservation Inference Toolkit (TACIT) to associate candidate enhancers with species' phenotypes using predictions from machine learning models trained on specific tissues. Applying TACIT to associate motor cortex and parvalbumin-positive interneuron enhancers with neurological phenotypes revealed dozens of enhancer-phenotype associations, including brain size-associated enhancers that interact with genes implicated in microcephaly or macrocephaly. TACIT provides a foundation for identifying enhancers associated with the evolution of any convergently evolved phenotype in any large group of species with aligned genomes.
Collapse
Affiliation(s)
- Irene M Kaplow
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Alyssa J Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Daniel E Schäffer
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Chaitanya Srinivasan
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Heather H Sestili
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Morgan E Wirthlin
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - BaDoi N Phan
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Kavya Prasad
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Ashley R Brown
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Xiaomeng Zhang
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Kathleen Foley
- Department of Biological Sciences, Lehigh University, Bethlehem, PA, USA
| | - Diane P Genereux
- Broad Institute, Cambridge, MA, USA
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Elinor K Karlsson
- Broad Institute, Cambridge, MA, USA
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Kerstin Lindblad-Toh
- Broad Institute, Cambridge, MA, USA
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Wynn K Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA, USA
| | - Andreas R Pfenning
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Christmas MJ, Kaplow IM, Genereux DP, Dong MX, Hughes GM, Li X, Sullivan PF, Hindle AG, Andrews G, Armstrong JC, Bianchi M, Breit AM, Diekhans M, Fanter C, Foley NM, Goodman DB, Goodman L, Keough KC, Kirilenko B, Kowalczyk A, Lawless C, Lind AL, Meadows JRS, Moreira LR, Redlich RW, Ryan L, Swofford R, Valenzuela A, Wagner F, Wallerman O, Brown AR, Damas J, Fan K, Gatesy J, Grimshaw J, Johnson J, Kozyrev SV, Lawler AJ, Marinescu VD, Morrill KM, Osmanski A, Paulat NS, Phan BN, Reilly SK, Schäffer DE, Steiner C, Supple MA, Wilder AP, Wirthlin ME, Xue JR, Birren BW, Gazal S, Hubley RM, Koepfli KP, Marques-Bonet T, Meyer WK, Nweeia M, Sabeti PC, Shapiro B, Smit AFA, Springer MS, Teeling EC, Weng Z, Hiller M, Levesque DL, Lewin HA, Murphy WJ, Navarro A, Paten B, Pollard KS, Ray DA, Ruf I, Ryder OA, Pfenning AR, Lindblad-Toh K, Karlsson EK. Evolutionary constraint and innovation across hundreds of placental mammals. Science 2023; 380:eabn3943. [PMID: 37104599 PMCID: PMC10250106 DOI: 10.1126/science.abn3943] [Citation(s) in RCA: 34] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 12/16/2022] [Indexed: 04/29/2023]
Abstract
Zoonomia is the largest comparative genomics resource for mammals produced to date. By aligning genomes for 240 species, we identify bases that, when mutated, are likely to affect fitness and alter disease risk. At least 332 million bases (~10.7%) in the human genome are unusually conserved across species (evolutionarily constrained) relative to neutrally evolving repeats, and 4552 ultraconserved elements are nearly perfectly conserved. Of 101 million significantly constrained single bases, 80% are outside protein-coding exons and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Changes in genes and regulatory elements are associated with exceptional mammalian traits, such as hibernation, that could inform therapeutic development. Earth's vast and imperiled biodiversity offers distinctive power for identifying genetic variants that affect genome function and organismal phenotypes.
Collapse
Affiliation(s)
- Matthew J. Christmas
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Irene M. Kaplow
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | - Michael X. Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Graham M. Hughes
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Xue Li
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Patrick F. Sullivan
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Allyson G. Hindle
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Gregory Andrews
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Joel C. Armstrong
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matteo Bianchi
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ana M. Breit
- School of Biology and Ecology, University of Maine, Orono, ME 04469, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Cornelia Fanter
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Nicole M. Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Daniel B. Goodman
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA 94143, USA
| | | | - Kathleen C. Keough
- Fauna Bio, Inc., Emeryville, CA 94608, USA
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Bogdan Kirilenko
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | - Amanda Kowalczyk
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Colleen Lawless
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Abigail L. Lind
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Jennifer R. S. Meadows
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Lucas R. Moreira
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Ruby W. Redlich
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Louise Ryan
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Ross Swofford
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Alejandro Valenzuela
- Department of Experimental and Health Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Franziska Wagner
- Museum of Zoology, Senckenberg Natural History Collections Dresden, 01109 Dresden, Germany
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ashley R. Brown
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Joana Damas
- The Genome Center, University of California Davis, Davis, CA 95616, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
| | - Jenna Grimshaw
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Jeremy Johnson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Sergey V. Kozyrev
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Alyssa J. Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Voichita D. Marinescu
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Kathleen M. Morrill
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Austin Osmanski
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Nicole S. Paulat
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - BaDoi N. Phan
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Steven K. Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Daniel E. Schäffer
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Cynthia Steiner
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Megan A. Supple
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Aryn P. Wilder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Morgan E. Wirthlin
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - James R. Xue
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | | | - Bruce W. Birren
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Steven Gazal
- Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | | | - Klaus-Peter Koepfli
- Center for Species Survival, Smithsonian’s National Zoo and Conservation Biology Institute, Washington, DC 20008, USA
- Computer Technologies Laboratory, ITMO University, St. Petersburg 197101, Russia
- Smithsonian-Mason School of Conservation, George Mason University, Front Royal, VA 22630, USA
| | - Tomas Marques-Bonet
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08036 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, 08193 Cerdanyola del Vallès, Barcelona, Spain
| | - Wynn K. Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | - Martin Nweeia
- Department of Comprehensive Care, School of Dental Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
- Department of Vertebrate Zoology, Canadian Museum of Nature, Ottawa, Ontario K2P 2R1, Canada
- Department of Vertebrate Zoology, Smithsonian Institution, Washington, DC 20002, USA
- Narwhal Genome Initiative, Department of Restorative Dentistry and Biomaterials Sciences, Harvard School of Dental Medicine, Boston, MA 02115, USA
| | - Pardis C. Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA 02138, USA
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Mark S. Springer
- Department of Evolution, Ecology and Organismal Biology, University of California Riverside, Riverside, CA 92521, USA
| | - Emma C. Teeling
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Michael Hiller
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | | | - Harris A. Lewin
- The Genome Center, University of California Davis, Davis, CA 95616, USA
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
- John Muir Institute for the Environment, University of California Davis, Davis, CA 95616, USA
| | - William J. Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Arcadi Navarro
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation, 08005 Barcelona, Spain
- CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
| | - Benedict Paten
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Katherine S. Pollard
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - David A. Ray
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Irina Ruf
- Division of Messel Research and Mammalogy, Senckenberg Research Institute and Natural History Museum Frankfurt, 60325 Frankfurt am Main, Germany
| | - Oliver A. Ryder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
- Department of Evolution, Behavior and Ecology, School of Biological Sciences, University of California San Diego, La Jolla, CA 92039, USA
| | - Andreas R. Pfenning
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Kerstin Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Elinor K. Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| |
Collapse
|
8
|
Kirilenko BM, Munegowda C, Osipova E, Jebb D, Sharma V, Blumer M, Morales AE, Ahmed AW, Kontopoulos DG, Hilgers L, Lindblad-Toh K, Karlsson EK, Hiller M, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. Integrating gene annotation with orthology inference at scale. Science 2023; 380:eabn3107. [PMID: 37104600 DOI: 10.1126/science.abn3107] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Annotating coding genes and inferring orthologs are two classical challenges in genomics and evolutionary biology that have traditionally been approached separately, limiting scalability. We present TOGA (Tool to infer Orthologs from Genome Alignments), a method that integrates structural gene annotation and orthology inference. TOGA implements a different paradigm to infer orthologous loci, improves ortholog detection and annotation of conserved genes compared with state-of-the-art methods, and handles even highly fragmented assemblies. TOGA scales to hundreds of genomes, which we demonstrate by applying it to 488 placental mammal and 501 bird assemblies, creating the largest comparative gene resources so far. Additionally, TOGA detects gene losses, enables selection screens, and automatically provides a superior measure of mammalian genome quality. TOGA is a powerful and scalable method to annotate and compare genes in the genomic era.
Collapse
Affiliation(s)
- Bogdan M Kirilenko
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Chetan Munegowda
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Ekaterina Osipova
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - David Jebb
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
| | - Virag Sharma
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
| | - Moritz Blumer
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
| | - Ariadna E Morales
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Alexis-Walid Ahmed
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Dimitrios-Georgios Kontopoulos
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Leon Hilgers
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Kerstin Lindblad-Toh
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 751 32 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Elinor K Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Michael Hiller
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Wilder AP, Supple MA, Subramanian A, Mudide A, Swofford R, Serres-Armero A, Steiner C, Koepfli KP, Genereux DP, Karlsson EK, Lindblad-Toh K, Marques-Bonet T, Munoz Fuentes V, Foley K, Meyer WK, Ryder OA, Shapiro B, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. The contribution of historical processes to contemporary extinction risk in placental mammals. Science 2023; 380:eabn5856. [PMID: 37104572 PMCID: PMC10184782 DOI: 10.1126/science.abn5856] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Species persistence can be influenced by the amount, type, and distribution of diversity across the genome, suggesting a potential relationship between historical demography and resilience. In this study, we surveyed genetic variation across single genomes of 240 mammals that compose the Zoonomia alignment to evaluate how historical effective population size (Ne) affects heterozygosity and deleterious genetic load and how these factors may contribute to extinction risk. We find that species with smaller historical Ne carry a proportionally larger burden of deleterious alleles owing to long-term accumulation and fixation of genetic load and have a higher risk of extinction. This suggests that historical demography can inform contemporary resilience. Models that included genomic data were predictive of species' conservation status, suggesting that, in the absence of adequate census or ecological data, genomic information may provide an initial risk assessment.
Collapse
Affiliation(s)
- Aryn P Wilder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Megan A Supple
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California, Santa Cruz, CA 95064, USA
| | | | | | - Ross Swofford
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Aitor Serres-Armero
- Institute of Evolutionary Biology, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Cynthia Steiner
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Klaus-Peter Koepfli
- Smithsonian-Mason School of Conservation, George Mason University, Front Royal, VA 22630, USA
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, DC 30008, USA
- Computer Technologies Laboratory, ITMO University, St. Petersburg 197101, Russia
| | | | - Elinor K Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala 751 32, Sweden
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona 08003, Spain
- Catalan Institution of Research and Advanced Studies, Barcelona 08010, Spain
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona 08028, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona 08193, Spain
| | - Violeta Munoz Fuentes
- European Molecular Biology Laboratory-European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Kathleen Foley
- College of Law, University of Iowa, Iowa City, IA 52242, USA
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | - Wynn K Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | - Oliver A Ryder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
- Department of Evolution, Behavior and Ecology, Division of Biology, University of California, San Diego, La Jolla, CA 92039, USA
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California, Santa Cruz, CA 95064, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Andrews G, Fan K, Pratt HE, Phalke N, Karlsson EK, Lindblad-Toh K, Gazal S, Moore JE, Weng Z, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. Mammalian evolution of human cis-regulatory elements and transcription factor binding sites. Science 2023; 380:eabn7930. [PMID: 37104580 DOI: 10.1126/science.abn7930] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Understanding the regulatory landscape of the human genome is a long-standing objective of modern biology. Using the reference-free alignment across 241 mammalian genomes produced by the Zoonomia Consortium, we charted evolutionary trajectories for 0.92 million human candidate cis-regulatory elements (cCREs) and 15.6 million human transcription factor binding sites (TFBSs). We identified 439,461 cCREs and 2,024,062 TFBSs under evolutionary constraint. Genes near constrained elements perform fundamental cellular processes, whereas genes near primate-specific elements are involved in environmental interaction, including odor perception and immune response. About 20% of TFBSs are transposable element-derived and exhibit intricate patterns of gains and losses during primate evolution whereas sequence variants associated with complex traits are enriched in constrained TFBSs. Our annotations illuminate the regulatory functions of the human genome.
Collapse
Affiliation(s)
- Gregory Andrews
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Henry E Pratt
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Nishigandha Phalke
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Elinor K Karlsson
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75132 Uppsala, Sweden
| | - Steven Gazal
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Foley NM, Mason VC, Harris AJ, Bredemeyer KR, Damas J, Lewin HA, Eizirik E, Gatesy J, Karlsson EK, Lindblad-Toh K, Springer MS, Murphy WJ, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. A genomic timescale for placental mammal evolution. Science 2023; 380:eabl8189. [PMID: 37104581 DOI: 10.1126/science.abl8189] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
The precise pattern and timing of speciation events that gave rise to all living placental mammals remain controversial. We provide a comprehensive phylogenetic analysis of genetic variation across an alignment of 241 placental mammal genome assemblies, addressing prior concerns regarding limited genomic sampling across species. We compared neutral genome-wide phylogenomic signals using concatenation and coalescent-based approaches, interrogated phylogenetic variation across chromosomes, and analyzed extensive catalogs of structural variants. Interordinal relationships exhibit relatively low rates of phylogenomic conflict across diverse datasets and analytical methods. Conversely, X-chromosome versus autosome conflicts characterize multiple independent clades that radiated during the Cenozoic. Genomic time trees reveal an accumulation of cladogenic events before and immediately after the Cretaceous-Paleogene (K-Pg) boundary, implying important roles for Cretaceous continental vicariance and the K-Pg extinction in the placental radiation.
Collapse
Affiliation(s)
- Nicole M Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
| | - Victor C Mason
- Institute of Cell Biology, University of Bern, Bern, Switzerland
| | - Andrew J Harris
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, TX, USA
| | - Kevin R Bredemeyer
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, TX, USA
| | - Joana Damas
- The Genome Center, University of California, Davis, CA, USA
| | - Harris A Lewin
- The Genome Center, University of California, Davis, CA, USA
- Department of Evolution and Ecology, University of California, Davis, CA, USA
| | - Eduardo Eizirik
- School of Health and Life Sciences, Pontifical Catholic University of Rio Grande do Sul, Porto Alegre, Brazil
| | - John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY, USA
| | - Elinor K Karlsson
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Molecular Medicine, University of Massachussetts Chan Medical School, Worcester, MA 01605, USA
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Mark S Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA, USA
| | - William J Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, TX, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Xue JR, Mackay-Smith A, Mouri K, Garcia MF, Dong MX, Akers JF, Noble M, Li X, Lindblad-Toh K, Karlsson EK, Noonan JP, Capellini TD, Brennand KJ, Tewhey R, Sabeti PC, Reilly SK, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. The functional and evolutionary impacts of human-specific deletions in conserved elements. Science 2023; 380:eabn2253. [PMID: 37104592 DOI: 10.1126/science.abn2253] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Conserved genomic sequences disrupted in humans may underlie uniquely human phenotypic traits. We identified and characterized 10,032 human-specific conserved deletions (hCONDELs). These short (average 2.56 base pairs) deletions are enriched for human brain functions across genetic, epigenomic, and transcriptomic datasets. Using massively parallel reporter assays in six cell types, we discovered 800 hCONDELs conferring significant differences in regulatory activity, half of which enhance rather than disrupt regulatory function. We highlight several hCONDELs with putative human-specific effects on brain development, including HDAC5, CPEB4, and PPP2CA. Reverting an hCONDEL to the ancestral sequence alters the expression of LOXL2 and developmental genes involved in myelination and synaptic function. Our data provide a rich resource to investigate the evolutionary mechanisms driving new traits in humans and other species.
Collapse
Affiliation(s)
- James R Xue
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for System Biology, Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Ava Mackay-Smith
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | | | | | - Michael X Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Jared F Akers
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Mark Noble
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Xue Li
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA, USA
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Elinor K Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA, USA
| | - James P Noonan
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, USA
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
| | - Terence D Capellini
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Kristen J Brennand
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Department of Psychiatry, Yale University, New Haven, CT, USA
| | - Ryan Tewhey
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
- Graduate School of Biomedical Sciences Tufts University School of Medicine, Boston, MA, USA
| | - Pardis C Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for System Biology, Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Immunology and Infectious Disease, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Steven K Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Martí-Carreras J, Gener AR, Miller SD, Brito AF, Camacho CE, Connor R, Deboutte W, Glickman C, Kristensen DM, Meyer WK, Modha S, Norris AL, Saha S, Belford AK, Biederstedt E, Brister JR, Buchmann JP, Cooley NP, Edwards RA, Javkar K, Muchow M, Muralidharan HS, Pepe-Ranney C, Shah N, Shakya M, Tisza MJ, Tully BJ, Vanmechelen B, Virta VC, Weissman JL, Zalunin V, Efremov A, Busby B. NCBI's Virus Discovery Codeathon: Building "FIVE" -The Federated Index of Viral Experiments API Index. Viruses 2020; 12:v12121424. [PMID: 33322070 PMCID: PMC7764237 DOI: 10.3390/v12121424] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 12/02/2020] [Indexed: 02/05/2023] Open
Abstract
Viruses represent important test cases for data federation due to their genome size and the rapid increase in sequence data in publicly available databases. However, some consequences of previously decentralized (unfederated) data are lack of consensus or comparisons between feature annotations. Unifying or displaying alternative annotations should be a priority both for communities with robust entry representation and for nascent communities with burgeoning data sources. To this end, during this three-day continuation of the Virus Hunting Toolkit codeathon series (VHT-2), a new integrated and federated viral index was elaborated. This Federated Index of Viral Experiments (FIVE) integrates pre-existing and novel functional and taxonomy annotations and virus–host pairings. Variability in the context of viral genomic diversity is often overlooked in virus databases. As a proof-of-concept, FIVE was the first attempt to include viral genome variation for HIV, the most well-studied human pathogen, through viral genome diversity graphs. As per the publication of this manuscript, FIVE is the first implementation of a virus-specific federated index of such scope. FIVE is coded in BigQuery for optimal access of large quantities of data and is publicly accessible. Many projects of database or index federation fail to provide easier alternatives to access or query information. To this end, a Python API query system was developed to enhance the accessibility of FIVE.
Collapse
Affiliation(s)
- Joan Martí-Carreras
- Laboratory of Clinical and Epidemiological Virology, KU Leuven Department of Microbiology, Immunology and Transplantation, Rega Institute, BE3000 Leuven, Belgium; (W.D.); (C.G.); (B.V.)
- Correspondence: (J.M.-C); (A.R.G.); (R.C.); (B.B.)
| | - Alejandro Rafael Gener
- Integrative Molecular and Biomedical Sciences Program, Baylor College of Medicine, Houston, TX 77030, USA
- Margaret M. and Albert B. Alkek Department of Medicine, Nephrology, Baylor College of Medicine, Houston, TX 77030, USA
- Department of Genetics, MD Anderson Cancer Center, Houston, TX 77030, USA
- School of Medicine, Universidad Central del Caribe, Bayamón, PR 00960, USA
- Correspondence: (J.M.-C); (A.R.G.); (R.C.); (B.B.)
| | - Sierra D. Miller
- Genetics & Molecular Biology, Millersville University, 40 Dilworth Rd, Millersville, PA 17551, USA;
| | - Anderson F. Brito
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health (YSPH), 60 College Street, New Haven, CT 06510, USA;
| | - Christiam E. Camacho
- National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health, 9000 Rockville Pike, Bethesda, MD 20894, USA; (C.E.C.); (J.R.B.); (V.Z.); (A.E.)
| | - Ryan Connor
- National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health, 9000 Rockville Pike, Bethesda, MD 20894, USA; (C.E.C.); (J.R.B.); (V.Z.); (A.E.)
- Correspondence: (J.M.-C); (A.R.G.); (R.C.); (B.B.)
| | - Ward Deboutte
- Laboratory of Clinical and Epidemiological Virology, KU Leuven Department of Microbiology, Immunology and Transplantation, Rega Institute, BE3000 Leuven, Belgium; (W.D.); (C.G.); (B.V.)
| | - Cody Glickman
- Laboratory of Clinical and Epidemiological Virology, KU Leuven Department of Microbiology, Immunology and Transplantation, Rega Institute, BE3000 Leuven, Belgium; (W.D.); (C.G.); (B.V.)
| | - David M. Kristensen
- Computational Bioscience Program, University of Colorado Anschutz, Aurora, CO 80045, USA;
| | - Wynn K. Meyer
- AAAS Science and Technology Policy Fellow, Office of Data Science Strategy, Division of Program Coordination, Planning, and Strategic Initiatives, Office of the Director, National Institutes of Health, 31 Center Dr., Bethesda, MD 20894, USA;
| | - Sejal Modha
- MRC-University of Glasgow Centre for Virus Research, Glasgow G61 1QH, UK;
| | - Alexis L. Norris
- Biotechnology Graduate Program, University of Maryland Global Campus, 1616 McCormick Drive, Largo, MD 20774, USA;
| | - Surya Saha
- Boyce Thompson Institute, Ithaca, NY 14850, USA;
- School of Animal and Comparative Biomedical Sciences, The University of Arizona, Tucson, AZ 85721, USA
| | - Anna K. Belford
- Laboratory of Cellular Oncology, National Cancer Institute, 37 Convent Dr., Bethesda, MD 20894, USA; (A.K.B.); (M.J.T.)
| | - Evan Biederstedt
- Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA;
| | - James Rodney Brister
- National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health, 9000 Rockville Pike, Bethesda, MD 20894, USA; (C.E.C.); (J.R.B.); (V.Z.); (A.E.)
| | - Jan P. Buchmann
- School of Life and Environmental Sciences and School of Medical Sciences, Marie Bashir Institute for Infectious Diseases and Biosecurity, The University of Sydney, Sydney, Australia;
| | - Nicholas P. Cooley
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15260, USA;
| | - Robert A. Edwards
- College of Science and Engineering, Flinders University, Bedford Park, SA 5042, Australia;
| | - Kiran Javkar
- Department of Computer Science, University of Maryland, College Park, MD 20740, USA; (K.J.); (H.S.M.); (N.S.)
- Joint Institute for Food Safety and Applied Nutrition, University of Maryland, College Park, MD 20740, USA
| | - Michael Muchow
- Novel Microdevices, Nucleic Acids, Baltimore, MD 21202, USA;
| | - Harihara Subrahmaniam Muralidharan
- Department of Computer Science, University of Maryland, College Park, MD 20740, USA; (K.J.); (H.S.M.); (N.S.)
- Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20740, USA
| | | | - Nidhi Shah
- Department of Computer Science, University of Maryland, College Park, MD 20740, USA; (K.J.); (H.S.M.); (N.S.)
| | - Migun Shakya
- Bioscience Division, Bikini Atoll Road, Los Alamos National Laboratory, Los Alamos, NM 87545, USA;
| | - Michael J. Tisza
- Laboratory of Cellular Oncology, National Cancer Institute, 37 Convent Dr., Bethesda, MD 20894, USA; (A.K.B.); (M.J.T.)
| | - Benjamin J. Tully
- Center for Dark Energy Biosphere Investigations, University of Southern California, Los Angeles, CA 90089, USA;
| | - Bert Vanmechelen
- Laboratory of Clinical and Epidemiological Virology, KU Leuven Department of Microbiology, Immunology and Transplantation, Rega Institute, BE3000 Leuven, Belgium; (W.D.); (C.G.); (B.V.)
| | - Valerie C. Virta
- AAAS Science & Technology Policy Fellow, National Institutes of Health, Center for Information Technology, 6555 Rock Spring Drive, Bethesda, MD 20817, USA;
| | - JL Weissman
- Department of Marine and Environmental Biology, University of Southern California, Los Angeles, CA 90089, USA;
| | - Vadim Zalunin
- National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health, 9000 Rockville Pike, Bethesda, MD 20894, USA; (C.E.C.); (J.R.B.); (V.Z.); (A.E.)
| | - Alexandre Efremov
- National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health, 9000 Rockville Pike, Bethesda, MD 20894, USA; (C.E.C.); (J.R.B.); (V.Z.); (A.E.)
| | - Ben Busby
- National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health, 9000 Rockville Pike, Bethesda, MD 20894, USA; (C.E.C.); (J.R.B.); (V.Z.); (A.E.)
- DNANexus, 1975 W El Camino Real #204, Mountain View, CA 94040, USA
- Correspondence: (J.M.-C); (A.R.G.); (R.C.); (B.B.)
| |
Collapse
|
14
|
Kowalczyk A, Meyer WK, Partha R, Mao W, Clark NL, Chikina M. RERconverge: an R package for associating evolutionary rates with convergent traits. Bioinformatics 2020; 35:4815-4817. [PMID: 31192356 DOI: 10.1093/bioinformatics/btz468] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 04/08/2019] [Accepted: 06/06/2019] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION When different lineages of organisms independently adapt to similar environments, selection often acts repeatedly upon the same genes, leading to signatures of convergent evolutionary rate shifts at these genes. With the increasing availability of genome sequences for organisms displaying a variety of convergent traits, the ability to identify genes with such convergent rate signatures would enable new insights into the molecular basis of these traits. RESULTS Here we present the R package RERconverge, which tests for association between relative evolutionary rates of genes and the evolution of traits across a phylogeny. RERconverge can perform associations with binary and continuous traits, and it contains tools for visualization and enrichment analyses of association results. AVAILABILITY AND IMPLEMENTATION RERconverge source code, documentation and a detailed usage walk-through are freely available at https://github.com/nclark-lab/RERconverge. Datasets for mammals, Drosophila and yeast are available at https://bit.ly/2J2QBnj. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Amanda Kowalczyk
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15213, USA.,Joint Carnegie Mellon University-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, PA 15213, USA
| | - Wynn K Meyer
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Raghavendran Partha
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15213, USA.,Joint Carnegie Mellon University-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, PA 15213, USA
| | - Weiguang Mao
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15213, USA.,Joint Carnegie Mellon University-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, PA 15213, USA
| | - Nathan L Clark
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15213, USA.,Joint Carnegie Mellon University-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, PA 15213, USA
| | - Maria Chikina
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15213, USA.,Joint Carnegie Mellon University-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, PA 15213, USA
| |
Collapse
|
15
|
Meyer WK, Jamison J, Richter R, Woods SE, Partha R, Kowalczyk A, Kronk C, Chikina M, Bonde RK, Crocker DE, Gaspard J, Lanyon JM, Marsillach J, Furlong CE, Clark NL. Ancient convergent losses of Paraoxonase 1 yield potential risks for modern marine mammals. Science 2018; 361:591-594. [PMID: 30093596 DOI: 10.1126/science.aap7714] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Accepted: 06/29/2018] [Indexed: 01/17/2023]
Abstract
Mammals diversified by colonizing drastically different environments, with each transition yielding numerous molecular changes, including losses of protein function. Though not initially deleterious, these losses could subsequently carry deleterious pleiotropic consequences. We have used phylogenetic methods to identify convergent functional losses across independent marine mammal lineages. In one extreme case, Paraoxonase 1 (PON1) accrued lesions in all marine lineages, while remaining intact in all terrestrial mammals. These lesions coincide with PON1 enzymatic activity loss in marine species' blood plasma. This convergent loss is likely explained by parallel shifts in marine ancestors' lipid metabolism and/or bloodstream oxidative environment affecting PON1's role in fatty acid oxidation. PON1 loss also eliminates marine mammals' main defense against neurotoxicity from specific man-made organophosphorus compounds, implying potential risks in modern environments.
Collapse
Affiliation(s)
- Wynn K Meyer
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Jerrica Jamison
- Dietrich School of Arts and Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Rebecca Richter
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Stacy E Woods
- Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
| | - Raghavendran Partha
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Amanda Kowalczyk
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Charles Kronk
- Dietrich School of Arts and Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Maria Chikina
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Robert K Bonde
- Wetland and Aquatic Research Center, U.S. Geological Survey, Gainesville, FL, USA
| | - Daniel E Crocker
- Department of Biology, Sonoma State University, Rohnert Park, CA, USA
| | | | - Janet M Lanyon
- School of Biological Sciences, The University of Queensland, St. Lucia, QLD 4072, Australia
| | - Judit Marsillach
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Clement E Furlong
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA.,Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Nathan L Clark
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA. .,Pittsburgh Center for Evolutionary Biology and Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
16
|
Meyer WK, Venkat A, Kermany AR, van de Geijn B, Zhang S, Przeworski M. Evolutionary history inferred from the de novo assembly of a nonmodel organism, the blue-eyed black lemur. Mol Ecol 2015. [PMID: 26198179 PMCID: PMC4557055 DOI: 10.1111/mec.13327] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Lemurs, the living primates most distantly related to humans, demonstrate incredible diversity in behaviour, life history patterns and adaptive traits. Although many lemur species are endangered within their native Madagascar, there is no high-quality genome assembly from this taxon, limiting population and conservation genetic studies. One critically endangered lemur is the blue-eyed black lemur Eulemur flavifrons. This species is fixed for blue irises, a convergent trait that evolved at least four times in primates and was subject to positive selection in humans, where 5′ regulatory variation of OCA2 explains most of the brown/blue eye colour differences. We built a de novo genome assembly for E. flavifrons, providing the most complete lemur genome to date, and a high confidence consensus sequence for close sister species E. macaco, the (brown-eyed) black lemur. From diversity and divergence patterns across the genomes, we estimated a recent split time of the two species (160 Kya) and temporal fluctuations in effective population sizes that accord with known environmental changes. By looking for regions of unusually low diversity, we identified potential signals of directional selection in E. flavifrons at MITF, a melanocyte development gene that regulates OCA2 and has previously been associated with variation in human iris colour, as well as at several other genes involved in melanin biosynthesis in mammals. Our study thus illustrates how whole-genome sequencing of a few individuals can illuminate the demographic and selection history of nonmodel species.
Collapse
Affiliation(s)
- Wynn K Meyer
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Aarti Venkat
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Amir R Kermany
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.,Howard Hughes Medical Institute, University of Chicago, Chicago, IL, 60637, USA
| | - Bryce van de Geijn
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, 60637, USA
| | - Sidi Zhang
- Biological Sciences Collegiate Division, University of Chicago, Chicago, IL, 60637, USA
| | - Molly Przeworski
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.,Howard Hughes Medical Institute, University of Chicago, Chicago, IL, 60637, USA.,Department of Ecology and Evolution, University of Chicago, Chicago, IL, 60637, USA
| |
Collapse
|
17
|
Meyer WK, Zhang S, Hayakawa S, Imai H, Przeworski M. The convergent evolution of blue iris pigmentation in primates took distinct molecular paths. Am J Phys Anthropol 2013; 151:398-407. [PMID: 23640739 PMCID: PMC3746105 DOI: 10.1002/ajpa.22280] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/03/2012] [Accepted: 03/24/2013] [Indexed: 12/18/2022]
Abstract
How many distinct molecular paths lead to the same phenotype? One approach to this question has been to examine the genetic basis of convergent traits, which likely evolved repeatedly under a shared selective pressure. We investigated the convergent phenotype of blue iris pigmentation, which has arisen independently in four primate lineages: humans, blue-eyed black lemurs, Japanese macaques, and spider monkeys. Characterizing the phenotype across these species, we found that the variation within the blue-eyed subsets of each species occupies strongly overlapping regions of CIE L*a*b* color space. Yet whereas Japanese macaques and humans display continuous variation, the phenotypes of blue-eyed black lemurs and their sister species (whose irises are brown) occupy more clustered subspaces. Variation in an enhancer of OCA2 is primarily responsible for the phenotypic difference between humans with blue and brown irises. In the orthologous region, we found no variant that distinguishes the two lemur species or associates with quantitative phenotypic variation in Japanese macaques. Given the high similarity between the blue iris phenotypes in these species and that in humans, this finding implies that evolution has used different molecular paths to reach the same end. Am J Phys Anthropol 151:398–407, 2013.© 2013 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Wynn K Meyer
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.
| | | | | | | | | |
Collapse
|
18
|
Leffler EM, Bullaughey K, Matute DR, Meyer WK, Ségurel L, Venkat A, Andolfatto P, Przeworski M. Revisiting an old riddle: what determines genetic diversity levels within species? PLoS Biol 2012; 10:e1001388. [PMID: 22984349 PMCID: PMC3439417 DOI: 10.1371/journal.pbio.1001388] [Citation(s) in RCA: 317] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Understanding why some species have more genetic diversity than others is central to the study of ecology and evolution, and carries potentially important implications for conservation biology. Yet not only does this question remain unresolved, it has largely fallen into disregard. With the rapid decrease in sequencing costs, we argue that it is time to revive it.
Collapse
Affiliation(s)
- Ellen M. Leffler
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (EML); (MP)
| | - Kevin Bullaughey
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Daniel R. Matute
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Wynn K. Meyer
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Laure Ségurel
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Howard Hughes Medical Institute, University of Chicago, Chicago, Illinois, United States of America
| | - Aarti Venkat
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Peter Andolfatto
- Department of Ecology and Evolutionary Biology and the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Molly Przeworski
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
- Howard Hughes Medical Institute, University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (EML); (MP)
| |
Collapse
|
19
|
Herchenröder O, Hahne JC, Meyer WK, Thiesen HJ, Schneider J. Repression of the human immunodeficiency virus type 1 promoter by the human KRAB domain results in inhibition of virus production. Biochim Biophys Acta 1999; 1445:216-23. [PMID: 10320774 DOI: 10.1016/s0167-4781(99)00046-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
The Krüppel-associated box (KRAB) domain has been described as a eukaryotic repressor of transcription. We show that fusion of KRAB to DNA-binding-domains provides a novel approach to inhibit expression of a replication-competent human immunodeficiency virus (HIV) genome. The KRAB domain from the human zinc finger protein KOX1 was combined with the DNA binding domain of the Escherichia coli tetracycline repressor (TetR). Constitutive expression of the TetR-KRAB protein in HeLa cells inhibited virus production from an HIV genome encoding TetR target sequences by 80%. The same inhibition was observed with HIV-promoter-driven reporter plasmids. The specificity of inhibition was shown with informative KRAB mutants, plasmids lacking the respective target sequences, and by reversal of the TetR-KRAB-mediated inhibition with tetracycline. Virus production was suppressed by binding of TetR-KRAB at a distance of 6 kbp to the promoter. We therefore conclude that any site of the genuine HIV genome could serve as target of a chimeric KRAB repressor protein. Specific targeting of the KRAB domain by artificially selected binding domains may be generally applicable to control transcription in mammalian cells.
Collapse
Affiliation(s)
- O Herchenröder
- Abteilung Virologie, Institut für Medizinische Mikrobiologie und Hygiene, Universität Freiburg, Hermann-Herder-Str. 11, P.O. Box 820, 79008, Freiburg, Germany
| | | | | | | | | |
Collapse
|
20
|
Meyer WK, Reichenbach P, Schindler U, Soldaini E, Nabholz M. Interaction of STAT5 dimers on two low affinity binding sites mediates interleukin 2 (IL-2) stimulation of IL-2 receptor alpha gene transcription. J Biol Chem 1997; 272:31821-8. [PMID: 9395528 DOI: 10.1074/jbc.272.50.31821] [Citation(s) in RCA: 68] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Stimulation of the interleukin 2 receptor alpha (IL-2Ralpha) gene by IL-2 is important for the proliferation of antigen-activated T lymphocytes. IL-2 regulates IL-2Ralpha transcription via a conserved 51-nucleotide IL-2 responsive enhancer. Mouse enhancer function depends on cooperative activity of three distinct sites. Two of these are weak binding sites for IL-2-activated STAT5 (signal transducer and activator of transcription) proteins, and mutational analysis indicates that binding of STAT5 to both sites is required for IL-2 responsiveness of the enhancer. The STAT5 dimers interact to form a STAT5 tetramer. The efficiency of tetramerization depends on the relative rotational orientation of the two STAT motifs on the DNA helix. STAT5 tetramerization on enhancer mutants correlates well with the IL-2 responsiveness of these mutants. This provides strong evidence that interactions between STAT dimers binding to a pair of weak binding sites play a biological role by controlling the activity of a well characterized, complex cytokine-responsive enhancer.
Collapse
Affiliation(s)
- W K Meyer
- Lymphocyte Biology Unit, Swiss Institute for Experimental Cancer Research (ISREC), 155 Chemin des Boveresses, CH-1066 Epalinges, Switzerland
| | | | | | | | | |
Collapse
|
21
|
Vissing H, Meyer WK, Aagaard L, Tommerup N, Thiesen HJ. Repression of transcriptional activity by heterologous KRAB domains present in zinc finger proteins. FEBS Lett 1995; 369:153-7. [PMID: 7649249 DOI: 10.1016/0014-5793(95)00728-r] [Citation(s) in RCA: 126] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
We report the characterization of three novel members of the KRAB-domain containing C2-H2 zinc finger family (ZNF133, 136 and 140). KRAB (Krüppel-associated box) is an evolutionarily conserved protein domain found N-terminally with respect to the zinc finger repeats that encodes the DNA binding domain. ZNF133 and ZNF140 have both the KRAB A- and KRAB B-boxes present at their N-terminus, whereas ZNF136 contains only the KRAB A-box. We have previously demonstrated that the KRAB domains derived from ZNF133 and ZNF140 are potent transcriptional repression domains [Margolin et al. (1994) Proc. Natl. Acad. Sci. USA 91, 4509-4513]. The KRAB domain from ZNF136, containing only subdomain A, is a considerable weaker suppression domain; however, when fused to the heterologous KRAB B subdomain of ZNF10 (KOX1) the two subdomains from a KRAB domain which induces repression as potently as previously reported KRAB domains.
Collapse
Affiliation(s)
- H Vissing
- Basel Institute for Immunology, Switzerland
| | | | | | | | | |
Collapse
|
22
|
Abstract
A tetracycline-controlled transrepressor protein has been engineered to silence transcriptional activities of eukaryotic promoters that are stably integrated into the chromatin of human cells. By fusing the KRAB domain of human Kox1 to the Tet repressor derived from Tn10 of Escherichia coli, a tetracycline-controlled hybrid protein (TetR-KRAB) was generated and constitutively expressed in HeLa cells. The TetR-KRAB protein binds to tet operator (tetO) sequences in the absence but not in the presence of tetracycline. When TetR-KRAB bound to tetO sequences upstream of the immediate-early promoter-enhancer of human cytomegalovirus (CMV), the expression of a CMV-driven luciferase reporter construct (ptetO7-CMV-L) was repressed in transient transfection experiments. This silencing was found to operate on different promoters and from tetO sequences placed more than 3 kb from the transcriptional start site. We constructed a stable, doubly transfected cell line (TIS-10) carrying a chromosomally integrated ptetO7-CMV-L reporter construct and expressing the TetR-KRAB protein. Upon addition of tetracycline, luciferase expression was induced more than 50-fold above the baseline level, with half-maximal induction by 2 days. Furthermore, a protein of around 110 kDa was found to coimmunoprecipitate with the TetR-KRAB fusion protein. This protein might play a role as an adaptor protein mediating the silencing exerted by the TetR-KRAB protein. The TetR-KRAB silencing system should be useful as a genetic switch for regulating the expression of chromosomally integrated heterologous and endogenous genes present in mammalian genomes.
Collapse
Affiliation(s)
- U Deuschle
- Basel Institute for Immunology, Switzerland
| | | | | |
Collapse
|
23
|
Margolin JF, Friedman JR, Meyer WK, Vissing H, Thiesen HJ, Rauscher FJ. Krüppel-associated boxes are potent transcriptional repression domains. Proc Natl Acad Sci U S A 1994; 91:4509-13. [PMID: 8183939 PMCID: PMC43815 DOI: 10.1073/pnas.91.10.4509] [Citation(s) in RCA: 485] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The Krüppel-associated box (KRAB) is a highly conserved, 75-aa region containing two predicted amphipathic alpha-helices. The KRAB domain is present in the amino-terminal regions of more than one-third of all Krüppel-class Cys2His2 zinc finger proteins and is conserved from yeast to man; however, its function is unknown. Here it is shown that the KRAB domain functions as a DNA binding-dependent transcriptional repressor when fused to a heterologous DNA-binding domain from the yeast GAL4 protein. A 45-aa segment containing one of the predicted KRAB amphipathic helices was necessary and sufficient for repression. Amino acid substitutions in the predicted helix abolished the repression function. These results assign a function, transcriptional repression, to the highly conserved KRAB box and define a minimal repression domain which may aid in identifying mechanisms of repression.
Collapse
|