3
|
Meadows JRS, Kidd JM, Wang GD, Parker HG, Schall PZ, Bianchi M, Christmas MJ, Bougiouri K, Buckley RM, Hitte C, Nguyen AK, Wang C, Jagannathan V, Niskanen JE, Frantz LAF, Arumilli M, Hundi S, Lindblad-Toh K, Ginja C, Agustina KK, André C, Boyko AR, Davis BW, Drögemüller M, Feng XY, Gkagkavouzis K, Iliopoulos G, Harris AC, Hytönen MK, Kalthoff DC, Liu YH, Lymberakis P, Poulakakis N, Pires AE, Racimo F, Ramos-Almodovar F, Savolainen P, Venetsani S, Tammen I, Triantafyllidis A, vonHoldt B, Wayne RK, Larson G, Nicholas FW, Lohi H, Leeb T, Zhang YP, Ostrander EA. Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture. Genome Biol 2023; 24:187. [PMID: 37582787 PMCID: PMC10426128 DOI: 10.1186/s13059-023-03023-7] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 07/25/2023] [Indexed: 08/17/2023] Open
Abstract
BACKGROUND The international Dog10K project aims to sequence and analyze several thousand canine genomes. Incorporating 20 × data from 1987 individuals, including 1611 dogs (321 breeds), 309 village dogs, 63 wolves, and four coyotes, we identify genomic variation across the canid family, setting the stage for detailed studies of domestication, behavior, morphology, disease susceptibility, and genome architecture and function. RESULTS We report the analysis of > 48 M single-nucleotide, indel, and structural variants spanning the autosomes, X chromosome, and mitochondria. We discover more than 75% of variation for 239 sampled breeds. Allele sharing analysis indicates that 94.9% of breeds form monophyletic clusters and 25 major clades. German Shepherd Dogs and related breeds show the highest allele sharing with independent breeds from multiple clades. On average, each breed dog differs from the UU_Cfam_GSD_1.0 reference at 26,960 deletions and 14,034 insertions greater than 50 bp, with wolves having 14% more variants. Discovered variants include retrogene insertions from 926 parent genes. To aid functional prioritization, single-nucleotide variants were annotated with SnpEff and Zoonomia phyloP constraint scores. Constrained positions were negatively correlated with allele frequency. Finally, the utility of the Dog10K data as an imputation reference panel is assessed, generating high-confidence calls across varied genotyping platform densities including for breeds not included in the Dog10K collection. CONCLUSIONS We have developed a dense dataset of 1987 sequenced canids that reveals patterns of allele sharing, identifies likely functional variants, informs breed structure, and enables accurate imputation. Dog10K data are publicly available.
Collapse
Affiliation(s)
- Jennifer R S Meadows
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75132, Uppsala, Sweden.
| | - Jeffrey M Kidd
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48107, USA.
| | - Guo-Dong Wang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Heidi G Parker
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Building 50 Room 5351, Bethesda, MD, 20892, USA
| | - Peter Z Schall
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48107, USA
| | - Matteo Bianchi
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75132, Uppsala, Sweden
| | - Matthew J Christmas
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75132, Uppsala, Sweden
| | - Katia Bougiouri
- Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Øster Voldgade 5-7, 1350, Copenhagen, Denmark
| | - Reuben M Buckley
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Building 50 Room 5351, Bethesda, MD, 20892, USA
| | - Christophe Hitte
- University of Rennes, CNRS, Institute Genetics and Development Rennes - UMR6290, 35000, Rennes, France
| | - Anthony K Nguyen
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48107, USA
| | - Chao Wang
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75132, Uppsala, Sweden
| | - Vidhya Jagannathan
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3001, Bern, Switzerland
| | - Julia E Niskanen
- Department of Medical and Clinical Genetics, Department of Veterinary Biosciences, University of Helsinki and Folkhälsan Research Center, 02900, Helsinki, Finland
| | - Laurent A F Frantz
- School of Biological and Behavioural Sciences, Queen Mary University of London, London E14NS, UK and Palaeogenomics Group, Department of Veterinary Sciences, Ludwig Maximilian University, D-80539, Munich, Germany
| | - Meharji Arumilli
- Department of Medical and Clinical Genetics, Department of Veterinary Biosciences, University of Helsinki and Folkhälsan Research Center, 02900, Helsinki, Finland
| | - Sruthi Hundi
- Department of Medical and Clinical Genetics, Department of Veterinary Biosciences, University of Helsinki and Folkhälsan Research Center, 02900, Helsinki, Finland
| | - Kerstin Lindblad-Toh
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75132, Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Catarina Ginja
- BIOPOLIS-CIBIO-InBIO-Centro de Investigação Em Biodiversidade E Recursos Genéticos - ArchGen Group, Universidade Do Porto, 4485-661, Vairão, Portugal
| | | | - Catherine André
- University of Rennes, CNRS, Institute Genetics and Development Rennes - UMR6290, 35000, Rennes, France
| | - Adam R Boyko
- Department of Biomedical Sciences, Cornell University, 930 Campus Road, Ithaca, NY, 14853, USA
| | - Brian W Davis
- Department of Veterinary Integrative Biosciences, School of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX, 77843, USA
| | - Michaela Drögemüller
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3001, Bern, Switzerland
| | - Xin-Yao Feng
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Konstantinos Gkagkavouzis
- Department of Genetics, School of Biology, ), Aristotle University of Thessaloniki, Thessaloniki, Macedonia 54124, Greece and Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation (CIRI-AUTH, Balkan Center, Thessaloniki, Greece
| | - Giorgos Iliopoulos
- NGO "Callisto", Wildlife and Nature Conservation Society, 54621, Thessaloniki, Greece
| | - Alexander C Harris
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Building 50 Room 5351, Bethesda, MD, 20892, USA
| | - Marjo K Hytönen
- Department of Medical and Clinical Genetics, Department of Veterinary Biosciences, University of Helsinki and Folkhälsan Research Center, 02900, Helsinki, Finland
| | - Daniela C Kalthoff
- NGO "Callisto", Wildlife and Nature Conservation Society, 54621, Thessaloniki, Greece
| | - Yan-Hu Liu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Petros Lymberakis
- Natural History Museum of Crete & Department of Biology, University of Crete, 71202, Irakleio, Greece
- Biology Department, School of Sciences and Engineering, University of Crete, Heraklion, Greece
- Palaeogenomics and Evolutionary Genetics Lab, Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology - Hellas (FORTH), Heraklion, Greece
| | - Nikolaos Poulakakis
- Natural History Museum of Crete & Department of Biology, University of Crete, 71202, Irakleio, Greece
- Biology Department, School of Sciences and Engineering, University of Crete, Heraklion, Greece
- Palaeogenomics and Evolutionary Genetics Lab, Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology - Hellas (FORTH), Heraklion, Greece
| | - Ana Elisabete Pires
- BIOPOLIS-CIBIO-InBIO-Centro de Investigação Em Biodiversidade E Recursos Genéticos - ArchGen Group, Universidade Do Porto, 4485-661, Vairão, Portugal
| | - Fernando Racimo
- Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Øster Voldgade 5-7, 1350, Copenhagen, Denmark
| | | | - Peter Savolainen
- Department of Gene Technology, Science for Life Laboratory, KTH - Royal Institute of Technology, 17121, Solna, Sweden
| | - Semina Venetsani
- Department of Genetics, School of Biology, Aristotle University of Thessaloniki, 54124, Thessaloniki, Macedonia, Greece
| | - Imke Tammen
- Sydney School of Veterinary Science, The University of Sydney, Sydney, NSW, 2570, Australia
| | - Alexandros Triantafyllidis
- Department of Genetics, School of Biology, ), Aristotle University of Thessaloniki, Thessaloniki, Macedonia 54124, Greece and Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation (CIRI-AUTH, Balkan Center, Thessaloniki, Greece
| | - Bridgett vonHoldt
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, 08544, USA
| | - Robert K Wayne
- Department of Ecology and Evolutionary Biology, Ecology and Evolutionary Biology, University of California, Los Angeles, CA, 90095-7246, USA
| | - Greger Larson
- Palaeogenomics and Bio-Archaeology Research Network, School of Archaeology, University of Oxford, Oxford, OX1 3TG, UK
| | - Frank W Nicholas
- Sydney School of Veterinary Science, The University of Sydney, Sydney, NSW, 2570, Australia
| | - Hannes Lohi
- Department of Medical and Clinical Genetics, Department of Veterinary Biosciences, University of Helsinki and Folkhälsan Research Center, 02900, Helsinki, Finland
| | - Tosso Leeb
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3001, Bern, Switzerland
| | - Ya-Ping Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Elaine A Ostrander
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Building 50 Room 5351, Bethesda, MD, 20892, USA.
| |
Collapse
|
5
|
Batcher K, Varney S, York D, Blacksmith M, Kidd JM, Rebhun R, Dickinson P, Bannasch D. Recent, full-length gene retrocopies are common in canids. Genome Res 2022; 32:1602-1611. [PMID: 35961775 PMCID: PMC9435743 DOI: 10.1101/gr.276828.122] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 07/19/2022] [Indexed: 02/03/2023]
Abstract
Gene retrocopies arise from the reverse transcription and insertion into the genome of processed mRNA transcripts. Although many retrocopies have acquired mutations that render them functionally inactive, most mammals retain active LINE-1 sequences capable of producing new retrocopies. New retrocopies, referred to as retro copy number variants (retroCNVs), may not be identified by standard variant calling techniques in high-throughput sequencing data. Although multiple functional FGF4 retroCNVs have been associated with skeletal dysplasias in dogs, the full landscape of canid retroCNVs has not been characterized. Here, retroCNV discovery was performed on a whole-genome sequencing data set of 293 canids from 76 breeds. We identified retroCNV parent genes via the presence of mRNA-specific 30-mers, and then identified retroCNV insertion sites through discordant read analysis. In total, we resolved insertion sites for 1911 retroCNVs from 1179 parent genes, 1236 of which appeared identical to their parent genes. Dogs had on average 54.1 total retroCNVs and 1.4 private retroCNVs. We found evidence of expression in testes for 12% (14/113) of the retroCNVs identified in six Golden Retrievers, including four chimeric transcripts, and 97 retroCNVs also had significantly elevated F ST across dog breeds, possibly indicating selection. We applied our approach to a subset of human genomes and detected an average of 4.2 retroCNVs per sample, highlighting a 13-fold relative increase of retroCNV frequency in dogs. Particularly in canids, retroCNVs are a largely unexplored source of genetic variation that can contribute to genome plasticity and that should be considered when investigating traits and diseases.
Collapse
Affiliation(s)
- Kevin Batcher
- Department of Population Health and Reproduction, University of California, Davis, Davis, California 95616, USA
| | - Scarlett Varney
- Department of Population Health and Reproduction, University of California, Davis, Davis, California 95616, USA
| | - Daniel York
- Department of Surgical and Radiological Sciences, University of California, Davis, Davis, California 95616, USA
| | - Matthew Blacksmith
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| | - Jeffrey M Kidd
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| | - Robert Rebhun
- Department of Surgical and Radiological Sciences, University of California, Davis, Davis, California 95616, USA
| | - Peter Dickinson
- Department of Surgical and Radiological Sciences, University of California, Davis, Davis, California 95616, USA
| | - Danika Bannasch
- Department of Population Health and Reproduction, University of California, Davis, Davis, California 95616, USA
| |
Collapse
|
6
|
Chen J, Zhong J, He X, Li X, Ni P, Safner T, Šprem N, Han J. The de novo assembly of a European wild boar genome revealed unique patterns of chromosomal structural variations and segmental duplications. Anim Genet 2022; 53:281-292. [PMID: 35238061 PMCID: PMC9314987 DOI: 10.1111/age.13181] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 02/12/2022] [Accepted: 02/12/2022] [Indexed: 02/05/2023]
Abstract
The rapid progress of sequencing technology has greatly facilitated the de novo genome assembly of pig breeds. However, the assembly of the wild boar genome is still lacking, hampering our understanding of chromosomal and genomic evolution during domestication from wild boars into domestic pigs. Here, we sequenced and de novo assembled a European wild boar genome (ASM2165605v1) using the long‐range information provided by 10× Linked‐Reads sequencing. We achieved a high‐quality assembly with contig N50 of 26.09 Mb. Additionally, 1.64% of the contigs (222) with lengths from 107.65 kb to 75.36 Mb covered 90.3% of the total genome size of ASM2165605v1 (~2.5 Gb). Mapping analysis revealed that the contigs can fill 24.73% (93/376) of the gaps present in the orthologous regions of the updated pig reference genome (Sscrofa11.1). We further improved the contigs into chromosome level with a reference‐assistant scaffolding method. Using the ‘assembly‐to‐assembly’ approach, we identified intra‐chromosomal large structural variations (SVs, length >1 kb) between ASM2165605v1 and Sscrofa11.1 assemblies. Interestingly, we found that the number of SV events on the X chromosome deviated significantly from the linear models fitting autosomes (R2 > 0.64, p < 0.001). Specifically, deletions and insertions were deficient on the X chromosome by 66.14 and 58.41% respectively, whereas duplications and inversions were excessive on the X chromosome by 71.96 and 107.61% respectively. We further used the large segmental duplications (SDs, >1 kb) events as a proxy to understand the large‐scale inter‐chromosomal evolution, by resolving parental‐derived relationships for SD pairs. We revealed a significant excess of SD movements from the X chromosome to autosomes (p < 0.001), consistent with the expectation of meiotic sex chromosome inactivation. Enrichment analyses indicated that the genes within derived SD copies on autosomes were significantly related to biological processes involving nervous system, lipid biosynthesis and sperm motility (p < 0.01). Together, our analyses of the de novo assembly of ASM2165605v1 provides insight into the SVs between European wild boar and domestic pig, in addition to the ongoing process of meiotic sex chromosome inactivation in driving inter‐chromosomal interaction between the sex chromosome and autosomes.
Collapse
Affiliation(s)
- Jianhai Chen
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Jie Zhong
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Xuefei He
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Xiaoyu Li
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Pan Ni
- Animal Husbandry and Veterinary Institute of Keqiao District, Shaoxing, Zhejiang, China
| | - Toni Safner
- Faculty of Agriculture, University of Zagreb, Zagreb, Croatia.,Centre of Excellence for Biodiversity and Molecular Plant Breeding, (CoE CroP-BioDiv), Zagreb, Croatia
| | - Nikica Šprem
- Faculty of Agriculture, University of Zagreb, Zagreb, Croatia
| | - Jianlin Han
- International Livestock Research Institute, Nairobi, Kenya.,CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| |
Collapse
|