1
|
Murphy WJ, Harris AJ. Toward telomere-to-telomere cat genomes for precision medicine and conservation biology. Genome Res 2024; 34:655-664. [PMID: 38849156 PMCID: PMC11216403 DOI: 10.1101/gr.278546.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2024]
Abstract
Genomic data from species of the cat family Felidae promise to stimulate veterinary and human medical advances, and clarify the coherence of genome organization. We describe how interspecies hybrids have been instrumental in the genetic analysis of cats, from the first genetic maps to propelling cat genomes toward the T2T standard set by the human genome project. Genotype-to-phenotype mapping in cat models has revealed dozens of health-related genetic variants, the molecular basis for mammalian pigmentation and patterning, and species-specific adaptations. Improved genomic surveillance of natural and captive populations across the cat family tree will increase our understanding of the genetic architecture of traits, population dynamics, and guide a future of genome-enabled biodiversity conservation.
Collapse
Affiliation(s)
- William J Murphy
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas 77843-4458, USA;
- Department of Biology, Texas A&M University, College Station, Texas 77843-4458, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, Texas 77843-4458, USA
| | - Andrew J Harris
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas 77843-4458, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, Texas 77843-4458, USA
| |
Collapse
|
2
|
Kalleberg J, Rissman J, Schnabel RD. Overcoming Limitations to Deep Learning in Domesticated Animals with TrioTrain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.15.589602. [PMID: 38659907 PMCID: PMC11042298 DOI: 10.1101/2024.04.15.589602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Variant calling across diverse species remains challenging as most bioinformatics tools default to assumptions based on human genomes. DeepVariant (DV) excels without joint genotyping while offering fewer implementation barriers. However, the growing appeal of a "universal" algorithm has magnified the unknown impacts when used with non-human genomes. Here, we use bovine genomes to assess the limits of human-genome-trained models in other species. We introduce the first multi-species DV model that achieves a lower Mendelian Inheritance Error (MIE) rate during single-sample genotyping. Our novel approach, TrioTrain, automates extending DV for species without Genome In A Bottle (GIAB) resources and uses region shuffling to mitigate barriers for SLURM-based clusters. To offset imperfect truth labels for animal genomes, we remove Mendelian discordant variants before training, where models are tuned to genotype the offspring correctly. With TrioTrain, we use cattle, yak, and bison trios to build 30 model iterations across five phases. We observe remarkable performance across phases when testing the GIAB human trios with a mean SNP F1 score >0.990. In HG002, our phase 4 bovine model identifies more variants at a lower MIE rate than DeepTrio. In bovine F1-hybrid genomes, our model substantially reduces inheritance errors with a mean MIE rate of 0.03 percent. Although constrained by imperfect labels, we find that multi-species, trio-based training produces a robust variant calling model. Our research demonstrates that exclusively training with human genomes restricts the application of deep-learning approaches for comparative genomics.
Collapse
Affiliation(s)
- Jenna Kalleberg
- University of Missouri, Division of Animal Sciences, Columbia, MO, 65201 USA
| | - Jacob Rissman
- University of Missouri, Division of Animal Sciences, Columbia, MO, 65201 USA
| | - Robert D Schnabel
- University of Missouri, Division of Animal Sciences, Columbia, MO, 65201 USA
- University of Missouri, Genetics Area Program, Columbia, MO, 65201 USA
| |
Collapse
|
3
|
Nie F, Ni P, Huang N, Zhang J, Wang Z, Xiao C, Luo F, Wang J. De novo diploid genome assembly using long noisy reads. Nat Commun 2024; 15:2964. [PMID: 38580638 PMCID: PMC10997618 DOI: 10.1038/s41467-024-47349-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 03/25/2024] [Indexed: 04/07/2024] Open
Abstract
The high sequencing error rate has impeded the application of long noisy reads for diploid genome assembly. Most existing assemblers failed to generate high-quality phased assemblies using long noisy reads. Here, we present PECAT, a Phased Error Correction and Assembly Tool, for reconstructing diploid genomes from long noisy reads. We design a haplotype-aware error correction method that can retain heterozygote alleles while correcting sequencing errors. We combine a corrected read SNP caller and a raw read SNP caller to further improve the identification of inconsistent overlaps in the string graph. We use a grouping method to assign reads to different haplotype groups. PECAT efficiently assembles diploid genomes using Nanopore R9, PacBio CLR or Nanopore R10 reads only. PECAT generates more contiguous haplotype-specific contigs compared to other assemblers. Especially, PECAT achieves nearly haplotype-resolved assembly on B. taurus (Bison×Simmental) using Nanopore R9 reads and phase block NG50 with 59.4/58.0 Mb for HG002 using Nanopore R10 reads.
Collapse
Affiliation(s)
- Fan Nie
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Xiangjiang Laboratory, Changsha, 410205, China
- National Center for Applied Mathematics in Hunan and Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, Hunan, 411105, China
| | - Peng Ni
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Xiangjiang Laboratory, Changsha, 410205, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Neng Huang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Xiangjiang Laboratory, Changsha, 410205, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Jun Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Xiangjiang Laboratory, Changsha, 410205, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Zhenyu Wang
- Institute of Nanfan & Seed Industry, Guangdong Academy of Sciences, Guangdong, 510316, China
| | - Chuanle Xiao
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University #7 Jinsui Road, Tianhe District, Guangzhou, China.
| | - Feng Luo
- School of Computing, Clemson University, Clemson, SC, 29634-0974, USA.
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
- Xiangjiang Laboratory, Changsha, 410205, China.
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China.
| |
Collapse
|
4
|
Jevit MJ, Castaneda C, Paria N, Das PJ, Miller D, Antczak DF, Kalbfleisch TS, Davis BW, Raudsepp T. Trio-binning of a hinny refines the comparative organization of the horse and donkey X chromosomes and reveals novel species-specific features. Sci Rep 2023; 13:20180. [PMID: 37978222 PMCID: PMC10656420 DOI: 10.1038/s41598-023-47583-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 11/14/2023] [Indexed: 11/19/2023] Open
Abstract
We generated single haplotype assemblies from a hinny hybrid which significantly improved the gapless contiguity for horse and donkey autosomal genomes and the X chromosomes. We added over 15 Mb of missing sequence to both X chromosomes, 60 Mb to donkey autosomes and corrected numerous errors in donkey and some in horse reference genomes. We resolved functionally important X-linked repeats: the DXZ4 macrosatellite and ampliconic Equine Testis Specific Transcript Y7 (ETSTY7). We pinpointed the location of the pseudoautosomal boundaries (PAB) and determined the size of the horse (1.8 Mb) and donkey (1.88 Mb) pseudoautosomal regions (PARs). We discovered distinct differences in horse and donkey PABs: a testis-expressed gene, XKR3Y, spans horse PAB with exons1-2 located in Y and exon3 in the X-Y PAR, whereas the donkey XKR3Y is Y-specific. DXZ4 had a similar ~ 8 kb monomer in both species with 10 copies in horse and 20 in donkey. We assigned hundreds of copies of ETSTY7, a sequence horizontally transferred from Parascaris and massively amplified in equids, to horse and donkey X chromosomes and three autosomes. The findings and products contribute to molecular studies of equid biology and advance research on X-linked conditions, sex chromosome regulation and evolution in equids.
Collapse
Affiliation(s)
- Matthew J Jevit
- School of Veterinary Medicine, Texas A&M University, College Station, TX, 77843, USA
| | - Caitlin Castaneda
- School of Veterinary Medicine, Texas A&M University, College Station, TX, 77843, USA
| | - Nandina Paria
- Texas Scottish Rite Hospital for Children, Dallas, TX, 75219, USA
| | - Pranab J Das
- ICAR-National Research Centre on Pig, Rani, Guwahati, Assam, 781131, India
| | - Donald Miller
- Baker Institute for Animal Health, Cornell University, Ithaca, NY, 14853, USA
| | - Douglas F Antczak
- Baker Institute for Animal Health, Cornell University, Ithaca, NY, 14853, USA
| | - Theodore S Kalbfleisch
- Maxwell H. Gluck Equine Research Center, University of Kentucky, Lexington, KY, 40546, USA
| | - Brian W Davis
- School of Veterinary Medicine, Texas A&M University, College Station, TX, 77843, USA.
| | - Terje Raudsepp
- School of Veterinary Medicine, Texas A&M University, College Station, TX, 77843, USA.
| |
Collapse
|
5
|
Zhang Y, Lu HW, Ruan J. GAEP: a comprehensive genome assembly evaluating pipeline. J Genet Genomics 2023; 50:747-754. [PMID: 37245652 DOI: 10.1016/j.jgg.2023.05.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 05/19/2023] [Accepted: 05/23/2023] [Indexed: 05/30/2023]
Abstract
With the rapid development of sequencing technologies, especially the maturity of third-generation sequencing technologies, there has been a significant increase in the number and quality of published genome assemblies. The emergence of these high-quality genomes has raised higher requirements for genome evaluation. Although numerous computational methods have been developed to evaluate assembly quality from various perspectives, the selective use of these evaluation methods can be arbitrary and inconvenient for fairly comparing the assembly quality. To address this issue, we have developed the Genome Assembly Evaluating Pipeline (GAEP), which provides a comprehensive assessment pipeline for evaluating genome quality from multiple perspectives, including continuity, completeness, and correctness. Additionally, GAEP includes new functions for detecting misassemblies and evaluating the assembly redundancy, which performs well in our testing. GAEP is publicly available at https://github.com/zy-optimistic/GAEP under the GPL3.0 License. With GAEP, users can quickly obtain accurate and reliable evaluation results, facilitating the comparison and selection of high-quality genome assemblies.
Collapse
Affiliation(s)
- Yong Zhang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Hong-Wei Lu
- State Key Laboratory of Rice Biology and Breeding, China National Rice Research Institute, Chinese Academy of Agricultural Sciences, Hangzhou, Zhejiang 311401, China
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China.
| |
Collapse
|
6
|
Stroupe S, Martone C, McCann B, Juras R, Kjöllerström HJ, Raudsepp T, Beard D, Davis BW, Derr JN. Chromosome-level reference genome for North American bison (Bison bison) and variant database aids in identifying albino mutation. G3 (BETHESDA, MD.) 2023; 13:jkad156. [PMID: 37481261 PMCID: PMC10542314 DOI: 10.1093/g3journal/jkad156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Revised: 07/05/2023] [Accepted: 07/07/2023] [Indexed: 07/24/2023]
Abstract
We developed a highly contiguous chromosome-level reference genome for North American bison to provide a platform to evaluate the conservation, ecological, evolutionary, and population genomics of this species. Generated from a F1 hybrid between a North American bison dam and a domestic cattle bull, completeness and contiguity exceed that of other published bison genome assemblies. To demonstrate the utility for genome-wide variant frequency estimation, we compiled a genomic variant database consisting of 3 true albino bison and 44 wild-type pelage color bison. Through the examination of genomic variants fixed in the albino cohort and absent in the controls, we identified a nonsynonymous single nucleotide polymorphism (SNP) mutation on chromosome 29 in exon 3 of the tyrosinase gene (c.1114C>T). A TaqMan SNP Genotyping Assay was developed to genotype this SNP in a total of 283 animals across 29 herds. This assay confirmed the absence of homozygous variants in all animals except 7 true albino bison included in this study. In addition, the only heterozygous animals identified were 2 wild-type pelage color dams of albino offspring. Therefore, we propose that this new high-quality bison genome assembly and incipient variant database provides a highly robust and informative resource for genomics investigations for this iconic North American species.
Collapse
Affiliation(s)
- Sam Stroupe
- Department of Veterinary Pathobiology, Texas A&M University School of Veterinary Medicine and Biomedical Science, College Station, TX 77843, USA
| | - Carly Martone
- Department of Veterinary Pathobiology, Texas A&M University School of Veterinary Medicine and Biomedical Science, College Station, TX 77843, USA
| | - Blake McCann
- National Park Service, Theodore Roosevelt National Park, Medora, ND 58645, USA
| | - Rytis Juras
- Department of Veterinary Integrative Biosciences, Texas A&M University School of Veterinary Medicine and Biomedical Science, College Station, TX 77843, USA
| | - Helena Josefina Kjöllerström
- Department of Veterinary Integrative Biosciences, Texas A&M University School of Veterinary Medicine and Biomedical Science, College Station, TX 77843, USA
| | - Terje Raudsepp
- Department of Veterinary Integrative Biosciences, Texas A&M University School of Veterinary Medicine and Biomedical Science, College Station, TX 77843, USA
| | - Donald Beard
- Texas Parks and Wildlife, Caprock Canyons State Park & Trailway, Quitaque, TX 79255, USA
| | - Brian W Davis
- Department of Veterinary Integrative Biosciences, Texas A&M University School of Veterinary Medicine and Biomedical Science, College Station, TX 77843, USA
- Department of Small Animal Clinical Sciences, Texas A&M University School of Veterinary Medicine and Biomedical Science, College Station, TX 77843, USA
| | - James N Derr
- Department of Veterinary Pathobiology, Texas A&M University School of Veterinary Medicine and Biomedical Science, College Station, TX 77843, USA
| |
Collapse
|
7
|
Liu X, Liu W, Lenstra JA, Zheng Z, Wu X, Yang J, Li B, Yang Y, Qiu Q, Liu H, Li K, Liang C, Guo X, Ma X, Abbott RJ, Kang M, Yan P, Liu J. Evolutionary origin of genomic structural variations in domestic yaks. Nat Commun 2023; 14:5617. [PMID: 37726270 PMCID: PMC10509194 DOI: 10.1038/s41467-023-41220-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 08/23/2023] [Indexed: 09/21/2023] Open
Abstract
Yak has been subject to natural selection, human domestication and interspecific introgression during its evolution. However, genetic variants favored by each of these processes have not been distinguished previously. We constructed a graph-genome for 47 genomes of 7 cross-fertile bovine species. This allowed detection of 57,432 high-resolution structural variants (SVs) within and across the species, which were genotyped in 386 individuals. We distinguished the evolutionary origins of diverse SVs in domestic yaks by phylogenetic analyses. We further identified 334 genes overlapping with SVs in domestic yaks that bore potential signals of selection from wild yaks, plus an additional 686 genes introgressed from cattle. Nearly 90% of the domestic yaks were introgressed by cattle. Introgression of an SV spanning the KIT gene triggered the breeding of white domestic yaks. We validated a significant association of the selected stratified SVs with gene expression, which contributes to phenotypic variations. Our results highlight that SVs of different origins contribute to the phenotypic diversity of domestic yaks.
Collapse
Affiliation(s)
- Xinfeng Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystem, College of Ecology, Lanzhou University, Lanzhou, 730000, China
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou, 730050, China
- Academy of Plateau Science and Sustainability, Qinghai Normal University, Xining, 810016, China
| | - Wenyu Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystem, College of Ecology, Lanzhou University, Lanzhou, 730000, China
| | - Johannes A Lenstra
- Faculty of Veterinary Medicine, Utrecht University, Utrecht, 3508 TD, The Netherlands
| | - Zeyu Zheng
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystem, College of Ecology, Lanzhou University, Lanzhou, 730000, China
| | - Xiaoyun Wu
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou, 730050, China
| | - Jiao Yang
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystem, College of Ecology, Lanzhou University, Lanzhou, 730000, China
| | - Bowen Li
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystem, College of Ecology, Lanzhou University, Lanzhou, 730000, China
| | - Yongzhi Yang
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystem, College of Ecology, Lanzhou University, Lanzhou, 730000, China
| | - Qiang Qiu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystem, College of Ecology, Lanzhou University, Lanzhou, 730000, China
| | - Hongyu Liu
- Anhui Provincial Laboratory of Local Livestock and Poultry Genetical Resource Conservation and Breeding, College of Animal Science and Technology, Anhui Agricultural University, Hefei, 230036, China
| | - Kexin Li
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystem, College of Ecology, Lanzhou University, Lanzhou, 730000, China
| | - Chunnian Liang
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou, 730050, China
| | - Xian Guo
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou, 730050, China
| | - Xiaoming Ma
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou, 730050, China
| | - Richard J Abbott
- School of Biology, University of St Andrews, St Andrews, KY16 9AJ, UK
| | - Minghui Kang
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystem, College of Ecology, Lanzhou University, Lanzhou, 730000, China.
| | - Ping Yan
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou, 730050, China.
| | - Jianquan Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystem, College of Ecology, Lanzhou University, Lanzhou, 730000, China.
- Academy of Plateau Science and Sustainability, Qinghai Normal University, Xining, 810016, China.
| |
Collapse
|
8
|
Sheng H, Zhang J, Pan C, Wang S, Gu S, Li F, Ma Y, Ma Y. Genome-wide identification of bovine ADAMTS gene family and analysis of its expression profile in the inflammatory process of mammary epithelial cells. Int J Biol Macromol 2023:125304. [PMID: 37315674 DOI: 10.1016/j.ijbiomac.2023.125304] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 05/29/2023] [Accepted: 06/04/2023] [Indexed: 06/16/2023]
Abstract
ADAM metallopeptidase with thrombospondin type 1 motif (ADAMTS) are secreted, multi-domain matrix-related zinc endopeptidases that play a role in organogenesis, assembly and degradation of extracellular matrix (ECM), cancer and inflammation. Genome-wide identification and analysis of the bovine ADAMTS gene family has not yet been carried out. In this study, 19 ADAMTS family genes were identified in Bos taurus by genome-wide bioinformatics analysis, and they were unevenly distributed on 12 chromosomes. Phylogenetic analysis shows that the Bos taurus ADAMTS are divided into eight subfamilies, with highly consistent gene structures and motifs within the same subfamily. Collinearity analysis showed that the Bos taurus ADAMTS gene family is homologous to other bovine subfamily species, and many ADAMTS genes may be derived from tandem replication and segmental replication. In addition, based on the analysis of RNA-seq data, we found the expression pattern of ADAMTS gene in different tissues. Meanwhile, we also analyzed the expression profile of ADAMTS gene in the inflammatory response of bovine mammary epithelial cells (BMECs) stimulated by LPS by qRT-PCR. The results can provide ideas for understanding the evolutionary relationship and expression pattern of ADAMTS gene in Bovidae, and clarify the theoretical basis of the function of ADAMTS in inflammation.
Collapse
Affiliation(s)
- Hui Sheng
- School of Agriculture, Ningxia University, Yinchuan 750021, China; Key Laboratory of Ruminant Molecular and Cellular Breeding, School of Agriculture, Ningxia University, Yinchuan 750021, China
| | - Junxing Zhang
- School of Agriculture, Ningxia University, Yinchuan 750021, China; Key Laboratory of Ruminant Molecular and Cellular Breeding, School of Agriculture, Ningxia University, Yinchuan 750021, China
| | - Cuili Pan
- School of Agriculture, Ningxia University, Yinchuan 750021, China; Key Laboratory of Ruminant Molecular and Cellular Breeding, School of Agriculture, Ningxia University, Yinchuan 750021, China
| | - Shuzhe Wang
- School of Agriculture, Ningxia University, Yinchuan 750021, China; Key Laboratory of Ruminant Molecular and Cellular Breeding, School of Agriculture, Ningxia University, Yinchuan 750021, China
| | - Shuaifeng Gu
- School of Agriculture, Ningxia University, Yinchuan 750021, China; Key Laboratory of Ruminant Molecular and Cellular Breeding, School of Agriculture, Ningxia University, Yinchuan 750021, China
| | - Fen Li
- School of Agriculture, Ningxia University, Yinchuan 750021, China; Key Laboratory of Ruminant Molecular and Cellular Breeding, School of Agriculture, Ningxia University, Yinchuan 750021, China
| | - Yanfen Ma
- School of Agriculture, Ningxia University, Yinchuan 750021, China; Key Laboratory of Ruminant Molecular and Cellular Breeding, School of Agriculture, Ningxia University, Yinchuan 750021, China
| | - Yun Ma
- School of Agriculture, Ningxia University, Yinchuan 750021, China; Key Laboratory of Ruminant Molecular and Cellular Breeding, School of Agriculture, Ningxia University, Yinchuan 750021, China.
| |
Collapse
|
9
|
Harenčár J, Vargas OM, Escalona M, Schemske DW, Kay KM. Genome assemblies and comparison of two Neotropical spiral gingers: Costus pulverulentus and C. lasius. J Hered 2023; 114:286-293. [PMID: 36928286 PMCID: PMC10212132 DOI: 10.1093/jhered/esad018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 03/15/2023] [Indexed: 03/18/2023] Open
Abstract
The spiral gingers (Costus L.) are a pantropical genus of herbaceous perennial monocots; the Neotropical clade of Costus radiated rapidly in the past few million years into over 60 species. The Neotropical spiral gingers have a rich history of evolutionary and ecological research that can motivate and inform modern genetic investigations. Here, we present the first 2 chromosome-level genome assemblies in the genus, for C. pulverulentus and C. lasius, and briefly compare their synteny. We assembled the C. pulverulentus genome from a combination of short-read data, Chicago and Dovetail Hi-C chromatin-proximity sequencing, and alignment with a linkage map. We annotated the genome by mapping a C. pulverulentus transcriptome and querying mapped transcripts against a protein database. We assembled the C. lasius genome with Pacific Biosciences HiFi long reads and alignment to the C. pulverulentus genome. These 2 assemblies are the first published genomes for non-cultivated tropical plants. These genomes solidify the spiral gingers as a model system and will facilitate research on the poorly understood genetic basis of tropical plant diversification.
Collapse
Affiliation(s)
- Julia Harenčár
- Ecology and Evolutionary Biology Department, University of California, Santa Cruz, Santa Cruz, CA, United States
| | - Oscar M Vargas
- Department of Biological Sciences, California State Polytechnic University, Humboldt, Arcata, CA, United States
| | - Merly Escalona
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, United States
| | - Douglas W Schemske
- Department of Plant Biology, Michigan State University, East Lansing, MI, United States
| | - Kathleen M Kay
- Ecology and Evolutionary Biology Department, University of California, Santa Cruz, Santa Cruz, CA, United States
| |
Collapse
|
10
|
Leonard AS, Crysnanto D, Mapel XM, Bhati M, Pausch H. Graph construction method impacts variation representation and analyses in a bovine super-pangenome. Genome Biol 2023; 24:124. [PMID: 37217946 PMCID: PMC10204317 DOI: 10.1186/s13059-023-02969-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Accepted: 05/10/2023] [Indexed: 05/24/2023] Open
Abstract
BACKGROUND Several models and algorithms have been proposed to build pangenomes from multiple input assemblies, but their impact on variant representation, and consequently downstream analyses, is largely unknown. RESULTS We create multi-species super-pangenomes using pggb, cactus, and minigraph with the Bos taurus taurus reference sequence and eleven haplotype-resolved assemblies from taurine and indicine cattle, bison, yak, and gaur. We recover 221 k nonredundant structural variations (SVs) from the pangenomes, of which 135 k (61%) are common to all three. SVs derived from assembly-based calling show high agreement with the consensus calls from the pangenomes (96%), but validate only a small proportion of variations private to each graph. Pggb and cactus, which also incorporate base-level variation, have approximately 95% exact matches with assembly-derived small variant calls, which significantly improves the edit rate when realigning assemblies compared to minigraph. We use the three pangenomes to investigate 9566 variable number tandem repeats (VNTRs), finding 63% have identical predicted repeat counts in the three graphs, while minigraph can over or underestimate the count given its approximate coordinate system. We examine a highly variable VNTR locus and show that repeat unit copy number impacts the expression of proximal genes and non-coding RNA. CONCLUSIONS Our findings indicate good consensus between the three pangenome methods but also show their individual strengths and weaknesses that need to be considered when analysing different types of variants from multiple input assemblies.
Collapse
Affiliation(s)
- Alexander S Leonard
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland.
| | - Danang Crysnanto
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| | - Xena M Mapel
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| | - Meenu Bhati
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| | - Hubert Pausch
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland.
| |
Collapse
|
11
|
Zhou Y, Yang L, Han X, Han J, Hu Y, Li F, Xia H, Peng L, Boschiero C, Rosen BD, Bickhart DM, Zhang S, Guo A, Van Tassell CP, Smith TPL, Yang L, Liu GE. Assembly of a pangenome for global cattle reveals missing sequences and novel structural variations, providing new insights into their diversity and evolutionary history. Genome Res 2022; 32:gr.276550.122. [PMID: 35977842 PMCID: PMC9435747 DOI: 10.1101/gr.276550.122] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 07/21/2022] [Indexed: 02/03/2023]
Abstract
A cattle pangenome representation was created based on the genome sequences of 898 cattle representing 57 breeds. The pangenome identified 83 Mb of sequence not found in the cattle reference genome, representing 3.1% novel sequence compared with the 2.71-Gb reference. A catalog of structural variants developed from this cattle population identified 3.3 million deletions, 0.12 million inversions, and 0.18 million duplications. Estimates of breed ancestry and hybridization between cattle breeds using insertion/deletions as markers were similar to those produced by single nucleotide polymorphism-based analysis. Hundreds of deletions were observed to have stratification based on subspecies and breed. For example, an insertion of a Bov-tA1 repeat element was identified in the first intron of the APPL2 gene and correlated with cattle breed geographic distribution. This insertion falls within a segment overlapping predicted enhancer and promoter regions of the gene, and could affect important traits such as immune response, olfactory functions, cell proliferation, and glucose metabolism in muscle. The results indicate that pangenomes are a valuable resource for studying diversity and evolutionary history, and help to delineate how domestication, trait-based breeding, and adaptive introgression have shaped the cattle genome.
Collapse
Affiliation(s)
- Yang Zhou
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Lv Yang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Xiaotao Han
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Jiazheng Han
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Yan Hu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Fan Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Han Xia
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Lingwei Peng
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Clarissa Boschiero
- Animal Genomics and Improvement Laboratory, BARC, USDA-ARS, Beltsville, Maryland 20705, USA
| | - Benjamin D Rosen
- Animal Genomics and Improvement Laboratory, BARC, USDA-ARS, Beltsville, Maryland 20705, USA
| | - Derek M Bickhart
- Dairy Forage Research Center, ARS USDA, Madison, Wisconsin 53706, USA
| | - Shujun Zhang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Aizhen Guo
- The State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan 430070, China
| | - Curtis P Van Tassell
- Animal Genomics and Improvement Laboratory, BARC, USDA-ARS, Beltsville, Maryland 20705, USA
| | - Timothy P L Smith
- U.S. Meat Animal Research Center, ARS USDA, Clay Center, Nebraska 68933, USA
| | - Liguo Yang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - George E Liu
- Animal Genomics and Improvement Laboratory, BARC, USDA-ARS, Beltsville, Maryland 20705, USA
| |
Collapse
|
12
|
Leonard AS, Crysnanto D, Fang ZH, Heaton MP, Vander Ley BL, Herrera C, Bollwein H, Bickhart DM, Kuhn KL, Smith TPL, Rosen BD, Pausch H. Structural variant-based pangenome construction has low sensitivity to variability of haplotype-resolved bovine assemblies. Nat Commun 2022; 13:3012. [PMID: 35641504 PMCID: PMC9156671 DOI: 10.1038/s41467-022-30680-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 05/10/2022] [Indexed: 12/12/2022] Open
Abstract
Advantages of pangenomes over linear reference assemblies for genome research have recently been established. However, potential effects of sequence platform and assembly approach, or of combining assemblies created by different approaches, on pangenome construction have not been investigated. Here we generate haplotype-resolved assemblies from the offspring of three bovine trios representing increasing levels of heterozygosity that each demonstrate a substantial improvement in contiguity, completeness, and accuracy over the current Bos taurus reference genome. Diploid coverage as low as 20x for HiFi or 60x for ONT is sufficient to produce two haplotype-resolved assemblies meeting standards set by the Vertebrate Genomes Project. Structural variant-based pangenomes created from the haplotype-resolved assemblies demonstrate significant consensus regardless of sequence platform, assembler algorithm, or coverage. Inspecting pangenome topologies identifies 90 thousand structural variants including 931 overlapping with coding sequences; this approach reveals variants affecting QRICH2, PRDM9, HSPA1A, TAS2R46, and GC that have potential to affect phenotype.
Collapse
Affiliation(s)
- Alexander S Leonard
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8006, Zurich, Switzerland.
| | - Danang Crysnanto
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8006, Zurich, Switzerland
| | - Zih-Hua Fang
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8006, Zurich, Switzerland
| | - Michael P Heaton
- U.S. Meat Animal Research Center, USDA-ARS, 844 Road 313, Clay Center, NE, 68933, USA
| | - Brian L Vander Ley
- Great Plains Veterinary Educational Center, University of Nebraska-Lincoln, Lincoln, NE, 68588, USA
| | - Carolina Herrera
- Clinic of Reproductive Medicine, Department for Farm Animals, University of Zurich, 8057, Zurich, Switzerland
| | - Heinrich Bollwein
- Clinic of Reproductive Medicine, Department for Farm Animals, University of Zurich, 8057, Zurich, Switzerland
| | - Derek M Bickhart
- Dairy Forage Research Center, USDA-ARS, 1925 Linden Drive, Madison, WI, 53706, USA
| | - Kristen L Kuhn
- U.S. Meat Animal Research Center, USDA-ARS, 844 Road 313, Clay Center, NE, 68933, USA
| | - Timothy P L Smith
- U.S. Meat Animal Research Center, USDA-ARS, 844 Road 313, Clay Center, NE, 68933, USA
| | - Benjamin D Rosen
- Animal Genomics and Improvement Laboratory, USDA-ARS, 10300 Baltimore Ave, Beltsville, MD, 20705, USA.
| | - Hubert Pausch
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8006, Zurich, Switzerland.
| |
Collapse
|
13
|
Qanbari S, Schnabel RD, Wittenburg D. Evidence of rare misassemblies in the bovine reference genome revealed by population genetic metrics. Anim Genet 2022; 53:498-505. [DOI: 10.1111/age.13205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 04/01/2022] [Accepted: 04/09/2022] [Indexed: 11/30/2022]
Affiliation(s)
- Saber Qanbari
- Research Institute for Farm Animal Biology (FBN) Institute of Genetics and Biometry Dummerstorf Germany
| | - Robert D. Schnabel
- Division of Animal Sciences Institute for Data Science and Informatics 162 Animal Science Research Center University of Missouri Columbia Missouri USA
| | - Dörte Wittenburg
- Research Institute for Farm Animal Biology (FBN) Institute of Genetics and Biometry Dummerstorf Germany
| |
Collapse
|
14
|
Genomic evaluation of hybridization in historic and modern North American Bison (Bison bison). Sci Rep 2022; 12:6397. [PMID: 35430616 PMCID: PMC9013353 DOI: 10.1038/s41598-022-09828-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Accepted: 03/24/2022] [Indexed: 11/17/2022] Open
Abstract
During the late nineteenth century North American bison underwent a significant population bottleneck resulting in a reduction in population size of over 99% and a species-level near-extinction event. Factors responsible for this destruction included indiscriminate killing, loss of access to suitable habitat, and diseases. At the nadir of this population crash, very few wild plains bison survived and were restricted to Yellowstone National Park, USA and a small number of wild wood bison remained in Wood Buffalo National Park, Canada. However, most surviving bison in the late 1800’s were maintained by cattle ranchers in private herds where hybridization between bison with various breeds of domestic cattle was often encouraged. Over the last 20 years, the legacy of this introgression has been identified using mitochondrial DNA and limited nuclear microsatellite analyses. However, no genome-wide assessment has been performed, and some herds were believed to be free of introgression based on current genetic testing strategies. Herein, we report detailed analyses using whole genome sequencing from nineteen modern and six historical bison, chosen to represent the major lineages of bison, to identify and quantitate signatures of nuclear introgression in their recent (within 200 years) history. Both low and high coverage genomes provided evidence for recent introgression, including animals from Yellowstone, Wind Cave, and Elk Island National Parks which were previously thought to be free from hybridization with domestic cattle. We employed multiple approaches, including one developed for this work, to identify putative cattle haplotypes in each bison genome. These regions vary greatly in size and frequency by sample and herd, though we detected domestic cattle introgression in all bison genomes tested. Since our sampling strategy spanned across the diversity of modern bison populations, these finding are best explained by multiple historical hybridization events between these two species with significant genetic recombination over the last 200 years. Our results demonstrate that whole genome sequencing approaches are required to accurately quantitate cattle introgression in bison.
Collapse
|
15
|
Davenport KM, Bickhart DM, Worley K, Murali SC, Salavati M, Clark EL, Cockett NE, Heaton MP, Smith TPL, Murdoch BM, Rosen BD. An improved ovine reference genome assembly to facilitate in-depth functional annotation of the sheep genome. Gigascience 2022; 11:giab096. [PMID: 35134925 PMCID: PMC8848310 DOI: 10.1093/gigascience/giab096] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 10/28/2021] [Accepted: 12/25/2021] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND The domestic sheep (Ovis aries) is an important agricultural species raised for meat, wool, and milk across the world. A high-quality reference genome for this species enhances the ability to discover genetic mechanisms influencing biological traits. Furthermore, a high-quality reference genome allows for precise functional annotation of gene regulatory elements. The rapid advances in genome assembly algorithms and emergence of sequencing technologies with increasingly long reads provide the opportunity for an improved de novo assembly of the sheep reference genome. FINDINGS Short-read Illumina (55× coverage), long-read Pacific Biosciences (75× coverage), and Hi-C data from this ewe retrieved from public databases were combined with an additional 50× coverage of Oxford Nanopore data and assembled with canu v1.9. The assembled contigs were scaffolded using Hi-C data with Salsa v2.2, gaps filled with PBsuitev15.8.24, and polished with Nanopolish v0.12.5. After duplicate contig removal with PurgeDups v1.0.1, chromosomes were oriented and polished with 2 rounds of a pipeline that consisted of freebayes v1.3.1 to call variants, Merfin to validate them, and BCFtools to generate the consensus fasta. The ARS-UI_Ramb_v2.0 assembly is 2.63 Gb in length and has improved continuity (contig NG50 of 43.18 Mb), with a 19- and 38-fold decrease in the number of scaffolds compared with Oar_rambouillet_v1.0 and Oar_v4.0. ARS-UI_Ramb_v2.0 has greater per-base accuracy and fewer insertions and deletions identified from mapped RNA sequence than previous assemblies. CONCLUSIONS The ARS-UI_Ramb_v2.0 assembly is a substantial improvement in contiguity that will optimize the functional annotation of the sheep genome and facilitate improved mapping accuracy of genetic variant and expression data for traits in sheep.
Collapse
Affiliation(s)
- Kimberly M Davenport
- Department of Animal, Veterinary, and Food Sciences, University of Idaho, 875 Perimeter Dr, Moscow, ID 83843, USA
| | - Derek M Bickhart
- US Dairy Forage Research Center, USDA-ARS, 1925 Linden Drive, Madison, WI 53706, USA
| | - Kim Worley
- Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Shwetha C Murali
- Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Mazdak Salavati
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
| | - Emily L Clark
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
| | | | - Michael P Heaton
- US Meat Animal Research Center, USDA-ARS, State Spur 18D, Clay Center, NE 68933, USA
| | - Timothy P L Smith
- US Meat Animal Research Center, USDA-ARS, State Spur 18D, Clay Center, NE 68933, USA
| | - Brenda M Murdoch
- Department of Animal, Veterinary, and Food Sciences, University of Idaho, 875 Perimeter Dr, Moscow, ID 83843, USA
| | - Benjamin D Rosen
- Animal Genomics and Improvement Laboratory, USDA-ARS, 10300 Baltimore Ave, Beltsville, MD 20705, USA
| |
Collapse
|