76
|
Yang L, Xu L, Zhou Y, Liu M, Wang L, Kijas JW, Zhang H, Li L, Liu GE. Diversity of copy number variation in a worldwide population of sheep. Genomics 2018; 110:143-148. [DOI: 10.1016/j.ygeno.2017.09.005] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Revised: 08/24/2017] [Accepted: 09/11/2017] [Indexed: 01/14/2023]
|
77
|
Shen B, Jiang J, Seroussi E, Liu GE, Ma L. Characterization of recombination features and the genetic basis in multiple cattle breeds. BMC Genomics 2018; 19:304. [PMID: 29703147 PMCID: PMC5923192 DOI: 10.1186/s12864-018-4705-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Accepted: 04/22/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Crossover generated by meiotic recombination is a fundamental event that facilitates meiosis and sexual reproduction. Comparative studies have shown wide variation in recombination rate among species, but the characterization of recombination features between cattle breeds has not yet been performed. Cattle populations in North America count millions, and the dairy industry has genotyped millions of individuals with pedigree information that provide a unique opportunity to study breed-level variations in recombination. RESULTS Based on large pedigrees of Jersey, Ayrshire and Brown Swiss cattle with genotype data, we identified over 3.4 million maternal and paternal crossover events from 161,309 three-generation families. We constructed six breed- and sex-specific genome-wide recombination maps using 58,982 autosomal SNPs for two sexes in the three dairy cattle breeds. A comparative analysis of the six recombination maps revealed similar global recombination patterns between cattle breeds but with significant differences between sexes. We confirmed that male recombination map is 10% longer than the female map in all three cattle breeds, consistent with previously reported results in Holstein cattle. When comparing recombination hotspot regions between cattle breeds, we found that 30% and 10% of the hotspots were shared between breeds in males and females, respectively, with each breed exhibiting some breed-specific hotspots. Finally, our multiple-breed GWAS found that SNPs in eight loci affected recombination rate and that the PRDM9 gene associated with hotspot usage in multiple cattle breeds, indicating a shared genetic basis for recombination across dairy cattle breeds. CONCLUSIONS Collectively, our results generated breed- and sex-specific recombination maps for multiple cattle breeds, provided a comprehensive characterization and comparison of recombination patterns between breeds, and expanded our understanding of the breed-level variations in recombination features within an important livestock species.
Collapse
|
78
|
Zhou Y, Shen B, Jiang J, Padhi A, Park KE, Oswalt A, Sattler CG, Telugu BP, Chen H, Cole JB, Liu GE, Ma L. Construction of PRDM9 allele-specific recombination maps in cattle using large-scale pedigree analysis and genome-wide single sperm genomics. DNA Res 2018; 25:183-194. [PMID: 29186399 PMCID: PMC5909443 DOI: 10.1093/dnares/dsx048] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Accepted: 11/09/2017] [Indexed: 11/23/2022] Open
Abstract
PRDM9 contributes to hybrid sterility and species evolution. However, its role is to be confirmed in cattle, a major domesticated livestock species. We previously found an association near PRDM9 with cattle recombination features, but the causative variants are still unknown. Using millions of genotyped cattle with pedigree information, we characterized five PRDM9 alleles and generated allele-specific recombination maps. By examining allele-specific recombination patterns, we observed the impact of PRDM9 on global distribution of recombination, especially in the two ends of chromosomes. We also showed strong associations between recombination hotspot regions and functional mutations within PRDM9 zinc finger domain. More importantly, we found one allele of PRDM9 to be very different from others in both protein composition and recombination landscape, indicating the causative role of this allele on the association between PRDM9 and cattle recombination. When comparing recombination maps from sperm and pedigree data, we observed similar genome-wide recombination patterns, validating the quality of pedigree-based results. Collectively, these evidence supported PRDM9 alleles as causal variants for the reported association with cattle recombination. Our study comprehensively surveyed the bovine PRDM9 alleles, generated allele-specific recombination maps, and expanded our understanding of the role of PRDM9 on genome distribution of recombination.
Collapse
|
79
|
Li W, Bickhart DM, Ramunno L, Iamartino D, Williams JL, Liu GE. Comparative sequence alignment reveals River Buffalo genomic structural differences compared with cattle. Genomics 2018; 111:418-425. [PMID: 29501677 DOI: 10.1016/j.ygeno.2018.02.018] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Revised: 02/12/2018] [Accepted: 02/28/2018] [Indexed: 10/17/2022]
Abstract
This study sought to characterize differences in gene content, regulation and structure between taurine cattle and river buffalo (one subspecies of domestic water buffalo) using the extensively annotated UMD3.1 cattle reference genome as a basis for comparisons. We identified 127 deletion CNV regions in river buffalo representing 5 annotated cattle genes. We also characterized 583 merged mobile element insertion (MEI) events within the upstream regions of annotated cattle genes. Transcriptome analysis in various tissue types on river buffalo confirmed the absence of four cattle genes. Four genes which may be related to phenotypic differences in meat quality and color, had upstream MEI predictions and were found to have significantly elevated expression in river buffalo compared with cattle. Our comparative alignment approach and gene expression analyses suggested a functional role for many genomic structural variations, which may contribute to the unique phenotypes of river buffalo.
Collapse
|
80
|
Xu L, Haasl RJ, Sun J, Zhou Y, Bickhart DM, Li J, Song J, Sonstegard TS, Van Tassell CP, Lewin HA, Liu GE. Systematic Profiling of Short Tandem Repeats in the Cattle Genome. Genome Biol Evol 2018; 9:20-31. [PMID: 28172841 PMCID: PMC5381564 DOI: 10.1093/gbe/evw256] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/21/2016] [Indexed: 12/13/2022] Open
Abstract
Short tandem repeats (STRs), or microsatellites, are genetic variants with repetitive 2–6 base pair motifs in many mammalian genomes. Using high-throughput sequencing and experimental validations, we systematically profiled STRs in five Holsteins. We identified a total of 60,106 microsatellites and generated the first high-resolution STR map, representing a substantial pool of polymorphism in dairy cattle. We observed significant STRs overlap with functional genes and quantitative trait loci (QTL). We performed evolutionary and population genetic analyses using over 20,000 common dinucleotide STRs. Besides corroborating the well-established positive correlation between allele size and variance in allele size, these analyses also identified dozens of outlier STRs based on two anomalous relationships that counter expected characteristics of neutral evolution. And one STR locus overlaps with a significant region of a summary statistic designed to detect STR-related selection. Additionally, our results showed that only 57.1% of STRs located within SNP-based linkage disequilibrium (LD) blocks whereas the other 42.9% were out of blocks. Therefore, a substantial number of STRs are not tagged by SNPs in the cattle genome, likely due to STR's distinct mutation mechanism and elevated polymorphism. This study provides the foundation for future STR-based studies of cattle genome evolution and selection.
Collapse
|
81
|
Zhou Y, Li M, Hu X, Cai H, Hua L, Wang J, Huang Y, Lan X, Lei C, Liu GE, Li C, Plath M, Chen H. Characterization of candidate genes for bovine adipogenesis reveals differences of TUSC5 isoforms caused by novel alternative splicing. Oncotarget 2018. [DOI: 10.18632/oncotarget.23482] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
82
|
Xu L, Yang L, Bickhart DM, Li J, Liu GE. Analysis of Population-Genetic Properties of Copy Number Variations. Methods Mol Biol 2018; 1833:179-186. [PMID: 30039373 DOI: 10.1007/978-1-4939-8666-8_14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
While single nucleotide polymorphisms (SNPs) are typically the variant of choice for population genetics, copy number variations (CNVs) which comprise insertions, deletions and duplications of genomic sequences, is also an informative type of genetic variation. CNVs have been shown to be both common in mammals and important for understanding the relationship between genotype and phenotype. Moreover, population-specific CNVs are candidate regions under selection and are potentially responsible for diverse phenotypes.
Collapse
|
83
|
Mei C, Wang H, Liao Q, Wang L, Cheng G, Wang H, Zhao C, Zhao S, Song J, Guang X, Liu GE, Li A, Wu X, Wang C, Fang X, Zhao X, Smith SB, Yang W, Tian W, Gui L, Zhang Y, Hill RA, Jiang Z, Xin Y, Jia C, Sun X, Wang S, Yang H, Wang J, Zhu W, Zan L. Genetic Architecture and Selection of Chinese Cattle Revealed by Whole Genome Resequencing. Mol Biol Evol 2017; 35:688-699. [PMID: 29294071 DOI: 10.1093/molbev/msx322] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
The bovine genetic resources in China are diverse, but their value and potential are yet to be discovered. To determine the genetic diversity and population structure of Chinese cattle, we analyzed the whole genomes of 46 cattle from six phenotypically and geographically representative Chinese cattle breeds, together with 18 Red Angus cattle genomes, 11 Japanese black cattle genomes and taurine and indicine genomes available from previous studies. Our results showed that Chinese cattle originated from hybridization between Bos taurus and Bos indicus. Moreover, we found that the level of genetic variation in Chinese cattle depends upon the degree of indicine content. We also discovered many potential selective sweep regions associated with domestication related to breed-specific characteristics, with selective sweep regions including genes associated with coat color (ERCC2, MC1R, ZBTB17, and MAP2K1), dairy traits (NCAPG, MAPK7, FST, ITFG1, SETMAR, PAG1, CSN3, and RPL37A), and meat production/quality traits (such as BBS2, R3HDM1, IGFBP2, IGFBP5, MYH9, MYH4, and MC5R). These findings substantially expand the catalogue of genetic variants in cattle and reveal new insights into the evolutionary history and domestication traits of Chinese cattle.
Collapse
|
84
|
Yang L, Xu L, Zhu B, Niu H, Zhang W, Miao J, Shi X, Zhang M, Chen Y, Zhang L, Gao X, Gao H, Li L, Liu GE, Li J. Genome-wide analysis reveals differential selection involved with copy number variation in diverse Chinese Cattle. Sci Rep 2017; 7:14299. [PMID: 29085051 PMCID: PMC5662686 DOI: 10.1038/s41598-017-14768-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 10/12/2017] [Indexed: 12/20/2022] Open
Abstract
Copy number variations (CNVs) are defined as deletions, insertions, and duplications between two individuals of a species. To investigate the diversity and population-genetic properties of CNVs and their diverse selection patterns, we performed a genome-wide CNV analysis using high density SNP array in Chinese native cattle. In this study, we detected a total of 13,225 CNV events and 3,356 CNV regions (CNVRs), overlapping with 1,522 annotated genes. Among them, approximately 71.43 Mb of novel CNVRs were detected in the Chinese cattle population for the first time, representing the unique genomic resources in cattle. A new V i statistic was proposed to estimate the region-specific divergence in CNVR for each group based on unbiased estimates of pairwise V ST . We obtained 12 and 62 candidate CNVRs at the top 1% and top 5% of genome-wide V i value thresholds for each of four groups (North, Northwest, Southwest and South). Moreover, we identified many lineage-differentiated CNV genes across four groups, which were associated with several important molecular functions and biological processes, including metabolic process, response to stimulus, immune system, and others. Our findings provide some insights into understanding lineage-differentiated CNVs under divergent selection in the Chinese native cattle.
Collapse
|
85
|
Hay EHA, Choi I, Xu L, Zhou Y, Rowland RRR, Lunney JK, Liu GE. CNV Analysis of Host Responses to Porcine Reproductive and Respiratory Syndrome Virus Infection. J Genomics 2017; 5:58-63. [PMID: 28611852 PMCID: PMC5457943 DOI: 10.7150/jgen.20358] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Porcine reproductive and respiratory syndrome (PRRS) is a devastating disease with a significant impact on the swine industry causing major economic losses. The objective of this study is to examine copy number variations (CNVs) associated with the group-specific host responses to PRRS virus infection. We performed a genome-wide CNV analysis using 660 animals genotyped with on the porcine SNP60 BeadChip and discovered 7097 CNVs and 271 CNV regions (CNVRs). For this study, we used two established traits related to host response to the virus, i.e. viral load (VL, area under the curve of log-transformed serum viremia from 0 to 21 days post infection) and weight gain (WG42 from 0 to 42 days post infection). To investigate the effects of CNVs on differential host responses to PRRS, we compared groups of animals with extreme high and low estimated breeding values (EBVs) for both traits using a case-control study design. For VL, we identified 163 CNVRs (84 Mb) from the high group and 159 CNVRs (76 Mb) from the low group. For WG42, we detected 126 (68 Mb) and 156 (79 Mb) CNVRs for high and low groups, respectively. Based on gene annotation within group-specific CNVRs, we performed network analyses and observed some potential candidate genes. Our results revealed these group-specific genes are involved in regulating innate and acquired immune response pathways. Specifically, molecules like interferons and interleukins are closely related to host responses to PRRS virus infection.
Collapse
|
86
|
Gao Y, Jiang J, Yang S, Hou Y, Liu GE, Zhang S, Zhang Q, Sun D. CNV discovery for milk composition traits in dairy cattle using whole genome resequencing. BMC Genomics 2017; 18:265. [PMID: 28356085 PMCID: PMC5371188 DOI: 10.1186/s12864-017-3636-3] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Accepted: 03/17/2017] [Indexed: 01/08/2023] Open
Abstract
Background Copy number variations (CNVs) are important and widely distributed in the genome. CNV detection opens a new avenue for exploring genes associated with complex traits in humans, animals and plants. Herein, we present a genome-wide assessment of CNVs that are potentially associated with milk composition traits in dairy cattle. Results In this study, CNVs were detected based on whole genome re-sequencing data of eight Holstein bulls from four half- and/or full-sib families, with extremely high and low estimated breeding values (EBVs) of milk protein percentage and fat percentage. The range of coverage depth per individual was 8.2–11.9×. Using CNVnator, we identified a total of 14,821 CNVs, including 5025 duplications and 9796 deletions. Among them, 487 differential CNV regions (CNVRs) comprising ~8.23 Mb of the cattle genome were observed between the high and low groups. Annotation of these differential CNVRs were performed based on the cattle genome reference assembly (UMD3.1) and totally 235 functional genes were found within the CNVRs. By Gene Ontology and KEGG pathway analyses, we found that genes were significantly enriched for specific biological functions related to protein and lipid metabolism, insulin/IGF pathway-protein kinase B signaling cascade, prolactin signaling pathway and AMPK signaling pathways. These genes included INS, IGF2, FOXO3, TH, SCD5, GALNT18, GALNT16, ART3, SNCA and WNT7A, implying their potential association with milk protein and fat traits. In addition, 95 CNVRs were overlapped with 75 known QTLs that are associated with milk protein and fat traits of dairy cattle (Cattle QTLdb). Conclusions In conclusion, based on NGS of 8 Holstein bulls with extremely high and low EBVs for milk PP and FP, we identified a total of 14,821 CNVs, 487 differential CNVRs between groups, and 10 genes, which were suggested as promising candidate genes for milk protein and fat traits. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3636-3) contains supplementary material, which is available to authorized users.
Collapse
|
87
|
Padhi A, Shen B, Jiang J, Zhou Y, Liu GE, Ma L. Ruminant-specific multiple duplication events of PRDM9 before speciation. BMC Evol Biol 2017; 17:79. [PMID: 28292260 PMCID: PMC5351255 DOI: 10.1186/s12862-017-0892-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Accepted: 01/26/2017] [Indexed: 11/30/2022] Open
Abstract
Background Understanding the genetic and evolutionary mechanisms of speciation genes in sexually reproducing organisms would provide important insights into mammalian reproduction and fitness. PRDM9, a widely known speciation gene, has recently gained attention for its important role in meiotic recombination and hybrid incompatibility. Despite the fact that PRDM9 is a key regulator of recombination and plays a dominant role in hybrid incompatibility, little is known about the underlying genetic and evolutionary mechanisms that generated multiple copies of PRDM9 in many metazoan lineages. Results The present study reports (1) evidence of ruminant-specific multiple gene duplication events, which likely have had occurred after the ancestral ruminant population diverged from its most recent common ancestor and before the ruminant speciation events, (2) presence of three copies of PRDM9, one copy (lineages I) in chromosome 1 (chr1) and two copies (lineages II & III) in chromosome X (chrX), thus indicating the possibility of ancient inter- and intra-chromosomal unequal crossing over and gene conversion events, (3) while lineages I and II are characterized by the presence of variable tandemly repeated C2H2 zinc finger (ZF) arrays, lineage III lost these arrays, and (4) C2H2 ZFs of lineages I and II, particularly the amino acid residues located at positions −1, 3, and 6 have evolved under strong positive selection. Conclusions Our results demonstrated two gene duplication events of PRDM9 in ruminants: an inter-chromosomal duplication that occurred between chr1 and chrX, and an intra-chromosomal X-linked duplication, which resulted in two additional copies of PRDM9 in ruminants. The observation of such duplication between chrX and chr1 is rare and may possibly have happened due to unequal crossing-over millions of years ago when sex chromosomes were independently derived from a pair of ancestral autosomes. Two copies (lineages I & II) are characterized by the presence of variable sized tandem-repeated C2H2 ZFs and evolved under strong positive selection and concerted evolution, supporting the notion of well-established Red Queen hypothesis. Collectively, gene duplication, concerted evolution, and positive selection are the likely driving forces for the expansion of ruminant PRDM9 sub-family. Electronic supplementary material The online version of this article (doi:10.1186/s12862-017-0892-4) contains supplementary material, which is available to authorized users.
Collapse
|
88
|
Zhou Y, Xu L, Bickhart DM, Abdel Hay EH, Schroeder SG, Connor EE, Alexander LJ, Sonstegard TS, Van Tassell CP, Chen H, Liu GE. Reduced representation bisulphite sequencing of ten bovine somatic tissues reveals DNA methylation patterns and their impacts on gene expression. BMC Genomics 2016; 17:779. [PMID: 27716143 PMCID: PMC5053184 DOI: 10.1186/s12864-016-3116-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2016] [Accepted: 09/23/2016] [Indexed: 01/16/2023] Open
Abstract
Background As a major epigenetic component, DNA methylation plays important functions in individual development and various diseases. DNA methylation has been well studied in human and model organisms, but only limited data exist in economically important animals like cattle. Results Using reduced representation bisulphite sequencing (RRBS), we obtained single-base-resolution maps of bovine DNA methylation from ten somatic tissues. In total, we evaluated 1,868,049 cytosines in CG-enriched regions. While we found slightly low methylation levels (29.87 to 38.06 %) in cattle, the methylation contexts (CGs and non-CGs) of cattle showed similar methylation patterns to other species. Non-CG methylation was detected but methylation levels in somatic tissues were significantly lower than in pluripotent cells. To study the potential function of the methylation, we detected 10,794 differentially methylated cytosines (DMCs) and 836 differentially methylated CG islands (DMIs). Further analyses in the same tissues revealed many DMCs (including non-CGs) and DMIs, which were highly correlated with the expression of genes involved in tissue development. Conclusions In summary, our study provides a baseline dataset and essential information for DNA methylation profiles of cattle. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3116-1) contains supplementary material, which is available to authorized users.
Collapse
|
89
|
Zhou Y, Utsunomiya YT, Xu L, Hay EHA, Bickhart DM, Alexandre PA, Rosen BD, Schroeder SG, Carvalheiro R, de Rezende Neves HH, Sonstegard TS, Van Tassell CP, Ferraz JBS, Fukumasu H, Garcia JF, Liu GE. Genome-wide CNV analysis reveals variants associated with growth traits in Bos indicus. BMC Genomics 2016; 17:419. [PMID: 27245577 PMCID: PMC4888316 DOI: 10.1186/s12864-016-2461-4] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2015] [Accepted: 02/11/2016] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Apart from single nucleotide polymorphism (SNP), copy number variation (CNV) is another important type of genetic variation, which may affect growth traits and play key roles for the production of beef cattle. To date, no genome-wide association study (GWAS) for CNV and body traits in beef cattle has been reported, so the present study aimed to investigate this type of association in one of the most important cattle subspecies: Bos indicus (Nellore breed). RESULTS We have used intensity data from over 700,000 SNP probes across the bovine genome to detect common CNVs in a sample of 2230 Nellore cattle, and performed GWAS between the detected CNVs and nine growth traits. After filtering for frequency and length, a total of 231 CNVs ranging from 894 bp to 4,855,088 bp were kept and tested as predictors for each growth trait using linear regression analysis with principal components correction. There were 49 significant associations identified among 17 CNVs and seven body traits after false discovery rate correction (P < 0.05). Among the 17 CNVs, three were significant or marginally significant for all the traits. We have compared the locations of associated CNVs with quantitative trait locus and the RefGene database, and found two sets of 9 CNVs overlapping with either known QTLs or genes, respectively. The gene overlapping with CNV100, KCNJ12, is a functional candidate for muscle development and plays critical roles in muscling traits. CONCLUSION This study presents the first CNV-based GWAS of growth traits using high density SNP microarray data in cattle. We detected 17 CNVs significantly associated with seven growth traits and one of them (CNV100) may be involved in growth traits through KCNJ12.
Collapse
|
90
|
Bickhart DM, Xu L, Hutchison JL, Cole JB, Null DJ, Schroeder SG, Song J, Garcia JF, Sonstegard TS, Van Tassell CP, Schnabel RD, Taylor JF, Lewin HA, Liu GE. Diversity and population-genetic properties of copy number variations and multicopy genes in cattle. DNA Res 2016; 23:253-62. [PMID: 27085184 PMCID: PMC4909312 DOI: 10.1093/dnares/dsw013] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Accepted: 02/29/2016] [Indexed: 11/14/2022] Open
Abstract
The diversity and population genetics of copy number variation (CNV) in domesticated animals are not well understood. In this study, we analysed 75 genomes of major taurine and indicine cattle breeds (including Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, and Romagnola), sequenced to 11-fold coverage to identify 1,853 non-redundant CNV regions. Supported by high validation rates in array comparative genomic hybridization (CGH) and qPCR experiments, these CNV regions accounted for 3.1% (87.5 Mb) of the cattle reference genome, representing a significant increase over previous estimates of the area of the genome that is copy number variable (∼2%). Further population genetics and evolutionary genomics analyses based on these CNVs revealed the population structures of the cattle taurine and indicine breeds and uncovered potential diversely selected CNVs near important functional genes, including AOX1, ASZ1, GAT, GLYAT, and KRTAP9-1. Additionally, 121 CNV gene regions were found to be either breed specific or differentially variable across breeds, such as RICTOR in dairy breeds and PNPLA3 in beef breeds. In contrast, clusters of the PRP and PAG genes were found to be duplicated in all sequenced animals, suggesting that subfunctionalization, neofunctionalization, or overdominance play roles in diversifying those fertility-related genes. These CNV results provide a new glimpse into the diverse selection histories of cattle breeds and a basis for correlating structural variation with complex traits in the future.
Collapse
|
91
|
Xu L, Hou Y, Bickhart DM, Zhou Y, Hay EHA, Song J, Sonstegard TS, Van Tassell CP, Liu GE. Population-genetic properties of differentiated copy number variations in cattle. Sci Rep 2016; 6:23161. [PMID: 27005566 PMCID: PMC4804293 DOI: 10.1038/srep23161] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Accepted: 02/25/2016] [Indexed: 01/24/2023] Open
Abstract
While single nucleotide polymorphism (SNP) is typically the variant of choice for population genetics, copy number variation (CNV) which comprises insertion, deletion and duplication of genomic sequence, is an informative type of genetic variation. CNVs have been shown to be both common in mammals and important for understanding the relationship between genotype and phenotype. However, CNV differentiation, selection and its population genetic properties are not well understood across diverse populations. We performed a population genetics survey based on CNVs derived from the BovineHD SNP array data of eight distinct cattle breeds. We generated high resolution results that show geographical patterns of variations and genome-wide admixture proportions within and among breeds. Similar to the previous SNP-based studies, our CNV-based results displayed a strong correlation of population structure and geographical location. By conducting three pairwise comparisons among European taurine, African taurine, and indicine groups, we further identified 78 unique CNV regions that were highly differentiated, some of which might be due to selection. These CNV regions overlapped with genes involved in traits related to parasite resistance, immunity response, body size, fertility, and milk production. Our results characterize CNV diversity among cattle populations and provide a list of lineage-differentiated CNVs.
Collapse
|
92
|
Bickhart DM, Hutchison JL, Xu L, Schnabel RD, Taylor JF, Reecy JM, Schroeder S, Van Tassell CP, Sonstegard TS, Liu GE. RAPTR-SV: a hybrid method for the detection of structural variants. Bioinformatics 2015; 31:2084-90. [PMID: 25686638 DOI: 10.1093/bioinformatics/btv086] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2014] [Accepted: 02/09/2015] [Indexed: 12/30/2022] Open
Abstract
MOTIVATION Identification of structural variants (SVs) in sequence data results in a large number of false positive calls using existing software, which overburdens subsequent validation. RESULTS Simulations using RAPTR-SV and other, similar algorithms for SV detection revealed that RAPTR-SV had superior sensitivity and precision, as it recovered 66.4% of simulated tandem duplications with a precision of 99.2%. When compared with calls made by Delly and LUMPY on available datasets from the 1000 genomes project, RAPTR-SV showed superior sensitivity for tandem duplications, as it identified 2-fold more duplications than Delly, while making ∼85% fewer duplication predictions. AVAILABILITY AND IMPLEMENTATION RAPTR-SV is written in Java and uses new features in the collections framework in the latest release of the Java version 8 language specifications. A compiled version of the software, instructions for usage and test results files are available on the GitHub repository page: https://github.com/njdbickhart/RAPTR-SV. CONTACT derek.bickhart@ars.usda.gov.
Collapse
|
93
|
Xu L, Bickhart DM, Cole JB, Schroeder SG, Song J, Tassell CPV, Sonstegard TS, Liu GE. Genomic signatures reveal new evidences for selection of important traits in domestic cattle. Mol Biol Evol 2014; 32:711-25. [PMID: 25431480 DOI: 10.1093/molbev/msu333] [Citation(s) in RCA: 110] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
We investigated diverse genomic selections using high-density single nucleotide polymorphism data of five distinct cattle breeds. Based on allele frequency differences, we detected hundreds of candidate regions under positive selection across Holstein, Angus, Charolais, Brahman, and N'Dama. In addition to well-known genes such as KIT, MC1R, ASIP, GHR, LCORL, NCAPG, WIF1, and ABCA12, we found evidence for a variety of novel and less-known genes under selection in cattle, such as LAP3, SAR1B, LRIG3, FGF5, and NUDCD3. Selective sweeps near LAP3 were then validated by next-generation sequencing. Genome-wide association analysis involving 26,362 Holsteins confirmed that LAP3 and SAR1B were related to milk production traits, suggesting that our candidate regions were likely functional. In addition, haplotype network analyses further revealed distinct selective pressures and evolution patterns across these five cattle breeds. Our results provided a glimpse into diverse genomic selection during cattle domestication, breed formation, and recent genetic improvement. These findings will facilitate genome-assisted breeding to improve animal production and health.
Collapse
|
94
|
Xu L, Zhao F, Ren H, Li L, Lu J, Liu J, Zhang S, Liu GE, Song J, Zhang L, Wei C, Du L. Co-expression analysis of fetal weight-related genes in ovine skeletal muscle during mid and late fetal development stages. Int J Biol Sci 2014; 10:1039-50. [PMID: 25285036 PMCID: PMC4183924 DOI: 10.7150/ijbs.9737] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2014] [Accepted: 08/16/2014] [Indexed: 11/05/2022] Open
Abstract
BACKGROUND Muscle development and lipid metabolism play important roles during fetal development stages. The commercial Texel sheep are more muscular than the indigenous Ujumqin sheep. RESULTS We performed serial transcriptomics assays and systems biology analyses to investigate the dynamics of gene expression changes associated with fetal longissimus muscles during different fetal stages in two sheep breeds. Totally, we identified 1472 differentially expressed genes during various fetal stages using time-series expression analysis. A systems biology approach, weighted gene co-expression network analysis (WGCNA), was used to detect modules of correlated genes among these 1472 genes. Dramatically different gene modules were identified in four merged datasets, corresponding to the mid fetal stage in Texel and Ujumqin sheep, the late fetal stage in Texel and Ujumqin sheep, respectively. We further detected gene modules significantly correlated with fetal weight, and constructed networks and pathways using genes with high significances. In these gene modules, we identified genes like TADA3, LMNB1, TGF-β3, EEF1A2, FGFR1, MYOZ1, and FBP2 correlated with fetal weight. CONCLUSION Our study revealed the complex network characteristics involved in muscle development and lipid metabolism during fetal development stages. Diverse patterns of the network connections observed between breeds and fetal stages could involve some hub genes, which play central roles in fetal development, correlating with fetal weight. Our findings could provide potential valuable biomarkers for selection of body weight-related traits in sheep and other livestock.
Collapse
|
95
|
Xu L, Cole JB, Bickhart DM, Hou Y, Song J, VanRaden PM, Sonstegard TS, Van Tassell CP, Liu GE. Genome wide CNV analysis reveals additional variants associated with milk production traits in Holsteins. BMC Genomics 2014; 15:683. [PMID: 25128478 PMCID: PMC4152564 DOI: 10.1186/1471-2164-15-683] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2014] [Accepted: 07/31/2014] [Indexed: 12/21/2022] Open
Abstract
Background Milk production is an economically important sector of global agriculture. Much attention has been paid to the identification of quantitative trait loci (QTL) associated with milk, fat, and protein yield and the genetic and molecular mechanisms underlying them. Copy number variation (CNV) is an emerging class of variants which may be associated with complex traits. Results In this study, we performed a genome-wide association between CNVs and milk production traits in 26,362 Holstein bulls and cows. A total of 99 candidate CNVs were identified using Illumina BovineSNP50 array data, and association tests for each production trait were performed using a linear regression analysis with PCA correlation. A total of 34 CNVs on 22 chromosomes were significantly associated with at least one milk production trait after false discovery rate (FDR) correction. Some of those CNVs were located within or near known QTL for milk production traits. We further investigated the relationship between associated CNVs with neighboring SNPs. For all 82 combinations of traits and CNVs (less than 400 kb in length), we found 17 cases where CNVs directly overlapped with tag SNPs and 40 cases where CNVs were adjacent to tag SNPs. In 5 cases, CNVs located were in strong linkage disequilibrium with tag SNPs, either within or adjacent to the same haplotype block. There were an additional 20 cases where CNVs did not have a significant association with SNPs, suggesting that the effects of those CNVs were probably not captured by tag SNPs. Conclusion We conclude that combining CNV with SNP analyses reveals more genetic variations underlying milk production traits than those revealed by SNPs alone. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-683) contains supplementary material, which is available to authorized users.
Collapse
|
96
|
Shin JH, Xu L, Li RW, Gao Y, Bickhart D, Liu GE, Baldwin R, Li CJ. A high-resolution whole-genome map of the distinctive epigenomic landscape induced by butyrate in bovine cells. Anim Genet 2014; 45 Suppl 1:40-50. [PMID: 24990294 DOI: 10.1111/age.12147] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/14/2014] [Indexed: 12/11/2022]
Abstract
This report presents a study utilizing next-generation sequencing technology, combined with chromatin immunoprecipitation (ChIP-seq) technology to analyze histone modification induced by butyrate and to construct a high-definition map of the epigenomic landscape with normal histone H3 and H4 and their variants in bovine cells at the whole-genome scale. A total of 10 variants of histone H3 and H4 modifications were mapped at the whole-genome scale (acetyl-H3K18-ChIP-seq, trimethy-H3K9, histone H4 ChIP-seq, acetyl-H4K5 ChIP-seq, acetyl-H4K12 ChIP-seq, acetyl-H4K16 ChIP-seq, histone H3 ChIP-seq, acetyl H3H9 ChIP-seq, acetyl H3K27 ChIP-seq and tetra-acetyl H4 ChIP-seq). Integrated experiential data and an analysis of histone and histone modification at a single base resolution across the entire genome are presented. We analyzed the enriched binding regions in the proximal promoter (within 5 kb upstream or at the 5'-untranslated region from the transcriptional start site (TSS)), and the exon, intron and intergenic regions (defined by regions 25 kb upstream and 10 kb downstream from the TSS). A de novo search for the binding motif of the 10 ChIP-seq datasets discovered numerous motifs from each of the ChIP-seq datasets. These consensus sequences indicated that histone modification at different locations changes the histone H3 and H4 binding preferences. Nevertheless, a high degree of conservation in histone binding also was presented in these motifs. This first extensive epigenomic landscape mapping in bovine cells offers a new framework and a great resource for testing the role of epigenomes in cell function and transcriptomic regulation.
Collapse
|
97
|
Xu L, Hou Y, Bickhart DM, Song J, Van Tassell CP, Sonstegard TS, Liu GE. A genome-wide survey reveals a deletion polymorphism associated with resistance to gastrointestinal nematodes in Angus cattle. Funct Integr Genomics 2014; 14:333-9. [PMID: 24718732 DOI: 10.1007/s10142-014-0371-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2013] [Revised: 03/26/2014] [Accepted: 03/27/2014] [Indexed: 01/17/2023]
Abstract
Gastrointestinal (GI) nematode infections are a worldwide threat to human health and animal production. In this study, we performed a genome-wide association study between copy number variations (CNVs) and resistance to GI nematodes in an Angus cattle population. Using a linear regression analysis, we identified one deletion CNV which reaches genome-wide significance after Bonferroni correction. With multiple mapped human olfactory receptor genes but no annotated bovine genes in the region, this significantly associated CNV displays high population frequencies (58.26 %) with a length of 104.8 kb on chr7. We further investigated the linkage disequilibrium (LD) relationships between this CNV and its nearby single nucleotide polymorphisms (SNPs) and genes. The underlining haplotype blocks contain immune-related genes such as ZNF496 and NLRP3. As this CNV co-segregates with linked SNPs and associated genes, we suspect that it could contribute to the detected variations in gene expression and thus differences in host parasite resistance.
Collapse
|
98
|
Cui X, Hou Y, Yang S, Xie Y, Zhang S, Zhang Y, Zhang Q, Lu X, Liu GE, Sun D. Transcriptional profiling of mammary gland in Holstein cows with extremely different milk protein and fat percentage using RNA sequencing. BMC Genomics 2014; 15:226. [PMID: 24655368 PMCID: PMC3998192 DOI: 10.1186/1471-2164-15-226] [Citation(s) in RCA: 97] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2013] [Accepted: 03/18/2014] [Indexed: 11/15/2022] Open
Abstract
Background Recently, RNA sequencing (RNA-seq) has rapidly emerged as a major transcriptome profiling system. Elucidation of the bovine mammary gland transcriptome by RNA-seq is essential for identifying candidate genes that contribute to milk composition traits in dairy cattle. Results We used massive, parallel, high-throughput, RNA-seq to generate the bovine transcriptome from the mammary glands of four lactating Holstein cows with extremely high and low phenotypic values of milk protein and fat percentage. In total, we obtained 48,967,376–75,572,578 uniquely mapped reads that covered 82.25% of the current annotated transcripts, which represented 15549 mRNA transcripts, across all the four mammary gland samples. Among them, 31 differentially expressed genes (p < 0.05, false discovery rate q < 0.05) between the high and low groups of cows were revealed. Gene ontology and pathway analysis demonstrated that the 31 differently expressed genes were enriched in specific biological processes with regard to protein metabolism, fat metabolism, and mammary gland development (p < 0.05). Integrated analysis of differential gene expression, previously reported quantitative trait loci, and genome-wide association studies indicated that TRIB3, SAA (SAA1, SAA3, and M-SAA3.2), VEGFA, PTHLH, and RPL23A were the most promising candidate genes affecting milk protein and fat percentage. Conclusions This study investigated the complexity of the mammary gland transcriptome in dairy cattle using RNA-seq. Integrated analysis of differential gene expression and the reported quantitative trait loci and genome-wide association study data permitted the identification of candidate key genes for milk composition traits.
Collapse
|
99
|
Pérez O'Brien AM, Utsunomiya YT, Mészáros G, Bickhart DM, Liu GE, Van Tassell CP, Sonstegard TS, Da Silva MVB, Garcia JF, Sölkner J. Assessing signatures of selection through variation in linkage disequilibrium between taurine and indicine cattle. Genet Sel Evol 2014; 46:19. [PMID: 24592996 PMCID: PMC4014805 DOI: 10.1186/1297-9686-46-19] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2013] [Accepted: 01/09/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Signatures of selection are regions in the genome that have been preferentially increased in frequency and fixed in a population because of their functional importance in specific processes. These regions can be detected because of their lower genetic variability and specific regional linkage disequilibrium (LD) patterns. METHODS By comparing the differences in regional LD variation between dairy and beef cattle types, and between indicine and taurine subspecies, we aim at finding signatures of selection for production and adaptation in cattle breeds. The VarLD method was applied to compare the LD variation in the autosomal genome between breeds, including Angus and Brown Swiss, representing taurine breeds, and Nelore and Gir, representing indicine breeds. Genomic regions containing the top 0.01 and 0.1 percentile of signals were characterized using the UMD3.1 Bos taurus genome assembly to identify genes in those regions and compared with previously reported selection signatures and regions with copy number variation. RESULTS For all comparisons, the top 0.01 and 0.1 percentile included 26 and 165 signals and 17 and 125 genes, respectively, including TECRL, BT.23182 or FPPS, CAST, MYOM1, UVRAG and DNAJA1. CONCLUSIONS The VarLD method is a powerful tool to identify differences in linkage disequilibrium between cattle populations and putative signatures of selection with potential adaptive and productive importance.
Collapse
|
100
|
Bickhart DM, Liu GE. The challenges and importance of structural variation detection in livestock. Front Genet 2014; 5:37. [PMID: 24600474 PMCID: PMC3927395 DOI: 10.3389/fgene.2014.00037] [Citation(s) in RCA: 82] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2013] [Accepted: 01/31/2014] [Indexed: 01/25/2023] Open
Abstract
Recent studies in humans and other model organisms have demonstrated that structural variants (SVs) comprise a substantial proportion of variation among individuals of each species. Many of these variants have been linked to debilitating diseases in humans, thereby cementing the importance of refining methods for their detection. Despite progress in the field, reliable detection of SVs still remains a problem even for human subjects. Many of the underlying problems that make SVs difficult to detect in humans are amplified in livestock species, whose lower quality genome assemblies and incomplete gene annotation can often give rise to false positive SV discoveries. Regardless of the challenges, SV detection is just as important for livestock researchers as it is for human researchers, given that several productive traits and diseases have been linked to copy number variations (CNVs) in cattle, sheep, and pig. Already, there is evidence that many beneficial SVs have been artificially selected in livestock such as a duplication of the agouti signaling protein gene that causes white coat color in sheep. In this review, we will list current SV and CNV discoveries in livestock and discuss the problems that hinder routine discovery and tracking of these polymorphisms. We will also discuss the impacts of selective breeding on CNV and SV frequencies and mention how SV genotyping could be used in the future to improve genetic selection.
Collapse
|