1
|
Dallaire X, Bouchard R, Hénault P, Ulmo-Diaz G, Normandeau E, Mérot C, Bernatchez L, Moore JS. Widespread Deviant Patterns of Heterozygosity in Whole-Genome Sequencing Due to Autopolyploidy, Repeated Elements, and Duplication. Genome Biol Evol 2023; 15:evad229. [PMID: 38085037 PMCID: PMC10752349 DOI: 10.1093/gbe/evad229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/30/2023] [Indexed: 12/28/2023] Open
Abstract
Most population genomic tools rely on accurate single nucleotide polymorphism (SNP) calling and filtering to meet their underlying assumptions. However, genomic complexity, resulting from structural variants, paralogous sequences, and repetitive elements, presents significant challenges in assembling contiguous reference genomes. Consequently, short-read resequencing studies can encounter mismapping issues, leading to SNPs that deviate from Mendelian expected patterns of heterozygosity and allelic ratio. In this study, we employed the ngsParalog software to identify such deviant SNPs in whole-genome sequencing (WGS) data with low (1.5×) to intermediate (4.8×) coverage for four species: Arctic Char (Salvelinus alpinus), Lake Whitefish (Coregonus clupeaformis), Atlantic Salmon (Salmo salar), and the American Eel (Anguilla rostrata). The analyses revealed that deviant SNPs accounted for 22% to 62% of all SNPs in salmonid datasets and approximately 11% in the American Eel dataset. These deviant SNPs were particularly concentrated within repetitive elements and genomic regions that had recently undergone rediploidization in salmonids. Additionally, narrow peaks of elevated coverage were ubiquitous along all four reference genomes, encompassed most deviant SNPs, and could be partially associated with transposons and tandem repeats. Including these deviant SNPs in genomic analyses led to highly distorted site frequency spectra, underestimated pairwise FST values, and overestimated nucleotide diversity. Considering the widespread occurrence of deviant SNPs arising from a variety of sources, their important impact in estimating population parameters, and the availability of effective tools to identify them, we propose that excluding deviant SNPs from WGS datasets is required to improve genomic inferences for a wide range of taxa and sequencing depths.
Collapse
Affiliation(s)
- Xavier Dallaire
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Centre d'Études Nordiques, Université Laval, Québec, Canada
| | - Raphael Bouchard
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Philippe Hénault
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Gabriela Ulmo-Diaz
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Eric Normandeau
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
- Plateforme de bio-informatique de l’IBIS, Université Laval, Québec, Canada
| | - Claire Mérot
- CNRS, UMR 6553 ECOBIO, Université de Rennes, Rennes, France
| | - Louis Bernatchez
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Jean-Sébastien Moore
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Centre d'Études Nordiques, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| |
Collapse
|
2
|
Bernard M, Dehaullon A, Gao G, Paul K, Lagarde H, Charles M, Prchal M, Danon J, Jaffrelo L, Poncet C, Patrice P, Haffray P, Quillet E, Dupont-Nivet M, Palti Y, Lallias D, Phocas F. Development of a High-Density 665 K SNP Array for Rainbow Trout Genome-Wide Genotyping. Front Genet 2022; 13:941340. [PMID: 35923696 PMCID: PMC9340366 DOI: 10.3389/fgene.2022.941340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 06/24/2022] [Indexed: 12/02/2022] Open
Abstract
Single nucleotide polymorphism (SNP) arrays, also named « SNP chips », enable very large numbers of individuals to be genotyped at a targeted set of thousands of genome-wide identified markers. We used preexisting variant datasets from USDA, a French commercial line and 30X-coverage whole genome sequencing of INRAE isogenic lines to develop an Affymetrix 665 K SNP array (HD chip) for rainbow trout. In total, we identified 32,372,492 SNPs that were polymorphic in the USDA or INRAE databases. A subset of identified SNPs were selected for inclusion on the chip, prioritizing SNPs whose flanking sequence uniquely aligned to the Swanson reference genome, with homogenous repartition over the genome and the highest Minimum Allele Frequency in both USDA and French databases. Of the 664,531 SNPs which passed the Affymetrix quality filters and were manufactured on the HD chip, 65.3% and 60.9% passed filtering metrics and were polymorphic in two other distinct French commercial populations in which, respectively, 288 and 175 sampled fish were genotyped. Only 576,118 SNPs mapped uniquely on both Swanson and Arlee reference genomes, and 12,071 SNPs did not map at all on the Arlee reference genome. Among those 576,118 SNPs, 38,948 SNPs were kept from the commercially available medium-density 57 K SNP chip. We demonstrate the utility of the HD chip by describing the high rates of linkage disequilibrium at 2–10 kb in the rainbow trout genome in comparison to the linkage disequilibrium observed at 50–100 kb which are usual distances between markers of the medium-density chip.
Collapse
Affiliation(s)
- Maria Bernard
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
- INRAE, SIGENAE, Jouy-en-Josas, France
| | - Audrey Dehaullon
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
| | - Guangtu Gao
- USDA, REE, ARS, NEA, NCCCWA, Kearneysville, WV, United States
| | - Katy Paul
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
| | - Henri Lagarde
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
| | - Mathieu Charles
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
- INRAE, SIGENAE, Jouy-en-Josas, France
| | - Martin Prchal
- South Bohemian Research Center of Aquaculture and Biodiversity of Hydrocenoses, Faculty of Fisheries and Protection of Waters, University of South Bohemia, Vodňany, Czechia
| | - Jeanne Danon
- INRAE-UCA, Plateforme Gentyane, UMR GDEC, Clermont-Ferrand, France
| | - Lydia Jaffrelo
- INRAE-UCA, Plateforme Gentyane, UMR GDEC, Clermont-Ferrand, France
| | - Charles Poncet
- INRAE-UCA, Plateforme Gentyane, UMR GDEC, Clermont-Ferrand, France
| | | | | | - Edwige Quillet
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
| | | | - Yniv Palti
- USDA, REE, ARS, NEA, NCCCWA, Kearneysville, WV, United States
| | - Delphine Lallias
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
| | - Florence Phocas
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
- *Correspondence: Florence Phocas,
| |
Collapse
|
3
|
Sundaray JK, Dixit S, Rather A, Rasal KD, Sahoo L. Aquaculture omics: An update on the current status of research and data analysis. Mar Genomics 2022; 64:100967. [PMID: 35779450 DOI: 10.1016/j.margen.2022.100967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 05/26/2022] [Accepted: 06/15/2022] [Indexed: 11/28/2022]
Abstract
Aquaculture is the fast-growing agricultural sector and has the ability to meet the growing demand for protein nutritional security for future population. In future aquaculture is going to be the major source of fish proteins as capture fisheries reached at its maximum. However, several challenges need to overcome such as lack of genetically improved strains/varieties, lack of species-specific feed/functional feed, round the year availability of quality fish seed, pollution of ecosystems and increased frequencies of disease occurrence etc. In recent years, the continuous development of high throughput sequencing technology has revolutionized the biological sciences and provided necessary tools. Application of 'omics' in aquaculture research have been successfully used to resolve several productive and reproductive issues and thus ensure its sustainability and profitability. To date, high quality draft genomes of over fifty fish species have been generated and successfully used to develop large number of single nucleotide polymorphism markers (SNPs), marker panels and other genomic resources etc in several aquaculture species. Similarly, transcriptome profiling and miRNAs analysis have been used in aquaculture research to identify key transcripts and expression analysis of candidate genes/miRNAs involved in reproduction, immunity, growth, development, stress toxicology and disease. Metagenome analysis emerged as a promising scientific tool to analyze the complex genomes contained within microbial communities. Metagenomics has been successfully used in the aquaculture sector to identify novel and potential pathogens, antibiotic resistance genes, microbial roles in microcosms, microbial communities forming biofloc, probiotics etc. In the current review, we discussed application of high-throughput technologies (NGS) in the aquaculture sector.
Collapse
Affiliation(s)
- Jitendra Kumar Sundaray
- ICAR-Central Institute of Freshwater Aquaculture, Kausalyaganga, Bhubaneswar 751002, Odisha, India
| | - Sangita Dixit
- Centre for Biotechnology, School of Pharmaceutical Sciences, Siksha 'O' Anusandhan University (Deemed to be University), Bhubaneswar 751003, Odisha, India
| | - Ashraf Rather
- Division of Fish Genetics and Biotechnology, College of Fisheries, Sher-e- Kashmir University of Agricultural Science and Technology, Rangil-Ganderbal 190006, Jammu and Kashmir, India
| | - Kiran D Rasal
- Fish Genetics and Biotechnology Division, ICAR-Central Institute of Fisheries Education, Versova, Mumbai 400 061, Maharastra, India
| | - Lakshman Sahoo
- ICAR-Central Institute of Freshwater Aquaculture, Kausalyaganga, Bhubaneswar 751002, Odisha, India.
| |
Collapse
|
4
|
Zhang HY, Zhao ZX, Xu J, Xu P, Bai QL, Yang SY, Jiang LK, Chen BH. Population genetic analysis of aquaculture salmonid populations in China using a 57K rainbow trout SNP array. PLoS One 2018; 13:e0202582. [PMID: 30118517 PMCID: PMC6097679 DOI: 10.1371/journal.pone.0202582] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Accepted: 08/05/2018] [Indexed: 12/21/2022] Open
Abstract
Various salmonid species are cultivated in cold water aquaculture. However, due to limited genomic data resources, specific high-throughput genotyping tools are not available to many of the salmonid species. In this study, a 57K single nucleotide polymorphism (SNP) array for rainbow trout (Oncorhynchus mykiss) was utilized to detect polymorphisms in seven salmonid species, including Hucho taimen, Oncorhynchus masou, Salvelinus fontinalis, Brachymystax lenok, Salvelinus leucomaenis, O. kisutch, and O. mykiss. The number of polymorphic markers per population ranged from 3,844 (O. kisutch) to 53,734 (O. mykiss), indicating that the rainbow trout SNP array was applicable as a universal genotyping tool for other salmonid species. Among the six other salmonid populations from four genera, 28,882 SNPs were shared, whereas 525 SNPs were polymorphic in all four genera. The genetic diversity and population relationships of the seven salmonid species were studied by principal component analysis (PCA). The phylogenetic relationships among populations were analyzed using the maximum likelihood method, which indicated that the shared SNP markers provide reliable genomic information for population genetic analyses in common aquaculture salmonid fishes. Furthermore, this obtained genomic information may be applicable for population genetic evaluation, marker-assisted breeding, and propagative parent selection in fry production.
Collapse
Affiliation(s)
- Han-Yuan Zhang
- Key Laboratory of Aquatic Genomics, Ministry of Agriculture, Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, China
| | - Zi-Xia Zhao
- Key Laboratory of Aquatic Genomics, Ministry of Agriculture, Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, China
- * E-mail: (ZXZ); (PX)
| | - Jian Xu
- Key Laboratory of Aquatic Genomics, Ministry of Agriculture, Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, China
| | - Peng Xu
- Key Laboratory of Aquatic Genomics, Ministry of Agriculture, Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, China
- Fujian Collaborative Innovation Center for Exploitation and Utilization of Marine Biological Resources, College of Ocean and Earth Sciences, Xiamen University, Xiamen, China
- * E-mail: (ZXZ); (PX)
| | - Qing-Li Bai
- Heilongjiang River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Harbin, China
| | - Shi-Yong Yang
- College of Animal Science and Technology, Sichuan Agricultural University, Yaan, China
| | - Li-Kun Jiang
- Key Laboratory of Aquatic Genomics, Ministry of Agriculture, Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, China
| | - Bao-Hua Chen
- Key Laboratory of Aquatic Genomics, Ministry of Agriculture, Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, China
- Fujian Collaborative Innovation Center for Exploitation and Utilization of Marine Biological Resources, College of Ocean and Earth Sciences, Xiamen University, Xiamen, China
| |
Collapse
|
5
|
Gao G, Nome T, Pearse DE, Moen T, Naish KA, Thorgaard GH, Lien S, Palti Y. A New Single Nucleotide Polymorphism Database for Rainbow Trout Generated Through Whole Genome Resequencing. Front Genet 2018; 9:147. [PMID: 29740479 PMCID: PMC5928233 DOI: 10.3389/fgene.2018.00147] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Accepted: 04/09/2018] [Indexed: 11/13/2022] Open
Abstract
Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout (Oncorhynchus mykiss), SNP discovery has been previously done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL) and RNA sequencing. Recently we have performed high coverage whole genome resequencing with 61 unrelated samples, representing a wide range of rainbow trout and steelhead populations, with 49 new samples added to 12 aquaculture samples from AquaGen (Norway) that we previously used for SNP discovery. Of the 49 new samples, 11 were double-haploid lines from Washington State University (WSU) and 38 represented wild and hatchery populations from a wide range of geographic distribution and with divergent migratory phenotypes. We then mapped the sequences to the new rainbow trout reference genome assembly (GCA_002163495.1) which is based on the Swanson YY doubled haploid line. Variant calling was conducted with FreeBayes and SAMtools mpileup, followed by filtering of SNPs based on quality score, sequence complexity, read depth on the locus, and number of genotyped samples. Results from the two variant calling programs were compared and genotypes of the double haploid samples were used for detecting and filtering putative paralogous sequence variants (PSVs) and multi-sequence variants (MSVs). Overall, 30,302,087 SNPs were identified on the rainbow trout genome 29 chromosomes and 1,139,018 on unplaced scaffolds, with 4,042,723 SNPs having high minor allele frequency (MAF > 0.25). The average SNP density on the chromosomes was one SNP per 64 bp, or 15.6 SNPs per 1 kb. Results from the phylogenetic analysis that we conducted indicate that the SNP markers contain enough population-specific polymorphisms for recovering population relationships despite the small sample size used. Intra-Population polymorphism assessment revealed high level of polymorphism and heterozygosity within each population. We also provide functional annotation based on the genome position of each SNP and evaluate the use of clonal lines for filtering of PSVs and MSVs. These SNPs form a new database, which provides an important resource for a new high density SNP array design and for other SNP genotyping platforms used for genetic and genomics studies of this iconic salmonid fish species.
Collapse
Affiliation(s)
- Guangtu Gao
- National Center for Cool and Cold Water Aquaculture, ARS-USDA, Kearneysville, WV, United States
| | - Torfinn Nome
- Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Centre of Integrative Genetics, Norwegian University of Life Sciences, Ås, Norway
| | - Devon E Pearse
- Fisheries Ecology Division, Southwest Fisheries Science Center, National Marine Fisheries Service, Santa Cruz, CA, United States
| | | | - Kerry A Naish
- School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA, United States
| | - Gary H Thorgaard
- School of Biological Sciences and Center for Reproductive Biology, Washington State University, Pullman, WA, United States
| | - Sigbjørn Lien
- Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Centre of Integrative Genetics, Norwegian University of Life Sciences, Ås, Norway
| | - Yniv Palti
- National Center for Cool and Cold Water Aquaculture, ARS-USDA, Kearneysville, WV, United States
| |
Collapse
|
6
|
Annotated Draft Genome Assemblies for the Northern Bobwhite ( Colinus virginianus) and the Scaled Quail ( Callipepla squamata) Reveal Disparate Estimates of Modern Genome Diversity and Historic Effective Population Size. G3-GENES GENOMES GENETICS 2017; 7:3047-3058. [PMID: 28717047 PMCID: PMC5592930 DOI: 10.1534/g3.117.043083] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Northern bobwhite (Colinus virginianus; hereafter bobwhite) and scaled quail (Callipepla squamata) populations have suffered precipitous declines across most of their US ranges. Illumina-based first- (v1.0) and second- (v2.0) generation draft genome assemblies for the scaled quail and the bobwhite produced N50 scaffold sizes of 1.035 and 2.042 Mb, thereby producing a 45-fold improvement in contiguity over the existing bobwhite assembly, and ≥90% of the assembled genomes were captured within 1313 and 8990 scaffolds, respectively. The scaled quail assembly (v1.0 = 1.045 Gb) was ∼20% smaller than the bobwhite (v2.0 = 1.254 Gb), which was supported by kmer-based estimates of genome size. Nevertheless, estimates of GC content (41.72%; 42.66%), genome-wide repetitive content (10.40%; 10.43%), and MAKER-predicted protein coding genes (17,131; 17,165) were similar for the scaled quail (v1.0) and bobwhite (v2.0) assemblies, respectively. BUSCO analyses utilizing 3023 single-copy orthologs revealed a high level of assembly completeness for the scaled quail (v1.0; 84.8%) and the bobwhite (v2.0; 82.5%), as verified by comparison with well-established avian genomes. We also detected 273 putative segmental duplications in the scaled quail genome (v1.0), and 711 in the bobwhite genome (v2.0), including some that were shared among both species. Autosomal variant prediction revealed ∼2.48 and 4.17 heterozygous variants per kilobase within the scaled quail (v1.0) and bobwhite (v2.0) genomes, respectively, and estimates of historic effective population size were uniformly higher for the bobwhite across all time points in a coalescent model. However, large-scale declines were predicted for both species beginning ∼15-20 KYA.
Collapse
|
7
|
A comparative integrated gene-based linkage and locus ordering by linkage disequilibrium map for the Pacific white shrimp, Litopenaeus vannamei. Sci Rep 2017; 7:10360. [PMID: 28871114 PMCID: PMC5583237 DOI: 10.1038/s41598-017-10515-7] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2017] [Accepted: 08/09/2017] [Indexed: 11/23/2022] Open
Abstract
The Pacific whiteleg shrimp, Litopenaeus vannamei, is the most farmed aquaculture species worldwide with global production exceeding 3 million tonnes annually. Litopenaeus vannamei has been the focus of many selective breeding programs aiming to improve growth and disease resistance. However, these have been based primarily on phenotypic measurements and omit potential gains by integrating genetic selection into existing breeding programs. Such integration of genetic information has been hindered by the limited available genomic resources, background genetic parameters and knowledge on the genetic architecture of commercial traits for L. vannamei. This study describes the development of a comprehensive set of genomic gene-based resources including the identification and validation of 234,452 putative single nucleotide polymorphisms in-silico, of which 8,967 high value SNPs were incorporated into a commercially available Illumina Infinium ShrimpLD-24 v1.0 genotyping array. A framework genetic linkage map was constructed and combined with locus ordering by disequilibrium methodology to generate an integrated genetic map containing 4,817 SNPs, which spanned a total of 4552.5 cM and covered an estimated 98.12% of the genome. These gene-based genomic resources will not only be valuable for identifying regions underlying important L. vannamei traits, but also as a foundational resource in comparative and genome assembly activities.
Collapse
|
8
|
Al-Tobasei R, Ali A, Leeds TD, Liu S, Palti Y, Kenney B, Salem M. Identification of SNPs associated with muscle yield and quality traits using allelic-imbalance analyses of pooled RNA-Seq samples in rainbow trout. BMC Genomics 2017; 18:582. [PMID: 28784089 PMCID: PMC5547479 DOI: 10.1186/s12864-017-3992-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2016] [Accepted: 08/01/2017] [Indexed: 12/22/2022] Open
Abstract
Background Coding/functional SNPs change the biological function of a gene and, therefore, could serve as “large-effect” genetic markers. In this study, we used two bioinformatics pipelines, GATK and SAMtools, for discovering coding/functional SNPs with allelic-imbalances associated with total body weight, muscle yield, muscle fat content, shear force, and whiteness. Phenotypic data were collected for approximately 500 fish, representing 98 families (5 fish/family), from a growth-selected line, and the muscle transcriptome was sequenced from 22 families with divergent phenotypes (4 low- versus 4 high-ranked families per trait). Results GATK detected 59,112 putative SNPs; of these SNPs, 4798 showed allelic imbalances (>2.0 as an amplification and <0.5 as loss of heterozygosity). SAMtools detected 87,066 putative SNPs; and of them, 4962 had allelic imbalances between the low- and high-ranked families. Only 1829 SNPs with allelic imbalances were common between the two datasets, indicating significant differences in algorithms. The two datasets contained 7930 non-redundant SNPs of which 4439 mapped to 1498 protein-coding genes (with 6.4% non-synonymous SNPs) and 684 mapped to 295 lncRNAs. Validation of a subset of 92 SNPs revealed 1) 86.7–93.8% success rate in calling polymorphic SNPs and 2) 95.4% consistent matching between DNA and cDNA genotypes indicating a high rate of identifying SNPs with allelic imbalances. In addition, 4.64% SNPs revealed random monoallelic expression. Genome distribution of the SNPs with allelic imbalances exhibited high density for all five traits in several chromosomes, especially chromosome 9, 20 and 28. Most of the SNP-harboring genes were assigned to important growth-related metabolic pathways. Conclusion These results demonstrate utility of RNA-Seq in assessing phenotype-associated allelic imbalances in pooled RNA-Seq samples. The SNPs identified in this study were included in a new SNP-Chip design (available from Affymetrix) for genomic and genetic analyses in rainbow trout. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3992-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rafet Al-Tobasei
- Computational Science Program, Middle Tennessee State University, Murfreesboro, TN, 37132, USA
| | - Ali Ali
- Department of Biology and Molecular Biosciences Program, Middle Tennessee State University, Murfreesboro, TN, 37132, USA
| | - Timothy D Leeds
- National Center for Cool and Cold Water Aquaculture, ARS-USDA, Kearneysville, WV, 25430, USA
| | - Sixin Liu
- National Center for Cool and Cold Water Aquaculture, ARS-USDA, Kearneysville, WV, 25430, USA
| | - Yniv Palti
- National Center for Cool and Cold Water Aquaculture, ARS-USDA, Kearneysville, WV, 25430, USA
| | - Brett Kenney
- Division of Animal and Nutritional Sciences, West Virginia University, Morgantown, WV, 26506, USA
| | - Mohamed Salem
- Computational Science Program, Middle Tennessee State University, Murfreesboro, TN, 37132, USA. .,Department of Biology and Molecular Biosciences Program, Middle Tennessee State University, Murfreesboro, TN, 37132, USA.
| |
Collapse
|
9
|
Status and future perspectives of single nucleotide polymorphisms (SNPs) markers in farmed fishes: Way ahead using next generation sequencing. GENE REPORTS 2017. [DOI: 10.1016/j.genrep.2016.12.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
10
|
McKinney GJ, Waples RK, Seeb LW, Seeb JE. Paralogs are revealed by proportion of heterozygotes and deviations in read ratios in genotyping-by-sequencing data from natural populations. Mol Ecol Resour 2016; 17:656-669. [DOI: 10.1111/1755-0998.12613] [Citation(s) in RCA: 127] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Revised: 10/06/2016] [Accepted: 10/10/2016] [Indexed: 11/28/2022]
Affiliation(s)
- Garrett J. McKinney
- School of Aquatic and Fishery Sciences; University of Washington; 1122 NE Boat Street, Box 355020 Seattle WA 98195-5020 USA
| | - Ryan K. Waples
- School of Aquatic and Fishery Sciences; University of Washington; 1122 NE Boat Street, Box 355020 Seattle WA 98195-5020 USA
| | - Lisa W. Seeb
- School of Aquatic and Fishery Sciences; University of Washington; 1122 NE Boat Street, Box 355020 Seattle WA 98195-5020 USA
| | - James E. Seeb
- School of Aquatic and Fishery Sciences; University of Washington; 1122 NE Boat Street, Box 355020 Seattle WA 98195-5020 USA
| |
Collapse
|
11
|
Wang J, Xue DX, Zhang BD, Li YL, Liu BJ, Liu JX. Genome-Wide SNP Discovery, Genotyping and Their Preliminary Applications for Population Genetic Inference in Spotted Sea Bass (Lateolabrax maculatus). PLoS One 2016; 11:e0157809. [PMID: 27336696 PMCID: PMC4919078 DOI: 10.1371/journal.pone.0157809] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2016] [Accepted: 06/06/2016] [Indexed: 12/30/2022] Open
Abstract
Next-generation sequencing and the collection of genome-wide single-nucleotide polymorphisms (SNPs) allow identifying fine-scale population genetic structure and genomic regions under selection. The spotted sea bass (Lateolabrax maculatus) is a non-model species of ecological and commercial importance and widely distributed in northwestern Pacific. A total of 22 648 SNPs was discovered across the genome of L. maculatus by paired-end sequencing of restriction-site associated DNA (RAD-PE) for 30 individuals from two populations. The nucleotide diversity (π) for each population was 0.0028±0.0001 in Dandong and 0.0018±0.0001 in Beihai, respectively. Shallow but significant genetic differentiation was detected between the two populations analyzed by using both the whole data set (FST = 0.0550, P < 0.001) and the putatively neutral SNPs (FST = 0.0347, P < 0.001). However, the two populations were highly differentiated based on the putatively adaptive SNPs (FST = 0.6929, P < 0.001). Moreover, a total of 356 SNPs representing 298 unique loci were detected as outliers putatively under divergent selection by FST-based outlier tests as implemented in BAYESCAN and LOSITAN. Functional annotation of the contigs containing putatively adaptive SNPs yielded hits for 22 of 55 (40%) significant BLASTX matches. Candidate genes for local selection constituted a wide array of functions, including binding, catalytic and metabolic activities, etc. The analyses with the SNPs developed in the present study highlighted the importance of genome-wide genetic variation for inference of population structure and local adaptation in L. maculatus.
Collapse
Affiliation(s)
- Juan Wang
- Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
- Laboratory for Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
| | - Dong-Xiu Xue
- Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
- Laboratory for Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
| | - Bai-Dong Zhang
- Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
- Laboratory for Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yu-Long Li
- Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
- Laboratory for Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Bing-Jian Liu
- Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
- Laboratory for Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jin-Xian Liu
- Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
- Laboratory for Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
- * E-mail:
| |
Collapse
|
12
|
Sharpe RM, Koepke T, Harper A, Grimes J, Galli M, Satoh-Cruz M, Kalyanaraman A, Evans K, Kramer D, Dhingra A. CisSERS: Customizable In Silico Sequence Evaluation for Restriction Sites. PLoS One 2016; 11:e0152404. [PMID: 27071032 PMCID: PMC4829253 DOI: 10.1371/journal.pone.0152404] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2015] [Accepted: 03/14/2016] [Indexed: 11/30/2022] Open
Abstract
High-throughput sequencing continues to produce an immense volume of information that is processed and assembled into mature sequence data. Data analysis tools are urgently needed that leverage the embedded DNA sequence polymorphisms and consequent changes to restriction sites or sequence motifs in a high-throughput manner to enable biological experimentation. CisSERS was developed as a standalone open source tool to analyze sequence datasets and provide biologists with individual or comparative genome organization information in terms of presence and frequency of patterns or motifs such as restriction enzymes. Predicted agarose gel visualization of the custom analyses results was also integrated to enhance the usefulness of the software. CisSERS offers several novel functionalities, such as handling of large and multiple datasets in parallel, multiple restriction enzyme site detection and custom motif detection features, which are seamlessly integrated with real time agarose gel visualization. Using a simple fasta-formatted file as input, CisSERS utilizes the REBASE enzyme database. Results from CisSERS enable the user to make decisions for designing genotyping by sequencing experiments, reduced representation sequencing, 3’UTR sequencing, and cleaved amplified polymorphic sequence (CAPS) molecular markers for large sample sets. CisSERS is a java based graphical user interface built around a perl backbone. Several of the applications of CisSERS including CAPS molecular marker development were successfully validated using wet-lab experimentation. Here, we present the tool CisSERS and results from in-silico and corresponding wet-lab analyses demonstrating that CisSERS is a technology platform solution that facilitates efficient data utilization in genomics and genetics studies.
Collapse
Affiliation(s)
- Richard M. Sharpe
- Molecular Plant Sciences Graduate Program, Washington State University, Pullman, Washington, United States of America
- School of Biological Sciences, Washington State University, Pullman, WA, Washington, United States of America
| | - Tyson Koepke
- Molecular Plant Sciences Graduate Program, Washington State University, Pullman, Washington, United States of America
- Department of Horticulture, Washington State University, Pullman, Washington, United States of America
| | - Artemus Harper
- Department of Horticulture, Washington State University, Pullman, Washington, United States of America
| | - John Grimes
- Electrical Engineering and Computer Science, Washington State University, Pullman, Washington, United States of America
| | - Marco Galli
- Department of Horticulture, Washington State University, Pullman, Washington, United States of America
| | - Mio Satoh-Cruz
- MSU-DOE Plant Research Laboratory, Michigan State University, East Lansing, Michigan, United States of America
| | - Ananth Kalyanaraman
- Electrical Engineering and Computer Science, Washington State University, Pullman, Washington, United States of America
| | - Katherine Evans
- Department of Horticulture, Washington State University, Pullman, Washington, United States of America
| | - David Kramer
- MSU-DOE Plant Research Laboratory, Michigan State University, East Lansing, Michigan, United States of America
| | - Amit Dhingra
- Molecular Plant Sciences Graduate Program, Washington State University, Pullman, Washington, United States of America
- Department of Horticulture, Washington State University, Pullman, Washington, United States of America
- * E-mail:
| |
Collapse
|
13
|
Humble E, Martinez-Barrio A, Forcada J, Trathan PN, Thorne MAS, Hoffmann M, Wolf JBW, Hoffman JI. A draft fur seal genome provides insights into factors affecting SNP validation and how to mitigate them. Mol Ecol Resour 2016; 16:909-21. [DOI: 10.1111/1755-0998.12502] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2015] [Revised: 12/01/2015] [Accepted: 12/15/2015] [Indexed: 01/19/2023]
Affiliation(s)
- E. Humble
- Department of Animal Behaviour; University of Bielefeld; Postfach 100131 33501 Bielefeld Germany
- British Antarctic Survey; High Cross, Madingley Road Cambridge CB3 OET UK
| | - A. Martinez-Barrio
- Science of Life Laboratories and Department of Cell and Molecular Biology; Uppsala University; Husargatan 3 75124 Uppsala Sweden
| | - J. Forcada
- British Antarctic Survey; High Cross, Madingley Road Cambridge CB3 OET UK
| | - P. N. Trathan
- British Antarctic Survey; High Cross, Madingley Road Cambridge CB3 OET UK
| | - M. A. S. Thorne
- British Antarctic Survey; High Cross, Madingley Road Cambridge CB3 OET UK
| | - M. Hoffmann
- Max Planck Institute for Developmental Biology; Spemannstrasse 35 72076 Tübingen Germany
| | - J. B. W. Wolf
- Science of Life Laboratories and Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala University; Norbyvägen 18D 75236 Uppsala Sweden
| | - J. I. Hoffman
- Department of Animal Behaviour; University of Bielefeld; Postfach 100131 33501 Bielefeld Germany
| |
Collapse
|
14
|
Tsai HY, Hamilton A, Guy DR, Tinch AE, Bishop SC, Houston RD. Verification of SNPs Associated with Growth Traits in Two Populations of Farmed Atlantic Salmon. Int J Mol Sci 2015; 17:ijms17010005. [PMID: 26703584 PMCID: PMC4730252 DOI: 10.3390/ijms17010005] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2015] [Revised: 12/11/2015] [Accepted: 12/11/2015] [Indexed: 01/17/2023] Open
Abstract
Understanding the relationship between genetic variants and traits of economic importance in aquaculture species is pertinent to selective breeding programmes. High-throughput sequencing technologies have enabled the discovery of large numbers of SNPs in Atlantic salmon, and high density SNP arrays now exist. A previous genome-wide association study (GWAS) using a high density SNP array (132K SNPs) has revealed the polygenic nature of early growth traits in salmon, but has also identified candidate SNPs showing suggestive associations with these traits. The aim of this study was to test the association of the candidate growth-associated SNPs in a separate population of farmed Atlantic salmon to verify their effects. Identifying SNP-trait associations in two populations provides evidence that the associations are true and robust. Using a large cohort (N = 1152), we successfully genotyped eight candidate SNPs from the previous GWAS, two of which were significantly associated with several growth and fillet traits measured at harvest. The genes proximal to these SNPs were identified by alignment to the salmon reference genome and are discussed in the context of their potential role in underpinning genetic variation in salmon growth.
Collapse
Affiliation(s)
- Hsin Y Tsai
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, the University of Edinburgh, Midlothian EH25 9RG, UK.
| | - Alastair Hamilton
- Landcatch Natural Selection Ltd., 15 Beta Centre, Stirling University Innovation Park, Stirling FK9 4NF, UK.
| | - Derrick R Guy
- Landcatch Natural Selection Ltd., 15 Beta Centre, Stirling University Innovation Park, Stirling FK9 4NF, UK.
| | - Alan E Tinch
- Landcatch Natural Selection Ltd., 15 Beta Centre, Stirling University Innovation Park, Stirling FK9 4NF, UK.
| | - Steve C Bishop
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, the University of Edinburgh, Midlothian EH25 9RG, UK.
| | - Ross D Houston
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, the University of Edinburgh, Midlothian EH25 9RG, UK.
| |
Collapse
|
15
|
Ali SS, Shao J, Strem MD, Phillips-Mora W, Zhang D, Meinhardt LW, Bailey BA. Combination of RNAseq and SNP nanofluidic array reveals the center of genetic diversity of cacao pathogen Moniliophthora roreri in the upper Magdalena Valley of Colombia and its clonality. Front Microbiol 2015; 6:850. [PMID: 26379633 PMCID: PMC4550789 DOI: 10.3389/fmicb.2015.00850] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Accepted: 08/04/2015] [Indexed: 01/11/2023] Open
Abstract
Moniliophthora roreri is the fungal pathogen that causes frosty pod rot (FPR) disease of Theobroma cacao L., the source of chocolate. FPR occurs in most of the cacao producing countries in the Western Hemisphere, causing yield losses up to 80%. Genetic diversity within the FPR pathogen population may allow the population to adapt to changing environmental conditions and adapt to enhanced resistance in the host plant. The present study developed single nucleotide polymorphism (SNP) markers from RNASeq results for 13 M. roreri isolates and validated the markers for their ability to reveal genetic diversity in an international M. roreri collection. The SNP resources reported herein represent the first study of RNA sequencing (RNASeq)-derived SNP validation in M. roreri and demonstrates the utility of RNASeq as an approach for de novo SNP identification in M. roreri. A total of 88 polymorphic SNPs were used to evaluate the genetic diversity of 172 M. roreri cacao isolates resulting in 37 distinct genotypes (including 14 synonymous groups). Absence of heterozygosity for the 88 SNP markers indicates reproduction in M. roreri is clonal and likely due to a homothallic life style. The upper Magdalena Valley of Colombia showed the highest levels of genetic diversity with 20 distinct genotypes of which 13 were limited to this region, and indicates this region as the possible center of origin for M. roreri.
Collapse
Affiliation(s)
- Shahin S Ali
- Sustainable Perennial Crops Laboratory, Plant Sciences Institute, United States Department of Agriculture/Agricultural Research Service, Beltsville Agricultural Research Center-West Beltsville, MD, USA
| | - Jonathan Shao
- Sustainable Perennial Crops Laboratory, Plant Sciences Institute, United States Department of Agriculture/Agricultural Research Service, Beltsville Agricultural Research Center-West Beltsville, MD, USA
| | - Mary D Strem
- Sustainable Perennial Crops Laboratory, Plant Sciences Institute, United States Department of Agriculture/Agricultural Research Service, Beltsville Agricultural Research Center-West Beltsville, MD, USA
| | - Wilberth Phillips-Mora
- Departamento de Agricultura y Agroforestería, Centro Agronómico Tropica de Investigación y Enseñanza Turrialba, Costa Rica
| | - Dapeng Zhang
- Sustainable Perennial Crops Laboratory, Plant Sciences Institute, United States Department of Agriculture/Agricultural Research Service, Beltsville Agricultural Research Center-West Beltsville, MD, USA
| | - Lyndel W Meinhardt
- Sustainable Perennial Crops Laboratory, Plant Sciences Institute, United States Department of Agriculture/Agricultural Research Service, Beltsville Agricultural Research Center-West Beltsville, MD, USA
| | - Bryan A Bailey
- Sustainable Perennial Crops Laboratory, Plant Sciences Institute, United States Department of Agriculture/Agricultural Research Service, Beltsville Agricultural Research Center-West Beltsville, MD, USA
| |
Collapse
|
16
|
Starks HA, Clemento AJ, Garza JC. Discovery and characterization of single nucleotide polymorphisms in coho salmon, Oncorhynchus kisutch. Mol Ecol Resour 2015; 16:277-87. [PMID: 25965351 DOI: 10.1111/1755-0998.12430] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Revised: 04/21/2015] [Accepted: 04/23/2015] [Indexed: 11/30/2022]
Abstract
Molecular population genetic analyses have become an integral part of ecological investigation and population monitoring for conservation and management. Microsatellites have been the molecular marker of choice for such applications over the last several decades, but single nucleotide polymorphism (SNP) markers are rapidly expanding beyond model organisms. Coho salmon (Oncorhynchus kisutch) is native to the north Pacific Ocean and its tributaries, where it is the focus of intensive fishery and conservation activities. As it is an anadromous species, coho salmon typically migrate across multiple jurisdictional boundaries, complicating management and requiring shared data collection methods. Here, we describe the discovery and validation of a suite of novel SNPs and associated genotyping assays which can be used in the genetic analyses of this species. These assays include 91 that are polymorphic in the species and one that discriminates it from a sister species, Chinook salmon. We demonstrate the utility of these SNPs for population assignment and phylogeographic analyses, and map them against the draft trout genome. The markers constitute a large majority of all SNP markers described for coho salmon and will enable both population- and pedigree-based analyses across the southern part of the species native range.
Collapse
Affiliation(s)
- Hilary A Starks
- Southwest Fisheries Science Center, National Marine Fisheries Service and University of California, Santa Cruz, 110 Shaffer Rd, Santa Cruz, CA, 95060, USA
| | - Anthony J Clemento
- Southwest Fisheries Science Center, National Marine Fisheries Service and University of California, Santa Cruz, 110 Shaffer Rd, Santa Cruz, CA, 95060, USA
| | - John Carlos Garza
- Southwest Fisheries Science Center, National Marine Fisheries Service and University of California, Santa Cruz, 110 Shaffer Rd, Santa Cruz, CA, 95060, USA
| |
Collapse
|
17
|
She Z, Li L, Qi H, Song K, Que H, Zhang G. Candidate Gene Polymorphisms and their Association with Glycogen Content in the Pacific Oyster Crassostrea gigas. PLoS One 2015; 10:e0124401. [PMID: 25951187 PMCID: PMC4423957 DOI: 10.1371/journal.pone.0124401] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2014] [Accepted: 03/13/2015] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND The Pacific oyster Crassostrea gigas is an important cultivated shellfish that is rich in nutrients. It contains high levels of glycogen, which is of high nutritional value. To investigate the genetic basis of this high glycogen content and its variation, we conducted a candidate gene association analysis using a wild population, and confirmed our results using an independent population, via targeted gene resequencing and mRNA expression analysis. RESULTS We validated 295 SNPs in the 90 candidate genes surveyed for association with glycogen content, 86 of were ultimately genotyped in all 144 experimental individuals from Jiaonan (JN). In addition, 732 SNPs were genotyped via targeted gene resequencing. Two SNPs (Cg_SNP_TY202 and Cg_SNP_3021) in Cg_GD1 (glycogen debranching enzyme) and one SNP (Cg_SNP_4) in Cg_GP1 (glycogen phosphorylase) were identified as being associated with glycogen content. The glycogen content of individuals with genotypes TT and TC in Cg_SNP_TY202 was higher than that of individuals with genotype CC. The transcript abundance of both glycogen-associated genes was differentially expressed in high glycogen content and low glycogen content individuals. CONCLUSIONS This study identified three polymorphisms in two genes associated with oyster glycogen content, via candidate gene association analysis. The transcript abundance differences in Cg_GD1 and Cg_GP1 between low- and the high-glycogen content individuals suggests that it is possible that transcript regulation is mediated by variations of Cg_SNP_TY202, Cg_SNP_3021, and Cg_SNP_4. These findings will not only provide insights into the genetic basis of oyster quality, but also promote research into the molecular breeding of oysters.
Collapse
Affiliation(s)
- Zhicai She
- National & Local Joint Engineering Laboratory of Ecological Mariculture, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Li Li
- National & Local Joint Engineering Laboratory of Ecological Mariculture, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
| | - Haigang Qi
- National & Local Joint Engineering Laboratory of Ecological Mariculture, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
| | - Kai Song
- National & Local Joint Engineering Laboratory of Ecological Mariculture, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
| | - Huayong Que
- National & Local Joint Engineering Laboratory of Ecological Mariculture, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
| | - Guofan Zhang
- National & Local Joint Engineering Laboratory of Ecological Mariculture, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
| |
Collapse
|
18
|
Salem M, Paneru B, Al-Tobasei R, Abdouni F, Thorgaard GH, Rexroad CE, Yao J. Transcriptome assembly, gene annotation and tissue gene expression atlas of the rainbow trout. PLoS One 2015; 10:e0121778. [PMID: 25793877 PMCID: PMC4368115 DOI: 10.1371/journal.pone.0121778] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2014] [Accepted: 02/04/2015] [Indexed: 11/25/2022] Open
Abstract
Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complemented by transcriptome information that will enhance genome assembly and annotation. Previously, transcriptome reference sequences were reported using data from different sources. Although the previous work added a great wealth of sequences, a complete and well-annotated transcriptome is still needed. In addition, gene expression in different tissues was not completely addressed in the previous studies. In this study, non-normalized cDNA libraries were sequenced from 13 different tissues of a single doubled haploid rainbow trout from the same source used for the rainbow trout genome sequence. A total of ~1.167 billion paired-end reads were de novo assembled using the Trinity RNA-Seq assembler yielding 474,524 contigs > 500 base-pairs. Of them, 287,593 had homologies to the NCBI non-redundant protein database. The longest contig of each cluster was selected as a reference, yielding 44,990 representative contigs. A total of 4,146 contigs (9.2%), including 710 full-length sequences, did not match any mRNA sequences in the current rainbow trout genome reference. Mapping reads to the reference genome identified an additional 11,843 transcripts not annotated in the genome. A digital gene expression atlas revealed 7,678 housekeeping and 4,021 tissue-specific genes. Expression of about 16,000–32,000 genes (35–71% of the identified genes) accounted for basic and specialized functions of each tissue. White muscle and stomach had the least complex transcriptomes, with high percentages of their total mRNA contributed by a small number of genes. Brain, testis and intestine, in contrast, had complex transcriptomes, with a large numbers of genes involved in their expression patterns. This study provides comprehensive de novo transcriptome information that is suitable for functional and comparative genomics studies in rainbow trout, including annotation of the genome.
Collapse
Affiliation(s)
- Mohamed Salem
- Department of Biology, Middle Tennessee State University, Murfreesboro, Tennessee, 37132, United States of America
- * E-mail:
| | - Bam Paneru
- Department of Biology, Middle Tennessee State University, Murfreesboro, Tennessee, 37132, United States of America
| | - Rafet Al-Tobasei
- Department of Biology, Middle Tennessee State University, Murfreesboro, Tennessee, 37132, United States of America
| | - Fatima Abdouni
- Department of Biology, Middle Tennessee State University, Murfreesboro, Tennessee, 37132, United States of America
| | - Gary H. Thorgaard
- School of Biological Sciences and Center for Reproductive Biology, Washington State University, Pullman, Washington 99164, United States of America
| | - Caird E. Rexroad
- The National Center for Cool and Cold Water Aquaculture, USDA Agricultural Research Service, Leetown, West Virginia 25430, United States of America
| | - Jianbo Yao
- Division of Animal and Nutritional Sciences, West Virginia University, Morgantown, West Virginia, 26506, United States of America
| |
Collapse
|
19
|
Zhang N, Zhang L, Tao Y, Guo L, Sun J, Li X, Zhao N, Peng J, Li X, Zeng L, Chen J, Yang G. Construction of a high density SNP linkage map of kelp (Saccharina japonica) by sequencing Taq I site associated DNA and mapping of a sex determining locus. BMC Genomics 2015; 16:189. [PMID: 25887315 PMCID: PMC4369078 DOI: 10.1186/s12864-015-1371-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2014] [Accepted: 02/20/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Kelp (Saccharina japonica) has been intensively cultured in China for almost a century. Its genetic improvement is comparable with that of rice. However, the development of its molecular tools is extremely limited, thus its genes, genetics and genomics. Kelp performs an alternative life cycle during which sporophyte generation alternates with gametophyte generation. The gametophytes of kelp can be cloned and crossed. Due to these characteristics, kelp may serve as a reference for the biological and genetic studies of Volvox, mosses and ferns. RESULTS We constructed a high density single nucleotide polymorphism (SNP) linkage map for kelp by restriction site associated DNA (RAD) sequencing. In total, 4,994 SNP-containing physical (tag-defined) RAD loci were mapped on 31 linkage groups. The map expanded a total genetic distance of 1,782.75 cM, covering 98.66% of the expected (1,806.94 cM). The length of RAD tags (85 bp) was extended to 400-500 bp with Miseq method, offering us an easiness of developing SNP chips and shifting SNP genotyping to a high throughput track. The number of linkage groups was in accordance with the documented with cytological methods. In addition, we identified a set of microsatellites (99 in total) from the extended RAD tags. A gametophyte sex determining locus was mapped on linkage group 2 in a window about 9.0 cM in width, which was 2.66 cM up to marker_40567 and 6.42 cM down to marker_23595. CONCLUSIONS A high density SNP linkage map was constructed for kelp, an intensively cultured brown alga in China. The RAD tags were also extended so that a SNP chip could be developed. In addition, a set of microsatellites were identified among mapped loci, and a gametophyte sex determining locus was mapped. This map will facilitate the genetic studies of kelp including for example the evaluation of germplasm and the decipherment of the genetic bases of economic traits.
Collapse
Affiliation(s)
- Ning Zhang
- Laboratory of Marine Genetics and Breeding, Ocean University of China, Qingdao, 266003, China.
- Institute of Evolution and Marine Biodiversity, Ocean University of China, Qingdao, 266003, China.
- College of Marine Life Sciences, Ocean University of China, Qingdao, 266003, China.
| | - Linan Zhang
- National Engineering Science Research & Development Center of Algae and Sea Cucumbers of China; Provincial Key Laboratory of Genetic Improvement & Efficient Culture of Marine Algae of Shandong, Shandong Oriental Ocean Sci-tech Co., Ltd, Yantai, Shandong, 264003, China.
| | - Ye Tao
- Majorbio Pharm Technology Co., Ltd, Shanghai, 201203, China.
| | - Li Guo
- Laboratory of Marine Genetics and Breeding, Ocean University of China, Qingdao, 266003, China.
- Institute of Evolution and Marine Biodiversity, Ocean University of China, Qingdao, 266003, China.
- College of Marine Life Sciences, Ocean University of China, Qingdao, 266003, China.
| | - Juan Sun
- National Engineering Science Research & Development Center of Algae and Sea Cucumbers of China; Provincial Key Laboratory of Genetic Improvement & Efficient Culture of Marine Algae of Shandong, Shandong Oriental Ocean Sci-tech Co., Ltd, Yantai, Shandong, 264003, China.
| | - Xia Li
- National Engineering Science Research & Development Center of Algae and Sea Cucumbers of China; Provincial Key Laboratory of Genetic Improvement & Efficient Culture of Marine Algae of Shandong, Shandong Oriental Ocean Sci-tech Co., Ltd, Yantai, Shandong, 264003, China.
| | - Nan Zhao
- National Engineering Science Research & Development Center of Algae and Sea Cucumbers of China; Provincial Key Laboratory of Genetic Improvement & Efficient Culture of Marine Algae of Shandong, Shandong Oriental Ocean Sci-tech Co., Ltd, Yantai, Shandong, 264003, China.
| | - Jie Peng
- National Engineering Science Research & Development Center of Algae and Sea Cucumbers of China; Provincial Key Laboratory of Genetic Improvement & Efficient Culture of Marine Algae of Shandong, Shandong Oriental Ocean Sci-tech Co., Ltd, Yantai, Shandong, 264003, China.
| | - Xiaojie Li
- National Engineering Science Research & Development Center of Algae and Sea Cucumbers of China; Provincial Key Laboratory of Genetic Improvement & Efficient Culture of Marine Algae of Shandong, Shandong Oriental Ocean Sci-tech Co., Ltd, Yantai, Shandong, 264003, China.
| | - Liang Zeng
- Majorbio Pharm Technology Co., Ltd, Shanghai, 201203, China.
| | - Jinsa Chen
- Majorbio Pharm Technology Co., Ltd, Shanghai, 201203, China.
| | - Guanpin Yang
- Laboratory of Marine Genetics and Breeding, Ocean University of China, Qingdao, 266003, China.
- Institute of Evolution and Marine Biodiversity, Ocean University of China, Qingdao, 266003, China.
- College of Marine Life Sciences, Ocean University of China, Qingdao, 266003, China.
| |
Collapse
|
20
|
SUN JIAN, HE SHAN, LIANG XUFANG, LI LING, WEN ZHENGYONG, ZHU TAO, SHEN DAN. Identification of SNPs in NPY and LEP and the association with food habit domestication traits in mandarin fish. J Genet 2014. [DOI: 10.1007/s12041-014-0442-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
21
|
Yáñez JM, Houston RD, Newman S. Genetics and genomics of disease resistance in salmonid species. Front Genet 2014; 5:415. [PMID: 25505486 PMCID: PMC4245001 DOI: 10.3389/fgene.2014.00415] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2014] [Accepted: 11/06/2014] [Indexed: 11/15/2022] Open
Abstract
Infectious and parasitic diseases generate large economic losses in salmon farming. A feasible and sustainable alternative to prevent disease outbreaks may be represented by genetic improvement for disease resistance. To include disease resistance into the breeding goal, prior knowledge of the levels of genetic variation for these traits is required. Furthermore, the information from the genetic architecture and molecular factors involved in resistance against diseases may be used to accelerate the genetic progress for these traits. In this regard, marker assisted selection and genomic selection are approaches which incorporate molecular information to increase the accuracy when predicting the genetic merit of selection candidates. In this article we review and discuss key aspects related to disease resistance in salmonid species, from both a genetic and genomic perspective, with emphasis in the applicability of disease resistance traits into breeding programs in salmonids.
Collapse
Affiliation(s)
- José M Yáñez
- Faculty of Veterinary and Animal Sciences, University of Chile Santiago, Chile ; Aquainnovo, Puerto Montt Chile
| | - Ross D Houston
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh Midlothian, UK
| | | |
Collapse
|
22
|
Palti Y, Gao G, Liu S, Kent MP, Lien S, Miller MR, Rexroad CE, Moen T. The development and characterization of a 57K single nucleotide polymorphism array for rainbow trout. Mol Ecol Resour 2014; 15:662-72. [PMID: 25294387 DOI: 10.1111/1755-0998.12337] [Citation(s) in RCA: 97] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2014] [Revised: 09/23/2014] [Accepted: 09/26/2014] [Indexed: 11/30/2022]
Abstract
In this study, we describe the development and characterization of the first high-density single nucleotide polymorphism (SNP) genotyping array for rainbow trout. The SNP array is publically available from a commercial vendor (Affymetrix). The SNP genotyping quality was high, and validation rate was close to 90%. This is comparable to other farm animals and is much higher than previous smaller scale SNP validation studies in rainbow trout. High quality and integrity of the genotypes are evident from sample reproducibility and from nearly 100% agreement in genotyping results from other methods. The array is very useful for rainbow trout aquaculture populations with more than 40 900 polymorphic markers per population. For wild populations that were confounded by a smaller sample size, the number of polymorphic markers was between 10 577 and 24 330. Comparison between genotypes from individual populations suggests good potential for identifying candidate markers for populations' traceability. Linkage analysis and mapping of the SNPs to the reference genome assembly provide strong evidence for a wide distribution throughout the genome with good representation in all 29 chromosomes. A total of 68% of the genome scaffolds and contigs were anchored through linkage analysis using the SNP array genotypes, including ~20% of the genome assembly that has not been previously anchored to chromosomes.
Collapse
Affiliation(s)
- Y Palti
- National Center for Cool and Cold Water Aquaculture, ARS-USDA, 11861 Leetown Road, Kearneysville, WV, 25430, USA
| | | | | | | | | | | | | | | |
Collapse
|
23
|
Ali A, Rexroad CE, Thorgaard GH, Yao J, Salem M. Characterization of the rainbow trout spleen transcriptome and identification of immune-related genes. Front Genet 2014; 5:348. [PMID: 25352861 PMCID: PMC4196580 DOI: 10.3389/fgene.2014.00348] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2014] [Accepted: 09/16/2014] [Indexed: 11/13/2022] Open
Abstract
Resistance against diseases affects profitability of rainbow trout. Limited information is available about functions and mechanisms of teleost immune pathways. Immunogenomics provides powerful tools to determine disease resistance genes/gene pathways and develop genetic markers for genomic selection. RNA-Seq sequencing of the rainbow trout spleen yielded 93,532,200 reads (100 bp). High quality reads were assembled into 43,047 contigs. 26,333 (61.17%) of the contigs had hits to the NR protein database and 7024 (16.32%) had hits to the KEGG database. Gene ontology showed significant percentages of transcripts assigned to binding (51%), signaling (7%), response to stimuli (9%) and receptor activity (4%) suggesting existence of many immune-related genes. KEGG annotation revealed 2825 sequences belonging to "organismal systems" with the highest number of sequences, 842 (29.81%), assigned to immune system. A number of sequences were identified for the first time in rainbow trout belonging to Toll-like receptor signaling (35), B cell receptor signaling pathway (44), T cell receptor signaling pathway (56), chemokine signaling pathway (73), Fc gamma R-mediated phagocytosis (52), leukocyte transendothelial migration (60) and NK cell mediated cytotoxicity (42). In addition, 51 transcripts were identified as spleen-specific genes. The list includes 277 full-length cDNAs. The presence of a large number of immune-related genes and pathways similar to other vertebrates suggests that innate and adaptive immunity in fish are conserved. This study provides deep-sequence data of rainbow trout spleen transcriptome and identifies many new immune-related genes and full-length cDNAs. This data will help identify allelic variations suitable for genomic selection and genetic manipulation in aquaculture.
Collapse
Affiliation(s)
- Ali Ali
- Department of Biology, Middle Tennessee State University Murfreesboro, TN, USA ; Department of Zoology, Faculty of Science, Benha University Benha, Egypt
| | - Caird E Rexroad
- The National Center for Cool and Cold Water Aquaculture, United States Department of Agriculture Agricultural Research Service Leetown, WV USA
| | - Gary H Thorgaard
- School of Biological Sciences, Washington State University Pullman, WA, USA
| | - Jianbo Yao
- Division of Animal and Nutritional Science, West Virginia University Morgantown, WV, USA
| | - Mohamed Salem
- Department of Biology, Middle Tennessee State University Murfreesboro, TN, USA ; Division of Animal and Nutritional Science, West Virginia University Morgantown, WV, USA
| |
Collapse
|
24
|
Ma RQ, He F, Wen HS, Li JF, Mu WJ, Liu M, Zhang YQ, Hu J, Qun L. Polymorphysims of CYP17-I Gene in the Exons Were Associated with the Reproductive Endocrine of Japanese Flounder (Paralichthys olivaceus). ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES 2014; 25:794-9. [PMID: 25049628 PMCID: PMC4093092 DOI: 10.5713/ajas.2011.11489] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Revised: 02/03/2012] [Accepted: 01/17/2012] [Indexed: 11/27/2022]
Abstract
The cytochrome P450c17-I (CYP17-I) is one of the enzymes critical to gonadal development and the synthesis of androgens. Two single nucleotide polymorphisms (SNPs) were detected within the coding region of the CYP17-I gene in a population of 75 male Japanese flounder (Paralichthys olivaceus). They were SNP1 (c.C445T) located in exon2 and SNP2 (c.T980C (p.Phe307Leu)) located in exon5. Four physiological indices, which were serum testosterone (T), serum 17β-estradiol (E2), Hepatosomatic index (HSI), and Gonadosomatic index (GSI), were studied to examine the effect of the two SNPs on the reproductive endocrines of Japanese flounder. Multiple comparisons revealed that CT genotype of SNP1 had a much lower T level than CC genotype (p<0.05) and the GSI of individuals with CC genotype of SNP2 was higher than those with TT genotype (p<0.05). Four diplotypes were constructed based on the two SNPs and the diplotype D3 had a significantly lower T level and GSI. In conclusion, the two SNPs were significantly associated with reproductive traits of Japanese flounder.
Collapse
|
25
|
Matala AP, Ackerman MW, Campbell MR, Narum SR. Relative contributions of neutral and non-neutral genetic differentiation to inform conservation of steelhead trout across highly variable landscapes. Evol Appl 2014; 7:682-701. [PMID: 25067950 PMCID: PMC4105918 DOI: 10.1111/eva.12174] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2013] [Accepted: 05/06/2014] [Indexed: 12/25/2022] Open
Abstract
Mounting evidence of climatic effects on riverine environments and adaptive responses of fishes have elicited growing conservation concerns. Measures to rectify population declines include assessment of local extinction risk, population ecology, viability, and genetic differentiation. While conservation planning has been largely informed by neutral genetic structure, there has been a dearth of critical information regarding the role of non-neutral or functional genetic variation. We evaluated genetic variation among steelhead trout of the Columbia River Basin, which supports diverse populations distributed among dynamic landscapes. We categorized 188 SNP loci as either putatively neutral or candidates for divergent selection (non-neutral) using a multitest association approach. Neutral variation distinguished lineages and defined broad-scale population structure consistent with previous studies, but fine-scale resolution was also detected at levels not previously observed. Within distinct coastal and inland lineages, we identified nine and 22 candidate loci commonly associated with precipitation or temperature variables and putatively under divergent selection. Observed patterns of non-neutral variation suggest overall climate is likely to shape local adaptation (e.g., potential rapid evolution) of steelhead trout in the Columbia River region. Broad geographic patterns of neutral and non-neutral variation demonstrated here can be used to accommodate priorities for regional management and inform long-term conservation of this species.
Collapse
Affiliation(s)
- Andrew P Matala
- Columbia River Inter-Tribal Fish Commission Hagerman, ID, USA
| | - Michael W Ackerman
- Eagle Fish Genetic Laboratory, Pacific States Marine Fisheries Commission Eagle, ID, USA
| | - Matthew R Campbell
- Eagle Fish Genetic Laboratory, Idaho Department of Fish and Game Eagle, ID, USA
| | - Shawn R Narum
- Columbia River Inter-Tribal Fish Commission Hagerman, ID, USA
| |
Collapse
|
26
|
Gomez-Uchida D, Seeb LW, Warheit KI, McKinney GJ, Seeb JE. Deep sequencing of the transcriptome and mining of single nucleotide polymorphisms (SNPs) provide genomic resources for applied studies in Chinook salmon (Oncorhynchus tshawytscha). CONSERV GENET RESOUR 2014. [DOI: 10.1007/s12686-014-0235-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
27
|
Peñalba JV, Smith LL, Tonione MA, Sass C, Hykin SM, Skipwith PL, McGuire JA, Bowie RCK, Moritz C. Sequence capture using PCR-generated probes: a cost-effective method of targeted high-throughput sequencing for nonmodel organisms. Mol Ecol Resour 2014; 14:1000-10. [DOI: 10.1111/1755-0998.12249] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Revised: 02/11/2014] [Accepted: 02/12/2014] [Indexed: 11/27/2022]
Affiliation(s)
- Joshua V. Peñalba
- Museum of Vertebrate Zoology; University of California; 3101 Valley Life Sciences Building Berkeley CA 94720 USA
| | - Lydia L. Smith
- Museum of Vertebrate Zoology; University of California; 3101 Valley Life Sciences Building Berkeley CA 94720 USA
- Department of Integrative Biology; University of California; 3060 Valley Life Sciences Building Berkeley CA 94720 USA
| | - Maria A. Tonione
- Museum of Vertebrate Zoology; University of California; 3101 Valley Life Sciences Building Berkeley CA 94720 USA
- Department of Environmental Science Policy and Management; University of California; 130 Mulford Hall Berkeley CA 94720 USA
| | - Chodon Sass
- UC and Jepson Herbarium; University of California; 411 Koshland Hall MC 3102 Berkeley CA 94720 USA
| | - Sarah M. Hykin
- Museum of Vertebrate Zoology; University of California; 3101 Valley Life Sciences Building Berkeley CA 94720 USA
| | - Phillip L. Skipwith
- Museum of Vertebrate Zoology; University of California; 3101 Valley Life Sciences Building Berkeley CA 94720 USA
| | - Jimmy A. McGuire
- Museum of Vertebrate Zoology; University of California; 3101 Valley Life Sciences Building Berkeley CA 94720 USA
- Department of Integrative Biology; University of California; 3060 Valley Life Sciences Building Berkeley CA 94720 USA
| | - Rauri C. K. Bowie
- Museum of Vertebrate Zoology; University of California; 3101 Valley Life Sciences Building Berkeley CA 94720 USA
- Department of Integrative Biology; University of California; 3060 Valley Life Sciences Building Berkeley CA 94720 USA
| | - Craig Moritz
- Department of Integrative Biology; University of California; 3060 Valley Life Sciences Building Berkeley CA 94720 USA
- Research School of Biology; The Australian National University; Building 116 Acton ACT 0200 Australia
| |
Collapse
|
28
|
Pearse DE, Miller MR, Abadía-Cardoso A, Garza JC. Rapid parallel evolution of standing variation in a single, complex, genomic region is associated with life history in steelhead/rainbow trout. Proc Biol Sci 2014; 281:20140012. [PMID: 24671976 DOI: 10.1098/rspb.2014.0012] [Citation(s) in RCA: 123] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Rapid adaptation to novel environments may drive changes in genomic regions through natural selection. Such changes may be population-specific or, alternatively, may involve parallel evolution of the same genomic region in multiple populations, if that region contains genes or co-adapted gene complexes affecting the selected trait(s). Both quantitative and population genetic approaches have identified associations between specific genomic regions and the anadromous (steelhead) and resident (rainbow trout) life-history strategies of Oncorhynchus mykiss. Here, we use genotype data from 95 single nucleotide polymorphisms and show that the distribution of variation in a large region of one chromosome, Omy5, is strongly associated with life-history differentiation in multiple above-barrier populations of rainbow trout and their anadromous steelhead ancestors. The associated loci are in strong linkage disequilibrium, suggesting the presence of a chromosomal inversion or other rearrangement limiting recombination. These results provide the first evidence of a common genomic basis for life-history variation in O. mykiss in a geographically diverse set of populations and extend our knowledge of the heritable basis of rapid adaptation of complex traits in novel habitats.
Collapse
Affiliation(s)
- Devon E Pearse
- Fisheries Ecology Division, Southwest Fisheries Science Center, National Marine Fisheries Service, , 110 Shaffer Road, Santa Cruz, CA 95060, USA, Institute of Marine Sciences, University of California, , Santa Cruz, CA 95060, USA, Institute of Molecular Biology, University of Oregon, , Eugene, OR 97403, USA, Department of Animal Science, University of California, , One Shields Avenue, Davis, CA 95616, USA
| | | | | | | |
Collapse
|
29
|
Halley YA, Dowd SE, Decker JE, Seabury PM, Bhattarai E, Johnson CD, Rollins D, Tizard IR, Brightsmith DJ, Peterson MJ, Taylor JF, Seabury CM. A draft de novo genome assembly for the northern bobwhite (Colinus virginianus) reveals evidence for a rapid decline in effective population size beginning in the Late Pleistocene. PLoS One 2014; 9:e90240. [PMID: 24621616 PMCID: PMC3951200 DOI: 10.1371/journal.pone.0090240] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2013] [Accepted: 01/27/2014] [Indexed: 11/20/2022] Open
Abstract
Wild populations of northern bobwhites (Colinus virginianus; hereafter bobwhite) have declined across nearly all of their U.S. range, and despite their importance as an experimental wildlife model for ecotoxicology studies, no bobwhite draft genome assembly currently exists. Herein, we present a bobwhite draft de novo genome assembly with annotation, comparative analyses including genome-wide analyses of divergence with the chicken (Gallus gallus) and zebra finch (Taeniopygia guttata) genomes, and coalescent modeling to reconstruct the demographic history of the bobwhite for comparison to other birds currently in decline (i.e., scarlet macaw; Ara macao). More than 90% of the assembled bobwhite genome was captured within <40,000 final scaffolds (N50 = 45.4 Kb) despite evidence for approximately 3.22 heterozygous polymorphisms per Kb, and three annotation analyses produced evidence for >14,000 unique genes and proteins. Bobwhite analyses of divergence with the chicken and zebra finch genomes revealed many extremely conserved gene sequences, and evidence for lineage-specific divergence of noncoding regions. Coalescent models for reconstructing the demographic history of the bobwhite and the scarlet macaw provided evidence for population bottlenecks which were temporally coincident with human colonization of the New World, the late Pleistocene collapse of the megafauna, and the last glacial maximum. Demographic trends predicted for the bobwhite and the scarlet macaw also were concordant with how opposing natural selection strategies (i.e., skewness in the r-/K-selection continuum) would be expected to shape genome diversity and the effective population sizes in these species, which is directly relevant to future conservation efforts.
Collapse
Affiliation(s)
- Yvette A. Halley
- Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas, United States of America
| | - Scot E. Dowd
- Molecular Research LP, Shallowater, Texas, United States of America
| | - Jared E. Decker
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Paul M. Seabury
- ElanTech Inc., Greenbelt, Maryland, United States of America
| | - Eric Bhattarai
- Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas, United States of America
| | - Charles D. Johnson
- Genomics and Bioinformatics Core, Texas A&M AgriLife Research, College Station, Texas, United States of America
| | - Dale Rollins
- Rolling Plains Quail Research Ranch, Rotan, Texas, United States of America
| | - Ian R. Tizard
- Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas, United States of America
| | - Donald J. Brightsmith
- Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas, United States of America
| | - Markus J. Peterson
- Department of Wildlife and Fisheries Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Jeremy F. Taylor
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Christopher M. Seabury
- Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas, United States of America
- * E-mail:
| |
Collapse
|
30
|
Liu S, Sun L, Li Y, Sun F, Jiang Y, Zhang Y, Zhang J, Feng J, Kaltenboeck L, Kucuktas H, Liu Z. Development of the catfish 250K SNP array for genome-wide association studies. BMC Res Notes 2014; 7:135. [PMID: 24618043 PMCID: PMC3995806 DOI: 10.1186/1756-0500-7-135] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Accepted: 02/28/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Quantitative traits, such as disease resistance, are most often controlled by a set of genes involving a complex array of regulation. The dissection of genetic basis of quantitative traits requires large numbers of genetic markers with good genome coverage. The application of next-generation sequencing technologies has allowed discovery of over eight million SNPs in catfish, but the challenge remains as to how to efficiently and economically use such SNP resources for genetic analysis. RESULTS In this work, we developed a catfish 250K SNP array using Affymetrix Axiom genotyping technology. The SNPs were obtained from multiple sources including gene-associated SNPs, anonymous genomic SNPs, and inter-specific SNPs. A set of 640K high-quality SNPs obtained following specific requirements of array design were submitted. A panel of 250,113 SNPs was finalized for inclusion on the array. The performance evaluated by genotyping individuals from wild populations and backcross families suggested the good utility of the catfish 250K SNP array. CONCLUSIONS This is the first high-density SNP array for catfish. The array should be a valuable resource for genome-wide association studies (GWAS), fine QTL mapping, high-density linkage map construction, haplotype analysis, and whole genome-based selection.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Zhanjiang Liu
- The Fish Molecular Genetics and Biotechnology Laboratory, Aquatic Genomics Unit, School of Fisheries, Aquaculture and Aquatic Sciences, and Program of Cell and Molecular Biosciences, Auburn University, Auburn, AL 36849, USA.
| |
Collapse
|
31
|
Lapègue S, Harrang E, Heurtebise S, Flahauw E, Donnadieu C, Gayral P, Ballenghien M, Genestout L, Barbotte L, Mahla R, Haffray P, Klopp C. Development of SNP-genotyping arrays in two shellfish species. Mol Ecol Resour 2014; 14:820-30. [DOI: 10.1111/1755-0998.12230] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2013] [Revised: 12/26/2013] [Accepted: 01/08/2014] [Indexed: 11/30/2022]
Affiliation(s)
- S. Lapègue
- Ifremer; SG2M-LGPMM; Laboratoire de Génétique et Pathologie des Mollusques Marins; La Tremblade France
| | - E. Harrang
- Ifremer; SG2M-LGPMM; Laboratoire de Génétique et Pathologie des Mollusques Marins; La Tremblade France
| | - S. Heurtebise
- Ifremer; SG2M-LGPMM; Laboratoire de Génétique et Pathologie des Mollusques Marins; La Tremblade France
| | - E. Flahauw
- Ifremer; SG2M-LGPMM; Laboratoire de Génétique et Pathologie des Mollusques Marins; La Tremblade France
| | - C. Donnadieu
- INRA UMR444; Laboratoire de Génétique Cellulaire; Plateforme GeT-PlaGe Genotoul; Castanet-Tolosan France
| | - P. Gayral
- CNRS UMR 5554; Institut des Sciences de l'Evolution de Montpellier; Université Montpellier 2; Montpellier France
- CNRS UMR 7261; Institut de Recherche sur la Biologie de l'Insecte; Faculté des Sciences et Techniques; Université François Rabelais; Tours France
| | - M. Ballenghien
- CNRS UMR 5554; Institut des Sciences de l'Evolution de Montpellier; Université Montpellier 2; Montpellier France
| | - L. Genestout
- LABOGENA; Domaine de Vilvert; Jouy-en-Josas France
| | - L. Barbotte
- LABOGENA; Domaine de Vilvert; Jouy-en-Josas France
| | - R. Mahla
- LABOGENA; Domaine de Vilvert; Jouy-en-Josas France
| | - P. Haffray
- SYSAAF; Station LPGP/INRA; Campus de Beaulieu; 35042 Rennes France
| | - C. Klopp
- INRA; Sigenae; UR875 Biométrie et Intelligence Artificielle; Castanet-Tolosan France
| |
Collapse
|
32
|
Everett MV, Seeb JE. Detection and mapping of QTL for temperature tolerance and body size in Chinook salmon (Oncorhynchus tshawytscha) using genotyping by sequencing. Evol Appl 2014; 7:480-92. [PMID: 24822082 PMCID: PMC4001446 DOI: 10.1111/eva.12147] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2013] [Accepted: 12/16/2013] [Indexed: 01/07/2023] Open
Abstract
Understanding how organisms interact with their environments is increasingly important for conservation efforts in many species, especially in light of highly anticipated climate changes. One method for understanding this relationship is to use genetic maps and QTL mapping to detect genomic regions linked to phenotypic traits of importance for adaptation. We used high-throughput genotyping by sequencing (GBS) to both detect and map thousands of SNPs in haploid Chinook salmon (Oncorhynchus tshawytscha). We next applied this map to detect QTL related to temperature tolerance and body size in families of diploid Chinook salmon. Using these techniques, we mapped 3534 SNPs in 34 linkage groups which is consistent with the haploid chromosome number for Chinook salmon. We successfully detected three QTL for temperature tolerance and one QTL for body size at the experiment-wide level, as well as additional QTL significant at the chromosome-wide level. The use of haploids coupled with GBS provides a robust pathway to rapidly develop genomic resources in nonmodel organisms; these QTL represent preliminary progress toward linking traits of conservation interest to regions in the Chinook salmon genome.
Collapse
Affiliation(s)
- Meredith V Everett
- School of Aquatic and Fishery Sciences, University of Washington Seattle, WA, USA
| | - James E Seeb
- School of Aquatic and Fishery Sciences, University of Washington Seattle, WA, USA
| |
Collapse
|
33
|
Houston RD, Taggart JB, Cézard T, Bekaert M, Lowe NR, Downing A, Talbot R, Bishop SC, Archibald AL, Bron JE, Penman DJ, Davassi A, Brew F, Tinch AE, Gharbi K, Hamilton A. Development and validation of a high density SNP genotyping array for Atlantic salmon (Salmo salar). BMC Genomics 2014; 15:90. [PMID: 24524230 PMCID: PMC3923896 DOI: 10.1186/1471-2164-15-90] [Citation(s) in RCA: 167] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2013] [Accepted: 01/27/2014] [Indexed: 12/30/2022] Open
Abstract
Background Dense single nucleotide polymorphism (SNP) genotyping arrays provide extensive information on polymorphic variation across the genome of species of interest. Such information can be used in studies of the genetic architecture of quantitative traits and to improve the accuracy of selection in breeding programs. In Atlantic salmon (Salmo salar), these goals are currently hampered by the lack of a high-density SNP genotyping platform. Therefore, the aim of the study was to develop and test a dense Atlantic salmon SNP array. Results SNP discovery was performed using extensive deep sequencing of Reduced Representation (RR-Seq), Restriction site-Associated DNA (RAD-Seq) and mRNA (RNA-Seq) libraries derived from farmed and wild Atlantic salmon samples (n = 283) resulting in the discovery of > 400 K putative SNPs. An Affymetrix Axiom® myDesign Custom Array was created and tested on samples of animals of wild and farmed origin (n = 96) revealing a total of 132,033 polymorphic SNPs with high call rate, good cluster separation on the array and stable Mendelian inheritance in our sample. At least 38% of these SNPs are from transcribed genomic regions and therefore more likely to include functional variants. Linkage analysis utilising the lack of male recombination in salmonids allowed the mapping of 40,214 SNPs distributed across all 29 pairs of chromosomes, highlighting the extensive genome-wide coverage of the SNPs. An identity-by-state clustering analysis revealed that the array can clearly distinguish between fish of different origins, within and between farmed and wild populations. Finally, Y-chromosome-specific probes included on the array provide an accurate molecular genetic test for sex. Conclusions This manuscript describes the first high-density SNP genotyping array for Atlantic salmon. This array will be publicly available and is likely to be used as a platform for high-resolution genetics research into traits of evolutionary and economic importance in salmonids and in aquaculture breeding programs via genomic selection.
Collapse
Affiliation(s)
- Ross D Houston
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Greminger MP, Stölting KN, Nater A, Goossens B, Arora N, Bruggmann R, Patrignani A, Nussberger B, Sharma R, Kraus RHS, Ambu LN, Singleton I, Chikhi L, van Schaik CP, Krützen M. Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms. BMC Genomics 2014; 15:16. [PMID: 24405840 PMCID: PMC3897891 DOI: 10.1186/1471-2164-15-16] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2013] [Accepted: 12/21/2013] [Indexed: 12/30/2022] Open
Abstract
Background High-throughput sequencing has opened up exciting possibilities in population and conservation genetics by enabling the assessment of genetic variation at genome-wide scales. One approach to reduce genome complexity, i.e. investigating only parts of the genome, is reduced-representation library (RRL) sequencing. Like similar approaches, RRL sequencing reduces ascertainment bias due to simultaneous discovery and genotyping of single-nucleotide polymorphisms (SNPs) and does not require reference genomes. Yet, generating such datasets remains challenging due to laboratory and bioinformatical issues. In the laboratory, current protocols require improvements with regards to sequencing homologous fragments to reduce the number of missing genotypes. From the bioinformatical perspective, the reliance of most studies on a single SNP caller disregards the possibility that different algorithms may produce disparate SNP datasets. Results We present an improved RRL (iRRL) protocol that maximizes the generation of homologous DNA sequences, thus achieving improved genotyping-by-sequencing efficiency. Our modifications facilitate generation of single-sample libraries, enabling individual genotype assignments instead of pooled-sample analysis. We sequenced ~1% of the orangutan genome with 41-fold median coverage in 31 wild-born individuals from two populations. SNPs and genotypes were called using three different algorithms. We obtained substantially different SNP datasets depending on the SNP caller. Genotype validations revealed that the Unified Genotyper of the Genome Analysis Toolkit and SAMtools performed significantly better than a caller from CLC Genomics Workbench (CLC). Of all conflicting genotype calls, CLC was only correct in 17% of the cases. Furthermore, conflicting genotypes between two algorithms showed a systematic bias in that one caller almost exclusively assigned heterozygotes, while the other one almost exclusively assigned homozygotes. Conclusions Our enhanced iRRL approach greatly facilitates genotyping-by-sequencing and thus direct estimates of allele frequencies. Our direct comparison of three commonly used SNP callers emphasizes the need to question the accuracy of SNP and genotype calling, as we obtained considerably different SNP datasets depending on caller algorithms, sequencing depths and filtering criteria. These differences affected scans for signatures of natural selection, but will also exert undue influences on demographic inferences. This study presents the first effort to generate a population genomic dataset for wild-born orangutans with known population provenance.
Collapse
Affiliation(s)
- Maja P Greminger
- Evolutionary Genetics Group, Anthropological Institute and Museum, University of Zurich, Zurich, Switzerland.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Li MR, Wang XF, Zhang C, Wang HY, Shi FX, Xiao HX, Li LF. A simple strategy for development of single nucleotide polymorphisms from non-model species and its application in Panax. Int J Mol Sci 2013; 14:24581-91. [PMID: 24351835 PMCID: PMC3876129 DOI: 10.3390/ijms141224581] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2013] [Revised: 12/09/2013] [Accepted: 12/13/2013] [Indexed: 11/23/2022] Open
Abstract
Single nucleotide polymorphisms (SNPs) are widely employed in the studies of population genetics, molecular breeding and conservation genetics. In this study, we explored a simple route to develop SNPs from non-model species based on screening the library of single copy nuclear genes (SCNGs). Through application of this strategy in Panax, we identified 160 and 171 SNPs from P. quinquefolium and P. ginseng, respectively. Our results demonstrated that both P. ginseng and P. quinquefolium possessed a high level of nucleotide diversity. The number of haplotype per locus ranged from 1 to 12 for P. ginseng and from 1 to 9 for P. quinquefolium, respectively. The nucleotide diversity of total sites (πT) varied between 0.000 and 0.023 for P. ginseng and 0.000 and 0.035 for P. quinquefolium, respectively. These findings suggested that this approach is well suited for SNP discovery in non-model organisms and is easily employed in standard genetics laboratory studies.
Collapse
Affiliation(s)
- Ming Rui Li
- Key Laboratory of Molecular Epigenetics of Ministry of Education, Northeast Normal University, Changchun 130024, China; E-Mails: (M.R.L.); (X.F.W.); (C.Z.); (H.Y.W.); (F.X.S.)
| | - Xin Feng Wang
- Key Laboratory of Molecular Epigenetics of Ministry of Education, Northeast Normal University, Changchun 130024, China; E-Mails: (M.R.L.); (X.F.W.); (C.Z.); (H.Y.W.); (F.X.S.)
| | - Cui Zhang
- Key Laboratory of Molecular Epigenetics of Ministry of Education, Northeast Normal University, Changchun 130024, China; E-Mails: (M.R.L.); (X.F.W.); (C.Z.); (H.Y.W.); (F.X.S.)
| | - Hua Ying Wang
- Key Laboratory of Molecular Epigenetics of Ministry of Education, Northeast Normal University, Changchun 130024, China; E-Mails: (M.R.L.); (X.F.W.); (C.Z.); (H.Y.W.); (F.X.S.)
| | - Feng Xue Shi
- Key Laboratory of Molecular Epigenetics of Ministry of Education, Northeast Normal University, Changchun 130024, China; E-Mails: (M.R.L.); (X.F.W.); (C.Z.); (H.Y.W.); (F.X.S.)
| | - Hong Xing Xiao
- Key Laboratory of Molecular Epigenetics of Ministry of Education, Northeast Normal University, Changchun 130024, China; E-Mails: (M.R.L.); (X.F.W.); (C.Z.); (H.Y.W.); (F.X.S.)
| | - Lin Feng Li
- Key Laboratory of Molecular Epigenetics of Ministry of Education, Northeast Normal University, Changchun 130024, China; E-Mails: (M.R.L.); (X.F.W.); (C.Z.); (H.Y.W.); (F.X.S.)
| |
Collapse
|
36
|
Palti Y, Gao G, Miller MR, Vallejo RL, Wheeler PA, Quillet E, Yao J, Thorgaard GH, Salem M, Rexroad CE. A resource of single-nucleotide polymorphisms for rainbow trout generated by restriction-site associated DNA sequencing of doubled haploids. Mol Ecol Resour 2013; 14:588-96. [DOI: 10.1111/1755-0998.12204] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2013] [Revised: 11/12/2013] [Accepted: 11/13/2013] [Indexed: 11/29/2022]
Affiliation(s)
- Yniv Palti
- National Center for Cool and Cold Water Aquaculture; ARS-USDA; 11861 Leetown Road Kearneysville WV 25430 USA
| | - Guangtu Gao
- National Center for Cool and Cold Water Aquaculture; ARS-USDA; 11861 Leetown Road Kearneysville WV 25430 USA
| | - Michael R. Miller
- Institute of Molecular Biology; University of Oregon; Eugene OR 97403-1229 USA
- Department of Animal Science; University of California; Davis CA 95616 USA
| | - Roger L. Vallejo
- National Center for Cool and Cold Water Aquaculture; ARS-USDA; 11861 Leetown Road Kearneysville WV 25430 USA
| | - Paul A. Wheeler
- School of Biological Sciences and Center for Reproductive Biology; Washington State University; Pullman WA 99164-4236 USA
| | - Edwige Quillet
- INRA; UMR 1313 GABI; Génétique Animale et Biologie Intégrative; Jouy-en-Josas 78350 France
| | - Jianbo Yao
- Division of Animal and Nutritional Sciences; West Virginia University; Morgantown WV 26506 USA
| | - Gary H. Thorgaard
- School of Biological Sciences and Center for Reproductive Biology; Washington State University; Pullman WA 99164-4236 USA
| | - Mohamed Salem
- Division of Animal and Nutritional Sciences; West Virginia University; Morgantown WV 26506 USA
- Department of Biology; Middle Tennessee State University; Murfreesboro TN 37132 USA
| | - Caird E. Rexroad
- National Center for Cool and Cold Water Aquaculture; ARS-USDA; 11861 Leetown Road Kearneysville WV 25430 USA
| |
Collapse
|
37
|
Quillery E, Quenez O, Peterlongo P, Plantard O. Development of genomic resources for the tick Ixodes ricinus: isolation and characterization of single nucleotide polymorphisms. Mol Ecol Resour 2013; 14:393-400. [PMID: 24119113 DOI: 10.1111/1755-0998.12179] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Revised: 09/17/2013] [Accepted: 09/20/2013] [Indexed: 12/12/2022]
Abstract
Assessing the genetic variability of the tick Ixodes ricinus-an important vector of pathogens in Europe-is an essential step for setting up antitick control methods. Here, we report the first identification of a set of SNPs isolated from the genome of I. ricinus, by applying a reduction in genomic complexity, pyrosequencing and new bioinformatics tools. Almost 1.4 million of reads (average length: 528 nt) were generated with a full Roche 454 GS FLX run on two reduced representation libraries of I. ricinus. A newly developed bioinformatics tool (DiscoSnp), which isolates SNPs without requiring any reference genome, was used to obtain 321 088 putative SNPs. Stringent selection criteria were applied in a bioinformatics pipeline to select 1768 SNPs for the development of specific primers. Among 384 randomly SNPs tested by Fluidigm genotyping technology on 464 individuals ticks, 368 SNPs loci (96%) exhibited the presence of the two expected alleles. Hardy-Weinberg equilibrium tests conducted on six natural populations of ticks have shown that from 26 to 46 of the 384 loci exhibited significant heterozygote deficiency.
Collapse
Affiliation(s)
- E Quillery
- INRA, UMR1300 Biology, Epidemiology and Risk Analysis in animal health, BP 40706, F-44307, Nantes, France; LUNAM Université, Oniris, Ecole nationale vétérinaire, agroalimentaire et de l'alimentation Nantes Atlantique, UMR BioEpAR, Nantes, 44307, France
| | | | | | | |
Collapse
|
38
|
SNP discovery in European anchovy (Engraulis encrasicolus, L) by high-throughput transcriptome and genome sequencing. PLoS One 2013; 8:e70051. [PMID: 23936375 PMCID: PMC3731364 DOI: 10.1371/journal.pone.0070051] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2013] [Accepted: 06/13/2013] [Indexed: 11/29/2022] Open
Abstract
Increased throughput in sequencing technologies has facilitated the acquisition of detailed genomic information in non-model species. The focus of this research was to discover and validate SNPs derived from the European anchovy (Engraulis encrasicolus) transcriptome, a species with no available reference genome, using next generation sequencing technologies. A cDNA library was constructed from four tissues of ten fish individuals corresponding to three populations of E. encrasicolus, and Roche 454 GS FLX Titanium sequencing yielded 19,367 contigs. Additionally, the European anchovy genome was sequenced for the same ten individuals using an Illumina HiSeq2000. Using a computational pipeline for combining transcriptome and genome information, a total of 18,994 SNPs met the necessary minor allele frequency and depth filters. A series of further stringent filters were applied to identify those SNPs likely to succeed in genotyping assays, and for filtering of those in potential duplicated genome regions. A novel method for detecting potential intron-exon boundaries in areas of putative SNPs has also been applied in silico to improve genotyping success. In all, 2,317 filtered putative transcriptome SNPs suitable for genotyping primer design were identified. From those, a subset of 530 were selected, with the genotyping results showing the highest reported conversion and validation rates (91.3% and 83.2%, respectively) reported to date for a non-model species. This study represents a promising strategy to discover genotypable SNPs in the exome of non-model organisms. The genomic resource generated for E. encrasicolus, both in terms of sequences and novel markers, will be informative for research into this species with applications including traceability studies, population genetic analyses and aquaculture.
Collapse
|
39
|
Hoffman JI, Thorne MAS, McEwing R, Forcada J, Ogden R. Cross-amplification and validation of SNPs conserved over 44 million years between seals and dogs. PLoS One 2013; 8:e68365. [PMID: 23874599 PMCID: PMC3712990 DOI: 10.1371/journal.pone.0068365] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2013] [Accepted: 05/28/2013] [Indexed: 01/17/2023] Open
Abstract
High-density SNP arrays developed for humans and their companion species provide a rapid and convenient tool for generating SNP data in closely-related non-model organisms, but have not yet been widely applied to phylogenetically divergent taxa. Consequently, we used the CanineHD BeadChip to genotype 24 Antarctic fur seal (Arctocephalus gazella) individuals. Despite seals and dogs having diverged around 44 million years ago, 33,324 out of 173,662 loci (19.2%) could be genotyped, of which 173 were polymorphic and clearly interpretable. Two SNPs were validated using KASP genotyping assays, with the resulting genotypes being 100% concordant with those obtained from the high-density array. Two loci were also confirmed through in silico visualisation after mapping them to the fur seal transcriptome. Polymorphic SNPs were distributed broadly throughout the dog genome and did not differ significantly in proximity to genes from either monomorphic SNPs or those that failed to cross-amplify in seals. However, the nearest genes to polymorphic SNPs were significantly enriched for functional annotations relating to energy metabolism, suggesting a possible bias towards conserved regions of the genome.
Collapse
Affiliation(s)
- Joseph I. Hoffman
- Department of Animal Behaviour, University of Bielefeld, Bielefeld, North Rhine-Westphalia, Germany
- * E-mail:
| | - Michael A. S. Thorne
- British Antarctic Survey, Natural Environment Research Council, High Cross, Cambridge, United Kingdom
| | - Rob McEwing
- Wildgenes Laboratory, Royal Zoological Society of Scotland, Edinburgh, United Kingdom
| | - Jaume Forcada
- British Antarctic Survey, Natural Environment Research Council, High Cross, Cambridge, United Kingdom
| | - Rob Ogden
- Wildgenes Laboratory, Royal Zoological Society of Scotland, Edinburgh, United Kingdom
| |
Collapse
|
40
|
Garvin MR, Saitoh K, Gharrett AJ. Application of single nucleotide polymorphisms to non-model species: a technical review. Mol Ecol Resour 2013; 10:915-34. [PMID: 21565101 DOI: 10.1111/j.1755-0998.2010.02891.x] [Citation(s) in RCA: 159] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Single nucleotide polymorphisms (SNPs) have gained wide use in humans and model species and are becoming the marker of choice for applications in other species. Technology that was developed for work in model species may provide useful tools for SNP discovery and genotyping in non-model organisms. However, SNP discovery can be expensive, labour intensive, and introduce ascertainment bias. In addition, the most efficient approaches to SNP discovery will depend on the research questions that the markers are to resolve as well as the focal species. We discuss advantages and disadvantages of several past and recent technologies for SNP discovery and genotyping and summarize a variety of SNP discovery and genotyping studies in ecology and evolution.
Collapse
Affiliation(s)
- M R Garvin
- Fisheries Division, School of Fisheries and Ocean Sciences, University of Alaska Fairbanks, 17101 Point Lena Loop Road, Juneau, AK 99801, USA National Research Institute of Fisheries Science, Fukuura, Kanazawa, Yokohama 236-8648 Japan
| | | | | |
Collapse
|
41
|
Pujolar JM, Jacobsen MW, Frydenberg J, Als TD, Larsen PF, Maes GE, Zane L, Jian JB, Cheng L, Hansen MM. A resource of genome-wide single-nucleotide polymorphisms generated by RAD tag sequencing in the critically endangered European eel. Mol Ecol Resour 2013; 13:706-14. [PMID: 23656721 DOI: 10.1111/1755-0998.12117] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Revised: 03/22/2013] [Accepted: 03/24/2013] [Indexed: 01/25/2023]
Abstract
Reduced representation genome sequencing such as restriction-site-associated DNA (RAD) sequencing is finding increased use to identify and genotype large numbers of single-nucleotide polymorphisms (SNPs) in model and nonmodel species. We generated a unique resource of novel SNP markers for the European eel using the RAD sequencing approach that was simultaneously identified and scored in a genome-wide scan of 30 individuals. Whereas genomic resources are increasingly becoming available for this species, including the recent release of a draft genome, no genome-wide set of SNP markers was available until now. The generated SNPs were widely distributed across the eel genome, aligning to 4779 different contigs and 19,703 different scaffolds. Significant variation was identified, with an average nucleotide diversity of 0.00529 across individuals. Results varied widely across the genome, ranging from 0.00048 to 0.00737 per locus. Based on the average nucleotide diversity across all loci, long-term effective population size was estimated to range between 132,000 and 1,320,000, which is much higher than previous estimates based on microsatellite loci. The generated SNP resource consisting of 82,425 loci and 376,918 associated SNPs provides a valuable tool for future population genetics and genomics studies and allows for targeting specific genes and particularly interesting regions of the eel genome.
Collapse
Affiliation(s)
- J M Pujolar
- Department of Bioscience, Aarhus University, Aarhus C, Denmark.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
A multi-platform draft de novo genome assembly and comparative analysis for the Scarlet Macaw (Ara macao). PLoS One 2013; 8:e62415. [PMID: 23667475 PMCID: PMC3648530 DOI: 10.1371/journal.pone.0062415] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Accepted: 03/21/2013] [Indexed: 12/31/2022] Open
Abstract
Data deposition to NCBI Genomes: This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession AMXX00000000 (SMACv1.0, unscaffolded genome assembly). The version described in this paper is the first version (AMXX01000000). The scaffolded assembly (SMACv1.1) has been deposited at DDBJ/EMBL/GenBank under the accession AOUJ00000000, and is also the first version (AOUJ01000000). Strong biological interest in traits such as the acquisition and utilization of speech, cognitive abilities, and longevity catalyzed the utilization of two next-generation sequencing platforms to provide the first-draft de novo genome assembly for the large, new world parrot Ara macao (Scarlet Macaw). Despite the challenges associated with genome assembly for an outbred avian species, including 951,507 high-quality putative single nucleotide polymorphisms, the final genome assembly (>1.035 Gb) includes more than 997 Mb of unambiguous sequence data (excluding N's). Cytogenetic analyses including ZooFISH revealed complex rearrangements associated with two scarlet macaw macrochromosomes (AMA6, AMA7), which supports the hypothesis that translocations, fusions, and intragenomic rearrangements are key factors associated with karyotype evolution among parrots. In silico annotation of the scarlet macaw genome provided robust evidence for 14,405 nuclear gene annotation models, their predicted transcripts and proteins, and a complete mitochondrial genome. Comparative analyses involving the scarlet macaw, chicken, and zebra finch genomes revealed high levels of nucleotide-based conservation as well as evidence for overall genome stability among the three highly divergent species. Application of a new whole-genome analysis of divergence involving all three species yielded prioritized candidate genes and noncoding regions for parrot traits of interest (i.e., speech, intelligence, longevity) which were independently supported by the results of previous human GWAS studies. We also observed evidence for genes and noncoding loci that displayed extreme conservation across the three avian lineages, thereby reflecting their likely biological and developmental importance among birds.
Collapse
|
43
|
Krück NC, Innes DI, Ovenden JR. New
SNP
s for population genetic analysis reveal possible cryptic speciation of eastern Australian sea mullet (
Mugil cephalus
). Mol Ecol Resour 2013; 13:715-25. [DOI: 10.1111/1755-0998.12112] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2013] [Revised: 03/27/2013] [Accepted: 03/29/2013] [Indexed: 01/25/2023]
Affiliation(s)
- Nils C. Krück
- School of Biological Sciences The University of Queensland St Lucia Campus Brisbane Qld 4072 Australia
- Molecular Fisheries Laboratory Queensland Government PO Box 6097 Brisbane Qld 4072 Australia
| | - David I. Innes
- Molecular Fisheries Laboratory Queensland Government PO Box 6097 Brisbane Qld 4072 Australia
| | - Jennifer R. Ovenden
- School of Biological Sciences The University of Queensland St Lucia Campus Brisbane Qld 4072 Australia
- Molecular Fisheries Laboratory Queensland Government PO Box 6097 Brisbane Qld 4072 Australia
| |
Collapse
|
44
|
Carlsson J, Gauthier DT, Carlsson JEL, Coughlan JP, Dillane E, Fitzgerald RD, Keating U, McGinnity P, Mirimin L, Cross TF. Rapid, economical single-nucleotide polymorphism and microsatellite discovery based on de novo assembly of a reduced representation genome in a non-model organism: a case study of Atlantic cod Gadus morhua. JOURNAL OF FISH BIOLOGY 2013; 82:944-958. [PMID: 23464553 DOI: 10.1111/jfb.12034] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2011] [Accepted: 11/21/2012] [Indexed: 06/01/2023]
Abstract
By combining next-generation sequencing technology (454) and reduced representation library (RRL) construction, the rapid and economical isolation of over 25 000 potential single-nucleotide polymorphisms (SNP) and >6000 putative microsatellite loci from c. 2% of the genome of the non-model teleost, Atlantic cod Gadus morhua from the Celtic Sea, south of Ireland, was demonstrated. A small-scale validation of markers indicated that 80% (11 of 14) of SNP loci and 40% (6 of 15) of the microsatellite loci could be amplified and showed variability. The results clearly show that small-scale next-generation sequencing of RRL genomes is an economical and rapid approach for simultaneous SNP and microsatellite discovery that is applicable to any species. The low cost and relatively small investment in time allows for positive exploitation of ascertainment bias to design markers applicable to specific populations and study questions.
Collapse
Affiliation(s)
- J Carlsson
- Beaufort Genetics Research Programme, School of Biological, Earth and Environmental Sciences/Aquauculture and Fisheries Development Centre, University College Cork, Distillery Fields, North Mall, Cork, Ireland.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
McCormack JE, Hird SM, Zellmer AJ, Carstens BC, Brumfield RT. Applications of next-generation sequencing to phylogeography and phylogenetics. Mol Phylogenet Evol 2013; 66:526-38. [DOI: 10.1016/j.ympev.2011.12.007] [Citation(s) in RCA: 445] [Impact Index Per Article: 40.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2011] [Revised: 12/02/2011] [Accepted: 12/05/2011] [Indexed: 01/09/2023]
|
46
|
Switchgrass genomic diversity, ploidy, and evolution: novel insights from a network-based SNP discovery protocol. PLoS Genet 2013; 9:e1003215. [PMID: 23349638 PMCID: PMC3547862 DOI: 10.1371/journal.pgen.1003215] [Citation(s) in RCA: 401] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2012] [Accepted: 11/19/2012] [Indexed: 01/01/2023] Open
Abstract
Switchgrass (Panicum virgatum L.) is a perennial grass that has been designated as an herbaceous model biofuel crop for the United States of America. To facilitate accelerated breeding programs of switchgrass, we developed both an association panel and linkage populations for genome-wide association study (GWAS) and genomic selection (GS). All of the 840 individuals were then genotyped using genotyping by sequencing (GBS), generating 350 GB of sequence in total. As a highly heterozygous polyploid (tetraploid and octoploid) species lacking a reference genome, switchgrass is highly intractable with earlier methodologies of single nucleotide polymorphism (SNP) discovery. To access the genetic diversity of species like switchgrass, we developed a SNP discovery pipeline based on a network approach called the Universal Network-Enabled Analysis Kit (UNEAK). Complexities that hinder single nucleotide polymorphism discovery, such as repeats, paralogs, and sequencing errors, are easily resolved with UNEAK. Here, 1.2 million putative SNPs were discovered in a diverse collection of primarily upland, northern-adapted switchgrass populations. Further analysis of this data set revealed the fundamentally diploid nature of tetraploid switchgrass. Taking advantage of the high conservation of genome structure between switchgrass and foxtail millet (Setaria italica (L.) P. Beauv.), two parent-specific, synteny-based, ultra high-density linkage maps containing a total of 88,217 SNPs were constructed. Also, our results showed clear patterns of isolation-by-distance and isolation-by-ploidy in natural populations of switchgrass. Phylogenetic analysis supported a general south-to-north migration path of switchgrass. In addition, this analysis suggested that upland tetraploid arose from upland octoploid. All together, this study provides unparalleled insights into the diversity, genomic complexity, population structure, phylogeny, phylogeography, ploidy, and evolutionary dynamics of switchgrass. Recent advances in sequencing technologies have enabled large-scale surveys of genetic diversity in model species with a wholly or partly sequenced reference genome. However, thousands of key species, which are essential for food, health, energy, and ecology, do not have reference genomes. To accelerate their breeding cycle via marker assisted selection, high-throughput genotyping is required for these valuable species, in spite of the absence of reference genomes. Based on genotyping by sequencing (GBS) technology, we developed a new single nucleotide polymorphism (SNP) discovery protocol, the Universal Network-Enabled Analysis Kit (UNEAK), which can be widely used in any species, regardless of genome complexity or the availability of a reference genome. Here we test this protocol on switchgrass, currently the prime energy crop species in the United States of America. In addition to the discovery of over a million SNPs and construction of high-density linkage maps, we provide novel insights into the genome complexity, ploidy, phylogeny, and evolution of switchgrass. This is only the beginning: we believe UNEAK offers the key to the exploration and exploitation of the genetic diversity of thousands of non-model species.
Collapse
|
47
|
Zhang Y, Wang S, Li J, Zhang X, Jiang L, Xu P, Lu C, Wan Y, Sun X. Primary genome scan for complex body shape-related traits in the common carp Cyprinus carpio. JOURNAL OF FISH BIOLOGY 2013; 82:125-140. [PMID: 23331142 DOI: 10.1111/j.1095-8649.2012.03469.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
To identify quantitative trait loci (QTL) that affect body shape in common carp Cyprinus carpio, a linkage map, 2159·23 cM long, was constructed with a total of 307 markers covering 51 linkage groups (LG). The map included 167 new single nucleotide polymorphism (SNP) markers derived from expressed sequence tags (EST) together with 140 microsatellite markers reported earlier. A primary genome scan was conducted for QTL for standard length (L(S)), head length (L(H)), body height (H(B)), body width (W(B)) and tail length (L(TAIL)) in an F1 line containing 92 offspring. A total of 15 suggestive QTL on six LGs were found to associate with L(S), L(H), H(B), W(B) and L(TAIL) which explained 10·7-17·4% of the variance. Five significant QTL were detected for body-shape related traits and located for LGs (lg1, 12 and 20). These QTL included: one associated with L(S) (21·1% variance explained), three for H(B) (almost 20% variance explained) and one for W(B) (20·7% variance explained).
Collapse
Affiliation(s)
- Y Zhang
- The Centre for Applied Aquatic Genomics, Chinese Academy of Fishery Sciences, Beijing 100141, China
| | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Lu F, Lipka AE, Glaubitz J, Elshire R, Cherney JH, Casler MD, Buckler ES, Costich DE. Switchgrass genomic diversity, ploidy, and evolution: novel insights from a network-based SNP discovery protocol. PLoS Genet 2013. [PMID: 23349638 DOI: 10.1371/journalpgen1003215] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2023] Open
Abstract
Switchgrass (Panicum virgatum L.) is a perennial grass that has been designated as an herbaceous model biofuel crop for the United States of America. To facilitate accelerated breeding programs of switchgrass, we developed both an association panel and linkage populations for genome-wide association study (GWAS) and genomic selection (GS). All of the 840 individuals were then genotyped using genotyping by sequencing (GBS), generating 350 GB of sequence in total. As a highly heterozygous polyploid (tetraploid and octoploid) species lacking a reference genome, switchgrass is highly intractable with earlier methodologies of single nucleotide polymorphism (SNP) discovery. To access the genetic diversity of species like switchgrass, we developed a SNP discovery pipeline based on a network approach called the Universal Network-Enabled Analysis Kit (UNEAK). Complexities that hinder single nucleotide polymorphism discovery, such as repeats, paralogs, and sequencing errors, are easily resolved with UNEAK. Here, 1.2 million putative SNPs were discovered in a diverse collection of primarily upland, northern-adapted switchgrass populations. Further analysis of this data set revealed the fundamentally diploid nature of tetraploid switchgrass. Taking advantage of the high conservation of genome structure between switchgrass and foxtail millet (Setaria italica (L.) P. Beauv.), two parent-specific, synteny-based, ultra high-density linkage maps containing a total of 88,217 SNPs were constructed. Also, our results showed clear patterns of isolation-by-distance and isolation-by-ploidy in natural populations of switchgrass. Phylogenetic analysis supported a general south-to-north migration path of switchgrass. In addition, this analysis suggested that upland tetraploid arose from upland octoploid. All together, this study provides unparalleled insights into the diversity, genomic complexity, population structure, phylogeny, phylogeography, ploidy, and evolutionary dynamics of switchgrass.
Collapse
Affiliation(s)
- Fei Lu
- Institute for Genomic Diversity, Cornell University, Ithaca, New York, USA
| | | | | | | | | | | | | | | |
Collapse
|
49
|
SNP design from 454 sequencing of Podosphaera plantaginis transcriptome reveals a genetically diverse pathogen metapopulation with high levels of mixed-genotype infection. PLoS One 2012; 7:e52492. [PMID: 23300684 PMCID: PMC3531457 DOI: 10.1371/journal.pone.0052492] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2012] [Accepted: 11/14/2012] [Indexed: 01/11/2023] Open
Abstract
Background Molecular tools may greatly improve our understanding of pathogen evolution and epidemiology but technical constraints have hindered the development of genetic resources for parasites compared to free-living organisms. This study aims at developing molecular tools for Podosphaera plantaginis, an obligate fungal pathogen of Plantago lanceolata. This interaction has been intensively studied in the Åland archipelago of Finland with epidemiological data collected from over 4,000 host populations annually since year 2001. Principal Findings A cDNA library of a pooled sample of fungal conidia was sequenced on the 454 GS-FLX platform. Over 549,411 reads were obtained and annotated into 45,245 contigs. Annotation data was acquired for 65.2% of the assembled sequences. The transcriptome assembly was screened for SNP loci, as well as for functionally important genes (mating-type genes and potential effector proteins). A genotyping assay of 27 SNP loci was designed and tested on 380 infected leaf samples from 80 populations within the Åland archipelago. With this panel we identified 85 multilocus genotypes (MLG) with uneven frequencies across the pathogen metapopulation. Approximately half of the sampled populations contain polymorphism. Our genotyping protocol revealed mixed-genotype infection within a single host leaf to be common. Mixed infection has been proposed as one of the main drivers of pathogen evolution, and hence may be an important process in this pathosystem. Significance The developed SNP panel offers exciting research perspectives for future studies in this well-characterized pathosystem. Also, the transcriptome provides an invaluable novel genomic resource for powdery mildews, which cause significant yield losses on commercially important crops annually. Furthermore, the features that render genetic studies in this system a challenge are shared with the majority of obligate parasitic species, and hence our results provide methodological insights from SNP calling to field sampling protocols for a wide range of biological systems.
Collapse
|
50
|
Wong MML, Cannon CH, Wickneswari R. Development of high-throughput SNP-based genotyping in Acacia auriculiformis x A. mangium hybrids using short-read transcriptome data. BMC Genomics 2012; 13:726. [PMID: 23265623 PMCID: PMC3556151 DOI: 10.1186/1471-2164-13-726] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Accepted: 12/21/2012] [Indexed: 11/22/2022] Open
Abstract
Background Next Generation Sequencing has provided comprehensive, affordable and high-throughput DNA sequences for Single Nucleotide Polymorphism (SNP) discovery in Acacia auriculiformis and Acacia mangium. Like other non-model species, SNP detection and genotyping in Acacia are challenging due to lack of genome sequences. The main objective of this study is to develop the first high-throughput SNP genotyping assay for linkage map construction of A. auriculiformis x A. mangium hybrids. Results We identified a total of 37,786 putative SNPs by aligning short read transcriptome data from four parents of two Acacia hybrid mapping populations using Bowtie against 7,839 de novo transcriptome contigs. Given a set of 10 validated SNPs from two lignin genes, our in silico SNP detection approach is highly accurate (100%) compared to the traditional in vitro approach (44%). Further validation of 96 SNPs using Illumina GoldenGate Assay gave an overall assay success rate of 89.6% and conversion rate of 37.5%. We explored possible factors lowering assay success rate by predicting exon-intron boundaries and paralogous genes of Acacia contigs using Medicago truncatula genome as reference. This assessment revealed that presence of exon-intron boundary is the main cause (50%) of assay failure. Subsequent SNPs filtering and improved assay design resulted in assay success and conversion rate of 92.4% and 57.4%, respectively based on 768 SNPs genotyping. Analysis of clustering patterns revealed that 27.6% of the assays were not reproducible and flanking sequence might play a role in determining cluster compression. In addition, we identified a total of 258 and 319 polymorphic SNPs in A. auriculiformis and A. mangium natural germplasms, respectively. Conclusion We have successfully discovered a large number of SNP markers in A. auriculiformis x A. mangium hybrids using next generation transcriptome sequencing. By using a reference genome from the most closely related species, we converted most SNPs to successful assays. We also demonstrated that Illumina GoldenGate genotyping together with manual clustering can provide high quality genotypes for a non-model species like Acacia. These SNPs markers are not only important for linkage map construction, but will be very useful for hybrid discrimination and genetic diversity assessment of natural germplasms in the future.
Collapse
Affiliation(s)
- Melissa M L Wong
- School of Environmental and Natural Resource Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, UKM Bangi 43600, Selangor, Malaysia
| | | | | |
Collapse
|