1
|
Liu Z, Xie Z, Li M. Comprehensive and deep evaluation of structural variation detection pipelines with third-generation sequencing data. Genome Biol 2024; 25:188. [PMID: 39010145 PMCID: PMC11247875 DOI: 10.1186/s13059-024-03324-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 06/26/2024] [Indexed: 07/17/2024] Open
Abstract
BACKGROUND Structural variation (SV) detection methods using third-generation sequencing data are widely employed, yet accurately detecting SVs remains challenging. Different methods often yield inconsistent results for certain SV types, complicating tool selection and revealing biases in detection. RESULTS This study comprehensively evaluates 53 SV detection pipelines using simulated and real data from PacBio (CLR: Continuous Long Read, CCS: Circular Consensus Sequencing) and Nanopore (ONT) platforms. We assess their performance in detecting various sizes and types of SVs, breakpoint biases, and genotyping accuracy with various sequencing depths. Notably, pipelines such as Minimap2-cuteSV2, NGMLR-SVIM, PBMM2-pbsv, Winnowmap-Sniffles2, and Winnowmap-SVision exhibit comparatively higher recall and precision. Our findings also show that combining multiple pipelines with the same aligner, like pbmm2 or winnowmap, can significantly enhance performance. The individual pipelines' detailed ranking and performance metrics can be viewed in a dynamic table: http://pmglab.top/SVPipelinesRanking . CONCLUSIONS This study comprehensively characterizes the strengths and weaknesses of numerous pipelines, providing valuable insights that can improve SV detection in third-generation sequencing data and inform SV annotation and function prediction.
Collapse
Affiliation(s)
- Zhi Liu
- Program in Bioinformatics, Zhongshan School of Medicine, The Fifth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China
- Key Laboratory of Tropical Disease Control (Sun Yat-Sen University), Ministry of Education, Guangzhou, China
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou, China
| | - Miaoxin Li
- Program in Bioinformatics, Zhongshan School of Medicine, The Fifth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China.
- Key Laboratory of Tropical Disease Control (Sun Yat-Sen University), Ministry of Education, Guangzhou, China.
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China.
- Department of Psychiatry, The University of Hong Kong, Hong Kong, SAR, China.
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital, Sun Yat-Sen University, Zhuhai, China.
| |
Collapse
|
2
|
Dallaire X, Bouchard R, Hénault P, Ulmo-Diaz G, Normandeau E, Mérot C, Bernatchez L, Moore JS. Widespread Deviant Patterns of Heterozygosity in Whole-Genome Sequencing Due to Autopolyploidy, Repeated Elements, and Duplication. Genome Biol Evol 2023; 15:evad229. [PMID: 38085037 PMCID: PMC10752349 DOI: 10.1093/gbe/evad229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/30/2023] [Indexed: 12/28/2023] Open
Abstract
Most population genomic tools rely on accurate single nucleotide polymorphism (SNP) calling and filtering to meet their underlying assumptions. However, genomic complexity, resulting from structural variants, paralogous sequences, and repetitive elements, presents significant challenges in assembling contiguous reference genomes. Consequently, short-read resequencing studies can encounter mismapping issues, leading to SNPs that deviate from Mendelian expected patterns of heterozygosity and allelic ratio. In this study, we employed the ngsParalog software to identify such deviant SNPs in whole-genome sequencing (WGS) data with low (1.5×) to intermediate (4.8×) coverage for four species: Arctic Char (Salvelinus alpinus), Lake Whitefish (Coregonus clupeaformis), Atlantic Salmon (Salmo salar), and the American Eel (Anguilla rostrata). The analyses revealed that deviant SNPs accounted for 22% to 62% of all SNPs in salmonid datasets and approximately 11% in the American Eel dataset. These deviant SNPs were particularly concentrated within repetitive elements and genomic regions that had recently undergone rediploidization in salmonids. Additionally, narrow peaks of elevated coverage were ubiquitous along all four reference genomes, encompassed most deviant SNPs, and could be partially associated with transposons and tandem repeats. Including these deviant SNPs in genomic analyses led to highly distorted site frequency spectra, underestimated pairwise FST values, and overestimated nucleotide diversity. Considering the widespread occurrence of deviant SNPs arising from a variety of sources, their important impact in estimating population parameters, and the availability of effective tools to identify them, we propose that excluding deviant SNPs from WGS datasets is required to improve genomic inferences for a wide range of taxa and sequencing depths.
Collapse
Affiliation(s)
- Xavier Dallaire
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Centre d'Études Nordiques, Université Laval, Québec, Canada
| | - Raphael Bouchard
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Philippe Hénault
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Gabriela Ulmo-Diaz
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Eric Normandeau
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
- Plateforme de bio-informatique de l’IBIS, Université Laval, Québec, Canada
| | - Claire Mérot
- CNRS, UMR 6553 ECOBIO, Université de Rennes, Rennes, France
| | - Louis Bernatchez
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Jean-Sébastien Moore
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Centre d'Études Nordiques, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| |
Collapse
|
3
|
Nguyen TN, Chen N, Cosgrove EJ, Bowman R, Fitzpatrick JW, Clark AG. Dynamics of reduced genetic diversity in increasingly fragmented populations of Florida scrub jays, Aphelocoma coerulescens. Evol Appl 2022; 15:1018-1027. [PMID: 35782006 PMCID: PMC9234620 DOI: 10.1111/eva.13421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 04/04/2022] [Accepted: 05/04/2022] [Indexed: 11/29/2022] Open
Abstract
Understanding the genomic consequences of population decline is important for predicting species' vulnerability to intensifying global change. Empirical information about genomic changes in populations in the early stages of decline, especially for those still experiencing immigration, remains scarce. We used 7834 autosomal SNPs and demographic data for 288 Florida scrub jays (Aphelocoma coerulescens; FSJ) sampled in 2000 and 2008 to compare levels of genetic diversity, inbreeding, relatedness, and lengths of runs of homozygosity (ROH) between two subpopulations within dispersal distance of one another but have experienced contrasting demographic trajectories. At Archbold Biological Station (ABS), the FSJ population has been stable because of consistent habitat protection and management, while at nearby Placid Lakes Estates (PLE), the population declined precipitously due to suburban development. By the onset of our sampling in 2000, birds in PLE were already less heterozygous, more inbred, and on average more related than birds in ABS. No significant changes occurred in heterozygosity or inbreeding across the 8-year sampling interval, but average relatedness among individuals decreased in PLE, thus by 2008 average relatedness did not differ between sites. PLE harbored a similar proportion of short ROH but a greater proportion of long ROH than ABS, suggesting one continuous population of shared demographic history in the past, which is now experiencing more recent inbreeding. These results broadly uphold the predictions of simple population genetic models based on inferred effective population sizes and rates of immigration. Our study highlights how, in just a few generations, formerly continuous populations can diverge in heterozygosity and levels of inbreeding with severe local population decline despite ongoing gene flow.
Collapse
Affiliation(s)
- Tram N. Nguyen
- Department of Ecology and Evolutionary BiologyCornell UniversityIthacaNew YorkUSA
- Cornell Lab of OrnithologyIthacaNew YorkUSA
| | - Nancy Chen
- Department of BiologyUniversity of RochesterRochesterNew YorkUSA
| | - Elissa J. Cosgrove
- Department of Molecular Biology and GeneticsCornell UniversityIthacaNew YorkUSA
| | - Reed Bowman
- Avian Ecology LabArchbold Biological StationFloridaUSA
| | - John W. Fitzpatrick
- Department of Ecology and Evolutionary BiologyCornell UniversityIthacaNew YorkUSA
- Cornell Lab of OrnithologyIthacaNew YorkUSA
| | - Andrew G. Clark
- Department of Ecology and Evolutionary BiologyCornell UniversityIthacaNew YorkUSA
- Department of Molecular Biology and GeneticsCornell UniversityIthacaNew YorkUSA
| |
Collapse
|
4
|
Wang L, Yang J, Zhang H, Tao Q, Zhang Y, Dang Z, Zhang F, Luo Z. Sequence coverage required for accurate genotyping by sequencing in polyploid species. Mol Ecol Resour 2021; 22:1417-1426. [PMID: 34826191 DOI: 10.1111/1755-0998.13558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 11/12/2021] [Accepted: 11/15/2021] [Indexed: 11/29/2022]
Abstract
Polyploidy plays an important role in the evolution of eukaryotes, especially for flowering plants. Many of ecologically or agronomically important plant or crop species are polyploids, including sycamore maple (tetraploid), the world second and third largest food crops wheat (hexaploid) and potato (tetraploid) as well as economically important aquaculture animals such as Atlantic salmon and trout. The next generation sequencing data enables to allocate genotype at a sequence variant site, known as genotyping by sequencing (GBS). GBS has stimulated enormous interests in population based genomics studies in almost all diploid and many polyploid organisms. DNA sequence polymorphisms are codominant and thus fully informative about the underlying genotype at the polymorphic site, making GBS a straightforward task in diploids. However, sequence data may usually be uninformative in polyploid species, making GBS a far more challenging task in polyploids. This paper presents novel and rigorous statistical methods for predicting the number of sequence reads needed to ensure accurate GBS at a polymorphic site bared by the reads in polyploids and shows that a dozen of reads can ensure a probability of 95% to recover all constituent alleles of any tetraploid genotype but several hundreds of reads are needed to accurately uncover the genotype with probability confidence of 90%, subverting the proposition of GBS using low coverage sequence data in the literature. The theoretical prediction was tested by use of RAD-seq data from tetraploid potato cultivars. The paper provides polyploid experimentalists with theoretical guides and methods for designing and conducting their sequence-based studies.
Collapse
Affiliation(s)
- Lin Wang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Jixuan Yang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Hong Zhang
- Department of Statistics and Finance, University of Science and Technology of China, Hefei, China
| | - Qin Tao
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Yuxin Zhang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Zhenyu Dang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Fengjun Zhang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Zewei Luo
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China.,School of Biosciences, University of Birmingham, Birmingham, UK
| |
Collapse
|
5
|
Galla SJ, Brown L, Couch-Lewis Ngāi Tahu Te Hapū O Ngāti Wheke Ngāti Waewae Y, Cubrinovska I, Eason D, Gooley RM, Hamilton JA, Heath JA, Hauser SS, Latch EK, Matocq MD, Richardson A, Wold JR, Hogg CJ, Santure AW, Steeves TE. The relevance of pedigrees in the conservation genomics era. Mol Ecol 2021; 31:41-54. [PMID: 34553796 PMCID: PMC9298073 DOI: 10.1111/mec.16192] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 09/12/2021] [Accepted: 09/17/2021] [Indexed: 01/21/2023]
Abstract
Over the past 50 years conservation genetics has developed a substantive toolbox to inform species management. One of the most long‐standing tools available to manage genetics—the pedigree—has been widely used to characterize diversity and maximize evolutionary potential in threatened populations. Now, with the ability to use high throughput sequencing to estimate relatedness, inbreeding, and genome‐wide functional diversity, some have asked whether it is warranted for conservation biologists to continue collecting and collating pedigrees for species management. In this perspective, we argue that pedigrees remain a relevant tool, and when combined with genomic data, create an invaluable resource for conservation genomic management. Genomic data can address pedigree pitfalls (e.g., founder relatedness, missing data, uncertainty), and in return robust pedigrees allow for more nuanced research design, including well‐informed sampling strategies and quantitative analyses (e.g., heritability, linkage) to better inform genomic inquiry. We further contend that building and maintaining pedigrees provides an opportunity to strengthen trusted relationships among conservation researchers, practitioners, Indigenous Peoples, and Local Communities.
Collapse
Affiliation(s)
- Stephanie J Galla
- Department of Biological Sciences, Boise State University, Boise, Idaho, USA.,School of Biological Sciences, University of Canterbury, Christchurch, Canterbury, New Zealand
| | - Liz Brown
- New Zealand Department of Conservation, Twizel, Canterbury, New Zealand
| | | | - Ilina Cubrinovska
- School of Biological Sciences, University of Canterbury, Christchurch, Canterbury, New Zealand
| | - Daryl Eason
- New Zealand Department of Conservation, Invercargill, Southland, New Zealand
| | - Rebecca M Gooley
- Smithsonian-Mason School of Conservation, Front Royal, Maryland, USA.,Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, District of Columbia, USA
| | - Jill A Hamilton
- Department of Biological Sciences, North Dakota State University, Fargo, North Dakota, USA
| | - Julie A Heath
- Department of Biological Sciences, Boise State University, Boise, Idaho, USA
| | - Samantha S Hauser
- Department of Biological Sciences, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA
| | - Emily K Latch
- Department of Biological Sciences, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA
| | - Marjorie D Matocq
- Department of Natural Resources and Environmental Science, Program in Ecology, Evolution and Conservation Biology, University of Nevada Reno, Reno, Nevada, USA
| | - Anne Richardson
- The Isaac Conservation and Wildlife Trust, Christchurch, Canterbury, New Zealand
| | - Jana R Wold
- School of Biological Sciences, University of Canterbury, Christchurch, Canterbury, New Zealand
| | - Carolyn J Hogg
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW, Australia
| | - Anna W Santure
- School of Biological Sciences, University of Auckland, Auckland, Auckland, New Zealand
| | - Tammy E Steeves
- School of Biological Sciences, University of Canterbury, Christchurch, Canterbury, New Zealand
| |
Collapse
|
6
|
Lind BM, Lu M, Obreht Vidakovic D, Singh P, Booker TR, Aitken SN, Yeaman S. Haploid, diploid, and pooled exome capture recapitulate features of biology and paralogy in two non-model tree species. Mol Ecol Resour 2021; 22:225-238. [PMID: 34270863 PMCID: PMC9292622 DOI: 10.1111/1755-0998.13474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 03/18/2021] [Accepted: 04/27/2021] [Indexed: 11/30/2022]
Abstract
Despite their suitability for studying evolution, many conifer species have large and repetitive giga‐genomes (16–31 Gbp) that create hurdles to producing high coverage SNP data sets that capture diversity from across the entirety of the genome. Due in part to multiple ancient whole genome duplication events, gene family expansion and subsequent evolution within Pinaceae, false diversity from the misalignment of paralog copies creates further challenges in accurately and reproducibly inferring evolutionary history from sequence data. Here, we leverage the cost‐saving benefits of pool‐seq and exome‐capture to discover SNPs in two conifer species, Douglas‐fir (Pseudotsuga menziesii var. menziesii (Mirb.) Franco, Pinaceae) and jack pine (Pinus banksiana Lamb., Pinaceae). We show, using minimal baseline filtering, that allele frequencies estimated from pooled individuals show a strong, positive correlation with those estimated by sequencing the same population as individuals (r > .948), on par with such comparisons made in model organisms. Further, we highlight the utility of haploid megagametophyte tissue for identifying sites that are probably due to misaligned paralogs. Together with additional minor filtering, we show that it is possible to remove many of the loci with large frequency estimate discrepancies between individual and pooled sequencing approaches, improving the correlation further (r > .973). Our work addresses bioinformatic challenges in non‐model organisms with large and complex genomes, highlights the use of megagametophyte tissue for the identification of paralogous artefacts, and suggests the combination of pool‐seq and exome capture to be robust for further evolutionary hypothesis testing in these systems.
Collapse
Affiliation(s)
- Brandon M Lind
- Department of Forest and Conservation Sciences, Centre for Forest Conservation Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Mengmeng Lu
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - Dragana Obreht Vidakovic
- Department of Forest and Conservation Sciences, Centre for Forest Conservation Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Pooja Singh
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - Tom R Booker
- Department of Forest and Conservation Sciences, Centre for Forest Conservation Genetics, University of British Columbia, Vancouver, BC, Canada.,Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada
| | - Sally N Aitken
- Department of Forest and Conservation Sciences, Centre for Forest Conservation Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Sam Yeaman
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
7
|
Dang Z, Yang J, Wang L, Tao Q, Zhang F, Zhang Y, Luo Z. Sampling Variation of RAD-Seq Data from Diploid and Tetraploid Potato ( Solanum tuberosum L.). PLANTS (BASEL, SWITZERLAND) 2021; 10:319. [PMID: 33562246 PMCID: PMC7915145 DOI: 10.3390/plants10020319] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2020] [Revised: 01/24/2021] [Accepted: 02/02/2021] [Indexed: 12/02/2022]
Abstract
The new sequencing technology enables identification of genome-wide sequence-based variants at a population level and a competitively low cost. The sequence variant-based molecular markers have motivated enormous interest in population and quantitative genetic analyses. Generation of the sequence data involves a sophisticated experimental process embedded with rich non-biological variation. Statistically, the sequencing process indeed involves sampling DNA fragments from an individual sequence. Adequate knowledge of sampling variation of the sequence data generation is one of the key statistical properties for any downstream analysis of the data and for implementing statistically appropriate methods. This paper reports a thorough investigation on modeling the sampling variation of the sequence data from the optimized RAD-seq (Restriction sit associated DNA sequencing) experiments with two parents and their offspring of diploid and autotetraploid potato (Solanum tuberosum L.). The analysis shows significant dispersion in sampling variation of the sequence data over that expected under multinomial distribution as widely assumed in the literature and provides statistical methods for modeling the variation and calculating the model parameters, which may be easily implemented in real sequence datasets. The optimized design of RAD-seq experiments enabled effective control of presentation of undesirable chloroplast DNA and RNA genes in the sequence data generated.
Collapse
Affiliation(s)
- Zhenyu Dang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, Fudan University Shanghai, Shanghai 200433, China; (Z.D.); (J.Y.); (L.W.); (Q.T.); (Y.Z.)
| | - Jixuan Yang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, Fudan University Shanghai, Shanghai 200433, China; (Z.D.); (J.Y.); (L.W.); (Q.T.); (Y.Z.)
| | - Lin Wang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, Fudan University Shanghai, Shanghai 200433, China; (Z.D.); (J.Y.); (L.W.); (Q.T.); (Y.Z.)
| | - Qin Tao
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, Fudan University Shanghai, Shanghai 200433, China; (Z.D.); (J.Y.); (L.W.); (Q.T.); (Y.Z.)
| | - Fengjun Zhang
- Qinghai Academy of Agricultural and Forestry Sciences, Xining 200433, China;
| | - Yuxin Zhang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, Fudan University Shanghai, Shanghai 200433, China; (Z.D.); (J.Y.); (L.W.); (Q.T.); (Y.Z.)
| | - Zewei Luo
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, Fudan University Shanghai, Shanghai 200433, China; (Z.D.); (J.Y.); (L.W.); (Q.T.); (Y.Z.)
- School of Biosciences, University of Birmingham, Birmingham B15 2TT, UK
| |
Collapse
|
8
|
Kanzi AM, San JE, Chimukangara B, Wilkinson E, Fish M, Ramsuran V, de Oliveira T. Next Generation Sequencing and Bioinformatics Analysis of Family Genetic Inheritance. Front Genet 2020; 11:544162. [PMID: 33193618 PMCID: PMC7649788 DOI: 10.3389/fgene.2020.544162] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 09/21/2020] [Indexed: 12/29/2022] Open
Abstract
Mendelian and complex genetic trait diseases continue to burden and affect society both socially and economically. The lack of effective tests has hampered diagnosis thus, the affected lack proper prognosis. Mendelian diseases are caused by genetic mutations in a singular gene while complex trait diseases are caused by the accumulation of mutations in either linked or unlinked genomic regions. Significant advances have been made in identifying novel diseases associated mutations especially with the introduction of next generation and third generation sequencing. Regardless, some diseases are still without diagnosis as most tests rely on SNP genotyping panels developed from population based genetic analyses. Analysis of family genetic inheritance using whole genomes, whole exomes or a panel of genes has been shown to be effective in identifying disease-causing mutations. In this review, we discuss next generation and third generation sequencing platforms, bioinformatic tools and genetic resources commonly used to analyze family based genomic data with a focus on identifying inherited or novel disease-causing mutations. Additionally, we also highlight the analytical, ethical and regulatory challenges associated with analyzing personal genomes which constitute the data used for family genetic inheritance.
Collapse
Affiliation(s)
- Aquillah M. Kanzi
- Kwazulu-Natal Research and Innovation Sequencing Platform (KRISP), School of Laboratory Medicine and Medical Sciences, College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
| | | | | | | | | | | | | |
Collapse
|
9
|
Development and validation of a RAD-Seq target-capture based genotyping assay for routine application in advanced black tiger shrimp (Penaeus monodon) breeding programs. BMC Genomics 2020; 21:541. [PMID: 32758142 PMCID: PMC7430818 DOI: 10.1186/s12864-020-06960-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 07/29/2020] [Indexed: 11/26/2022] Open
Abstract
Background The development of genome-wide genotyping resources has provided terrestrial livestock and crop industries with the unique ability to accurately assess genomic relationships between individuals, uncover the genetic architecture of commercial traits, as well as identify superior individuals for selection based on their specific genetic profile. Utilising recent advancements in de-novo genome-wide genotyping technologies, it is now possible to provide aquaculture industries with these same important genotyping resources, even in the absence of existing genome assemblies. Here, we present the development of a genome-wide SNP assay for the Black Tiger shrimp (Penaeus monodon) through utilisation of a reduced-representation whole-genome genotyping approach (DArTseq). Results Based on a single reduced-representation library, 31,262 polymorphic SNPs were identified across 650 individuals obtained from Australian wild stocks and commercial aquaculture populations. After filtering to remove SNPs with low read depth, low MAF, low call rate, deviation from HWE, and non-Mendelian inheritance, 7542 high-quality SNPs were retained. From these, 4236 high-quality genome-wide loci were selected for baits-probe development and 4194 SNPs were included within a finalized target-capture genotype-by-sequence assay (DArTcap). This assay was designed for routine and cost effective commercial application in large scale breeding programs, and demonstrates higher confidence in genotype calls through increased call rate (from 80.2 ± 14.7 to 93.0% ± 3.5%), increased read depth (from 20.4 ± 15.6 to 80.0 ± 88.7), as well as a 3-fold reduction in cost over traditional genotype-by-sequencing approaches. Conclusion Importantly, this assay equips the P. monodon industry with the ability to simultaneously assign parentage of communally reared animals, undertake genomic relationship analysis, manage mate pairings between cryptic family lines, as well as undertake advance studies of genome and trait architecture. Critically this assay can be cost effectively applied as P. monodon breeding programs transition to undertaking genomic selection.
Collapse
|
10
|
Perry A, Wachowiak W, Downing A, Talbot R, Cavers S. Development of a single nucleotide polymorphism array for population genomic studies in four European pine species. Mol Ecol Resour 2020; 20:1697-1705. [PMID: 32633888 DOI: 10.1111/1755-0998.13223] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 06/03/2020] [Accepted: 06/25/2020] [Indexed: 02/06/2023]
Abstract
Pines are some of the most ecologically and economically important tree species in the world, and many have enormous natural distributions or have been extensively planted. However, a lack of rapid genotyping capability is hampering progress in understanding the molecular basis of genetic variation in these species. Here, we deliver an efficient tool for genotyping thousands of single nucleotide polymorphism (SNP) markers across the genome that can be applied to genetic studies in pines. Polymorphisms from resequenced candidate genes and transcriptome sequences of P. sylvestris, P. mugo, P. uncinata, P. uliginosa and P. radiata were used to design a 49,829 SNP array (Axiom_PineGAP, Thermo Fisher). Over a third (34.68%) of the unigenes identified from the P. sylvestris transcriptome were represented on the array, which was used to screen samples of four pine species. The conversion rate for the array on all samples was 42% (N = 20,795 SNPs) and was similar for SNPs sourced from resequenced candidate gene and transcriptome sequences. The broad representation of gene ontology terms by unigenes containing converted SNPs reflected their coverage across the full transcriptome. Over a quarter of successfully converted SNPs were polymorphic among all species, and the data were successful in discriminating among the species and some individual populations. The SNP array provides a valuable new tool to advance genetic studies in these species and demonstrates the effectiveness of the technology for rapid genotyping in species with large and complex genomes.
Collapse
Affiliation(s)
- Annika Perry
- UK Centre for Ecology & Hydrology Edinburgh, Penicuik, UK
| | - Witold Wachowiak
- Institute of Environmental Biology, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | - Alison Downing
- Edinburgh Genomics, Ashworth Laboratories, University of Edinburgh, Edinburgh, UK
| | - Richard Talbot
- Edinburgh Genomics, Ashworth Laboratories, University of Edinburgh, Edinburgh, UK
| | - Stephen Cavers
- UK Centre for Ecology & Hydrology Edinburgh, Penicuik, UK
| |
Collapse
|
11
|
Maldonado C, Mora F, Scapim CA, Coan M. Genome-wide haplotype-based association analysis of key traits of plant lodging and architecture of maize identifies major determinants for leaf angle: hapLA4. PLoS One 2019; 14:e0212925. [PMID: 30840677 PMCID: PMC6402688 DOI: 10.1371/journal.pone.0212925] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2018] [Accepted: 02/12/2019] [Indexed: 11/18/2022] Open
Abstract
Traits related to plant lodging and architecture are important determinants of plant productivity in intensive maize cultivation systems. Motivated by the identification of genomic associations with the leaf angle, plant height (PH), ear height (EH) and the EH/PH ratio, we characterized approximately 7,800 haplotypes from a set of high-quality single nucleotide polymorphisms (SNPs), in an association panel consisting of tropical maize inbred lines. The proportion of the phenotypic variations explained by the individual SNPs varied between 7%, for the SNP S1_285330124 (located on chromosome 9 and associated with the EH/PH ratio), and 22%, for the SNP S1_317085830 (located on chromosome 6 and associated with the leaf angle). A total of 40 haplotype blocks were significantly associated with the traits of interest, explaining up to 29% of the phenotypic variation for the leaf angle, corresponding to the haplotype hapLA4.04, which was stable over two growing seasons. Overall, the associations for PH, EH and the EH/PH ratio were environment-specific, which was confirmed by performing a model comparison analysis using the information criteria of Akaike and Schwarz. In addition, five stable haplotypes (83%) and 15 SNPs (75%) were identified for the leaf angle. Finally, approximately 62% of the associated haplotypes (25/40) did not contain SNPs detected in the association study using individual SNP markers. This result confirms the advantage of haplotype-based genome-wide association studies for examining genomic regions that control the determining traits for architecture and lodging in maize plants.
Collapse
Affiliation(s)
- Carlos Maldonado
- Institute of Biological Sciences, University of Talca, Talca, Chile
| | - Freddy Mora
- Institute of Biological Sciences, University of Talca, Talca, Chile
| | - Carlos A. Scapim
- Universidade Estadual de Maringá, Departamento de Agronomia, Maringá, PR, Brazil
| | - Marlon Coan
- Universidade Estadual de Maringá, Departamento de Agronomia, Maringá, PR, Brazil
| |
Collapse
|
12
|
de Villemereuil P, Rutschmann A, Lee KD, Ewen JG, Brekke P, Santure AW. Little Adaptive Potential in a Threatened Passerine Bird. Curr Biol 2019; 29:889-894.e3. [DOI: 10.1016/j.cub.2019.01.072] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 12/18/2018] [Accepted: 01/28/2019] [Indexed: 11/29/2022]
|
13
|
Gerard D, Ferrão LFV, Garcia AAF, Stephens M. Genotyping Polyploids from Messy Sequencing Data. Genetics 2018; 210:789-807. [PMID: 30185430 PMCID: PMC6218231 DOI: 10.1534/genetics.118.301468] [Citation(s) in RCA: 86] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Accepted: 08/21/2018] [Indexed: 12/30/2022] Open
Abstract
Detecting and quantifying the differences in individual genomes (i.e., genotyping), plays a fundamental role in most modern bioinformatics pipelines. Many scientists now use reduced representation next-generation sequencing (NGS) approaches for genotyping. Genotyping diploid individuals using NGS is a well-studied field, and similar methods for polyploid individuals are just emerging. However, there are many aspects of NGS data, particularly in polyploids, that remain unexplored by most methods. Our contributions in this paper are fourfold: (i) We draw attention to, and then model, common aspects of NGS data: sequencing error, allelic bias, overdispersion, and outlying observations. (ii) Many datasets feature related individuals, and so we use the structure of Mendelian segregation to build an empirical Bayes approach for genotyping polyploid individuals. (iii) We develop novel models to account for preferential pairing of chromosomes, and harness these for genotyping. (iv) We derive oracle genotyping error rates that may be used for read depth suggestions. We assess the accuracy of our method in simulations, and apply it to a dataset of hexaploid sweet potato (Ipomoea batatas). An R package implementing our method is available at https://cran.r-project.org/package=updog.
Collapse
Affiliation(s)
- David Gerard
- Department of Mathematics and Statistics, American University, Washington, DC 20016
| | | | - Antonio Augusto Franco Garcia
- Department of Genetics, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, 13418-900, Brazil
| | - Matthew Stephens
- Department of Human Genetics, University of Chicago, Illinois 60637
- Department of Statistics, University of Chicago, Illinois 60637
| |
Collapse
|
14
|
Insights into the Structure of the Spruce Budworm ( Choristoneura fumiferana) Genome, as Revealed by Molecular Cytogenetic Analyses and a High-Density Linkage Map. G3-GENES GENOMES GENETICS 2018; 8:2539-2549. [PMID: 29950429 PMCID: PMC6071596 DOI: 10.1534/g3.118.200263] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Genome structure characterization can contribute to a better understanding of processes such as adaptation, speciation, and karyotype evolution, and can provide useful information for refining genome assemblies. We studied the genome of an important North American boreal forest pest, the spruce budworm, Choristoneura fumiferana, through a combination of molecular cytogenetic analyses and construction of a high-density linkage map based on single nucleotide polymorphism (SNP) markers obtained through a genotyping-by-sequencing (GBS) approach. Cytogenetic analyses using fluorescence in situ hybridization methods confirmed the haploid chromosome number of n = 30 in both sexes of C. fumiferana and showed, for the first time, that this species has a WZ/ZZ sex chromosome system. Synteny analysis based on a comparison of the Bombyx mori genome and the C. fumiferana linkage map revealed the presence of a neo-Z chromosome in the latter species, as previously reported for other tortricid moths. In this neo-Z chromosome, we detected an ABC transporter C2 (ABCC2) gene that has been associated with insecticide resistance. Sex-linkage of the ABCC2 gene provides a genomic context favorable to selection and rapid spread of resistance against Bacillus thuringiensis serotype kurstaki (Btk), the main insecticide used in Canada to control spruce budworm populations. Ultimately, the linkage map we developed, which comprises 3586 SNP markers distributed over 30 linkage groups for a total length of 1720.41 cM, will be a valuable tool for refining our draft assembly of the spruce budworm genome.
Collapse
|
15
|
Yu S, Li X, Liu X, Wang Y, Yu F, Xue Y, Mao Z, Wang C, Li W. Characteristic and influencing factors of Taqman genotyping calling error. J Clin Lab Anal 2018; 32:e22613. [PMID: 29943492 DOI: 10.1002/jcla.22613] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Accepted: 06/08/2018] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Taqman fluorescent probe was frequently applied in single nucleotide polymorphism (SNP) genotyping. However, the characteristic of calling error and the influencing factors remain unclear. METHOD Calling errors of Taqman genotyping was evaluated systematically based on Mendelian inheritance. Twenty-two SNPs were genotyped by Taqman probe for 419 pedigrees. Mendelian genetic errors were counted for every SNP and pedigree. Cluster analysis was applied to investigate the compatibility between Taqman probes and DNA sample. RESULTS On one hand, errors were found for all the SNPs. The error number ranged from 4 to 33 with median of 10.5. On the other hand, Mendelian genetic errors showed features of both randomness and cluster. Half of the pedigrees containing errors had only 1 Mendelian genetic error. But there was also a pedigree containing up to 10 Mendelian genetic errors. Furthermore, cluster analysis indicated that errors of different SNPs took place in different pedigree cluster. CONCLUSION It could be concluded that calling error is inevitable for Taqman genotyping of large samples. The quality of Taqman probe and DNA sample, as well as their compatibility, may account for the error incidence.
Collapse
Affiliation(s)
- Songcheng Yu
- College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Xing Li
- College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Xinxin Liu
- College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Yan Wang
- College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Fei Yu
- College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Yuan Xue
- College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Zhenxing Mao
- College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Chongjian Wang
- College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Wenjie Li
- College of Public Health, Zhengzhou University, Zhengzhou, China
| |
Collapse
|
16
|
Accounting for Errors in Low Coverage High-Throughput Sequencing Data When Constructing Genetic Maps Using Biparental Outcrossed Populations. Genetics 2018; 209:65-76. [PMID: 29487138 PMCID: PMC5937187 DOI: 10.1534/genetics.117.300627] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2017] [Accepted: 02/25/2018] [Indexed: 01/06/2023] Open
Abstract
Next-generation sequencing is an efficient method that allows for substantially more markers than previous technologies, providing opportunities for building high-density genetic linkage maps, which facilitate the development of nonmodel species' genomic assemblies and the investigation of their genes. However, constructing genetic maps using data generated via high-throughput sequencing technology (e.g., genotyping-by-sequencing) is complicated by the presence of sequencing errors and genotyping errors resulting from missing parental alleles due to low sequencing depth. If unaccounted for, these errors lead to inflated genetic maps. In addition, map construction in many species is performed using full-sibling family populations derived from the outcrossing of two individuals, where unknown parental phase and varying segregation types further complicate construction. We present a new methodology for modeling low coverage sequencing data in the construction of genetic linkage maps using full-sibling populations of diploid species, implemented in a package called GUSMap. Our model is based on the Lander-Green hidden Markov model but extended to account for errors present in sequencing data. We were able to obtain accurate estimates of the recombination fractions and overall map distance using GUSMap, while most existing mapping packages produced inflated genetic maps in the presence of errors. Our results demonstrate the feasibility of using low coverage sequencing data to produce genetic maps without requiring extensive filtering of potentially erroneous genotypes, provided that the associated errors are correctly accounted for in the model.
Collapse
|
17
|
Oh JH, Lee YJ, Byeon EJ, Kang BC, Kyeoung DS, Kim CK. Whole-genome resequencing and transcriptomic analysis of genes regulating anthocyanin biosynthesis in black rice plants. 3 Biotech 2018; 8:115. [PMID: 29430376 PMCID: PMC5801106 DOI: 10.1007/s13205-018-1140-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Accepted: 01/29/2018] [Indexed: 12/11/2022] Open
Abstract
Anthocyanins are involved in many diverse functions in rice, but their benefits have yet to be clearly demonstrated. Our objective in this study was to identify anthocyanin-related genes in black rice plants. We identified anthocyanin-related genes in black rice plants using a combination of whole-genome resequencing, RNA-sequencing (RNA-seq), microarray experiments, and reverse-transcriptase polymerase chain reaction (RT-PCR). Using multi-layer screening from 30 rice accessions, we identified 172,922 single-nucleotide polymorphisms (SNPs) and 1276 differentially expressed genes that appear to be related to anthocyanin biosynthesis. We identified 18 putative genes from 172,922 SNPs using intensive selective sweeps. The 18 candidate genes identified from SNPs were not significantly correlated with the RNA-seq expression pattern or other well-known anthocyanin biosynthesis/metabolism genes. We also identified nine putative genes from 1276 differentially expressed genes using RNA-seq transcriptome analysis. In addition, we identified four phylogenetic groups from these nine candidate genes and 51 pathway-network genes. Finally, we verified nine anthocyanin-related genes using a newly designed microarray and semi-quantitative RT-PCR. We suggest that these nine identified genes appear to be related to the regulation of anthocyanin biosynthesis and/or metabolism.
Collapse
Affiliation(s)
- Jae-Hyeon Oh
- Genomics Division, National Institute of Agricultural Sciences, Jeonju, 54874 Korea
| | - Ye-Ji Lee
- Department of Environmental Resources, Sangmyung University, Cheonan, 31066 Korea
| | - Eun-Ju Byeon
- Department of Crop Science and Biotechnology, Chonbuk National University, Jeonju, 54896 Korea
| | - Byeong-Chul Kang
- Codes Division, Insilicogen Inc., Suwon, 16954 Gyeonggi-do Korea
| | - Dong-Soo Kyeoung
- Codes Division, Insilicogen Inc., Suwon, 16954 Gyeonggi-do Korea
| | - Chang-Kug Kim
- Genomics Division, National Institute of Agricultural Sciences, Jeonju, 54874 Korea
| |
Collapse
|
18
|
Melville J, Haines ML, Boysen K, Hodkinson L, Kilian A, Smith Date KL, Potvin DA, Parris KM. Identifying hybridization and admixture using SNPs: application of the DArTseq platform in phylogeographic research on vertebrates. ROYAL SOCIETY OPEN SCIENCE 2017; 4:161061. [PMID: 28791133 PMCID: PMC5541528 DOI: 10.1098/rsos.161061] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Accepted: 06/14/2017] [Indexed: 05/04/2023]
Abstract
Next-generation sequencing (NGS) approaches are increasingly being used to generate multi-locus data for phylogeographic and evolutionary genetics research. We detail the applicability of a restriction enzyme-mediated genome complexity reduction approach with subsequent NGS (DArTseq) in vertebrate study systems at different evolutionary and geographical scales. We present two case studies using SNP data from the DArTseq molecular marker platform. First, we used DArTseq in a large phylogeographic study of the agamid lizard Ctenophorus caudicinctus, including 91 individuals and spanning the geographical range of this species across arid Australia. A low-density DArTseq assay resulted in 28 960 SNPs, with low density referring to a comparably reduced set of identified and sequenced markers as a cost-effective approach. Second, we applied this approach to an evolutionary genetics study of a classic frog hybrid zone (Litoria ewingii-Litoria paraewingi) across 93 individuals, which resulted in 48 117 and 67 060 SNPs for a low- and high-density assay, respectively. We provide a docker-based workflow to facilitate data preparation and analysis, then analyse SNP data using multiple methods including Bayesian model-based clustering and conditional likelihood approaches. Based on comparison of results from the DArTseq platform and traditional molecular approaches, we conclude that DArTseq can be used successfully in vertebrates and will be of particular interest to researchers working at the interface between population genetics and phylogenetics, exploring species boundaries, gene exchange and hybridization.
Collapse
Affiliation(s)
- Jane Melville
- Department of Sciences, Museum Victoria, Carlton, Victoria 3052, Australia
- Author for correspondence: Jane Melville e-mail:
| | - Margaret L. Haines
- Department of Sciences, Museum Victoria, Carlton, Victoria 3052, Australia
| | - Katja Boysen
- Department of Sciences, Museum Victoria, Carlton, Victoria 3052, Australia
| | - Luke Hodkinson
- Department of Sciences, Museum Victoria, Carlton, Victoria 3052, Australia
| | - Andrzej Kilian
- Diversity Arrays Technology, University of Canberra, Bruce, Australian Capital Territory 2617, Australia
| | | | | | - Kirsten M. Parris
- School of Ecosystem and Forest Sciences, The University of Melbourne, Parkville, Victoria 3010, Australia
| |
Collapse
|
19
|
McKinney GJ, Waples RK, Seeb LW, Seeb JE. Paralogs are revealed by proportion of heterozygotes and deviations in read ratios in genotyping-by-sequencing data from natural populations. Mol Ecol Resour 2016; 17:656-669. [DOI: 10.1111/1755-0998.12613] [Citation(s) in RCA: 127] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Revised: 10/06/2016] [Accepted: 10/10/2016] [Indexed: 11/28/2022]
Affiliation(s)
- Garrett J. McKinney
- School of Aquatic and Fishery Sciences; University of Washington; 1122 NE Boat Street, Box 355020 Seattle WA 98195-5020 USA
| | - Ryan K. Waples
- School of Aquatic and Fishery Sciences; University of Washington; 1122 NE Boat Street, Box 355020 Seattle WA 98195-5020 USA
| | - Lisa W. Seeb
- School of Aquatic and Fishery Sciences; University of Washington; 1122 NE Boat Street, Box 355020 Seattle WA 98195-5020 USA
| | - James E. Seeb
- School of Aquatic and Fishery Sciences; University of Washington; 1122 NE Boat Street, Box 355020 Seattle WA 98195-5020 USA
| |
Collapse
|
20
|
Chen N, Cosgrove EJ, Bowman R, Fitzpatrick JW, Clark AG. Genomic Consequences of Population Decline in the Endangered Florida Scrub-Jay. Curr Biol 2016; 26:2974-2979. [PMID: 27746026 DOI: 10.1016/j.cub.2016.08.062] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 07/28/2016] [Accepted: 08/24/2016] [Indexed: 11/15/2022]
Abstract
Understanding the population genetic consequences of declining population size is important for conserving the many species worldwide facing severe decline [1]. Thorough empirical studies on the impacts of population reduction at a genome-wide scale in the wild are scarce because they demand huge field and laboratory investments [1, 2]. Previous studies have demonstrated the importance of gene flow in introducing genetic variation to small populations [3], but few have documented both genetic and fitness consequences of decreased immigration through time in a natural population [4-6]. Here we assess temporal variation in gene flow, inbreeding, and fitness using longitudinal genomic, demographic, and phenotypic data from a long-studied population of federally Threatened Florida scrub-jays (Aphelocoma coerulescens). We exhaustively sampled and genotyped the study population over two decades, providing one of the most detailed longitudinal investigations of genetics in a wild animal population to date. Immigrants were less heterozygous than residents but still introduced genetic variation into our study population. Owing to regional population declines, immigration into the study population declined from 1995-2013, resulting in increased levels of inbreeding and reduced fitness via inbreeding depression, even as the population remained demographically stable. Our results show that, contrary to conventional wisdom, small peripheral populations that already have undergone a genetic bottleneck may play a vital role in preserving genetic diversity of larger and seemingly stable populations. These findings underscore the importance of investing in the persistence of small populations and maintaining population connectivity in conservation of fragmented species.
Collapse
Affiliation(s)
- Nancy Chen
- Center for Population Biology and Department of Evolution and Ecology, University of California, Davis, Davis, CA 95616, USA; Cornell Lab of Ornithology, Cornell University, Ithaca, NY 14850, USA.
| | - Elissa J Cosgrove
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Reed Bowman
- Archbold Biological Station, Venus, FL 33960, USA
| | - John W Fitzpatrick
- Cornell Lab of Ornithology, Cornell University, Ithaca, NY 14850, USA; Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY 14853, USA
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA; Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
21
|
Galla SJ, Buckley TR, Elshire R, Hale ML, Knapp M, McCallum J, Moraga R, Santure AW, Wilcox P, Steeves TE. Building strong relationships between conservation genetics and primary industry leads to mutually beneficial genomic advances. Mol Ecol 2016; 25:5267-5281. [PMID: 27641156 DOI: 10.1111/mec.13837] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2016] [Revised: 08/23/2016] [Accepted: 08/24/2016] [Indexed: 02/06/2023]
Abstract
Several reviews in the past decade have heralded the benefits of embracing high-throughput sequencing technologies to inform conservation policy and the management of threatened species, but few have offered practical advice on how to expedite the transition from conservation genetics to conservation genomics. Here, we argue that an effective and efficient way to navigate this transition is to capitalize on emerging synergies between conservation genetics and primary industry (e.g., agriculture, fisheries, forestry and horticulture). Here, we demonstrate how building strong relationships between conservation geneticists and primary industry scientists is leading to mutually-beneficial outcomes for both disciplines. Based on our collective experience as collaborative New Zealand-based scientists, we also provide insight for forging these cross-sector relationships.
Collapse
Affiliation(s)
- Stephanie J Galla
- School of Biological Sciences, University of Canterbury, Private Bag 4800, Christchurch, 8140, New Zealand.
| | - Thomas R Buckley
- Landcare Research, Private Bag 92170, Auckland Mail Centre, Auckland, 1142, New Zealand.,School of Biological Sciences, University of Auckland, Auckland, 1010, New Zealand
| | - Rob Elshire
- The Elshire Group, Ltd., 52 Victoria Avenue, Palmerston North, 4410, New Zealand
| | - Marie L Hale
- School of Biological Sciences, University of Canterbury, Private Bag 4800, Christchurch, 8140, New Zealand
| | - Michael Knapp
- Department of Anatomy, University of Otago, P.O. Box 913, Dunedin, 9054, New Zealand
| | - John McCallum
- Breeding and Genomics, New Zealand Institute for Plant and Food Research, Private Bag 4704, Christchurch, 8140, New Zealand
| | - Roger Moraga
- AgResearch, Ruakura Research Centre, Bisley Road, Private Bag 3115, Hamilton, 3240, New Zealand
| | - Anna W Santure
- School of Biological Sciences, University of Auckland, Auckland, 1010, New Zealand
| | - Phillip Wilcox
- Department of Mathematics and Statistics, University of Otago, P.O. Box 56, 710 Cumberland Street, Dunedin, 9054, New Zealand
| | - Tammy E Steeves
- School of Biological Sciences, University of Canterbury, Private Bag 4800, Christchurch, 8140, New Zealand
| |
Collapse
|
22
|
Kaiser SA, Taylor SA, Chen N, Sillett TS, Bondra ER, Webster MS. A comparative assessment of
SNP
and microsatellite markers for assigning parentage in a socially monogamous bird. Mol Ecol Resour 2016; 17:183-193. [DOI: 10.1111/1755-0998.12589] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 07/14/2016] [Accepted: 07/19/2016] [Indexed: 12/21/2022]
Affiliation(s)
- Sara A. Kaiser
- Macaulay Library Cornell Lab of Ornithology 159 Sapsucker Woods Rd Ithaca NY 14850 USA
- Migratory Bird Center Center for Conservation Genomics Smithsonian Conservation Biology Institute National Zoological Park MRC 5503 Washington DC 20013 USA
| | - Scott A. Taylor
- Fuller Evolutionary Biology Program Cornell Lab of Ornithology 159 Sapsucker Woods Rd Ithaca NY 14850 USA
- Department of Ecology and Evolutionary Biology University of Colorado at Boulder 1900 Pleasant Street 334 UCB Boulder CO 80309 USA
| | - Nancy Chen
- Fuller Evolutionary Biology Program Cornell Lab of Ornithology 159 Sapsucker Woods Rd Ithaca NY 14850 USA
- Department of Ecology and Evolutionary Biology Cornell University E145 Corson Hall 215 Tower Road Ithaca NY 14853 USA
| | - T. Scott Sillett
- Migratory Bird Center Center for Conservation Genomics Smithsonian Conservation Biology Institute National Zoological Park MRC 5503 Washington DC 20013 USA
| | - Eliana R. Bondra
- Department of Ecology and Evolutionary Biology Cornell University E145 Corson Hall 215 Tower Road Ithaca NY 14853 USA
| | - Michael S. Webster
- Macaulay Library Cornell Lab of Ornithology 159 Sapsucker Woods Rd Ithaca NY 14850 USA
| |
Collapse
|
23
|
Drury C, Dale KE, Panlilio JM, Miller SV, Lirman D, Larson EA, Bartels E, Crawford DL, Oleksiak MF. Genomic variation among populations of threatened coral: Acropora cervicornis. BMC Genomics 2016; 17:286. [PMID: 27076191 PMCID: PMC4831158 DOI: 10.1186/s12864-016-2583-8] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Background Acropora cervicornis, a threatened, keystone reef-building coral has undergone severe declines (>90 %) throughout the Caribbean. These declines could reduce genetic variation and thus hamper the species’ ability to adapt. Active restoration strategies are a common conservation approach to mitigate species' declines and require genetic data on surviving populations to efficiently respond to declines while maintaining the genetic diversity needed to adapt to changing conditions. To evaluate active restoration strategies for the staghorn coral, the genetic diversity of A. cervicornis within and among populations was assessed in 77 individuals collected from 68 locations along the Florida Reef Tract (FRT) and in the Dominican Republic. Results Genotyping by Sequencing (GBS) identified 4,764 single nucleotide polymorphisms (SNPs). Pairwise nucleotide differences (π) within a population are large (~37 %) and similar to π across all individuals. This high level of genetic diversity along the FRT is similar to the diversity within a small, isolated reef. Much of the genetic diversity (>90 %) exists within a population, yet GBS analysis shows significant variation along the FRT, including 300 SNPs with significant FST values and significant divergence relative to distance. There are also significant differences in SNP allele frequencies over small spatial scales, exemplified by the large FST values among corals collected within Miami-Dade county. Conclusions Large standing diversity was found within each population even after recent declines in abundance, including significant, potentially adaptive divergence over short distances. The data here inform conservation and management actions by uncovering population structure and high levels of diversity maintained within coral collections among sites previously shown to have little genetic divergence. More broadly, this approach demonstrates the power of GBS to resolve differences among individuals and identify subtle genetic structure, informing conservation goals with evolutionary implications.
Collapse
Affiliation(s)
- C Drury
- Rosenstiel School of Marine and Atmospheric Science, University of Miami, 4600 Rickenbacker Causeway, Miami, FL, 33149, USA
| | - K E Dale
- Rosenstiel School of Marine and Atmospheric Science, University of Miami, 4600 Rickenbacker Causeway, Miami, FL, 33149, USA
| | - J M Panlilio
- Rosenstiel School of Marine and Atmospheric Science, University of Miami, 4600 Rickenbacker Causeway, Miami, FL, 33149, USA
| | - S V Miller
- Rosenstiel School of Marine and Atmospheric Science, University of Miami, 4600 Rickenbacker Causeway, Miami, FL, 33149, USA
| | - D Lirman
- Rosenstiel School of Marine and Atmospheric Science, University of Miami, 4600 Rickenbacker Causeway, Miami, FL, 33149, USA
| | - E A Larson
- Nova Southeastern University Oceanographic Center, 8000 N Ocean Drive, Dania Beach, FL, 33004, USA
| | - E Bartels
- Center for Coral Reef Research, Mote Marine Laboratory, 24244 Overseas Highway, Summerland Key, FL, 33042, USA
| | - D L Crawford
- Rosenstiel School of Marine and Atmospheric Science, University of Miami, 4600 Rickenbacker Causeway, Miami, FL, 33149, USA
| | - M F Oleksiak
- Rosenstiel School of Marine and Atmospheric Science, University of Miami, 4600 Rickenbacker Causeway, Miami, FL, 33149, USA.
| |
Collapse
|
24
|
Fountain ED, Pauli JN, Reid BN, Palsbøll PJ, Peery MZ. Finding the right coverage: the impact of coverage and sequence quality on single nucleotide polymorphism genotyping error rates. Mol Ecol Resour 2016; 16:966-78. [DOI: 10.1111/1755-0998.12519] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Revised: 02/10/2016] [Accepted: 02/11/2016] [Indexed: 12/13/2022]
Affiliation(s)
- Emily D. Fountain
- Department of Forest and Wildlife Ecology University of Wisconsin‐Madison Madison WI 53706 USA
| | - Jonathan N. Pauli
- Department of Forest and Wildlife Ecology University of Wisconsin‐Madison Madison WI 53706 USA
| | - Brendan N. Reid
- Department of Forest and Wildlife Ecology University of Wisconsin‐Madison Madison WI 53706 USA
| | - Per J. Palsbøll
- Marine Evolution and Conservation Groningen Institute of Evolutionary Life Sciences University of Groningen Groningen9747 AG The Netherlands
| | - M. Zachariah Peery
- Department of Forest and Wildlife Ecology University of Wisconsin‐Madison Madison WI 53706 USA
| |
Collapse
|
25
|
Shojo H, Tanaka M, Takahashi R, Kakuda T, Adachi N. A Unique Primer with an Inosine Chain at the 5'-Terminus Improves the Reliability of SNP Analysis Using the PCR-Amplified Product Length Polymorphism Method. PLoS One 2015; 10:e0136995. [PMID: 26381262 PMCID: PMC4575067 DOI: 10.1371/journal.pone.0136995] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2015] [Accepted: 08/11/2015] [Indexed: 12/30/2022] Open
Abstract
Polymerase chain reaction-amplified product length polymorphism (PCR-APLP) is one of the most convenient and reliable methods for single nucleotide polymorphism (SNP) analysis. This method is based on PCR, but uses allele-specific primers containing SNP sites at the 3′-terminus of each primer. To use this method at least two allele-specific primers and one “counter-primer”, which serves as a common forward or reverse primer of the allele-specific primers, are required. The allele-specific primers have SNP sites at the 3′-terminus, and another primer should have a few non-complementary flaps at the 5′-terminus to detect SNPs by determining the difference of amplicon length by PCR and subsequent electrophoresis. A major disadvantage of the addition of a non-complementary flap is the non-specific annealing of the primer with non-complementary flaps. However, a design principle for avoiding this undesired annealing has not been fully established, therefore, it is often difficult to design effective APLP primers. Here, we report allele-specific primers with an inosine chain at the 5′-terminus for PCR-APLP analysis. This unique design improves the competitiveness of allele-specific primers and the reliability of SNP analysis when using the PCR-APLP method.
Collapse
Affiliation(s)
- Hideki Shojo
- Department of Legal Medicine, Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi, 1110 Shimokato, Yamanashi, 409–3898, Japan
| | - Mayumi Tanaka
- Department of Legal Medicine, Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi, 1110 Shimokato, Yamanashi, 409–3898, Japan
| | - Ryohei Takahashi
- Department of Legal Medicine, Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi, 1110 Shimokato, Yamanashi, 409–3898, Japan
| | - Tsuneo Kakuda
- Department of Legal Medicine, Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi, 1110 Shimokato, Yamanashi, 409–3898, Japan
| | - Noboru Adachi
- Department of Legal Medicine, Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi, 1110 Shimokato, Yamanashi, 409–3898, Japan
- * E-mail:
| |
Collapse
|