1
|
Hemstrom W, Grummer JA, Luikart G, Christie MR. Next-generation data filtering in the genomics era. Nat Rev Genet 2024:10.1038/s41576-024-00738-6. [PMID: 38877133 DOI: 10.1038/s41576-024-00738-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2024] [Indexed: 06/16/2024]
Abstract
Genomic data are ubiquitous across disciplines, from agriculture to biodiversity, ecology, evolution and human health. However, these datasets often contain noise or errors and are missing information that can affect the accuracy and reliability of subsequent computational analyses and conclusions. A key step in genomic data analysis is filtering - removing sequencing bases, reads, genetic variants and/or individuals from a dataset - to improve data quality for downstream analyses. Researchers are confronted with a multitude of choices when filtering genomic data; they must choose which filters to apply and select appropriate thresholds. To help usher in the next generation of genomic data filtering, we review and suggest best practices to improve the implementation, reproducibility and reporting standards for filter types and thresholds commonly applied to genomic datasets. We focus mainly on filters for minor allele frequency, missing data per individual or per locus, linkage disequilibrium and Hardy-Weinberg deviations. Using simulated and empirical datasets, we illustrate the large effects of different filtering thresholds on common population genetics statistics, such as Tajima's D value, population differentiation (FST), nucleotide diversity (π) and effective population size (Ne).
Collapse
Affiliation(s)
- William Hemstrom
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| | - Jared A Grummer
- Flathead Lake Biological Station, Wildlife Biology Program and Division of Biological Sciences, University of Montana, Missoula, MT, USA
| | - Gordon Luikart
- Flathead Lake Biological Station, Wildlife Biology Program and Division of Biological Sciences, University of Montana, Missoula, MT, USA
| | - Mark R Christie
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Forestry and Natural Resources, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
2
|
Friis G, Smith EG, Lovelock CE, Ortega A, Marshell A, Duarte CM, Burt JA. Rapid diversification of grey mangroves (Avicennia marina) driven by geographic isolation and extreme environmental conditions in the Arabian Peninsula. Mol Ecol 2024; 33:e17260. [PMID: 38197286 DOI: 10.1111/mec.17260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 11/13/2023] [Accepted: 12/11/2023] [Indexed: 01/11/2024]
Abstract
Biological systems occurring in ecologically heterogeneous and spatially discontinuous habitats provide an ideal opportunity to investigate the relative roles of neutral and selective factors in driving lineage diversification. The grey mangroves (Avicennia marina) of Arabia occur at the northern edge of the species' range and are subject to variable, often extreme, environmental conditions, as well as historic large fluctuations in habitat availability and connectivity resulting from Quaternary glacial cycles. Here, we analyse fully sequenced genomes sampled from 19 locations across the Red Sea, the Arabian Sea and the Persian/Arabian Gulf (PAG) to reconstruct the evolutionary history of the species in the region and to identify adaptive mechanisms of lineage diversification. Population structure and phylogenetic analyses revealed marked genetic structure correlating with geographic distance and highly supported clades among and within the seas surrounding the Arabian Peninsula. Demographic modelling showed times of divergence consistent with recent periods of geographic isolation and low marine connectivity during glaciations, suggesting the presence of (cryptic) glacial refugia in the Red Sea and the PAG. Significant migration was detected within the Red Sea and the PAG, and across the Strait of Hormuz to the Arabian Sea, suggesting gene flow upon secondary contact among populations. Genetic-environment association analyses revealed high levels of adaptive divergence and detected signs of multi-loci local adaptation driven by temperature extremes and hypersalinity. These results support a process of rapid diversification resulting from the combined effects of historical factors and ecological selection and reveal mangrove peripheral environments as relevant drivers of lineage diversity.
Collapse
Affiliation(s)
- Guillermo Friis
- Center for Genomics and Systems Biology (CGSB) and Mubadala ACCESS Center, New York University - Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Edward G Smith
- Department of Biological Sciences, University of North Carolina at Charlotte, Charlotte, North Carolina, USA
| | - Catherine E Lovelock
- School of Environment, The University of Queensland, St Lucia, Queensland, Australia
| | - Alejandra Ortega
- Red Sea Research Center (RSRC) and Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Alyssa Marshell
- Department of Marine Science and Fisheries, College of Agricultural and Marine Sciences, Sultan Qaboos University, Muscat, Oman
- Institute for Marine and Antarctic Studies, University of Tasmania, Hobart, Tasmania, Australia
| | - Carlos M Duarte
- Red Sea Research Center (RSRC) and Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - John A Burt
- Center for Genomics and Systems Biology (CGSB) and Mubadala ACCESS Center, New York University - Abu Dhabi, Abu Dhabi, United Arab Emirates
| |
Collapse
|
3
|
Schiebelhut LM, Guillaume AS, Kuhn A, Schweizer RM, Armstrong EE, Beaumont MA, Byrne M, Cosart T, Hand BK, Howard L, Mussmann SM, Narum SR, Rasteiro R, Rivera-Colón AG, Saarman N, Sethuraman A, Taylor HR, Thomas GWC, Wellenreuther M, Luikart G. Genomics and conservation: Guidance from training to analyses and applications. Mol Ecol Resour 2024; 24:e13893. [PMID: 37966259 DOI: 10.1111/1755-0998.13893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 10/25/2023] [Accepted: 10/30/2023] [Indexed: 11/16/2023]
Abstract
Environmental change is intensifying the biodiversity crisis and threatening species across the tree of life. Conservation genomics can help inform conservation actions and slow biodiversity loss. However, more training, appropriate use of novel genomic methods and communication with managers are needed. Here, we review practical guidance to improve applied conservation genomics. We share insights aimed at ensuring effectiveness of conservation actions around three themes: (1) improving pedagogy and training in conservation genomics including for online global audiences, (2) conducting rigorous population genomic analyses properly considering theory, marker types and data interpretation and (3) facilitating communication and collaboration between managers and researchers. We aim to update students and professionals and expand their conservation toolkit with genomic principles and recent approaches for conserving and managing biodiversity. The biodiversity crisis is a global problem and, as such, requires international involvement, training, collaboration and frequent reviews of the literature and workshops as we do here.
Collapse
Affiliation(s)
- Lauren M Schiebelhut
- Life and Environmental Sciences, University of California, Merced, California, USA
| | - Annie S Guillaume
- Geospatial Molecular Epidemiology group (GEOME), Laboratory for Biological Geochemistry (LGB), École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Arianna Kuhn
- Department of Biological Sciences, University of Lethbridge, Lethbridge, Alberta, Canada
- Virginia Museum of Natural History, Martinsville, Virginia, USA
| | - Rena M Schweizer
- Division of Biological Sciences, University of Montana, Missoula, Montana, USA
| | | | - Mark A Beaumont
- School of Biological Sciences, University of Bristol, Bristol, UK
| | - Margaret Byrne
- Department of Biodiversity, Conservation and Attractions, Biodiversity and Conservation Science, Perth, Western Australia, Australia
| | - Ted Cosart
- Flathead Lake Biology Station, University of Montana, Missoula, Montana, USA
| | - Brian K Hand
- Flathead Lake Biological Station, University of Montana, Polson, Montana, USA
| | - Leif Howard
- Flathead Lake Biology Station, University of Montana, Missoula, Montana, USA
| | - Steven M Mussmann
- Southwestern Native Aquatic Resources and Recovery Center, U.S. Fish & Wildlife Service, Dexter, New Mexico, USA
| | - Shawn R Narum
- Hagerman Genetics Lab, University of Idaho, Hagerman, Idaho, USA
| | - Rita Rasteiro
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
| | - Angel G Rivera-Colón
- Department of Evolution, Ecology, and Behavior, University of Illinois at Urbana-Champaign, Champaign, Illinois, USA
| | - Norah Saarman
- Department of Biology and Ecology Center, Utah State University, Logan, Utah, USA
| | - Arun Sethuraman
- Department of Biology, San Diego State University, San Diego, California, USA
| | - Helen R Taylor
- Royal Zoological Society of Scotland, Edinburgh, Scotland
| | - Gregg W C Thomas
- Informatics Group, Harvard University, Cambridge, Massachusetts, USA
| | - Maren Wellenreuther
- Plant and Food Research, Nelson, New Zealand
- University of Auckland, Auckland, New Zealand
| | - Gordon Luikart
- Division of Biological Sciences, University of Montana, Missoula, Montana, USA
- Flathead Lake Biology Station, University of Montana, Missoula, Montana, USA
| |
Collapse
|
4
|
Hopper KR. Reduced-representation libraries in insect genetics. CURRENT OPINION IN INSECT SCIENCE 2023; 59:101084. [PMID: 37442341 DOI: 10.1016/j.cois.2023.101084] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Revised: 05/04/2023] [Accepted: 07/06/2023] [Indexed: 07/15/2023]
Abstract
Genotyping-by-sequencing of reduced-representation libraries has ushered in an era where genome-wide data can be gotten for any species. Here, I review research on this topic during the last two years, report meta-analysis of the results, and discuss analysis methods and issues. Scanning the literature from 2021 to 2022 identified 21 papers, the majority of which were on population differences, including local adaptation and migration, but several papers were on genetic maps and their use in assembly scaffolding or analysis of quantitative trait loci, on the origin of incursions of pest insects, or on infection rates of a pathogen in a disease vector. The research reviewed includes 33 species from 25 families and 11 orders. Meta-analysis showed that less than 16%, and most often, less than 1% of the genome was implicated in local adaptation and that the number of adaptive loci correlated with genetic divergence among populations.
Collapse
Affiliation(s)
- Keith R Hopper
- Beneficial Insect Introductions Research Unit, ARS, USDA, Newark, DE, United States.
| |
Collapse
|
5
|
Capblancq T, Lachmuth S, Fitzpatrick MC, Keller SR. From common gardens to candidate genes: exploring local adaptation to climate in red spruce. THE NEW PHYTOLOGIST 2023; 237:1590-1605. [PMID: 36068997 PMCID: PMC10092705 DOI: 10.1111/nph.18465] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 08/09/2022] [Indexed: 05/12/2023]
Abstract
Local adaptation to climate is common in plant species and has been studied in a range of contexts, from improving crop yields to predicting population maladaptation to future conditions. The genomic era has brought new tools to study this process, which was historically explored through common garden experiments. In this study, we combine genomic methods and common gardens to investigate local adaptation in red spruce and identify environmental gradients and loci involved in climate adaptation. We first use climate transfer functions to estimate the impact of climate change on seedling performance in three common gardens. We then explore the use of multivariate gene-environment association methods to identify genes underlying climate adaptation, with particular attention to the implications of conducting genome scans with and without correction for neutral population structure. This integrative approach uncovered phenotypic evidence of local adaptation to climate and identified a set of putatively adaptive genes, some of which are involved in three main adaptive pathways found in other temperate and boreal coniferous species: drought tolerance, cold hardiness, and phenology. These putatively adaptive genes segregated into two 'modules' associated with different environmental gradients. This study nicely exemplifies the multivariate dimension of adaptation to climate in trees.
Collapse
Affiliation(s)
- Thibaut Capblancq
- Department of Plant BiologyUniversity of VermontBurlingtonVT05405USA
| | - Susanne Lachmuth
- Appalachian LaboratoryUniversity of Maryland Center for Environmental ScienceFrostburgMD21532USA
| | - Matthew C. Fitzpatrick
- Appalachian LaboratoryUniversity of Maryland Center for Environmental ScienceFrostburgMD21532USA
| | - Stephen R. Keller
- Department of Plant BiologyUniversity of VermontBurlingtonVT05405USA
| |
Collapse
|
6
|
Leaf Economic and Hydraulic Traits Signal Disparate Climate Adaptation Patterns in Two Co-Occurring Woodland Eucalypts. PLANTS 2022; 11:plants11141846. [PMID: 35890479 PMCID: PMC9320154 DOI: 10.3390/plants11141846] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 07/06/2022] [Accepted: 07/09/2022] [Indexed: 11/23/2022]
Abstract
With climate change impacting trees worldwide, enhancing adaptation capacity has become an important goal of provenance translocation strategies for forestry, ecological renovation, and biodiversity conservation. Given that not every species can be studied in detail, it is important to understand the extent to which climate adaptation patterns can be generalised across species, in terms of the selective agents and traits involved. We here compare patterns of genetic-based population (co)variation in leaf economic and hydraulic traits, climate–trait associations, and genomic differentiation of two widespread tree species (Eucalyptus pauciflora and E. ovata). We studied 2-year-old trees growing in a common-garden trial established with progeny from populations of both species, pair-sampled from 22 localities across their overlapping native distribution in Tasmania, Australia. Despite originating from the same climatic gradients, the species differed in their levels of population variance and trait covariance, patterns of population variation within each species were uncorrelated, and the species had different climate–trait associations. Further, the pattern of genomic differentiation among populations was uncorrelated between species, and population differentiation in leaf traits was mostly uncorrelated with genomic differentiation. We discuss hypotheses to explain this decoupling of patterns and propose that the choice of seed provenances for climate-based plantings needs to account for multiple dimensions of climate change unless species-specific information is available.
Collapse
|
7
|
Pearman WS, Urban L, Alexander A. Commonly used Hardy-Weinberg equilibrium filtering schemes impact population structure inferences using RADseq data. Mol Ecol Resour 2022; 22:2599-2613. [PMID: 35593534 PMCID: PMC9541430 DOI: 10.1111/1755-0998.13646] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 05/13/2022] [Indexed: 11/29/2022]
Abstract
Reduced representation sequencing (RRS) is a widely used method to assay the diversity of genetic loci across the genome of an organism. The dominant class of RRS approaches assay loci associated with restriction sites within the genome (restriction site associated DNA sequencing, or RADseq). RADseq is frequently applied to non‐model organisms since it enables population genetic studies without relying on well‐characterized reference genomes. However, RADseq requires the use of many bioinformatic filters to ensure the quality of genotyping calls. These filters can have direct impacts on population genetic inference, and therefore require careful consideration. One widely used filtering approach is the removal of loci that do not conform to expectations of Hardy–Weinberg equilibrium (HWE). Despite being widely used, we show that this filtering approach is rarely described in sufficient detail to enable replication. Furthermore, through analyses of in silico and empirical data sets we show that some of the most widely used HWE filtering approaches dramatically impact inference of population structure. In particular, the removal of loci exhibiting departures from HWE after pooling across samples significantly reduces the degree of inferred population structure within a data set (despite this approach being widely used). Based on these results, we provide recommendations for best practice regarding the implementation of HWE filtering for RADseq data sets.
Collapse
Affiliation(s)
- William S Pearman
- Department of Marine Science, University of Otago, Dunedin, New Zealand.,Department of Anatomy, University of Otago, Dunedin, New Zealand
| | - Lara Urban
- Department of Anatomy, University of Otago, Dunedin, New Zealand
| | - Alana Alexander
- Department of Anatomy, University of Otago, Dunedin, New Zealand
| |
Collapse
|
8
|
Filipe JC, Rymer PD, Byrne M, Hardy G, Mazanec R, Ahrens CW. Signatures of natural selection in a foundation tree along Mediterranean climatic gradients. Mol Ecol 2022; 31:1735-1752. [PMID: 35038378 PMCID: PMC9305101 DOI: 10.1111/mec.16351] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 01/04/2022] [Accepted: 01/10/2022] [Indexed: 11/30/2022]
Abstract
Temperature and precipitation regimes are rapidly changing, resulting in forest dieback and extinction events, particularly in Mediterranean‐type climates (MTC). Forest management that enhance forests’ resilience is urgently required, however adaptation to climates in heterogeneous landscapes with multiple selection pressures is complex. For widespread trees in MTC we hypothesized that: patterns of local adaptation are associated with climate; precipitation is a stronger factor of adaptation than temperature; functionally related genes show similar signatures of adaptation; and adaptive variants are independently sorting across the landscape. We sampled 28 populations across the geographic distribution of Eucalyptus marginata (jarrah), in South‐west Western Australia, and obtained 13,534 independent single nucleotide polymorphic (SNP) markers across the genome. Three genotype‐association analyses that employ different ways of correcting population structure were used to identify putatively adapted SNPs associated with independent climate variables. While overall levels of population differentiation were low (FST = 0.04), environmental association analyses found a total of 2336 unique SNPs associated with temperature and precipitation variables, with 1440 SNPs annotated to genic regions. Considerable allelic turnover was identified for SNPs associated with temperature seasonality and mean precipitation of the warmest quarter, suggesting that both temperature and precipitation are important factors in adaptation. SNPs with similar gene functions had analogous allelic turnover along climate gradients, while SNPs among temperature and precipitation variables had uncorrelated patterns of adaptation. These contrasting patterns provide evidence that there may be standing genomic variation adapted to current climate gradients, providing the basis for adaptive management strategies to bolster forest resilience in the future.
Collapse
Affiliation(s)
- J C Filipe
- Centre for Terrestrial Ecosystem Science and Sustainability, Harry Butler Institute, Murdoch University
| | - P D Rymer
- Hawkesbury Institute for the Environment, Western Sydney University
| | - M Byrne
- Biodiversity and Conservation Science, Department of Biodiversity, Conservation and Attractions
| | - G Hardy
- Centre for Terrestrial Ecosystem Science and Sustainability, Harry Butler Institute, Murdoch University
| | - R Mazanec
- Biodiversity and Conservation Science, Department of Biodiversity, Conservation and Attractions
| | - C W Ahrens
- Hawkesbury Institute for the Environment, Western Sydney University
| |
Collapse
|
9
|
Narum S, News JK, Fountain-Jones N, Hooper Junior R, Ortiz-Barrientos D, O'Boyle B, Sibbett B. Editorial 2022. Mol Ecol Resour 2021; 22:1-8. [PMID: 34919782 DOI: 10.1111/1755-0998.13572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
10
|
Foster Y, Dutoit L, Grosser S, Dussex N, Foster BJ, Dodds KG, Brauning R, Van Stijn T, Robertson F, McEwan JC, Jacobs JME, Robertson BC. Genomic signatures of inbreeding in a critically endangered parrot, the kākāpō. G3 (BETHESDA, MD.) 2021; 11:jkab307. [PMID: 34542587 PMCID: PMC8527487 DOI: 10.1093/g3journal/jkab307] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 08/23/2021] [Indexed: 02/06/2023]
Abstract
Events of inbreeding are inevitable in critically endangered species. Reduced population sizes and unique life-history traits can increase the severity of inbreeding, leading to declines in fitness and increased risk of extinction. Here, we investigate levels of inbreeding in a critically endangered flightless parrot, the kākāpō (Strigops habroptilus), wherein a highly inbred island population and one individual from the mainland of New Zealand founded the entire extant population. Genotyping-by-sequencing (GBS), and a genotype calling approach using a chromosome-level genome assembly, identified a filtered set of 12,241 single-nucleotide polymorphisms (SNPs) among 161 kākāpō, which together encompass the total genetic potential of the extant population. Multiple molecular-based estimates of inbreeding were compared, including genome-wide estimates of heterozygosity (FH), the diagonal elements of a genomic-relatedness matrix (FGRM), and runs of homozygosity (RoH, FRoH). In addition, we compared levels of inbreeding in chicks from a recent breeding season to examine if inbreeding is associated with offspring survival. The density of SNPs generated with GBS was sufficient to identify chromosomes that were largely homozygous with RoH distributed in similar patterns to other inbred species. Measures of inbreeding were largely correlated and differed significantly between descendants of the two founding populations. However, neither inbreeding nor ancestry was found to be associated with reduced survivorship in chicks, owing to unexpected mortality in chicks exhibiting low levels of inbreeding. Our study highlights important considerations for estimating inbreeding in critically endangered species, such as the impacts of small population sizes and admixture between diverse lineages.
Collapse
Affiliation(s)
- Yasmin Foster
- Department of Zoology, University of Otago, Dunedin 9054, New Zealand
| | - Ludovic Dutoit
- Department of Zoology, University of Otago, Dunedin 9054, New Zealand
| | - Stefanie Grosser
- Department of Zoology, University of Otago, Dunedin 9054, New Zealand
| | - Nicolas Dussex
- Centre for Palaeogenetics, SE-106 91 Stockholm, Sweden
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, SE-104 05 Stockholm, Sweden
- Department of Zoology, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Brodie J Foster
- Department of Zoology, University of Otago, Dunedin 9054, New Zealand
| | - Ken G Dodds
- AgResearch Invermay Agricultural Centre, Mosgiel 9053, New Zealand
| | - Rudiger Brauning
- AgResearch Invermay Agricultural Centre, Mosgiel 9053, New Zealand
| | - Tracey Van Stijn
- AgResearch Invermay Agricultural Centre, Mosgiel 9053, New Zealand
| | - Fiona Robertson
- Department of Zoology, University of Otago, Dunedin 9054, New Zealand
| | - John C McEwan
- AgResearch Invermay Agricultural Centre, Mosgiel 9053, New Zealand
| | | | - Bruce C Robertson
- Department of Zoology, University of Otago, Dunedin 9054, New Zealand
| |
Collapse
|