1
|
Nagi SC, Ashraf F, Miles A, Donnelly MJ. AnoPrimer: Primer Design in malaria vectors informed by range-wide genomic variation. Wellcome Open Res 2024; 9:255. [PMID: 39184128 PMCID: PMC11342028 DOI: 10.12688/wellcomeopenres.20998.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2024] [Indexed: 08/27/2024] Open
Abstract
The major malaria mosquitoes, Anopheles gambiae s.l and Anopheles funestus, are some of the most studied organisms in medical research and also some of the most genetically diverse. When designing polymerase chain reaction (PCR) or hybridisation-based molecular assays, reliable primer and probe design is crucial. However, single nucleotide polymorphisms (SNPs) in primer binding sites can prevent primer binding, leading to null alleles, or bind suboptimally, leading to preferential amplification of specific alleles. Given the extreme genetic diversity of Anopheles mosquitoes, researchers need to consider this genetic variation when designing primers and probes to avoid amplification problems. In this note, we present a Python package, AnoPrimer, which exploits the Ag1000G and Af1000 datasets and allows users to rapidly design primers in An. gambiae or An. funestus, whilst summarising genetic variation in the primer binding sites and visualising the position of primer pairs. AnoPrimer allows the design of both genomic DNA and cDNA primers and hybridisation probes. By coupling this Python package with Google Colaboratory, AnoPrimer is an open and accessible platform for primer and probe design, hosted in the cloud for free. AnoPrimer is available here https://github.com/sanjaynagi/AnoPrimer and we hope it will be a useful resource for the community to design probe and primer sets that can be reliably deployed across the An. gambiae and funestus species ranges.
Collapse
Affiliation(s)
- Sanjay C. Nagi
- Department of Vector Biology, Liverpool School of Tropical Medicine, Liverpool, L3 5QA, UK
| | - Faisal Ashraf
- Department of Vector Biology, Liverpool School of Tropical Medicine, Liverpool, L3 5QA, UK
| | | | - Martin J. Donnelly
- Department of Vector Biology, Liverpool School of Tropical Medicine, Liverpool, L3 5QA, UK
- Wellcome Sanger Institute, Hinxton, England, UK
| |
Collapse
|
2
|
History cooling events contributed to the endangered status of Pseudotsuga brevifolia endemic to limestone habitats. Glob Ecol Conserv 2023. [DOI: 10.1016/j.gecco.2023.e02414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2023] Open
|
3
|
Graffelman J, Weir BS. The transitivity of the Hardy-Weinberg law. Forensic Sci Int Genet 2022; 58:102680. [PMID: 35313226 PMCID: PMC10693928 DOI: 10.1016/j.fsigen.2022.102680] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 02/12/2022] [Accepted: 02/20/2022] [Indexed: 11/27/2022]
Abstract
The Hardy-Weinberg law is shown to be transitive in the sense that a multi-allelic polymorphism that is in equilibrium will retain its equilibrium status if any allele together with its corresponding genotypes is deleted from the population. Similarly, the transitivity principle also applies if alleles are joined, which leads to the summation of allele frequencies and their corresponding genotype frequencies. These basic polymorphism properties are intuitive, but they have apparently not been formalized or investigated. This article provides a straightforward proof of the transitivity principle, and its usefulness in genetic data analysis is explored, using high-quality autosomal microsatellite databases from the US National Institute of Standards and Technology. We address the reduction of multi-allelic polymorphisms to variants with fewer alleles, two in the limit. Equilibrium test results obtained with the original and reduced polymorphisms are generally observed to be coherent, in particular when results obtained with length-based and sequence-based microsatellites are compared. We exploit the transitivity principle in order to identify disequilibrium-related alleles, and show its usefulness for detecting population substructure and genotyping problems that relate to null alleles and allele imbalance.
Collapse
Affiliation(s)
- Jan Graffelman
- Department of Statistics and Operations Research, Universitat Politècnica de Catalunya, Carrer Jordi Girona, 1-3, 08034, Barcelona, Spain; Department of Biostatistics, University of Washington, University Tower, 15th Floor, 4333 Brooklyn Avenue, Seattle, WA 98105-9461, United States of America.
| | - Bruce S Weir
- Department of Biostatistics, University of Washington, University Tower, 15th Floor, 4333 Brooklyn Avenue, Seattle, WA 98105-9461, United States of America
| |
Collapse
|
4
|
Targeted genome-wide SNP genotyping in feral horses using non-invasive fecal swabs. CONSERV GENET RESOUR 2022; 14:203-213. [PMID: 35673611 PMCID: PMC9162989 DOI: 10.1007/s12686-022-01259-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 02/24/2022] [Indexed: 11/22/2022]
Abstract
The development of high-throughput sequencing has prompted a transition in wildlife genetics from using microsatellites toward sets of single nucleotide polymorphisms (SNPs). However, genotyping large numbers of targeted SNPs using non-invasive samples remains challenging due to relatively large DNA input requirements. Recently, target enrichment has emerged as a promising approach requiring little template DNA. We assessed the efficacy of Tecan Genomics’ Allegro Targeted Genotyping (ATG) for generating genome-wide SNP data in feral horses using DNA isolated from fecal swabs. Total and host-specific DNA were quantified for 989 samples collected as part of a long-term individual-based study of feral horses on Sable Island, Nova Scotia, Canada, using dsDNA fluorescence and a host-specific qPCR assay, respectively. Forty-eight samples representing 44 individuals containing at least 10 ng of host DNA (ATG’s recommended minimum input) were genotyped using a custom multiplex panel targeting 279 SNPs. Genotyping accuracy and consistency were assessed by contrasting ATG genotypes with those obtained from the same individuals with SNP microarrays, and from multiple samples from the same horse, respectively. 62% of swabs yielded the minimum recommended amount of host DNA for ATG. Ignoring samples that failed to amplify, ATG recovered an average of 88.8% targeted sites per sample, while genotype concordance between ATG and SNP microarrays was 98.5%. The repeatability of genotypes from the same individual approached unity with an average of 99.9%. This study demonstrates the suitability of ATG for genome-wide, non-invasive targeted SNP genotyping, and will facilitate further ecological and conservation genetics research in equids and related species.
Collapse
|
5
|
Gershoni M, Shirak A, Raz R, Seroussi E. Comparing BeadChip and WGS Genotyping: Non-Technical Failed Calling Is Attributable to Additional Variation within the Probe Target Sequence. Genes (Basel) 2022; 13:genes13030485. [PMID: 35328039 PMCID: PMC8948885 DOI: 10.3390/genes13030485] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 03/03/2022] [Accepted: 03/08/2022] [Indexed: 01/11/2023] Open
Abstract
Microarray-based genomic selection is a central tool to increase the genetic gain of economically significant traits in dairy cattle. Yet, the effectivity of this tool is slightly limited, as estimates based on genotype data only partially explain the observed heritability. In the analysis of the genomes of 17 Israeli Holstein bulls, we compared genotyping accuracy between whole-genome sequencing (WGS) and microarray-based techniques. Using the standard GATK pipeline, the short-variant discovery within sequence reads mapped to the reference genome (ARS-UCD1.2) was compared to the genotypes from Illumina BovineSNP50 BeadChip and to an alternative method, which computationally mimics the hybridization procedure by mapping reads to 50 bp spanning the BeadChip source sequences. The number of mismatches between the BeadChip and WGS genotypes was low (0.2%). However, 17,197 (40% of the informative SNPs) had extra variation within 50 bp of the targeted SNP site, which might interfere with hybridization-based genotyping. Consequently, with respect to genotyping errors, BeadChip varied significantly and systematically from WGS genotyping, introducing null allele-like effects and Mendelian errors (<0.5%), whereas the GATK algorithm of local de novo assembly of haplotypes successfully resolved the genotypes in the extra-variable regions. These findings suggest that the microarray design should avoid polymorphic genomic regions that are prone to extra variation and that WGS data may be used to resolve erroneous genotyping, which may partially explain missing heritability.
Collapse
|
6
|
Cortez T, Amaral RV, Sobral-Souza T, Andrade SCS. Genome-wide assessment elucidates connectivity and the evolutionary history of the highly dispersive marine invertebrate Littoraria flava (Littorinidae: Gastropoda). Biol J Linn Soc Lond 2021. [DOI: 10.1093/biolinnean/blab055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Abstract
An important goal of marine population genetics is to understand how spatial connectivity patterns are influenced by historical and evolutionary factors. In this study, we evaluate the demographic history and population structure of Littoraria flava, a highly dispersive marine gastropod in the Brazilian intertidal zone. To test the hypotheses that the species has (1) historically high levels of gene flow on a macrogeographical spatial scale and (2) a distribution in rocky shores that consists of subpopulations, we collected specimens along the Brazilian coastline and combined different sets of genetic markers (mitochondrial DNA, ITS-2 and single nucleotide polymorphisms) with niche-based modelling to predict its palaeodistribution. Low genetic structure was observed, as well as high gene flow over long distances. The demographic analyses suggest that L. flava has had periods of population bottlenecks followed by expansion. According to both palaeodistribution and coalescent simulations, these expansion events occurred during the Pleistocene interglacial cycles (21 kya) and the associated climatic changes were the probable drivers of the distribution of the species. This is the first phylogeographical study of a marine gastropod on the South American coast based on genomic markers associated with niche modelling.
Collapse
Affiliation(s)
- Thainá Cortez
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo, SPBrazil
| | - Rafael V Amaral
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo, SPBrazil
| | - Thadeu Sobral-Souza
- Departamento de Botânica e Ecologia, Universidade Federal do Mato Grosso, Cuiabá, MTBrazil
| | - Sónia C S Andrade
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo, SPBrazil
| |
Collapse
|
7
|
Howard NP, Troggio M, Durel CE, Muranty H, Denancé C, Bianco L, Tillman J, van de Weg E. Integration of Infinium and Axiom SNP array data in the outcrossing species Malus × domestica and causes for seemingly incompatible calls. BMC Genomics 2021; 22:246. [PMID: 33827434 PMCID: PMC8028180 DOI: 10.1186/s12864-021-07565-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 03/30/2021] [Indexed: 11/23/2022] Open
Abstract
Background Single nucleotide polymorphism (SNP) array technology has been increasingly used to generate large quantities of SNP data for use in genetic studies. As new arrays are developed to take advantage of new technology and of improved probe design using new genome sequence and panel data, a need to integrate data from different arrays and array platforms has arisen. This study was undertaken in view of our need for an integrated high-quality dataset of Illumina Infinium® 20 K and Affymetrix Axiom® 480 K SNP array data in apple (Malus × domestica). In this study, we qualify and quantify the compatibility of SNP calling, defined as SNP calls that are both accurate and concordant, across both arrays by two approaches. First, the concordance of SNP calls was evaluated using a set of 417 duplicate individuals genotyped on both arrays starting from a set of 10,295 robust SNPs on the Infinium array. Next, the accuracy of the SNP calls was evaluated on additional germplasm (n = 3141) from both arrays using Mendelian inconsistent and consistent errors across thousands of pedigree links. While performing this work, we took the opportunity to evaluate reasons for probe failure and observed discordant SNP calls. Results Concordance among the duplicate individuals was on average of 97.1% across 10,295 SNPs. Of these SNPs, 35% had discordant call(s) that were further curated, leading to a final set of 8412 (81.7%) SNPs that were deemed compatible. Compatibility was highly influenced by the presence of alternate probe binding locations and secondary polymorphisms. The impact of the latter was highly influenced by their number and proximity to the 3′ end of the probe. Conclusions The Infinium and Axiom SNP array data were mostly compatible. However, data integration required intense data filtering and curation. This work resulted in a workflow and information that may be of use in other data integration efforts. Such an in-depth analysis of array concordance and accuracy as ours has not been previously described in the literature and will be useful in future work on SNP array data integration and interpretation, and in probe/platform development. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07565-7.
Collapse
Affiliation(s)
- Nicholas P Howard
- Institut für Biologie und Umweltwissenschaften, Carl von Ossietzky Univ., Oldenburg, Germany.,Department of Horticultural Science, Univ. of Minnesota, St Paul, USA
| | | | - Charles-Eric Durel
- Université d'Angers, Institut Agro, INRAE, IRHS, SFR 4207 QuaSaV, Beaucouzé, France
| | - Hélène Muranty
- Université d'Angers, Institut Agro, INRAE, IRHS, SFR 4207 QuaSaV, Beaucouzé, France
| | - Caroline Denancé
- Université d'Angers, Institut Agro, INRAE, IRHS, SFR 4207 QuaSaV, Beaucouzé, France
| | - Luca Bianco
- Fondazione Edmund Mach, San Michele all'Adige, TN, Italy
| | - John Tillman
- Department of Horticultural Science, Univ. of Minnesota, St Paul, USA
| | - Eric van de Weg
- Department of Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands.
| |
Collapse
|
8
|
England AD, Kheravii SK, Musigwa S, Kumar A, Daneshmand A, Sharma NK, Gharib-Naseri K, Wu SB. Sexing chickens (Gallus gallus domesticus) with high-resolution melting analysis using feather crude DNA. Poult Sci 2020; 100:100924. [PMID: 33652540 PMCID: PMC7936197 DOI: 10.1016/j.psj.2020.12.022] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 09/13/2020] [Accepted: 12/08/2020] [Indexed: 10/25/2022] Open
Abstract
Identification of sex in broiler chickens allows researchers to reduce the level of variation in an experiment caused by the sex effect. Broiler breeds commonly used in research are no longer feather sexable because of the change in their genetics. Other alternate sexing methods are costly and difficult to apply on a large scale. Therefore, a sexing method is required that is both cost effective and highly sensitive as well as having the ability to offer high throughput genotyping. In this study, high-resolution melting (HRM) analysis was used to detect DNA variations present in the gene chromodomain helicase DNA binding 1 protein (CHD1) on the Z and W chromosomes (CHD1Z and CHD1W, respectively) of chickens. In addition, a simplified DNA extraction protocol, which made use of the basal part of chicken feathers, was developed to speed up the sexing procedure. Three pairs of primers, that is, CHD1UNEHRM1F/R, CHD1UNEHRM2F/R, and CHD1UNEHRM3F/R, flanking the polymorphic regions between CHD1Z and CHD1W were used to differentiate male and female chickens via distinct melting curves, typical of homozygous or heterozygous genotypes. The assay was validated by the HRM-sexing of 1,318 broiler chicks and verified by examining the sex of the birds after dissection. This method allows for the sexing of birds within a couple of days, which makes it applicable for use on a large scale such as in nutritional experiments.
Collapse
Affiliation(s)
- A D England
- School of Environmental and Rural Science, University of New England, Armidale 2351, NSW, Australia
| | - S K Kheravii
- School of Environmental and Rural Science, University of New England, Armidale 2351, NSW, Australia
| | - S Musigwa
- School of Environmental and Rural Science, University of New England, Armidale 2351, NSW, Australia
| | - A Kumar
- School of Environmental and Rural Science, University of New England, Armidale 2351, NSW, Australia
| | - A Daneshmand
- School of Environmental and Rural Science, University of New England, Armidale 2351, NSW, Australia
| | - N K Sharma
- School of Environmental and Rural Science, University of New England, Armidale 2351, NSW, Australia
| | - K Gharib-Naseri
- School of Environmental and Rural Science, University of New England, Armidale 2351, NSW, Australia
| | - S B Wu
- School of Environmental and Rural Science, University of New England, Armidale 2351, NSW, Australia.
| |
Collapse
|
9
|
Montanari S, Bianco L, Allen BJ, Martínez-García PJ, Bassil NV, Postman J, Knäbel M, Kitson B, Deng CH, Chagné D, Crepeau MW, Langley CH, Evans K, Dhingra A, Troggio M, Neale DB. Development of a highly efficient Axiom™ 70 K SNP array for Pyrus and evaluation for high-density mapping and germplasm characterization. BMC Genomics 2019; 20:331. [PMID: 31046664 PMCID: PMC6498479 DOI: 10.1186/s12864-019-5712-3] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Accepted: 04/17/2019] [Indexed: 12/20/2022] Open
Abstract
Background Both a source of diversity and the development of genomic tools, such as reference genomes and molecular markers, are equally important to enable faster progress in plant breeding. Pear (Pyrus spp.) lags far behind other fruit and nut crops in terms of employment of available genetic resources for new cultivar development. To address this gap, we designed a high-density, high-efficiency and robust single nucleotide polymorphism (SNP) array for pear, with the main objectives of conducting genetic diversity and genome-wide association studies. Results By applying a two-step design process, which consisted of the construction of a first ‘draft’ array for the screening of a small subset of samples, we were able to identify the most robust and informative SNPs to include in the Applied Biosystems™ Axiom™ Pear 70 K Genotyping Array, currently the densest SNP array for pear. Preliminary evaluation of this 70 K array in 1416 diverse pear accessions from the USDA National Clonal Germplasm Repository (NCGR) in Corvallis, OR identified 66,616 SNPs (93% of all the tiled SNPs) as high quality and polymorphic (PolyHighResolution). We further used the Axiom Pear 70 K Genotyping Array to construct high-density linkage maps in a bi-parental population, and to make a direct comparison with available genotyping-by-sequencing (GBS) data, which suggested that the SNP array is a more robust method of screening for SNPs than restriction enzyme reduced representation sequence-based genotyping. Conclusions The Axiom Pear 70 K Genotyping Array, with its high efficiency in a widely diverse panel of Pyrus species and cultivars, represents a valuable resource for a multitude of molecular studies in pear. The characterization of the USDA-NCGR collection with this array will provide important information for pear geneticists and breeders, as well as for the optimization of conservation strategies for Pyrus. Electronic supplementary material The online version of this article (10.1186/s12864-019-5712-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sara Montanari
- Department of Plant Sciences, University of California, Davis, CA, USA.
| | - Luca Bianco
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all'Adige, Trento, Italy
| | - Brian J Allen
- Department of Plant Sciences, University of California, Davis, CA, USA
| | | | - Nahla V Bassil
- USDA Agricultural Research Service, National Clonal Germplasm Repository, Corvallis, OR, USA
| | - Joseph Postman
- USDA Agricultural Research Service, National Clonal Germplasm Repository, Corvallis, OR, USA
| | - Mareike Knäbel
- Palmerston North Research Centre, The New Zealand Institute for Plant & Food Research Limited (PFR), Palmerston North, New Zealand
| | - Biff Kitson
- Motueka Research Centre, The New Zealand Institute for Plant & Food Research Limited (PFR), Motueka, New Zealand
| | - Cecilia H Deng
- Auckland Research Centre, The New Zealand Institute for Plant & Food Research Limited (PFR), Auckland, New Zealand
| | - David Chagné
- Palmerston North Research Centre, The New Zealand Institute for Plant & Food Research Limited (PFR), Palmerston North, New Zealand
| | - Marc W Crepeau
- Department of Evolution and Ecology, University of California, Davis, CA, USA
| | - Charles H Langley
- Department of Evolution and Ecology, University of California, Davis, CA, USA
| | - Kate Evans
- Tree Fruit Research and Extension Center, Washington State University, Wenatchee, WA, USA
| | - Amit Dhingra
- Department of Horticulture, Washington State University, Pullman, WA, USA
| | - Michela Troggio
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all'Adige, Trento, Italy
| | - David B Neale
- Department of Plant Sciences, University of California, Davis, CA, USA
| |
Collapse
|
10
|
Al-Breiki RD, Kjeldsen SR, Afzal H, Al Hinai MS, Zenger KR, Jerry DR, Al-Abri MA, Delghandi M. Genome-wide SNP analyses reveal high gene flow and signatures of local adaptation among the scalloped spiny lobster (Panulirus homarus) along the Omani coastline. BMC Genomics 2018; 19:690. [PMID: 30231936 PMCID: PMC6146514 DOI: 10.1186/s12864-018-5044-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Accepted: 08/27/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The scalloped spiny lobster (Panulirus homarus) is a popular seafood commodity worldwide and an important export item from Oman. Annual catches in commercial fisheries are in serious decline, which has resulted in calls for the development of an integrated stock management approach. In Oman, the scalloped spiny lobster is currently treated as a single management unit (MU) or stock and there is an absence of information on the genetic population structure of the species that can inform management decisions, particularly at a fine-scale level. This work is the first to identify genome-wide single nucleotide polymorphisms (SNPs) for P. homarus using Diversity Arrays Technology sequencing (DArT-seq) and to elucidate any stock structure in the species. RESULTS After stringent filtering, 7988 high utility SNPs were discovered and used to assess the genetic diversity, connectivity and structure of P. homarus populations from Al Ashkharah, Masirah Island, Duqm, Ras Madrakah, Haitam, Ashuwaymiyah, Mirbat and Dhalkut landing sites. Pairwise FST estimates revealed low differentiation among populations (pairwise FST range = - 0.0008 - 0.0021). Analysis of genetic variation using putatively directional FST outliers (504 SNPs) revealed higher and significant pairwise differentiation (p < 0.01) for all locations, with Ashuwaymiyah being the most diverged population (Ashuwaymiyah pairwise FST range = 0.0288-0.0736). Analysis of population structure using Discriminant Analysis of Principal Components (DAPC) revealed a broad admixture among P. homarus, however, Ashuwaymiyah stock appeared to be potentially under local adaptive pressures. Fine scale analysis using Netview R provided further support for the general admixture of P. homarus. CONCLUSIONS Findings here suggested that stocks of P. homarus along the Omani coastline are admixed. Yet, fishery managers need to treat the lobster stock from Ashuwaymiyah with caution as it might be subject to local adaptive pressures. We emphasize further study with larger number of samples to confirm the genetic status of the Ashuwaymiyah stock. The approach utilised in this study has high transferability in conservation and management of other marine stocks with similar biological and ecological attributes.
Collapse
Affiliation(s)
- Rufaida Dhuhai Al-Breiki
- Centre of Excellence in Marine Biotechnology, Sultan Qaboos University, P.O. Box 50, Al-Khoud, 123 Muscat, Sultanate of Oman
- College of Agriculture and Marine Sciences, Department of Marine Sciences and Fisheries, Sultan Qaboos University, P.O. Box 34, Al-Khoud, 123 Muscat, Sultanate of Oman
| | - Shannon R. Kjeldsen
- Centre for Sustainable Tropical Fisheries and Aquaculture and College of Science and Engineering, James Cook University, Townsville, QLD 4810 Australia
| | - Hasifa Afzal
- Centre of Excellence in Marine Biotechnology, Sultan Qaboos University, P.O. Box 50, Al-Khoud, 123 Muscat, Sultanate of Oman
| | - Manal Saif Al Hinai
- Centre of Excellence in Marine Biotechnology, Sultan Qaboos University, P.O. Box 50, Al-Khoud, 123 Muscat, Sultanate of Oman
| | - Kyall R. Zenger
- Centre for Sustainable Tropical Fisheries and Aquaculture and College of Science and Engineering, James Cook University, Townsville, QLD 4810 Australia
| | - Dean R. Jerry
- Centre for Sustainable Tropical Fisheries and Aquaculture and College of Science and Engineering, James Cook University, Townsville, QLD 4810 Australia
| | - Mohammed Ali Al-Abri
- College of Agriculture and Marine Sciences, Department of Animal and Veterinary Sciences and Technology, Sultan Qaboos University, P.O. Box 34, Al-Khoud, 123 Muscat, Sultanate of Oman
| | - Madjid Delghandi
- Centre of Excellence in Marine Biotechnology, Sultan Qaboos University, P.O. Box 50, Al-Khoud, 123 Muscat, Sultanate of Oman
| |
Collapse
|
11
|
Bekal S, Domier LL, Gonfa B, Lakhssassi N, Meksem K, Lambert KN. A SNARE-Like Protein and Biotin Are Implicated in Soybean Cyst Nematode Virulence. PLoS One 2015; 10:e0145601. [PMID: 26714307 PMCID: PMC4699853 DOI: 10.1371/journal.pone.0145601] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Accepted: 12/07/2015] [Indexed: 11/24/2022] Open
Abstract
Phytoparasitic nematodes that are able to infect and reproduce on plants that are considered resistant are referred to as virulent. The mechanism(s) that virulent nematodes employ to evade or suppress host plant defenses are not well understood. Here we report the use of a genetic strategy (allelic imbalance analysis) to associate single nucleotide polymorphisms (SNPs) with nematode virulence genes in Heterodera glycines, the soybean cyst nematode (SCN). To accomplish this analysis, a custom SCN SNP array was developed and used to genotype SCN F3-derived populations grown on resistant and susceptible soybean plants. Three SNPs reproducibly showed allele imbalances between nematodes grown on resistant and susceptible plants. Two candidate SCN virulence genes that were tightly linked to the SNPs were identified. One SCN gene encoded biotin synthase (HgBioB), and the other encoded a bacterial-like protein containing a putative SNARE domain (HgSLP-1). The two genes mapped to two different linkage groups. HgBioB contained sequence polymorphisms between avirulent and virulent nematodes. However, the gene encoding HgSLP-1 had reduced copy number in virulent nematode populations and appears to produce multiple forms of the protein via intron retention and alternative splicing. We show that HgSLP-1 is an esophageal-gland protein that is secreted by the nematode during plant parasitism. Furthermore, in bacterial co-expression experiments, HgSLP-1 co-purified with the SCN resistance protein Rhg1 α-SNAP, suggesting that these two proteins physically interact. Collectively our data suggest that multiple SCN genes are involved in SCN virulence, and that HgSLP-1 may function as an avirulence protein and when absent it helps SCN evade host defenses.
Collapse
Affiliation(s)
- Sadia Bekal
- Department of Plant, Soil and Agricultural Systems, 1205 Lincoln Dr. Southern Illinois University, Carbondale, IL, 62901, United States of America
| | - Leslie L. Domier
- Department of Crop Sciences, University of Illinois, 1102 South Goodwin Ave. Urbana, IL, 61801, United States of America
| | - Biruk Gonfa
- Department of Crop Sciences, University of Illinois, 1102 South Goodwin Ave. Urbana, IL, 61801, United States of America
| | - Naoufal Lakhssassi
- Department of Plant, Soil and Agricultural Systems, 1205 Lincoln Dr. Southern Illinois University, Carbondale, IL, 62901, United States of America
| | - Khalid Meksem
- Department of Plant, Soil and Agricultural Systems, 1205 Lincoln Dr. Southern Illinois University, Carbondale, IL, 62901, United States of America
| | - Kris N. Lambert
- Department of Crop Sciences, University of Illinois, 1102 South Goodwin Ave. Urbana, IL, 61801, United States of America
| |
Collapse
|
12
|
Özbek U, Feingold E, Weeks DE. Efficient Identification of Null-Allele Single Nucleotide Polymorphism Markers. Hum Hered 2015; 80:79-89. [PMID: 26613255 DOI: 10.1159/000441279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Accepted: 09/24/2015] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVES At the beginning of a genome-wide association study, many markers are discarded because they fail to meet standard quality control criteria. Some of these markers are out of Hardy-Weinberg equilibrium (HWE) because they have 'null alleles' (which may be deletions or third alleles that do not hybridize to standard probes). It may be useful to identify null-allele markers so that they can be analyzed under different models or in order to explore regions of copy number variation. METHODS We present a model for the chip-based genotype data that are produced when a null-allele single nucleotide polymorphism (SNP) is genotyped under standard (2-allele) assumptions. We show that this model can be combined with the standard HWE model to develop classification procedures based on the supervised learning algorithms Support Vector Machines (SVM), Classification and Regression Trees (CART) or Random Forests for identifying null-allele SNPs. RESULTS We report a list of null-allele SNPs we identified on the Illumina 660W-Quad chip and provide suggestions for applying our CART model to other SNP sets. CONCLUSIONS Properly identified null-allele SNPs can be used to test for genotype-phenotype associations or to identify regions which may contain copy number variants.
Collapse
Affiliation(s)
- Umut Özbek
- Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, N.Y., USA
| | | | | |
Collapse
|
13
|
Lal MM, Southgate PC, Jerry DR, Zenger KR. Fishing for divergence in a sea of connectivity: The utility of ddRADseq genotyping in a marine invertebrate, the black-lip pearl oyster Pinctada margaritifera. Mar Genomics 2015; 25:57-68. [PMID: 26545807 DOI: 10.1016/j.margen.2015.10.010] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2015] [Revised: 10/27/2015] [Accepted: 10/27/2015] [Indexed: 01/01/2023]
Abstract
Population genomic investigations on highly dispersive marine organisms typically require thousands of genome-wide SNP loci to resolve fine-scale population structure and detect signatures of selection. This information is important for species conservation efforts and stock management in both wild and captive populations, as well as genome mapping and genome wide association studies. Double digest Restriction site-Associated DNA Sequencing (ddRADseq) is a recent tool for delivering genome wide SNPs for non-model organisms. However, its application to marine invertebrate taxa has been limited, particularly given the complex and highly repetitive nature of many of these organisms' genomes. This study develops and evaluates an optimised ddRADseq technique together with associated analyses for generating genome-wide SNP data, and performs population genomic analyses to inform aquaculture and fishery management of a marine bivalve, the black-lip pearl oyster Pinctada margaritifera. A total of 5243 high-quality genome-wide SNP markers were detected, and used to assess population structure, genome diversity, detect Fst outliers and perform association testing in 156 individuals belonging to three wild and one hatchery produced populations from the Fiji Islands. Shallow but significant population structure was revealed among all wild populations (average pairwise Fst=0.046) when visualised with DAPC and an individual network analysis (NetView P), with clear evidence of a genetic bottleneck in the hatchery population (NeLD=6.1), compared to wild populations (NeLD>192.5). Fst outlier detection revealed 42-62 highly differentiated SNPs (p<0.02), while case-control association discovered up to 152 SNPs (p<0.001). Both analyses were able to successfully differentiate individuals between the orange and black tissue colour morphotypes characteristic of this species. BLAST searches revealed that five of these SNPs were associated with a melanin biosynthesis pathway, demonstrating their biological relevance. This study has produced highly informative SNP and population genomic data in P. margaritifera, and using the same approach promises to be of substantial value to a range of other non-model, broadcast-spawning or marine invertebrate taxa.
Collapse
Affiliation(s)
- Monal M Lal
- Centre for Sustainable Tropical Fisheries and Aquaculture (CSTFA), James Cook University, Townsville Campus, Townsville, QLD 4811, Australia; College of Marine and Environmental Sciences (CMES), James Cook University, Townsville Campus, Townsville, QLD 4811, Australia.
| | - Paul C Southgate
- Centre for Sustainable Tropical Fisheries and Aquaculture (CSTFA), James Cook University, Townsville Campus, Townsville, QLD 4811, Australia; College of Marine and Environmental Sciences (CMES), James Cook University, Townsville Campus, Townsville, QLD 4811, Australia.
| | - Dean R Jerry
- Centre for Sustainable Tropical Fisheries and Aquaculture (CSTFA), James Cook University, Townsville Campus, Townsville, QLD 4811, Australia; College of Marine and Environmental Sciences (CMES), James Cook University, Townsville Campus, Townsville, QLD 4811, Australia.
| | - Kyall R Zenger
- Centre for Sustainable Tropical Fisheries and Aquaculture (CSTFA), James Cook University, Townsville Campus, Townsville, QLD 4811, Australia; College of Marine and Environmental Sciences (CMES), James Cook University, Townsville Campus, Townsville, QLD 4811, Australia.
| |
Collapse
|
14
|
Nuclear species-diagnostic SNP markers mined from 454 amplicon sequencing reveal admixture genomic structure of modern citrus varieties. PLoS One 2015; 10:e0125628. [PMID: 25973611 PMCID: PMC4431842 DOI: 10.1371/journal.pone.0125628] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Accepted: 03/16/2015] [Indexed: 11/19/2022] Open
Abstract
Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP marker set will be useful for systematic estimation of admixture structure of citrus germplasm and for diverse genetic studies.
Collapse
|
15
|
Sethi SA, Cook GM, Lemons P, Wenburg J. Guidelines for MSAT and SNP panels that lead to high-quality data for genetic mark–recapture studies. CAN J ZOOL 2014. [DOI: 10.1139/cjz-2013-0302] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Molecular markers with inadequate power to discriminate among individuals can lead to false recaptures (shadows), and inaccurate genotyping can lead to missed recaptures (ghosts), potentially biasing genetic mark–recapture estimates. We used simulations to examine the impact of microsatellite (MSAT) and single nucleotide polymorphism (SNP) marker-set size, allelic frequency, multitubes approaches, and sample matching protocols on shadow and ghost events in genetic mark–recapture studies, presenting guidance on the specifications for MSAT and SNP marker panels, and sample matching protocols necessary to produce high-quality data. Shadow events are controllable by increasing the number of markers or by selecting markers with high discriminatory power; reasonably sized marker sets (e.g., ≥9 MSATs or ≥32 SNPs) of moderate allelic diversity lead to low probabilities of shadow errors. Ghost events are more challenging to control and low allelic dropout or false-allele error rates produced high rates of erroneous mismatches in mark–recapture sampling. Fortunately, error-tolerant matching protocols, which use information from positively matching loci between comparisons of samples, and multitubes protocols to achieve consensus genotypes are effective at eliminating ghost events. We present a case study on Pacific walrus, Odobenus rosmarus divergens (Illiger, 1815), using simulation results to inform genetic marker choices.
Collapse
Affiliation(s)
- Suresh Andrew Sethi
- U.S. Fish and Wildlife Service, Biometrics, 1011 East Tudor Road MS 331, Anchorage, AK 99503, USA
| | - Geoffrey M. Cook
- U.S. Fish and Wildlife Service, Conservation Genetics Laboratory, 1011 East Tudor Road MS 331, Anchorage, AK 99503, USA
| | - Patrick Lemons
- U.S. Fish and Wildlife Service, Marine Mammals Management, 1011 East Tudor Road, Anchorage, AK 99503, USA
| | - John Wenburg
- U.S. Fish and Wildlife Service, Conservation Genetics Laboratory, 1011 East Tudor Road MS 331, Anchorage, AK 99503, USA
| |
Collapse
|
16
|
Montanari S, Saeed M, Knäbel M, Kim Y, Troggio M, Malnoy M, Velasco R, Fontana P, Won K, Durel CE, Perchepied L, Schaffer R, Wiedow C, Bus V, Brewer L, Gardiner SE, Crowhurst RN, Chagné D. Identification of Pyrus single nucleotide polymorphisms (SNPs) and evaluation for genetic mapping in European pear and interspecific Pyrus hybrids. PLoS One 2013; 8:e77022. [PMID: 24155917 PMCID: PMC3796552 DOI: 10.1371/journal.pone.0077022] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2013] [Accepted: 08/26/2013] [Indexed: 11/18/2022] Open
Abstract
We have used new generation sequencing (NGS) technologies to identify single nucleotide polymorphism (SNP) markers from three European pear (Pyrus communis L.) cultivars and subsequently developed a subset of 1096 pear SNPs into high throughput markers by combining them with the set of 7692 apple SNPs on the IRSC apple Infinium® II 8K array. We then evaluated this apple and pear Infinium® II 9K SNP array for large-scale genotyping in pear across several species, using both pear and apple SNPs. The segregating populations employed for array validation included a segregating population of European pear ('Old Home'×'Louise Bon Jersey') and four interspecific breeding families derived from Asian (P. pyrifolia Nakai and P. bretschneideri Rehd.) and European pear pedigrees. In total, we mapped 857 polymorphic pear markers to construct the first SNP-based genetic maps for pear, comprising 78% of the total pear SNPs included in the array. In addition, 1031 SNP markers derived from apple (13% of the total apple SNPs included in the array) were polymorphic and were mapped in one or more of the pear populations. These results are the first to demonstrate SNP transferability across the genera Malus and Pyrus. Our construction of high density SNP-based and gene-based genetic maps in pear represents an important step towards the identification of chromosomal regions associated with a range of horticultural characters, such as pest and disease resistance, orchard yield and fruit quality.
Collapse
Affiliation(s)
- Sara Montanari
- Istituto Agrario San Michele all'Adige Research and Innovation Centre, Foundation Edmund Mach, San Michele all'Adige, Trento, Italy ; The New Zealand Institute for Plant & Food Research Limited (Plant & Food Research), Palmerston North Research Centre, Palmerston North, New Zealand ; Institut National de la Recherche Agronomique (INRA), UMR1345 Institut de Recherche en Horticulture et Semences, SFR 4207 Quasav, Pres L'UNAM, F-49071 Beaucouzé, France ; Université d'Angers, UMR1345 Institut de Recherche en Horticulture et Semences, F-49045 Angers, France ; AgroCampus-Ouest, UMR1345 Institut de Recherche en Horticulture et Semences, F-49045 Angers, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Emanuelli F, Lorenzi S, Grzeskowiak L, Catalano V, Stefanini M, Troggio M, Myles S, Martinez-Zapater JM, Zyprian E, Moreira FM, Grando MS. Genetic diversity and population structure assessed by SSR and SNP markers in a large germplasm collection of grape. BMC PLANT BIOLOGY 2013; 13:39. [PMID: 23497049 PMCID: PMC3610244 DOI: 10.1186/1471-2229-13-39] [Citation(s) in RCA: 153] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2012] [Accepted: 02/27/2013] [Indexed: 05/18/2023]
Abstract
BACKGROUND The economic importance of grapevine has driven significant efforts in genomics to accelerate the exploitation of Vitis resources for development of new cultivars. However, although a large number of clonally propagated accessions are maintained in grape germplasm collections worldwide, their use for crop improvement is limited by the scarcity of information on genetic diversity, population structure and proper phenotypic assessment. The identification of representative and manageable subset of accessions would facilitate access to the diversity available in large collections. A genome-wide germplasm characterization using molecular markers can offer reliable tools for adjusting the quality and representativeness of such core samples. RESULTS We investigated patterns of molecular diversity at 22 common microsatellite loci and 384 single nucleotide polymorphisms (SNPs) in 2273 accessions of domesticated grapevine V. vinifera ssp. sativa, its wild relative V. vinifera ssp. sylvestris, interspecific hybrid cultivars and rootstocks. Despite the large number of putative duplicates and extensive clonal relationships among the accessions, we observed high level of genetic variation. In the total germplasm collection the average genetic diversity, as quantified by the expected heterozygosity, was higher for SSR loci (0.81) than for SNPs (0.34). The analysis of the genetic structure in the grape germplasm collection revealed several levels of stratification. The primary division was between accessions of V. vinifera and non-vinifera, followed by the distinction between wild and domesticated grapevine. Intra-specific subgroups were detected within cultivated grapevine representing different eco-geographic groups. The comparison of a phenological core collection and genetic core collections showed that the latter retained more genetic diversity, while maintaining a similar phenotypic variability. CONCLUSIONS The comprehensive molecular characterization of our grape germplasm collection contributes to the knowledge about levels and distribution of genetic diversity in the existing resources of Vitis and provides insights into genetic subdivision within the European germplasm. Genotypic and phenotypic information compared in this study may efficiently guide further exploration of this diversity for facilitating its practical use.
Collapse
Affiliation(s)
- Francesco Emanuelli
- Department of Genomics and Biology of Fruit Crops, IASMA Research and Innovation Centre, Fondazione Edmund Mach - Via E. Mach 1, San Michele all'Adige, TN, 38010, Italy
| | - Silvia Lorenzi
- Department of Genomics and Biology of Fruit Crops, IASMA Research and Innovation Centre, Fondazione Edmund Mach - Via E. Mach 1, San Michele all'Adige, TN, 38010, Italy
| | - Lukasz Grzeskowiak
- Department of Genomics and Biology of Fruit Crops, IASMA Research and Innovation Centre, Fondazione Edmund Mach - Via E. Mach 1, San Michele all'Adige, TN, 38010, Italy
| | - Valentina Catalano
- Department of Genomics and Biology of Fruit Crops, IASMA Research and Innovation Centre, Fondazione Edmund Mach - Via E. Mach 1, San Michele all'Adige, TN, 38010, Italy
| | - Marco Stefanini
- Department of Genomics and Biology of Fruit Crops, IASMA Research and Innovation Centre, Fondazione Edmund Mach - Via E. Mach 1, San Michele all'Adige, TN, 38010, Italy
| | - Michela Troggio
- Department of Genomics and Biology of Fruit Crops, IASMA Research and Innovation Centre, Fondazione Edmund Mach - Via E. Mach 1, San Michele all'Adige, TN, 38010, Italy
| | - Sean Myles
- Department of Plant and Animal Sciences, Faculty of Agriculture, Dalhousie University, Truro, Nova Scotia, B2N 5E3, Canada
| | - José M Martinez-Zapater
- Instituto de Ciencias de la Vid y del Vino (CSIC, UR, Gobierno de La Rioja), C/ Madre de dios 51, Logroño, 26006, Spain
| | - Eva Zyprian
- JKI Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Flavia M Moreira
- Department of Genomics and Biology of Fruit Crops, IASMA Research and Innovation Centre, Fondazione Edmund Mach - Via E. Mach 1, San Michele all'Adige, TN, 38010, Italy
- Instituto Federal de Santa Catarina, Rua José Lino Kretzer 608 - Praia Comprida, São José, Santa Catarina, 88130-310, Brasil
| | - M Stella Grando
- Department of Genomics and Biology of Fruit Crops, IASMA Research and Innovation Centre, Fondazione Edmund Mach - Via E. Mach 1, San Michele all'Adige, TN, 38010, Italy
| |
Collapse
|
18
|
Ollitrault P, Terol J, Garcia-Lor A, Bérard A, Chauveau A, Froelicher Y, Belzile C, Morillon R, Navarro L, Brunel D, Talon M. SNP mining in C. clementina BAC end sequences; transferability in the Citrus genus (Rutaceae), phylogenetic inferences and perspectives for genetic mapping. BMC Genomics 2012; 13:13. [PMID: 22233093 PMCID: PMC3320530 DOI: 10.1186/1471-2164-13-13] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2011] [Accepted: 01/10/2012] [Indexed: 01/18/2024] Open
Abstract
Background With the increasing availability of EST databases and whole genome sequences, SNPs have become the most abundant and powerful polymorphic markers. However, SNP chip data generally suffers from ascertainment biases caused by the SNP discovery and selection process in which a small number of individuals are used as discovery panels. The ongoing International Citrus Genome Consortium sequencing project of the highly heterozygous Clementine and sweet orange genomes will soon result in the release of several hundred thousand SNPs. The primary goals of this study were: (i) to estimate the transferability within the genus Citrus of SNPs discovered from Clementine BACend sequencing (BES), (ii) to estimate bias associated with the very narrow discovery panel, and (iii) to evaluate the usefulness of the Clementine-derived SNP markers for diversity analysis and comparative mapping studies between the different cultivated Citrus species. Results Fifty-four accessions covering the main Citrus species and 52 interspecific hybrids between pummelo and Clementine were genotyped on a GoldenGate array platform using 1,457 SNPs mined from Clementine BES and 37 SNPs identified between and within C. maxima, C. medica, C. reticulata and C. micrantha. Consistent results were obtained from 622 SNP loci. Of these markers, 116 displayed incomplete transferability primarily in C. medica, C. maxima and wild Citrus species. The two primary biases associated with the SNP mining in Clementine were an overestimation of the C. reticulata diversity and an underestimation of the interspecific differentiation. However, the genetic stratification of the gene pool was high, with very frequent significant linkage disequilibrium. Furthermore, the shared intraspecific polymorphism and accession heterozygosity were generally enough to perform interspecific comparative genetic mapping. Conclusions A set of 622 SNP markers providing consistent results was selected. Of the markers mined from Clementine, 80.5% were successfully transferred to the whole Citrus gene pool. Despite the ascertainment biases in relation to the Clementine origin, the SNP data confirm the important stratification of the gene pools around C. maxima, C. medica and C. reticulata as well as previous hypothesis on the origin of secondary species. The implemented SNP marker set will be very useful for comparative genetic mapping in Citrus and genetic association in C. reticulata.
Collapse
Affiliation(s)
- Patrick Ollitrault
- CIRAD, UMR AGAP, Avenue Agropolis, TA A-108/02, 34398 Montpellier, Cedex 5, France.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Turner S, Armstrong LL, Bradford Y, Carlson CS, Crawford DC, Crenshaw AT, de Andrade M, Doheny KF, Haines JL, Hayes G, Jarvik G, Jiang L, Kullo IJ, Li R, Ling H, Manolio TA, Matsumoto M, McCarty CA, McDavid AN, Mirel DB, Paschall JE, Pugh EW, Rasmussen LV, Wilke RA, Zuvich RL, Ritchie MD. Quality control procedures for genome-wide association studies. ACTA ACUST UNITED AC 2011; Chapter 1:Unit1.19. [PMID: 21234875 DOI: 10.1002/0471142905.hg0119s68] [Citation(s) in RCA: 201] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of complex disease. Regardless of context, the practical utility of this information will ultimately depend upon the quality of the original data. Quality control (QC) procedures for GWAS are computationally intensive, operationally challenging, and constantly evolving. Here we enumerate some of the challenges in QC of GWAS data and describe the approaches that the electronic MEdical Records and Genomics (eMERGE) network is using for quality assurance in GWAS data, thereby minimizing potential bias and error in GWAS results. We discuss common issues associated with QC of GWAS data, including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We propose best practices and discuss areas of ongoing and future research.
Collapse
Affiliation(s)
- Stephen Turner
- Center for Human Genetics Research, Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Abstract
High-throughput genotyping technologies have become popular in studies that aim to reveal the genetics behind polygenic traits such as complex disease and the diverse response to some drug treatments. These technologies utilize bioinformatics tools to define strategies, analyze data, and estimate the final associations between certain genetic markers and traits. The strategy followed for an association study depends on its efficiency and cost. The efficiency is based on the assumed characteristics of the polymorphisms' allele frequencies and linkage disequilibrium for putative casual alleles. Statistically significant markers (single mutations or haplotypes) that cause a human disorder should be validated and their biological function elucidated. The aim of this chapter is to present a subset of bioinformatics tools for haplotype inference, tag SNP selection, and genome-wide association studies using a high-throughput generated SNP data set.
Collapse
Affiliation(s)
- Ana M Aransay
- Functional Genomics Unit, Parque Technológico de Bizkaia, Derio, Spain
| | | | | |
Collapse
|
21
|
Laakso M, Karinen S, Lehtonen R, Hautaniemi S. Computational identification of cancer susceptibility loci. Methods Mol Biol 2010; 653:87-103. [PMID: 20721739 DOI: 10.1007/978-1-60761-759-4_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
The identification of novel cancer susceptibility syndromes and genes from very limited numbers of study individuals has become feasible through the use of high-throughput genotype microarrays. With such an approach, highly sensitive genome-wide computational methods are needed to identify the regions of interest. We have developed novel methods to identify and compare homozygous and compound heterozygous regions between cases and controls, to facilitate the identification of recessively inherited cancer susceptibility loci. As our approach is optimized for sensitivity, it creates many hits that may be unrelated to the phenotype of interest. We compensate for this compromised specificity by the automated use of additional sources of biological information along with a ranking function to focus on the most relevant regions. The methods are demonstrated here by comparing colorectal cancer patients to controls.
Collapse
Affiliation(s)
- Marko Laakso
- Computational Systems Biology Laboratory, Institute of Biomedicine and Genome-Scale Biology Research Program, University of Helsinki, Helsinki, Finland
| | | | | | | |
Collapse
|
22
|
Wilding CS, Weetman D, Steen K, Donnelly MJ. High, clustered, nucleotide diversity in the genome of Anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols. BMC Genomics 2009; 10:320. [PMID: 19607710 PMCID: PMC2723138 DOI: 10.1186/1471-2164-10-320] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2008] [Accepted: 07/16/2009] [Indexed: 02/04/2023] Open
Abstract
Background Association mapping approaches are dependent upon discovery and validation of single nucleotide polymorphisms (SNPs). To further association studies in Anopheles gambiae we conducted a major resequencing programme, primarily targeting regions within or close to candidate genes for insecticide resistance. Results Using two pools of mosquito template DNA we sequenced over 300 kbp across 660 distinct amplicons of the An. gambiae genome. Comparison of SNPs identified from pooled templates with those from individual sequences revealed a very low false positive rate. False negative rates were much higher and mostly resulted from SNPs with a low minor allele frequency. Pooled-template sequencing also provided good estimates of SNP allele frequencies. Allele frequency estimation success, along with false positive and negative call rates, improved significantly when using a qualitative measure of SNP call quality. We identified a total of 7062 polymorphic features comprising 6995 SNPs and 67 indels, with, on average, a SNP every 34 bp; a high rate of polymorphism that is comparable to other studies of mosquitoes. SNPs were significantly more frequent in members of the cytochrome p450 mono-oxygenases and carboxy/cholinesterase gene-families than in glutathione-S-transferases, other detoxification genes, and control genomic regions. Polymorphic sites showed a significantly clustered distribution, but the degree of SNP clustering (independent of SNP frequency) did not vary among gene families, suggesting that clustering of polymorphisms is a general property of the An. gambiae genome. Conclusion The high frequency and clustering of SNPs has important ramifications for the design of high-throughput genotyping assays based on allele specific primer extension or probe hybridisation. We illustrate these issues in the context of the design of Illumina GoldenGate assays.
Collapse
Affiliation(s)
- Craig S Wilding
- Vector Group, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA, UK.
| | | | | | | |
Collapse
|
23
|
Mefford HC, Cooper GM, Zerr T, Smith JD, Baker C, Shafer N, Thorland EC, Skinner C, Schwartz CE, Nickerson DA, Eichler EE. A method for rapid, targeted CNV genotyping identifies rare variants associated with neurocognitive disease. Genome Res 2009; 19:1579-85. [PMID: 19506092 DOI: 10.1101/gr.094987.109] [Citation(s) in RCA: 109] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Copy-number variants (CNVs) are substantial contributors to human disease. A central challenge in CNV-disease association studies is to characterize the pathogenicity of rare and possibly incompletely penetrant events, which requires the accurate detection of rare CNVs in large numbers of individuals. Cost and throughput issues limit our ability to perform these studies. We have adapted the Illumina BeadXpress SNP genotyping assay and developed an algorithm, SNP-Conditional OUTlier detection (SCOUT), to rapidly and accurately detect both rare and common CNVs in large cohorts. This approach is customizable, cost effective, highly parallelized, and largely automated. We applied this method to screen 69 loci in 1105 children with unexplained intellectual disability, identifying pathogenic variants in 3.1% of these individuals and potentially pathogenic variants in an additional 2.3%. We identified seven individuals (0.7%) with a deletion of 16p11.2, which has been previously associated with autism. Our results widen the phenotypic spectrum of these deletions to include intellectual disability without autism. We also detected 1.65-3.4 Mbp duplications at 16p13.11 in 1.1% of affected individuals and 350 kbp deletions at 15q11.2, near the Prader-Willi/Angelman syndrome critical region, in 0.8% of affected individuals. Compared to published CNVs in controls they are significantly (P = 4.7 x 10(-5) and 0.003, respectively) enriched in these children, supporting previously published hypotheses that they are neurocognitive disease risk factors. More generally, this approach offers a previously unavailable balance between customization, cost, and throughput for analysis of CNVs and should prove valuable for targeted CNV detection in both research and diagnostic settings.
Collapse
Affiliation(s)
- Heather C Mefford
- Department of Pediatrics, University of Washington, Seattle, Washington 98195, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Correcting estimators of theta and Tajima's D for ascertainment biases caused by the single-nucleotide polymorphism discovery process. Genetics 2008; 181:701-10. [PMID: 19087964 DOI: 10.1534/genetics.108.094060] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Most single-nucleotide polymorphism (SNP) data suffer from an ascertainment bias caused by the process of SNP discovery followed by SNP genotyping. The final genotyped data are biased toward an excess of common alleles compared to directly sequenced data, making standard genetic methods of analysis inapplicable to this type of data. We here derive corrected estimators of the fundamental population genetic parameter = 4N(e)mu (N(e), effective population size; mu, mutation rate) on the basis of the average number of pairwise differences and on the basis of the number of segregating sites. We also derive the variances and covariances of these estimators and provide a corrected version of Tajima's D statistic. We reanalyze a human genomewide SNP data set and find substantial differences in the results with or without ascertainment bias correction.
Collapse
|
25
|
Franke L, de Kovel CG, Aulchenko YS, Trynka G, Zhernakova A, Hunt KA, Blauw HM, van den Berg LH, Ophoff R, Deloukas P, van Heel DA, Wijmenga C. Detection, imputation, and association analysis of small deletions and null alleles on oligonucleotide arrays. Am J Hum Genet 2008; 82:1316-33. [PMID: 18519066 DOI: 10.1016/j.ajhg.2008.05.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2008] [Revised: 03/21/2008] [Accepted: 05/13/2008] [Indexed: 12/14/2022] Open
Abstract
Copy-number variation (CNV) is a major contributor to human genetic variation. Recently, CNV associations with human disease have been reported. Many genome-wide association (GWA) studies in complex diseases have been performed with sets of biallelic single-nucleotide polymorphisms (SNPs), but the available CNV methods are still limited. We present a new method (TriTyper) that can infer genotypes in case-control data sets for deletion CNVs, or SNPs with an extra, untyped allele at a high-resolution single SNP level. By accounting for linkage disequilibrium (LD), as well as intensity data, calling accuracy is improved. Analysis of 3102 unrelated individuals with European descent, genotyped with Illumina Infinium BeadChips, resulted in the identification of 1880 SNPs with a common untyped allele, and these SNPs are in strong LD with neighboring biallelic SNPs. Simulations indicate our method has superior power to detect associations compared to biallelic SNPs that are in LD with these SNPs, yet without increasing type I errors, as shown in a GWA analysis in celiac disease. Genotypes for 1204 triallelic SNPs could be fully imputed, with only biallelic-genotype calls, permitting association analysis of these SNPs in many published data sets. We estimate that 682 of the 1655 unique loci reflect deletions; this is on average 99 deletions per individual, four times greater than those detected by other methods. Whereas the identified loci are strongly enriched for known deletions, 61% have not been reported before. Genes overlapping with these loci more often have paralogs (p = 0.006) and biologically interact with fewer genes than expected (p = 0.004).
Collapse
|
26
|
Hüebner C, Petermann I, Browning BL, Shelling AN, Ferguson LR. Triallelic single nucleotide polymorphisms and genotyping error in genetic epidemiology studies: MDR1 (ABCB1) G2677/T/A as an example. Cancer Epidemiol Biomarkers Prev 2007; 16:1185-92. [PMID: 17548683 DOI: 10.1158/1055-9965.epi-06-0759] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Accurate measurement of allele frequencies between population groups with differing sensitivities to disease is fundamental to genetic epidemiology. Genotyping errors can markedly influence the biological conclusions of a study. This issue may be especially important now there is increasing recognition of triallelic single nucleotide polymorphisms (SNPs) in the genome and their possible role in diseases like inflammatory bowel disease. For example, the MDR1 (ABCB1) SNP G2677/T/A was, like many other triallelic SNPs, originally described as diallelic. Here, we report a comprehensive analyses of estimated allele frequencies of this SNP in a set of 73 human DNA samples, comparing six commonly used genotyping methods (Applied Biosystems Taqman, Roche LightCycler melting analysis, allelic discrimination PCR, DNA sequencing, Sequenom, and RFLP) from the angle of their error potential. Only Sequenom and DNA sequencing provided accurate measurements, if we had not had prior knowledge of the triallelic nature of this SNP. The other tested methods (with the exception of LightCycler) failed to show any indication of the presence of the rare third A- allele in a diallelic assay. Although most of the errors were due to the inability to detect the third allele, all methods except Sequenom and sequencing produced errors for the detection of the two common alleles G and T (LightCycler, 6 errors; PCR, 4 errors; RFLP, 2 errors; Taqman, 1 error). There is considerable variability in the reported frequencies of the different alleles of the MDR1 G2677/T/A SNP, and the role of this SNP in the etiology of inflammatory bowel disease has been controversial. Our data emphasize the importance of choosing the appropriate method for SNP detection and lead us to suggest that part of the previously reported variation may reflect artifacts associated with the different genotyping methodologies used. The failure to recognize the triallic nature of a SNP may lead to underestimations of real genetic associations.
Collapse
Affiliation(s)
- Claudia Hüebner
- Discipline of Nutrition, Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand
| | | | | | | | | |
Collapse
|
27
|
Kosta K, Sabroe I, Goke J, Nibbs RJ, Tsanakas J, Whyte MK, Teare MD. A Bayesian approach to copy-number-polymorphism analysis in nuclear pedigrees. Am J Hum Genet 2007; 81:808-12. [PMID: 17847005 PMCID: PMC2227930 DOI: 10.1086/520096] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2007] [Accepted: 05/09/2007] [Indexed: 02/03/2023] Open
Abstract
Segmental copy-number polymorphisms (CNPs) represent a significant component of human genetic variation and are likely to contribute to disease susceptibility. These potentially multiallelic and highly polymorphic systems present new challenges to family-based genetic-analysis tools that commonly assume codominant markers and allow for no genotyping error. The copy-number quantitation (CNP phenotype) represents the total number of segmental copies present in an individual and provides a means to infer, rather than to observe, the underlying allele segregation. We present an integrated approach to meet these challenges, in the form of a graphical model in which we infer the underlying CNP phenotype from the (single or replicate) quantitative measure within the analysis while assuming an allele-based system segregating through the pedigree. This approach can be readily applied to the study of any form of genetic measure, and the construction permits extension to a wide variety of hypothesis tests. We have implemented the basic model for use with nuclear families, and we illustrate its application through an analysis of the CNP located in gene CCL3L1 in 201 families with asthma.
Collapse
Affiliation(s)
- Konstantina Kosta
- School of Medicine and Biomedical Sciences, University of Sheffield, Sheffield, UK
| | | | | | | | | | | | | |
Collapse
|
28
|
Reiner AP, Carlson CS, Ziv E, Iribarren C, Jaquish CE, Nickerson DA. Genetic ancestry, population sub-structure, and cardiovascular disease-related traits among African-American participants in the CARDIA Study. Hum Genet 2007; 121:565-75. [PMID: 17356887 DOI: 10.1007/s00439-007-0350-2] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2007] [Accepted: 02/26/2007] [Indexed: 12/27/2022]
Abstract
African-American populations are genetically admixed. Studies performed among unrelated individuals from ethnically admixed populations may be both vulnerable to confounding by population stratification, but offer an opportunity for efficiently mapping complex traits through admixture linkage disequilibrium. By typing 42 ancestry-informative markers and estimating genetic ancestry, we assessed genetic admixture and heterogeneity among African-American participants in the Coronary Artery Risk Development in Young Adults (CARDIA) cohort. We also assessed associations between individual genetic ancestry and several quantitative and binary traits related to cardiovascular risk. We found evidence of population sub-structure and excess inter-marker linkage disequilibrium, consistent with recent admixture. The estimated group admixture proportions were 78.1% African and 22.9% European, but differed according to geographic region. In multiple regression models, African ancestry was significantly associated with decreased total cholesterol, decreased LDL-cholesterol, and decreased triglycerides, and also with increased risk of insulin resistance. These observed associations between African ancestry and several lipid traits are consistent with the general tendency of individuals of African descent to have healthier lipid profiles compared to European-Americans. There was no association between genetic ancestry and hypertension, BMI, waist circumference, CRP level, or coronary artery calcification. These results demonstrate the potential for confounding of genetic associations with some cardiovascular disease-related traits in large studies involving US African-Americans.
Collapse
Affiliation(s)
- Alexander P Reiner
- Department of Epidemiology, University of Washington, Box 357236, Seattle, WA 98195, USA.
| | | | | | | | | | | |
Collapse
|