1
|
Howard NP, Troggio M, Durel CE, Muranty H, Denancé C, Bianco L, Tillman J, van de Weg E. Integration of Infinium and Axiom SNP array data in the outcrossing species Malus × domestica and causes for seemingly incompatible calls. BMC Genomics 2021; 22:246. [PMID: 33827434 PMCID: PMC8028180 DOI: 10.1186/s12864-021-07565-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 03/30/2021] [Indexed: 11/23/2022] Open
Abstract
Background Single nucleotide polymorphism (SNP) array technology has been increasingly used to generate large quantities of SNP data for use in genetic studies. As new arrays are developed to take advantage of new technology and of improved probe design using new genome sequence and panel data, a need to integrate data from different arrays and array platforms has arisen. This study was undertaken in view of our need for an integrated high-quality dataset of Illumina Infinium® 20 K and Affymetrix Axiom® 480 K SNP array data in apple (Malus × domestica). In this study, we qualify and quantify the compatibility of SNP calling, defined as SNP calls that are both accurate and concordant, across both arrays by two approaches. First, the concordance of SNP calls was evaluated using a set of 417 duplicate individuals genotyped on both arrays starting from a set of 10,295 robust SNPs on the Infinium array. Next, the accuracy of the SNP calls was evaluated on additional germplasm (n = 3141) from both arrays using Mendelian inconsistent and consistent errors across thousands of pedigree links. While performing this work, we took the opportunity to evaluate reasons for probe failure and observed discordant SNP calls. Results Concordance among the duplicate individuals was on average of 97.1% across 10,295 SNPs. Of these SNPs, 35% had discordant call(s) that were further curated, leading to a final set of 8412 (81.7%) SNPs that were deemed compatible. Compatibility was highly influenced by the presence of alternate probe binding locations and secondary polymorphisms. The impact of the latter was highly influenced by their number and proximity to the 3′ end of the probe. Conclusions The Infinium and Axiom SNP array data were mostly compatible. However, data integration required intense data filtering and curation. This work resulted in a workflow and information that may be of use in other data integration efforts. Such an in-depth analysis of array concordance and accuracy as ours has not been previously described in the literature and will be useful in future work on SNP array data integration and interpretation, and in probe/platform development. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07565-7.
Collapse
Affiliation(s)
- Nicholas P Howard
- Institut für Biologie und Umweltwissenschaften, Carl von Ossietzky Univ., Oldenburg, Germany.,Department of Horticultural Science, Univ. of Minnesota, St Paul, USA
| | | | - Charles-Eric Durel
- Université d'Angers, Institut Agro, INRAE, IRHS, SFR 4207 QuaSaV, Beaucouzé, France
| | - Hélène Muranty
- Université d'Angers, Institut Agro, INRAE, IRHS, SFR 4207 QuaSaV, Beaucouzé, France
| | - Caroline Denancé
- Université d'Angers, Institut Agro, INRAE, IRHS, SFR 4207 QuaSaV, Beaucouzé, France
| | - Luca Bianco
- Fondazione Edmund Mach, San Michele all'Adige, TN, Italy
| | - John Tillman
- Department of Horticultural Science, Univ. of Minnesota, St Paul, USA
| | - Eric van de Weg
- Department of Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands.
| |
Collapse
|
2
|
High-quality, genome-wide SNP genotypic data for pedigreed germplasm of the diploid outbreeding species apple, peach, and sweet cherry through a common workflow. PLoS One 2019; 14:e0210928. [PMID: 31246947 PMCID: PMC6597046 DOI: 10.1371/journal.pone.0210928] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Accepted: 04/19/2019] [Indexed: 12/14/2022] Open
Abstract
High-quality genotypic data is a requirement for many genetic analyses. For any crop, errors in genotype calls, phasing of markers, linkage maps, pedigree records, and unnoticed variation in ploidy levels can lead to spurious marker-locus-trait associations and incorrect origin assignment of alleles to individuals. High-throughput genotyping requires automated scoring, as manual inspection of thousands of scored loci is too time-consuming. However, automated SNP scoring can result in errors that should be corrected to ensure recorded genotypic data are accurate and thereby ensure confidence in downstream genetic analyses. To enable quick identification of errors in a large genotypic data set, we have developed a comprehensive workflow. This multiple-step workflow is based on inheritance principles and on removal of markers and individuals that do not follow these principles, as demonstrated here for apple, peach, and sweet cherry. Genotypic data was obtained on pedigreed germplasm using 6-9K SNP arrays for each crop and a subset of well-performing SNPs was created using ASSIsT. Use of correct (and corrected) pedigree records readily identified violations of simple inheritance principles in the genotypic data, streamlined with FlexQTL software. Retained SNPs were grouped into haploblocks to increase the information content of single alleles and reduce computational power needed in downstream genetic analyses. Haploblock borders were defined by recombination locations detected in ancestral generations of cultivars and selections. Another round of inheritance-checking was conducted, for haploblock alleles (i.e., haplotypes). High-quality genotypic data sets were created using this workflow for pedigreed collections representing the U.S. breeding germplasm of apple, peach, and sweet cherry evaluated within the RosBREED project. These data sets contain 3855, 4005, and 1617 SNPs spread over 932, 103, and 196 haploblocks in apple, peach, and sweet cherry, respectively. The highly curated phased SNP and haplotype data sets, as well as the raw iScan data, of germplasm in the apple, peach, and sweet cherry Crop Reference Sets is available through the Genome Database for Rosaceae.
Collapse
|
3
|
Peace CP, Bianco L, Troggio M, van de Weg E, Howard NP, Cornille A, Durel CE, Myles S, Migicovsky Z, Schaffer RJ, Costes E, Fazio G, Yamane H, van Nocker S, Gottschalk C, Costa F, Chagné D, Zhang X, Patocchi A, Gardiner SE, Hardner C, Kumar S, Laurens F, Bucher E, Main D, Jung S, Vanderzande S. Apple whole genome sequences: recent advances and new prospects. HORTICULTURE RESEARCH 2019; 6:59. [PMID: 30962944 PMCID: PMC6450873 DOI: 10.1038/s41438-019-0141-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Revised: 03/15/2019] [Accepted: 03/15/2019] [Indexed: 05/19/2023]
Abstract
In 2010, a major scientific milestone was achieved for tree fruit crops: publication of the first draft whole genome sequence (WGS) for apple (Malus domestica). This WGS, v1.0, was valuable as the initial reference for sequence information, fine mapping, gene discovery, variant discovery, and tool development. A new, high quality apple WGS, GDDH13 v1.1, was released in 2017 and now serves as the reference genome for apple. Over the past decade, these apple WGSs have had an enormous impact on our understanding of apple biological functioning, trait physiology and inheritance, leading to practical applications for improving this highly valued crop. Causal gene identities for phenotypes of fundamental and practical interest can today be discovered much more rapidly. Genome-wide polymorphisms at high genetic resolution are screened efficiently over hundreds to thousands of individuals with new insights into genetic relationships and pedigrees. High-density genetic maps are constructed efficiently and quantitative trait loci for valuable traits are readily associated with positional candidate genes and/or converted into diagnostic tests for breeders. We understand the species, geographical, and genomic origins of domesticated apple more precisely, as well as its relationship to wild relatives. The WGS has turbo-charged application of these classical research steps to crop improvement and drives innovative methods to achieve more durable, environmentally sound, productive, and consumer-desirable apple production. This review includes examples of basic and practical breakthroughs and challenges in using the apple WGSs. Recommendations for "what's next" focus on necessary upgrades to the genome sequence data pool, as well as for use of the data, to reach new frontiers in genomics-based scientific understanding of apple.
Collapse
Affiliation(s)
- Cameron P. Peace
- Department of Horticulture, Washington State University, Pullman, WA 99164 USA
| | - Luca Bianco
- Computational Biology, Fondazione Edmund Mach, San Michele all’Adige, TN 38010 Italy
| | - Michela Troggio
- Department of Genomics and Biology of Fruit Crops, Fondazione Edmund Mach, San Michele all’Adige, TN 38010 Italy
| | - Eric van de Weg
- Plant Breeding, Wageningen University and Research, Wageningen, 6708PB The Netherlands
| | - Nicholas P. Howard
- Department of Horticultural Science, University of Minnesota, St. Paul, MN 55108 USA
- Institut für Biologie und Umweltwissenschaften, Carl von Ossietzky Universität, 26129 Oldenburg, Germany
| | - Amandine Cornille
- GQE – Le Moulon, Institut National de la Recherche Agronomique, University of Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91190 Gif-sur-Yvette, France
| | - Charles-Eric Durel
- Institut National de la Recherche Agronomique, Institut de Recherche en Horticulture et Semences, UMR 1345, 49071 Beaucouzé, France
| | - Sean Myles
- Department of Plant, Food and Environmental Sciences, Faculty of Agriculture, Dalhousie University, Truro, NS B2N 5E3 Canada
| | - Zoë Migicovsky
- Department of Plant, Food and Environmental Sciences, Faculty of Agriculture, Dalhousie University, Truro, NS B2N 5E3 Canada
| | - Robert J. Schaffer
- The New Zealand Institute for Plant and Food Research Ltd, Motueka, 7198 New Zealand
- School of Biological Sciences, University of Auckland, Auckland, 1142 New Zealand
| | - Evelyne Costes
- AGAP, INRA, CIRAD, Montpellier SupAgro, University of Montpellier, Montpellier, France
| | - Gennaro Fazio
- Plant Genetic Resources Unit, USDA ARS, Geneva, NY 14456 USA
| | - Hisayo Yamane
- Laboratory of Pomology, Graduate School of Agriculture, Kyoto University, Kyoto, 606-8502 Japan
| | - Steve van Nocker
- Department of Horticulture, Michigan State University, East Lansing, MI 48824 USA
| | - Chris Gottschalk
- Department of Horticulture, Michigan State University, East Lansing, MI 48824 USA
| | - Fabrizio Costa
- Department of Genomics and Biology of Fruit Crops, Fondazione Edmund Mach, San Michele all’Adige, TN 38010 Italy
| | - David Chagné
- The New Zealand Institute for Plant and Food Research Ltd (Plant & Food Research), Palmerston North Research Centre, Palmerston North, 4474 New Zealand
| | - Xinzhong Zhang
- College of Horticulture, China Agricultural University, 100193 Beijing, China
| | | | - Susan E. Gardiner
- The New Zealand Institute for Plant and Food Research Ltd (Plant & Food Research), Palmerston North Research Centre, Palmerston North, 4474 New Zealand
| | - Craig Hardner
- Queensland Alliance of Agriculture and Food Innovation, University of Queensland, St Lucia, 4072 Australia
| | - Satish Kumar
- New Cultivar Innovation, Plant and Food Research, Havelock North, 4130 New Zealand
| | - Francois Laurens
- Institut National de la Recherche Agronomique, Institut de Recherche en Horticulture et Semences, UMR 1345, 49071 Beaucouzé, France
| | - Etienne Bucher
- Institut National de la Recherche Agronomique, Institut de Recherche en Horticulture et Semences, UMR 1345, 49071 Beaucouzé, France
- Agroscope, 1260 Changins, Switzerland
| | - Dorrie Main
- Department of Horticulture, Washington State University, Pullman, WA 99164 USA
| | - Sook Jung
- Department of Horticulture, Washington State University, Pullman, WA 99164 USA
| | - Stijn Vanderzande
- Department of Horticulture, Washington State University, Pullman, WA 99164 USA
| |
Collapse
|
4
|
Azaiez A, Pavy N, Gérardi S, Laroche J, Boyle B, Gagnon F, Mottet MJ, Beaulieu J, Bousquet J. A catalog of annotated high-confidence SNPs from exome capture and sequencing reveals highly polymorphic genes in Norway spruce (Picea abies). BMC Genomics 2018; 19:942. [PMID: 30558528 PMCID: PMC6296092 DOI: 10.1186/s12864-018-5247-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Accepted: 11/14/2018] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Norway spruce [Picea abies (L.) Karst.] is ecologically and economically one of the most important conifer worldwide. Our main goal was to develop a large catalog of annotated high confidence gene SNPs that should sustain the development of genomic tools for the conservation of natural and domesticated genetic diversity resources, and hasten tree breeding efforts in this species. RESULTS Targeted sequencing was achieved by capturing P. abies exome with probes previously designed from the sequenced transcriptome of white spruce (Picea glauca (Moench) Voss). Capture efficiency was high (74.5%) given a high level of exome conservation between the two species. Using stringent criteria, we delimited a set of 61,771 high-confidence SNPs across 13,543 genes. To validate SNPs, a high-throughput genotyping array was developed for a subset of 5571 predicted SNPs representing as many different gene loci, and was used to genotype over 1000 trees. The estimated true positive rate of the resource was 84.2%, which was comparable with the genotyping success rate obtained for P. abies control SNPs recycled from previous genotyping efforts. We also analyzed SNP abundance across various gene functional categories. Several GO terms and gene families involved in stress response were found over-represented in highly polymorphic genes. CONCLUSION The annotated high-confidence SNP catalog developed herein represents a valuable genomic resource, being representative of over 13 K genes distributed across the P. abies genome. This resource should serve a variety of population genomics and breeding applications in Norway spruce.
Collapse
Affiliation(s)
- Aïda Azaiez
- Canada Research Chair in Forest Genomics, Forest Research Centre, Université Laval, Québec, Québec G1V 0A6 Canada
- Institute of Integrative Biology and Systems, Université Laval, Québec, Québec G1V 0A6 Canada
| | - Nathalie Pavy
- Canada Research Chair in Forest Genomics, Forest Research Centre, Université Laval, Québec, Québec G1V 0A6 Canada
- Institute of Integrative Biology and Systems, Université Laval, Québec, Québec G1V 0A6 Canada
| | - Sébastien Gérardi
- Canada Research Chair in Forest Genomics, Forest Research Centre, Université Laval, Québec, Québec G1V 0A6 Canada
- Institute of Integrative Biology and Systems, Université Laval, Québec, Québec G1V 0A6 Canada
| | - Jérôme Laroche
- Institute of Integrative Biology and Systems, Université Laval, Québec, Québec G1V 0A6 Canada
| | - Brian Boyle
- Institute of Integrative Biology and Systems, Université Laval, Québec, Québec G1V 0A6 Canada
| | - France Gagnon
- Canada Research Chair in Forest Genomics, Forest Research Centre, Université Laval, Québec, Québec G1V 0A6 Canada
- Institute of Integrative Biology and Systems, Université Laval, Québec, Québec G1V 0A6 Canada
| | - Marie-Josée Mottet
- Direction de la recherche forestière, Ministère des Forêts, de la Faune et des Parcs du Québec, 2700 Einstein, Québec, Québec G1P 3W8 Canada
| | - Jean Beaulieu
- Canada Research Chair in Forest Genomics, Forest Research Centre, Université Laval, Québec, Québec G1V 0A6 Canada
- Institute of Integrative Biology and Systems, Université Laval, Québec, Québec G1V 0A6 Canada
| | - Jean Bousquet
- Canada Research Chair in Forest Genomics, Forest Research Centre, Université Laval, Québec, Québec G1V 0A6 Canada
- Institute of Integrative Biology and Systems, Université Laval, Québec, Québec G1V 0A6 Canada
| |
Collapse
|
5
|
Howard NP, van de Weg E, Bedford DS, Peace CP, Vanderzande S, Clark MD, Teh SL, Cai L, Luby JJ. Elucidation of the 'Honeycrisp' pedigree through haplotype analysis with a multi-family integrated SNP linkage map and a large apple ( Malus× domestica) pedigree-connected SNP data set. HORTICULTURE RESEARCH 2017; 4:17003. [PMID: 28243452 PMCID: PMC5321071 DOI: 10.1038/hortres.2017.3] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2017] [Revised: 01/27/2017] [Accepted: 01/30/2017] [Indexed: 05/18/2023]
Abstract
The apple (Malus×domestica) cultivar Honeycrisp has become important economically and as a breeding parent. An earlier study with SSR markers indicated the original recorded pedigree of 'Honeycrisp' was incorrect and 'Keepsake' was identified as one putative parent, the other being unknown. The objective of this study was to verify 'Keepsake' as a parent and identify and genetically describe the unknown parent and its grandparents. A multi-family based dense and high-quality integrated SNP map was created using the apple 8 K Illumina Infinium SNP array. This map was used alongside a large pedigree-connected data set from the RosBREED project to build extended SNP haplotypes and to identify pedigree relationships. 'Keepsake' was verified as one parent of 'Honeycrisp' and 'Duchess of Oldenburg' and 'Golden Delicious' were identified as grandparents through the unknown parent. Following this finding, siblings of 'Honeycrisp' were identified using the SNP data. Breeding records from several of these siblings suggested that the previously unreported parent is a University of Minnesota selection, MN1627. This selection is no longer available, but now is genetically described through imputed SNP haplotypes. We also present the mosaic grandparental composition of 'Honeycrisp' for each of its 17 chromosome pairs. This new pedigree and genetic information will be useful in future pedigree-based genetic studies to connect 'Honeycrisp' with other cultivars used widely in apple breeding programs. The created SNP linkage map will benefit future research using the data from the Illumina apple 8 and 20 K and Affymetrix 480 K SNP arrays.
Collapse
Affiliation(s)
- Nicholas P Howard
- Department of Horticultural Science, University of Minnesota, St Paul, MN 55104, USA
| | - Eric van de Weg
- Department of Plant Breeding, Wageningen University and Research, Wageningen 6700AJ, The Netherlands
| | - David S Bedford
- Department of Horticultural Science, University of Minnesota, St Paul, MN 55104, USA
| | - Cameron P Peace
- Department of Horticulture and Landscape Architecture, Washington State University, Pullman, WA 99164, USA
| | - Stijn Vanderzande
- Department of Horticulture and Landscape Architecture, Washington State University, Pullman, WA 99164, USA
| | - Matthew D Clark
- Department of Horticultural Science, University of Minnesota, St Paul, MN 55104, USA
| | - Soon Li Teh
- Department of Horticultural Science, University of Minnesota, St Paul, MN 55104, USA
| | - Lichun Cai
- Department of Horticulture, Michigan State University, East Lansing, MI 48824, USA
| | - James J Luby
- Department of Horticultural Science, University of Minnesota, St Paul, MN 55104, USA
- ()
| |
Collapse
|
6
|
Harrison N, Harrison RJ, Barber-Perez N, Cascant-Lopez E, Cobo-Medina M, Lipska M, Conde-Ruíz R, Brain P, Gregory PJ, Fernández-Fernández F. A new three-locus model for rootstock-induced dwarfing in apple revealed by genetic mapping of root bark percentage. JOURNAL OF EXPERIMENTAL BOTANY 2016; 67:1871-81. [PMID: 26826217 PMCID: PMC4783367 DOI: 10.1093/jxb/erw001] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Rootstock-induced dwarfing of apple scions revolutionized global apple production during the twentieth century, leading to the development of modern intensive orchards. A high root bark percentage (the percentage of the whole root area constituted by root cortex) has previously been associated with rootstock-induced dwarfing in apple. In this study, the root bark percentage was measured in a full-sib family of ungrafted apple rootstocks and found to be under the control of three loci. Two quantitative trait loci (QTLs) for root bark percentage were found to co-localize to the same genomic regions on chromosome 5 and chromosome 11 previously identified as controlling dwarfing, Dw1 and Dw2, respectively. A third QTL was identified on chromosome 13 in a region that has not been previously associated with dwarfing. The development of closely linked sequence-tagged site markers improved the resolution of allelic classes, thereby allowing the detection of dominance and epistatic interactions between loci, with high root bark percentage only occurring in specific allelic combinations. In addition, we report a significant negative correlation between root bark percentage and stem diameter (an indicator of tree vigour), measured on a clonally propagated grafted subset of the mapping population. The demonstrated link between root bark percentage and rootstock-induced dwarfing of the scion leads us to propose a three-locus model that is able to explain levels of dwarfing from the dwarf 'M.27' to the semi-invigorating rootstock 'M.116'. Moreover, we suggest that the QTL on chromosome 13 (Rb3) might be analogous to a third dwarfing QTL, Dw3, which has not previously been identified.
Collapse
Affiliation(s)
- Nicola Harrison
- East Malling Research, New Road, East Malling, Kent ME19 6BJ, UK Centre for Food Security, School of Agriculture, Policy and Development, University of Reading, Whiteknights, PO Box 237, Reading RG6 6AR, UK
| | - Richard J Harrison
- East Malling Research, New Road, East Malling, Kent ME19 6BJ, UK Centre for Food Security, School of Agriculture, Policy and Development, University of Reading, Whiteknights, PO Box 237, Reading RG6 6AR, UK
| | | | - Emma Cascant-Lopez
- East Malling Research, New Road, East Malling, Kent ME19 6BJ, UK Centre for Food Security, School of Agriculture, Policy and Development, University of Reading, Whiteknights, PO Box 237, Reading RG6 6AR, UK
| | | | - Marzena Lipska
- East Malling Research, New Road, East Malling, Kent ME19 6BJ, UK
| | | | - Philip Brain
- East Malling Research, New Road, East Malling, Kent ME19 6BJ, UK
| | - Peter J Gregory
- East Malling Research, New Road, East Malling, Kent ME19 6BJ, UK Centre for Food Security, School of Agriculture, Policy and Development, University of Reading, Whiteknights, PO Box 237, Reading RG6 6AR, UK
| | | |
Collapse
|
7
|
Sánchez-Sevilla JF, Horvath A, Botella MA, Gaston A, Folta K, Kilian A, Denoyes B, Amaya I. Diversity Arrays Technology (DArT) Marker Platforms for Diversity Analysis and Linkage Mapping in a Complex Crop, the Octoploid Cultivated Strawberry (Fragaria × ananassa). PLoS One 2015; 10:e0144960. [PMID: 26675207 PMCID: PMC4682937 DOI: 10.1371/journal.pone.0144960] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2015] [Accepted: 11/25/2015] [Indexed: 12/21/2022] Open
Abstract
Cultivated strawberry (Fragaria × ananassa) is a genetically complex allo-octoploid crop with 28 pairs of chromosomes (2n = 8x = 56) for which a genome sequence is not yet available. The diploid Fragaria vesca is considered the donor species of one of the octoploid sub-genomes and its available genome sequence can be used as a reference for genomic studies. A wide number of strawberry cultivars are stored in ex situ germplasm collections world-wide but a number of previous studies have addressed the genetic diversity present within a limited number of these collections. Here, we report the development and application of two platforms based on the implementation of Diversity Array Technology (DArT) markers for high-throughput genotyping in strawberry. The first DArT microarray was used to evaluate the genetic diversity of 62 strawberry cultivars that represent a wide range of variation based on phenotype, geographical and temporal origin and pedigrees. A total of 603 DArT markers were used to evaluate the diversity and structure of the population and their cluster analyses revealed that these markers were highly efficient in classifying the accessions in groups based on historical, geographical and pedigree-based cues. The second DArTseq platform took benefit of the complexity reduction method optimized for strawberry and the development of next generation sequencing technologies. The strawberry DArTseq was used to generate a total of 9,386 SNP markers in the previously developed ‘232’ × ‘1392’ mapping population, of which, 4,242 high quality markers were further selected to saturate this map after several filtering steps. The high-throughput platforms here developed for genotyping strawberry will facilitate genome-wide characterizations of large accessions sets and complement other available options.
Collapse
Affiliation(s)
- José F. Sánchez-Sevilla
- Instituto Andaluz de Investigación y Formación Agraria y Pesquera (IFAPA) Centro de Churriana, Cortijo de la Cruz, 29140, Málaga, Spain
| | - Aniko Horvath
- INRA, UMR 1332 BFP, F-33140 Villenave d’Ornon, France, Université de Bordeaux, UMR 1332 NFP, F-33140, Villenave d’Ornon, France
| | - Miguel A. Botella
- Instituto de Hortofruticultura Subtropical y Mediterránea (IHSM-UMA-CSIC),
Departamento de Biología Molecular y Bioquímica, Universidad de Málaga, 29071, Málaga, Spain
| | - Amèlia Gaston
- INRA, UMR 1332 BFP, F-33140 Villenave d’Ornon, France, Université de Bordeaux, UMR 1332 NFP, F-33140, Villenave d’Ornon, France
| | - Kevin Folta
- University of Florida, Horticultural Sciences Department, Gainesville, Florida, 32611, United States of America
| | - Andrzej Kilian
- Diversity Arrays Technology Pty Ltd, Building 3, University of Canberra, Bruce, ACT 2617, Australia
| | - Beatrice Denoyes
- INRA, UMR 1332 BFP, F-33140 Villenave d’Ornon, France, Université de Bordeaux, UMR 1332 NFP, F-33140, Villenave d’Ornon, France
| | - Iraida Amaya
- Instituto Andaluz de Investigación y Formación Agraria y Pesquera (IFAPA) Centro de Churriana, Cortijo de la Cruz, 29140, Málaga, Spain
- University of Florida, Horticultural Sciences Department, Gainesville, Florida, 32611, United States of America
- * E-mail:
| |
Collapse
|
8
|
Di Guardo M, Micheletti D, Bianco L, Koehorst-van Putten HJJ, Longhi S, Costa F, Aranzana MJ, Velasco R, Arús P, Troggio M, van de Weg EW. ASSIsT: an automatic SNP scoring tool for in- and outbreeding species. Bioinformatics 2015; 31:3873-4. [PMID: 26249809 PMCID: PMC4653386 DOI: 10.1093/bioinformatics/btv446] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2015] [Accepted: 07/25/2015] [Indexed: 11/13/2022] Open
Abstract
UNLABELLED ASSIsT (Automatic SNP ScorIng Tool) is a user-friendly customized pipeline for efficient calling and filtering of SNPs from Illumina Infinium arrays, specifically devised for custom genotyping arrays. Illumina has developed an integrated software for SNP data visualization and inspection called GenomeStudio (GS). ASSIsT builds on GS-derived data and identifies those markers that follow a bi-allelic genetic model and show reliable genotype calls. Moreover, ASSIsT re-edits SNP calls with null alleles or additional SNPs in the probe annealing site. ASSIsT can be employed in the analysis of different population types such as full-sib families and mating schemes used in the plant kingdom (backcross, F1, F2), and unrelated individuals. The final result can be directly exported in the format required by the most common software for genetic mapping and marker-trait association analysis. ASSIsT is developed in Python and runs in Windows and Linux. AVAILABILITY AND IMPLEMENTATION The software, example data sets and tutorials are freely available at http://compbiotoolbox.fmach.it/assist/. CONTACT eric.vandeweg@wur.nl.
Collapse
Affiliation(s)
- Mario Di Guardo
- Wageningen UR Plant Breeding, 6700 AA Wageningen, The Netherlands, Research and Innovation Centre, Fondazione Edmund Mach, Trento, Italy, Graduate School Experimental Plant Sciences, Wageningen University, 6700 AJ Wageningen, The Netherlands and
| | - Diego Micheletti
- Research and Innovation Centre, Fondazione Edmund Mach, Trento, Italy, Wageningen UR Plant Breeding, 6700 AA Wageningen, The Netherlands
| | - Luca Bianco
- Research and Innovation Centre, Fondazione Edmund Mach, Trento, Italy
| | | | - Sara Longhi
- Wageningen UR Plant Breeding, 6700 AA Wageningen, The Netherlands
| | - Fabrizio Costa
- Research and Innovation Centre, Fondazione Edmund Mach, Trento, Italy
| | - Maria J Aranzana
- IRTA, Centre de Recerca en Agrigenómica CSIC-IRTA-UAB, Beellaterra (Cerdanyola del Vallés), 08193 Barcelona, Spain
| | - Riccardo Velasco
- Research and Innovation Centre, Fondazione Edmund Mach, Trento, Italy
| | - Pere Arús
- IRTA, Centre de Recerca en Agrigenómica CSIC-IRTA-UAB, Beellaterra (Cerdanyola del Vallés), 08193 Barcelona, Spain
| | - Michela Troggio
- Research and Innovation Centre, Fondazione Edmund Mach, Trento, Italy
| | | |
Collapse
|
9
|
Bassil NV, Davis TM, Zhang H, Ficklin S, Mittmann M, Webster T, Mahoney L, Wood D, Alperin ES, Rosyara UR, Koehorst-Vanc Putten H, Monfort A, Sargent DJ, Amaya I, Denoyes B, Bianco L, van Dijk T, Pirani A, Iezzoni A, Main D, Peace C, Yang Y, Whitaker V, Verma S, Bellon L, Brew F, Herrera R, van de Weg E. Development and preliminary evaluation of a 90 K Axiom® SNP array for the allo-octoploid cultivated strawberry Fragaria × ananassa. BMC Genomics 2015; 16:155. [PMID: 25886969 PMCID: PMC4374422 DOI: 10.1186/s12864-015-1310-1] [Citation(s) in RCA: 104] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2014] [Accepted: 02/02/2015] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND A high-throughput genotyping platform is needed to enable marker-assisted breeding in the allo-octoploid cultivated strawberry Fragaria × ananassa. Short-read sequences from one diploid and 19 octoploid accessions were aligned to the diploid Fragaria vesca 'Hawaii 4' reference genome to identify single nucleotide polymorphisms (SNPs) and indels for incorporation into a 90 K Affymetrix® Axiom® array. We report the development and preliminary evaluation of this array. RESULTS About 36 million sequence variants were identified in a 19 member, octoploid germplasm panel. Strategies and filtering pipelines were developed to identify and incorporate markers of several types: di-allelic SNPs (66.6%), multi-allelic SNPs (1.8%), indels (10.1%), and ploidy-reducing "haploSNPs" (11.7%). The remaining SNPs included those discovered in the diploid progenitor F. iinumae (3.9%), and speculative "codon-based" SNPs (5.9%). In genotyping 306 octoploid accessions, SNPs were assigned to six classes with Affymetrix's "SNPolisher" R package. The highest quality classes, PolyHigh Resolution (PHR), No Minor Homozygote (NMH), and Off-Target Variant (OTV) comprised 25%, 38%, and 1% of array markers, respectively. These markers were suitable for genetic studies as demonstrated in the full-sib family 'Holiday' × 'Korona' with the generation of a genetic linkage map consisting of 6,594 PHR SNPs evenly distributed across 28 chromosomes with an average density of approximately one marker per 0.5 cM, thus exceeding our goal of one marker per cM. CONCLUSIONS The Affymetrix IStraw90 Axiom array is the first high-throughput genotyping platform for cultivated strawberry and is commercially available to the worldwide scientific community. The array's high success rate is likely driven by the presence of naturally occurring variation in ploidy level within the nominally octoploid genome, and by effectiveness of the employed array design and ploidy-reducing strategies. This array enables genetic analyses including generation of high-density linkage maps, identification of quantitative trait loci for economically important traits, and genome-wide association studies, thus providing a basis for marker-assisted breeding in this high value crop.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - David Wood
- University of New Hampshire, Durham, NH, USA.
| | | | | | | | - Amparo Monfort
- IRTA-Center for Research in Agricultural Genomics CSIC-IRTA-UAB-UB, Barcelona, Spain.
| | - Daniel J Sargent
- Fondazione Edmund Mach, Research and Innovation Centre, San Michele all'Adige, 38010, TN, Italy.
| | | | | | - Luca Bianco
- Fondazione Edmund Mach, Research and Innovation Centre, San Michele all'Adige, 38010, TN, Italy.
| | - Thijs van Dijk
- Wageningen-UR Plant Breeding, Wageningen, The Netherlands.
| | | | - Amy Iezzoni
- Michigan State University, East Lansing, MI, USA.
| | - Dorrie Main
- Washington State University, Pullman, WA, USA.
| | | | - Yilong Yang
- University of New Hampshire, Durham, NH, USA.
| | | | | | | | - Fiona Brew
- Affymetrix UK Ltd, Wooburn Green, High Wycombe, UK.
| | - Raul Herrera
- Instituto Ciencias Biologicas, Universidad de Talca, Talca, Chile.
| | | |
Collapse
|
10
|
Bassil NV, Davis TM, Zhang H, Ficklin S, Mittmann M, Webster T, Mahoney L, Wood D, Alperin ES, Rosyara UR, Koehorst-Vanc Putten H, Monfort A, Sargent DJ, Amaya I, Denoyes B, Bianco L, van Dijk T, Pirani A, Iezzoni A, Main D, Peace C, Yang Y, Whitaker V, Verma S, Bellon L, Brew F, Herrera R, van de Weg E. Development and preliminary evaluation of a 90 K Axiom® SNP array for the allo-octoploid cultivated strawberry Fragaria × ananassa. BMC Genomics 2015. [PMID: 25886969 DOI: 10.1186/s12864-12015-11310-12861] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2023] Open
Abstract
BACKGROUND A high-throughput genotyping platform is needed to enable marker-assisted breeding in the allo-octoploid cultivated strawberry Fragaria × ananassa. Short-read sequences from one diploid and 19 octoploid accessions were aligned to the diploid Fragaria vesca 'Hawaii 4' reference genome to identify single nucleotide polymorphisms (SNPs) and indels for incorporation into a 90 K Affymetrix® Axiom® array. We report the development and preliminary evaluation of this array. RESULTS About 36 million sequence variants were identified in a 19 member, octoploid germplasm panel. Strategies and filtering pipelines were developed to identify and incorporate markers of several types: di-allelic SNPs (66.6%), multi-allelic SNPs (1.8%), indels (10.1%), and ploidy-reducing "haploSNPs" (11.7%). The remaining SNPs included those discovered in the diploid progenitor F. iinumae (3.9%), and speculative "codon-based" SNPs (5.9%). In genotyping 306 octoploid accessions, SNPs were assigned to six classes with Affymetrix's "SNPolisher" R package. The highest quality classes, PolyHigh Resolution (PHR), No Minor Homozygote (NMH), and Off-Target Variant (OTV) comprised 25%, 38%, and 1% of array markers, respectively. These markers were suitable for genetic studies as demonstrated in the full-sib family 'Holiday' × 'Korona' with the generation of a genetic linkage map consisting of 6,594 PHR SNPs evenly distributed across 28 chromosomes with an average density of approximately one marker per 0.5 cM, thus exceeding our goal of one marker per cM. CONCLUSIONS The Affymetrix IStraw90 Axiom array is the first high-throughput genotyping platform for cultivated strawberry and is commercially available to the worldwide scientific community. The array's high success rate is likely driven by the presence of naturally occurring variation in ploidy level within the nominally octoploid genome, and by effectiveness of the employed array design and ploidy-reducing strategies. This array enables genetic analyses including generation of high-density linkage maps, identification of quantitative trait loci for economically important traits, and genome-wide association studies, thus providing a basis for marker-assisted breeding in this high value crop.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - David Wood
- University of New Hampshire, Durham, NH, USA.
| | | | | | | | - Amparo Monfort
- IRTA-Center for Research in Agricultural Genomics CSIC-IRTA-UAB-UB, Barcelona, Spain.
| | - Daniel J Sargent
- Fondazione Edmund Mach, Research and Innovation Centre, San Michele all'Adige, 38010, TN, Italy.
| | | | | | - Luca Bianco
- Fondazione Edmund Mach, Research and Innovation Centre, San Michele all'Adige, 38010, TN, Italy.
| | - Thijs van Dijk
- Wageningen-UR Plant Breeding, Wageningen, The Netherlands.
| | | | - Amy Iezzoni
- Michigan State University, East Lansing, MI, USA.
| | - Dorrie Main
- Washington State University, Pullman, WA, USA.
| | | | - Yilong Yang
- University of New Hampshire, Durham, NH, USA.
| | | | | | | | - Fiona Brew
- Affymetrix UK Ltd, Wooburn Green, High Wycombe, UK.
| | - Raul Herrera
- Instituto Ciencias Biologicas, Universidad de Talca, Talca, Chile.
| | | |
Collapse
|
11
|
Bianco L, Cestaro A, Sargent DJ, Banchi E, Derdak S, Di Guardo M, Salvi S, Jansen J, Viola R, Gut I, Laurens F, Chagné D, Velasco R, van de Weg E, Troggio M. Development and validation of a 20K single nucleotide polymorphism (SNP) whole genome genotyping array for apple (Malus × domestica Borkh). PLoS One 2014; 9:e110377. [PMID: 25303088 PMCID: PMC4193858 DOI: 10.1371/journal.pone.0110377] [Citation(s) in RCA: 108] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2014] [Accepted: 09/12/2014] [Indexed: 01/08/2023] Open
Abstract
High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.
Collapse
Affiliation(s)
- Luca Bianco
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| | - Alessandro Cestaro
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| | - Daniel James Sargent
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| | - Elisa Banchi
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| | - Sophia Derdak
- CNAG – Centro Nacional de Análisis Genómico, Parc Científic de Barcelona, Barcelona, Spain
| | - Mario Di Guardo
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
- Wageningen UR Plant Breeding, Wageningen University and Research Centre, Wageningen, The Netherlands
| | | | - Johannes Jansen
- Biometris, Wageningen University and Research Centre, Wageningen, The Netherlands
| | - Roberto Viola
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| | - Ivo Gut
- CNAG – Centro Nacional de Análisis Genómico, Parc Científic de Barcelona, Barcelona, Spain
| | - Francois Laurens
- INRA, UMR1345 Institut de Recherche en Horticulture and Semences, Beaucouzé, France
| | - David Chagné
- Plant & Food Research, Palmerston North Research Centre, Palmerston North, New Zealand
| | - Riccardo Velasco
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| | - Eric van de Weg
- Wageningen UR Plant Breeding, Wageningen University and Research Centre, Wageningen, The Netherlands
| | - Michela Troggio
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
- * E-mail:
| |
Collapse
|