1
|
Mshiywa FM, Edwards S, Bradley G. Rhodophyta DNA Barcoding: Ribulose-1, 5-Bisphosphate Carboxylase Gene and Novel Universal Primers. Int J Mol Sci 2023; 25:58. [PMID: 38203228 PMCID: PMC10871077 DOI: 10.3390/ijms25010058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 12/12/2023] [Accepted: 12/13/2023] [Indexed: 01/12/2024] Open
Abstract
Red algae (Rhodophyta) are a heterogeneous group of marine algal species that have served as a source of high-value molecules, including antioxidants and scaffolds, for novel drug development. However, it is challenging to identify Rhodophytes through morphological features alone, and in most instances, that has been the prevailing approach to identification. Consequently, this study undertook the identification of red algae species in Kenton-on-Sea, South Africa, as a baseline for future research on red algae biodiversity and conservation. The identification was achieved by designing, analysing, and using a set of universal primers through DNA barcoding of the rbcL gene. The PCR products of the rbcL gene were sequenced, and 96% of the amplicons were successfully sequenced from this set and matched with sequences on BOLD, which led to these species being molecularly described. Amongst these species are medicinally essential species, such as Laurencia natalensis and Hypnea spinella, and potential cryptic species. This calls for further investigation into the biodiversity of the studied region. Meanwhile, the availability of these primers will ease the identification process of red algae species from other coastal regions.
Collapse
Affiliation(s)
- Faith Masilive Mshiywa
- Department of Biochemistry and Microbiology, University of Fort Hare, Alice 5700, South Africa
| | - Shelley Edwards
- Department of Zoology & Entomology, Rhodes University, Makhanda 6139, South Africa;
| | - Graeme Bradley
- Department of Biochemistry and Microbiology, University of Fort Hare, Alice 5700, South Africa
| |
Collapse
|
2
|
High-Throughput Genotyping Technologies in Plant Taxonomy. Methods Mol Biol 2021; 2222:149-166. [PMID: 33301093 DOI: 10.1007/978-1-0716-0997-2_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Molecular markers provide researchers with a powerful tool for variation analysis between plant genomes. They are heritable and widely distributed across the genome and for this reason have many applications in plant taxonomy and genotyping. Over the last decade, molecular marker technology has developed rapidly and is now a crucial component for genetic linkage analysis, trait mapping, diversity analysis, and association studies. This chapter focuses on molecular marker discovery, its application, and future perspectives for plant genotyping through pangenome assemblies. Included are descriptions of automated methods for genome and sequence distance estimation, genome contaminant analysis in sequence reads, genome structural variation, and SNP discovery methods.
Collapse
|
3
|
Salgotra RK, Stewart CN. Functional Markers for Precision Plant Breeding. Int J Mol Sci 2020; 21:E4792. [PMID: 32640763 PMCID: PMC7370099 DOI: 10.3390/ijms21134792] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 06/19/2020] [Accepted: 07/02/2020] [Indexed: 01/24/2023] Open
Abstract
Advances in molecular biology including genomics, high-throughput sequencing, and genome editing enable increasingly faster and more precise cultivar development. Identifying genes and functional markers (FMs) that are highly associated with plant phenotypic variation is a grand challenge. Functional genomics approaches such as transcriptomics, targeting induced local lesions in genomes (TILLING), homologous recombinant (HR), association mapping, and allele mining are all strategies to identify FMs for breeding goals, such as agronomic traits and biotic and abiotic stress resistance. The advantage of FMs over other markers used in plant breeding is the close genomic association of an FM with a phenotype. Thereby, FMs may facilitate the direct selection of genes associated with phenotypic traits, which serves to increase selection efficiencies to develop varieties. Herein, we review the latest methods in FM development and how FMs are being used in precision breeding for agronomic and quality traits as well as in breeding for biotic and abiotic stress resistance using marker assisted selection (MAS) methods. In summary, this article describes the use of FMs in breeding for development of elite crop cultivars to enhance global food security goals.
Collapse
Affiliation(s)
- Romesh K. Salgotra
- School of Biotechnology, Sher-e-Kashmir University of Agricultural Sciences & Technology of Jammu, Chatha, Jammu 190008, India
| | - C. Neal Stewart
- Department of Plant Sciences, University of Tennessee, Knoxville, TN 37996, USA
| |
Collapse
|
4
|
Samanfar B, Cober ER, Charette M, Tan LH, Bekele WA, Morrison MJ, Kilian A, Belzile F, Molnar SJ. Genetic Analysis of High Protein Content in 'AC Proteus' Related Soybean Populations Using SSR, SNP, DArT and DArTseq Markers. Sci Rep 2019; 9:19657. [PMID: 31873115 PMCID: PMC6928212 DOI: 10.1038/s41598-019-55862-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Accepted: 12/02/2019] [Indexed: 11/10/2022] Open
Abstract
Key message: Several AC Proteus derived genomic regions (QTLs, SNPs) have been identified which may prove useful for further development of high yielding high protein cultivars and allele-specific marker developments. High seed protein content is a trait which is typically difficult to introgress into soybean without an accompanying reduction in seed yield. In a previous study, 'AC Proteus' was used as a high protein source and was found to produce populations that did not exhibit the typical association between high protein and low yield. Five high x low protein RIL populations and a high x high protein RIL population were evaluated by either quantitative trait locus (QTL) analysis or bulk segregant analyses (BSA) following phenotyping in the field. QTL analysis in one population using SSR, DArT and DArTseq markers found two QTLs for seed protein content on chromosomes 15 and 20. The BSA analyses suggested multiple genomic regions are involved with high protein content across the five populations, including the two previously mentioned QTLs. In an alternative approach to identify high protein genes, pedigree analysis identified SNPs for which the allele associated with high protein was retained in seven high protein descendants of AC Proteus on chromosomes 2, 17 and 18. Aside from the two identified QTLs (five genomic regions in total considering the two with highly elevated test statistic, but below the statistical threshold and the one with epistatic interactions) which were some distance from Meta-QTL regions and which were also supported by our BSA analysis within five populations. These high protein regions may prove useful for further development of high yielding high protein cultivars.
Collapse
Affiliation(s)
- Bahram Samanfar
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, ON, Canada.
- Department of Biology and Ottawa Institute of Systems Biology, Carleton University, Ottawa, ON, Canada.
| | - Elroy R Cober
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, ON, Canada
| | - Martin Charette
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, ON, Canada
| | - Le Hoa Tan
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, ON, Canada
| | - Wubishet A Bekele
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, ON, Canada
| | - Malcolm J Morrison
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, ON, Canada
| | - Andrzej Kilian
- Diversity Arrays Technology Pty Ltd, University of Canberra, Monana St., Canberra ACT, Australia
| | - François Belzile
- Département de Phytologie and Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec City, QC, Canada
| | - Stephen J Molnar
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, ON, Canada
| |
Collapse
|
5
|
Mason AS, Higgins EE, Snowdon RJ, Batley J, Stein A, Werner C, Parkin IAP. A user guide to the Brassica 60K Illumina Infinium™ SNP genotyping array. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2017; 130:621-633. [PMID: 28220206 DOI: 10.1007/s00122-016-2849-1] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Accepted: 09/14/2016] [Indexed: 06/06/2023]
Abstract
The Brassica napus 60K Illumina Infinium™ SNP array has had huge international uptake in the rapeseed community due to the revolutionary speed of acquisition and ease of analysis of this high-throughput genotyping data, particularly when coupled with the newly available reference genome sequence. However, further utilization of this valuable resource can be optimized by better understanding the promises and pitfalls of SNP arrays. We outline how best to analyze Brassica SNP marker array data for diverse applications, including linkage and association mapping, genetic diversity and genomic introgression studies. We present data on which SNPs are locus-specific in winter, semi-winter and spring B. napus germplasm pools, rather than amplifying both an A-genome and a C-genome locus or multiple loci. Common issues that arise when analyzing array data will be discussed, particularly those unique to SNP markers and how to deal with these for practical applications in Brassica breeding applications.
Collapse
Affiliation(s)
- Annaliese S Mason
- Department of Plant Breeding, IFZ for Biosystems, Land Use and Nutrition, Justus Liebig University Giessen, Heinrich-Buff-Ring 26-32, 35392, Giessen, Germany.
| | - Erin E Higgins
- Agriculture and Agri-Food Canada, 107 Science Place, Saskatoon, S7N0X2, Canada
| | - Rod J Snowdon
- Department of Plant Breeding, IFZ for Biosystems, Land Use and Nutrition, Justus Liebig University Giessen, Heinrich-Buff-Ring 26-32, 35392, Giessen, Germany
| | - Jacqueline Batley
- School of Agriculture and Food Sciences and Centre for Integrative Legume Research, The University of Queensland, Brisbane, 4072, Australia
- School of Plant Biology and The UWA Institute of Agriculture, The University of Western Australia, 35 Stirling Highway, Crawley, 6009, Perth, Australia
| | - Anna Stein
- Department of Plant Breeding, IFZ for Biosystems, Land Use and Nutrition, Justus Liebig University Giessen, Heinrich-Buff-Ring 26-32, 35392, Giessen, Germany
| | - Christian Werner
- Department of Plant Breeding, IFZ for Biosystems, Land Use and Nutrition, Justus Liebig University Giessen, Heinrich-Buff-Ring 26-32, 35392, Giessen, Germany
| | - Isobel A P Parkin
- Agriculture and Agri-Food Canada, 107 Science Place, Saskatoon, S7N0X2, Canada
| |
Collapse
|
6
|
Yu Y, Liu J, Li F, Zhang X, Zhang C, Xiang J. Gene set based association analyses for the WSSV resistance of Pacific white shrimp Litopenaeus vannamei. Sci Rep 2017; 7:40549. [PMID: 28094323 PMCID: PMC5240139 DOI: 10.1038/srep40549] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2016] [Accepted: 12/07/2016] [Indexed: 11/09/2022] Open
Abstract
White Spot Syndrome Virus (WSSV) is regarded as a virus with the strongest pathogenicity to shrimp. For the threshold trait such as disease resistance, marker assisted selection (MAS) was considered to be a more effective approach. In the present study, association analyses of single nucleotide polymorphisms (SNPs) located in a set of immune related genes were conducted to identify markers associated with WSSV resistance. SNPs were detected by bioinformatics analysis on RNA sequencing data generated by Illimina sequencing platform and Roche 454 sequencing technology. A total of 681 SNPs located in the exons of immune related genes were selected as candidate SNPs. Among these SNPs, 77 loci were genotyped in WSSV susceptible group and resistant group. Association analysis was performed based on logistic regression method under an additive and dominance model in GenABEL package. As a result, five SNPs showed associations with WSSV resistance at a significant level of 0.05. Besides, SNP-SNP interaction analysis was conducted. The combination of SNP loci in TRAF6, Cu/Zn SOD and nLvALF2 exhibited a significant effect on the WSSV resistance of shrimp. Gene expression analysis revealed that these SNPs might influence the expression of these immune-related genes. This study provides a useful method for performing MAS in shrimp.
Collapse
Affiliation(s)
- Yang Yu
- Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China
| | - Jingwen Liu
- Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Fuhua Li
- Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China
| | - Xiaojun Zhang
- Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China
| | - Chengsong Zhang
- Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China
| | - Jianhai Xiang
- Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China
| |
Collapse
|
7
|
Abstract
An integrated database with a variety of Web-based systems named WheatGenome.info hosting wheat genome and genomic data has been developed to support wheat research and crop improvement. The resource includes multiple Web-based applications, which are implemented as a variety of Web-based systems. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This portal provides links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/ .
Collapse
|
8
|
Li J, Wang L, Zhan Q, Liu Y, Yang X. Transcriptome Characterization and Functional Marker Development in Sorghum Sudanense. PLoS One 2016; 11:e0154947. [PMID: 27152648 PMCID: PMC4859472 DOI: 10.1371/journal.pone.0154947] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Accepted: 04/21/2016] [Indexed: 11/26/2022] Open
Abstract
Sudangrass, Sorghum sudanense, is an important forage in warm regions. But little is known about its genome. In this study, the transcriptomes of sudangrass S722 and sorghum Tx623B were sequenced by Illumina sequencing. More than 4Gb bases were sequenced for each library. For Tx623B and S722, 88.79% and 83.88% reads, respectively were matched to the Sorghum bicolor genome. A total of 2,397 differentially expressed genes (DEGs) were detected by RNA-Seq between the two libraries, including 849 up-regulated genes and 1,548 down-regulated genes. These DEGs could be divided into three groups by annotation analysis. A total of 44,495 single nucleotide polymorphisms (SNPs) were discovered by aligning S722 reads to the sorghum reference genome. Of these SNPs, 61.37% were transition, and this value did not differ much between different chromosomes. In addition, 16,928 insertion and deletion (indel) loci were identified between the two genomes. A total of 5,344 indel markers were designed, 15 of which were selected to construct the genetic map derived from the cross of Tx623A and Sa. It was indicated that the indel markers were useful and versatile between sorghum and sudangrass. Comparison of synonymous base substitutions (Ks) and non-synonymous base substitutions (Ka) between the two libraries showed that 95% orthologous pairs exhibited Ka/Ks<1.0, indicating that these genes were influenced by purifying selection. The results from this study provide important information for molecular genetic research and a rich resource for marker development in sudangrass and other Sorghum species.
Collapse
Affiliation(s)
- Jieqin Li
- College of Agriculture, Anhui Science and Technology University, Fengyang, China
| | - Lihua Wang
- College of Agriculture, Anhui Science and Technology University, Fengyang, China
| | - Qiuwen Zhan
- College of Agriculture, Anhui Science and Technology University, Fengyang, China
- * E-mail:
| | - Yanlong Liu
- College of Agriculture, Anhui Science and Technology University, Fengyang, China
| | - Xiaocui Yang
- College of Agriculture, Anhui Science and Technology University, Fengyang, China
| |
Collapse
|
9
|
Doddamani D, Khan AW, Katta MAVSK, Agarwal G, Thudi M, Ruperao P, Edwards D, Varshney RK. CicArVarDB: SNP and InDel database for advancing genetics research and breeding applications in chickpea. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav078. [PMID: 26289427 PMCID: PMC4541373 DOI: 10.1093/database/bav078] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2014] [Accepted: 07/22/2015] [Indexed: 11/12/2022]
Abstract
Molecular markers are valuable tools for breeders to help accelerate crop improvement. High throughput sequencing technologies facilitate the discovery of large-scale variations such as single nucleotide polymorphisms (SNPs) and simple sequence repeats (SSRs). Sequencing of chickpea genome along with re-sequencing of several chickpea lines has enabled the discovery of 4.4 million variations including SNPs and InDels. Here we report a repository of 1.9 million variations (SNPs and InDels) anchored on eight pseudomolecules in a custom database, referred as CicArVarDB that can be accessed at http://cicarvardb.icrisat.org/. It includes an easy interface for users to select variations around specific regions associated with quantitative trait loci, with embedded webBLAST search and JBrowse visualisation. We hope that this database will be immensely useful for the chickpea research community for both advancing genetics research as well as breeding applications for crop improvement. Database URL:http://cicarvardb.icrisat.org.
Collapse
Affiliation(s)
- Dadakhalandar Doddamani
- Research Program Grain Legumes, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502 324, Telangana State, India
| | - Aamir W Khan
- Research Program Grain Legumes, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502 324, Telangana State, India
| | - Mohan A V S K Katta
- Research Program Grain Legumes, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502 324, Telangana State, India
| | - Gaurav Agarwal
- Research Program Grain Legumes, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502 324, Telangana State, India
| | - Mahendar Thudi
- Research Program Grain Legumes, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502 324, Telangana State, India
| | - Pradeep Ruperao
- School of Agriculture and Food Sciences, University of Queensland, St Lucia, Queensland, Australia 4072, School of Plant Biology, The University of Western Australia, Perth, Western Australia, Australia 6009 and
| | - David Edwards
- School of Plant Biology, The University of Western Australia, Perth, Western Australia, Australia 6009 and Institute of Agriculture, The University of Western Australia, Perth, Western Australia, Australia 6009
| | - Rajeev K Varshney
- Research Program Grain Legumes, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502 324, Telangana State, India, School of Plant Biology, The University of Western Australia, Perth, Western Australia, Australia 6009 and
| |
Collapse
|
10
|
Ruperao P, Edwards D. Bioinformatics: identification of markers from next-generation sequence data. Methods Mol Biol 2015; 1245:29-47. [PMID: 25373747 DOI: 10.1007/978-1-4939-1966-6_3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
With the advent of sequencing technology, next-generation sequencing (NGS) technology has dramatically revolutionized plant genomics. NGS technology combined with new software tools enables the discovery, validation, and assessment of genetic markers on a large scale. Among different markers systems, simple sequence repeats (SSRs) and Single nucleotide polymorphisms (SNPs) are the markers of choice for genetics and plant breeding. SSR markers have been a choice for large-scale characterization of germplasm collections, construction of genetic maps, and QTL identification. Similarly, SNPs are the most abundant genetic variations with higher frequencies throughout the genome of plant species. This chapter discusses various tools available for genome assembly and widely focuses on SSR and SNP marker discovery.
Collapse
Affiliation(s)
- Pradeep Ruperao
- School of Agriculture and Food Sciences, University of Queensland, Brisbane, QLD, Australia
| | | |
Collapse
|
11
|
Abstract
The detection and analysis of genetic variation plays an important role in plant breeding and this role is increasing with the continued development of genome sequencing technologies. Molecular genetic markers are important tools to characterize genetic variation and assist with genomic breeding. Processing and storing the growing abundance of molecular marker data being produced requires the development of specific bioinformatics tools and advanced databases. Molecular marker databases range from species specific through to organism wide and often host a variety of additional related genetic, genomic, or phenotypic information. In this chapter, we will present some of the features of plant molecular genetic marker databases, highlight the various types of marker resources, and predict the potential future direction of crop marker databases.
Collapse
|
12
|
Lai K, Lorenc MT, Lee HC, Berkman PJ, Bayer PE, Visendi P, Ruperao P, Fitzgerald TL, Zander M, Chan CKK, Manoli S, Stiller J, Batley J, Edwards D. Identification and characterization of more than 4 million intervarietal SNPs across the group 7 chromosomes of bread wheat. PLANT BIOTECHNOLOGY JOURNAL 2015; 13:97-104. [PMID: 25147022 DOI: 10.1111/pbi.12240] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2014] [Revised: 07/07/2014] [Accepted: 07/13/2014] [Indexed: 05/19/2023]
Abstract
Despite being a major international crop, our understanding of the wheat genome is relatively poor due to its large size and complexity. To gain a greater understanding of wheat genome diversity, we have identified single nucleotide polymorphisms between 16 Australian bread wheat varieties. Whole-genome shotgun Illumina paired read sequence data were mapped to the draft assemblies of chromosomes 7A, 7B and 7D to identify more than 4 million intervarietal SNPs. SNP density varied between the three genomes, with much greater density observed on the A and B genomes than the D genome. This variation may be a result of substantial gene flow from the tetraploid Triticum turgidum, which possesses A and B genomes, during early co-cultivation of tetraploid and hexaploid wheat. In addition, we examined SNP density variation along the chromosome syntenic builds and identified genes in low-density regions which may have been selected during domestication and breeding. This study highlights the impact of evolution and breeding on the bread wheat genome and provides a substantial resource for trait association and crop improvement. All SNP data are publically available on a generic genome browser GBrowse at www.wheatgenome.info.
Collapse
Affiliation(s)
- Kaitao Lai
- School of Agriculture and Food Sciences, University of Queensland, Brisbane, Qld, Australia; Australian Centre for Plant Functional Genomics, University of Queensland, Brisbane, Qld, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Liu J, Yu Y, Li F, Zhang X, Xiang J. A new anti-lipopolysaccharide factor (ALF) gene with its SNP polymorphisms related to WSSV-resistance of Litopenaeus vannamei. FISH & SHELLFISH IMMUNOLOGY 2014; 39:24-33. [PMID: 24769128 DOI: 10.1016/j.fsi.2014.04.009] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2014] [Revised: 03/10/2014] [Accepted: 04/14/2014] [Indexed: 06/03/2023]
Abstract
Anti-lipopolysaccharide factors (ALFs) of crustacean play an important role against bacteria or virus infection. In this study, the cDNA sequence and genomic sequence of one new isoform of ALF designated as nLvALF1 were reported. The open reading frame (ORF) of nLvALF1 consisted of 369 bp encoding 123 amino acids and the genomic structure of nLvALF1 comprised four introns and three exons. The predicted pI of the deduced protein was 8.82 and the molecular weight (MW) was 13.72 KDa. The deduced amino acid sequence of nLvALF1 contained a typical functional domain of ALF: LPS-binding domain. Phylogenetic analysis showed that nLvALF1 had the closest relationship with FcALF1 from Fenneropenaeus chinensis. The nLvALF1 was specifically expressed in lymphoid organ (Oka) of shrimp. Its transcriptional level was significantly up-regulated after white spot syndrome virus (WSSV) challenge, suggesting that nLvALF1 might participate in defense against WSSV in Litopenaeus vannamei. In order to search potential genetic markers associated with WSSV-resistance, we scanned the polymorphisms of the genomic fragment with 397 bp where the LPS-binding domain encoding sequence located and 18 SNPs were found. The distribution frequency of these SNPs was analyzed in WSSV susceptible shrimp and resistant shrimp separately. Significant differences existed in allelic frequencies at loci g.1361-T > C, g.1370-T > C, g.1419-T > A between the WSSV-resistant group and the WSSV-susceptible/susceptible group (P < 0.05). The specific haplotype CT consisted of g.1415-C > A and g.1419-T > A was associated with susceptibility to WSSV (P < 0.05). These findings provide theoretical support for selection of WSSV-resistant varieties of L. vannamei.
Collapse
Affiliation(s)
- Jingwen Liu
- Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yang Yu
- Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Fuhua Li
- Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China.
| | - Xiaojun Zhang
- Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China
| | - Jianhai Xiang
- Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China
| |
Collapse
|
14
|
Abstract
Molecular genetic markers represent one of the most powerful tools for the analysis of variation between plant genomes. Molecular marker technology has developed rapidly over the last decade, with the introduction of new DNA sequencing methods and the development of high-throughput genotyping methods. Single nucleotide polymorphisms (SNPs) now dominate applications in modern plant genetic analysis. The reducing cost of DNA sequencing and increasing availability of large sequence data sets permit the mining of this data for large numbers of SNPs. These may then be used in applications such as genetic linkage analysis and trait mapping, diversity analysis, association studies, and marker-assisted selection. Here we describe automated methods for the discovery of SNP molecular markers and new technologies for high-throughput, low-cost molecular marker genotyping. Examples include SNP discovery using autoSNPdb and wheatgenome.info as well as SNP genotyping using Illumina's GoldenGate™ and Infinium™ methods.
Collapse
|
15
|
Wei L, Xiao M, Hayward A, Fu D. Applications and challenges of next-generation sequencing in Brassica species. PLANTA 2013; 238:1005-24. [PMID: 24062086 DOI: 10.1007/s00425-013-1961-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2013] [Accepted: 09/12/2013] [Indexed: 05/09/2023]
Abstract
Next-generation sequencing (NGS) produces numerous (often millions) short DNA sequence reads, typically varying between 25 and 400 bp in length, at a relatively low cost and in a short time. This revolutionary technology is being increasingly applied in whole-genome, transcriptome, epigenome and small RNA sequencing, molecular marker and gene discovery, comparative and evolutionary genomics, and association studies. The Brassica genus comprises some of the most agro-economically important crops, providing abundant vegetables, condiments, fodder, oil and medicinal products. Many Brassica species have undergone the process of polyploidization, which makes their genomes exceptionally complex and can create difficulties in genomics research. NGS injects new vigor into Brassica research, yet also faces specific challenges in the analysis of complex crop genomes and traits. In this article, we review the advantages and limitations of different NGS technologies and their applications and challenges, using Brassica as an advanced model system for agronomically important, polyploid crops. Specifically, we focus on the use of NGS for genome resequencing, transcriptome sequencing, development of single-nucleotide polymorphism markers, and identification of novel microRNAs and their targets. We present trends and advances in NGS technology in relation to Brassica crop improvement, with wide application for sophisticated genomics research into agronomically important polyploid crops.
Collapse
Affiliation(s)
- Lijuan Wei
- Key Laboratory of Crop Physiology, Ecology and Genetic Breeding, Ministry of Education, Agronomy College, Jiangxi Agricultural University, Nanchang, 330045, China
- Chongqing Engineering Research Center for Rapeseed, College of Agronomy and Biotechnology, Southwest University, Chongqing, 400716, China
| | - Meili Xiao
- Chongqing Engineering Research Center for Rapeseed, College of Agronomy and Biotechnology, Southwest University, Chongqing, 400716, China
| | - Alice Hayward
- Centre for Integrative Legume Research, School of Agriculture and Food Sciences, The University of Queensland, St Lucia, 4072, Australia
| | - Donghui Fu
- Key Laboratory of Crop Physiology, Ecology and Genetic Breeding, Ministry of Education, Agronomy College, Jiangxi Agricultural University, Nanchang, 330045, China.
| |
Collapse
|
16
|
Next generation characterisation of cereal genomes for marker discovery. BIOLOGY 2013; 2:1357-77. [PMID: 24833229 PMCID: PMC4009793 DOI: 10.3390/biology2041357] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2013] [Revised: 10/29/2013] [Accepted: 11/08/2013] [Indexed: 12/30/2022]
Abstract
Cereal crops form the bulk of the world’s food sources, and thus their importance cannot be understated. Crop breeding programs increasingly rely on high-resolution molecular genetic markers to accelerate the breeding process. The development of these markers is hampered by the complexity of some of the major cereal crop genomes, as well as the time and cost required. In this review, we address current and future methods available for the characterisation of cereal genomes, with an emphasis on faster and more cost effective approaches for genome sequencing and the development of markers for trait association and marker assisted selection (MAS) in crop breeding programs.
Collapse
|
17
|
Berkman PJ, Visendi P, Lee HC, Stiller J, Manoli S, Lorenc MT, Lai K, Batley J, Fleury D, Simková H, Kubaláková M, Weining S, Doležel J, Edwards D. Dispersion and domestication shaped the genome of bread wheat. PLANT BIOTECHNOLOGY JOURNAL 2013; 11:564-71. [PMID: 23346876 DOI: 10.1111/pbi.12044] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2012] [Revised: 12/10/2012] [Accepted: 12/11/2012] [Indexed: 05/20/2023]
Abstract
Despite the international significance of wheat, its large and complex genome hinders genome sequencing efforts. To assess the impact of selection on this genome, we have assembled genomic regions representing genes for chromosomes 7A, 7B and 7D. We demonstrate that the dispersion of wheat to new environments has shaped the modern wheat genome. Most genes are conserved between the three homoeologous chromosomes. We found differential gene loss that supports current theories on the evolution of wheat, with greater loss observed in the A and B genomes compared with the D. Analysis of intervarietal polymorphisms identified fewer polymorphisms in the D genome, supporting the hypothesis of early gene flow between the tetraploid and hexaploid. The enrichment for genes on the D genome that confer environmental adaptation may be associated with dispersion following wheat domestication. Our results demonstrate the value of applying next-generation sequencing technologies to assemble gene-rich regions of complex genomes and investigate polyploid genome evolution. We anticipate the genome-wide application of this reduced-complexity syntenic assembly approach will accelerate crop improvement efforts not only in wheat, but also in other polyploid crops of significance.
Collapse
Affiliation(s)
- Paul J Berkman
- School of Agriculture and Food Sciences, University of Queensland, Brisbane, QLD, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Duran C, Singhania R, Raman H, Batley J, Edwards D. Predicting polymorphic EST-SSRs in silico. Mol Ecol Resour 2013; 13:538-45. [PMID: 23398650 DOI: 10.1111/1755-0998.12078] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2012] [Revised: 12/16/2012] [Accepted: 12/22/2012] [Indexed: 11/26/2022]
Abstract
The public availability of large quantities of gene sequence data provides a valuable resource of the mining of Simple Sequence Repeat (SSR) molecular genetic markers for genetic analysis. These markers are inexpensive, require minimal labour to produce and can frequently be associated with functionally annotated genes. This study presents the characterization of barley EST-SSRs and the identification of putative polymorphic SSRs from EST data. Polymorphic SSRs are distinguished from monomorphic SSRs by the representation of varying motif lengths within an alignment of sequence reads. Two measures of confidence are calculated, redundancy of a polymorphism and co-segregation with accessions. The utility of this method is demonstrated through the discovery of 597 candidate polymorphic SSRs, from a total of 452 642 consensus expressed sequences. PCR amplification primers were designed for the identified SSRs. Ten primer pairs were validated for polymorphism in barley and for transferability across species. Analysis of the polymorphisms in relation to SSR motif, length, position and annotation is discussed.
Collapse
Affiliation(s)
- Chris Duran
- Melbourne eResearch Group, University of Melbourne, Parkville, Vic, 3010, Australia
| | | | | | | | | |
Collapse
|
19
|
Edwards D, Batley J, Snowdon RJ. Accessing complex crop genomes with next-generation sequencing. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2013; 126:1-11. [PMID: 22948437 DOI: 10.1007/s00122-012-1964-x] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Accepted: 08/08/2012] [Indexed: 05/02/2023]
Abstract
Many important crop species have genomes originating from ancestral or recent polyploidisation events. Multiple homoeologous gene copies, chromosomal rearrangements and amplification of repetitive DNA within large and complex crop genomes can considerably complicate genome analysis and gene discovery by conventional, forward genetics approaches. On the other hand, ongoing technological advances in molecular genetics and genomics today offer unprecedented opportunities to analyse and access even more recalcitrant genomes. In this review, we describe next-generation sequencing and data analysis techniques that vastly improve our ability to dissect and mine genomes for causal genes underlying key traits and allelic variation of interest to breeders. We focus primarily on wheat and oilseed rape, two leading examples of major polyploid crop genomes whose size or complexity present different, significant challenges. In both cases, the latest DNA sequencing technologies, applied using quite different approaches, have enabled considerable progress towards unravelling the respective genomes. Our ability to discover the extent and distribution of genetic diversity in crop gene pools, and its relationship to yield and quality-related traits, is swiftly gathering momentum as DNA sequencing and the bioinformatic tools to deal with growing quantities of genomic data continue to develop. In the coming decade, genomic and transcriptomic sequencing, discovery and high-throughput screening of single nucleotide polymorphisms, presence-absence variations and other structural chromosomal variants in diverse germplasm collections will give detailed insight into the origins, domestication and available trait-relevant variation of polyploid crops, in the process facilitating novel approaches and possibilities for genomics-assisted breeding.
Collapse
Affiliation(s)
- David Edwards
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Sciences, University of Queensland, Brisbane, QLD 4072, Australia
| | | | | |
Collapse
|
20
|
Vidal RO, do Nascimento LC, Mondego JMC, Pereira GAG, Carazzolle MF. Identification of SNPs in RNA-seq data of two cultivars of Glycine max (soybean) differing in drought resistance. Genet Mol Biol 2012; 35:331-4. [PMID: 22802718 PMCID: PMC3392885 DOI: 10.1590/s1415-47572012000200014] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
The legume Glycine max (soybean) plays an important economic role in the international commodities market, with a world production of almost 260 million tons for the 2009/2010 harvest. The increase in drought events in the last decade has caused production losses in recent harvests. This fact compels us to understand the drought tolerance mechanisms in soybean, taking into account its variability among commercial and developing cultivars. In order to identify single nucleotide polymorphisms (SNPs) in genes up-regulated during drought stress, we evaluated suppression subtractive libraries (SSH) from two contrasting cultivars upon water deprivation: sensitive (BR 16) and tolerant (Embrapa 48). A total of 2,222 soybean genes were up-regulated in both cultivars. Our method identified more than 6,000 SNPs in tolerant and sensitive Brazilian cultivars in those drought stress related genes. Among these SNPs, 165 (in 127 genes) are positioned at soybean chromosome ends, including transcription factors (MYB, WRKY) related to tolerance to abiotic stress.
Collapse
Affiliation(s)
- Ramon Oliveira Vidal
- Laboratório de Genômica e Expressão, Universidade Estadual de Campinas, Campinas, SP, Brazil
| | | | | | | | | |
Collapse
|
21
|
Lorenc MT, Hayashi S, Stiller J, Lee H, Manoli S, Ruperao P, Visendi P, Berkman PJ, Lai K, Batley J, Edwards D. Discovery of Single Nucleotide Polymorphisms in Complex Genomes Using SGSautoSNP. BIOLOGY 2012; 1:370-82. [PMID: 24832230 PMCID: PMC4009776 DOI: 10.3390/biology1020370] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 07/12/2012] [Revised: 08/09/2012] [Accepted: 08/10/2012] [Indexed: 01/01/2023]
Abstract
Single nucleotide polymorphisms (SNPs) are becoming the dominant form of molecular marker for genetic and genomic analysis. The advances in second generation DNA sequencing provide opportunities to identify very large numbers of SNPs in a range of species. However, SNP identification remains a challenge for large and polyploid genomes due to their size and complexity. We have developed a pipeline for the robust identification of SNPs in large and complex genomes using Illumina second generation DNA sequence data and demonstrated this by the discovery of SNPs in the hexaploid wheat genome. We have developed a SNP discovery pipeline called SGSautoSNP (Second-Generation Sequencing AutoSNP) and applied this to discover more than 800,000 SNPs between four hexaploid wheat cultivars across chromosomes 7A, 7B and 7D. All SNPs are presented for download and viewing within a public GBrowse database. Validation suggests an accuracy of greater than 93% of SNPs represent polymorphisms between wheat cultivars and hence are valuable for detailed diversity analysis, marker assisted selection and genotyping by sequencing. The pipeline produces output in GFF3, VCF, Flapjack or Illumina Infinium design format for further genotyping diverse populations. As well as providing an unprecedented resource for wheat diversity analysis, the method establishes a foundation for high resolution SNP discovery in other large and complex genomes.
Collapse
Affiliation(s)
- Michał T Lorenc
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Science, University of Queensland, Brisbane, QLD 4072, Australia.
| | - Satomi Hayashi
- Centre for Integrative Legume Research, School of Agriculture and Food Science, University of Queensland, Brisbane, QLD 4072, Australia.
| | - Jiri Stiller
- CSIRO Plant Industry, Brisbane, QLD 4072, Australia.
| | - Hong Lee
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Science, University of Queensland, Brisbane, QLD 4072, Australia.
| | - Sahana Manoli
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Science, University of Queensland, Brisbane, QLD 4072, Australia.
| | - Pradeep Ruperao
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Science, University of Queensland, Brisbane, QLD 4072, Australia.
| | - Paul Visendi
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Science, University of Queensland, Brisbane, QLD 4072, Australia.
| | | | - Kaitao Lai
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Science, University of Queensland, Brisbane, QLD 4072, Australia.
| | - Jacqueline Batley
- Centre for Integrative Legume Research, School of Agriculture and Food Science, University of Queensland, Brisbane, QLD 4072, Australia.
| | - David Edwards
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Science, University of Queensland, Brisbane, QLD 4072, Australia.
| |
Collapse
|
22
|
Jhanwar S, Priya P, Garg R, Parida SK, Tyagi AK, Jain M. Transcriptome sequencing of wild chickpea as a rich resource for marker development. PLANT BIOTECHNOLOGY JOURNAL 2012; 10:690-702. [PMID: 22672127 DOI: 10.1111/j.1467-7652.2012.00712.x] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
The transcriptome of cultivated chickpea (Cicer arietinum L.), an important crop legume, has recently been sequenced. Here, we report sequencing of the transcriptome of wild chickpea, C. reticulatum (PI489777), the progenitor of cultivated chickpea, by GS-FLX 454 technology. The optimized assembly of C. reticulatum transcriptome generated 37 265 transcripts in total with an average length of 946 bp. A total of 4072 simple sequence repeats (SSRs) could be identified in these transcript sequences, of which at least 561 SSRs were polymorphic between C. arietinum and C. reticulatum. In addition, a total of 36 446 single-nucleotide polymorphisms (SNPs) were identified after optimization of probability score, quality score, read depth and consensus base ratio. Several of these SSRs and SNPs could be associated with tissue-specific and transcription factor encoding transcripts. A high proportion (92-94%) of polymorphic SSRs and SNPs identified between the two chickpea species were validated successfully. Further, the estimation of synonymous substitution rates of orthologous transcript pairs suggested that the speciation event for divergence of C. arietinum and C. reticulatum may have happened approximately 0.53 million years ago. The results of our study provide a rich resource for exploiting genetic variations in chickpea for breeding programmes.
Collapse
Affiliation(s)
- Shalu Jhanwar
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, India
| | | | | | | | | | | |
Collapse
|
23
|
Lai K, Duran C, Berkman PJ, Lorenc MT, Stiller J, Manoli S, Hayden MJ, Forrest KL, Fleury D, Baumann U, Zander M, Mason AS, Batley J, Edwards D. Single nucleotide polymorphism discovery from wheat next-generation sequence data. PLANT BIOTECHNOLOGY JOURNAL 2012; 10:743-9. [PMID: 22748104 DOI: 10.1111/j.1467-7652.2012.00718.x] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Single nucleotide polymorphisms (SNPs) are the most abundant type of molecular genetic marker and can be used for producing high-resolution genetic maps, marker-trait association studies and marker-assisted breeding. Large polyploid genomes such as wheat present a challenge for SNP discovery because of the potential presence of multiple homoeologs for each gene. AutoSNPdb has been successfully applied to identify SNPs from Sanger sequence data for several species, including barley, rice and Brassica, but the volume of data required to accurately call SNPs in the complex genome of wheat has prevented its application to this important crop. DNA sequencing technology has been revolutionized by the introduction of next-generation sequencing, and it is now possible to generate several million sequence reads in a timely and cost-effective manner. We have produced wheat transcriptome sequence data using 454 sequencing technology and applied this for SNP discovery using a modified autoSNPdb method, which integrates SNP and gene annotation information with a graphical viewer. A total of 4,694,141 sequence reads from three bread wheat varieties were assembled to identify a total of 38 928 candidate SNPs. Each SNP is within an assembly complete with annotation, enabling the selection of polymorphism within genes of interest.
Collapse
Affiliation(s)
- Kaitao Lai
- School of Agriculture and Food Science, University of Queensland, Brisbane, QLD, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Hayward A, Mason AS, Dalton-Morgan J, Zander M, Edwards D, Batley J. SNP discovery and applications in Brassica napus. ACTA ACUST UNITED AC 2012. [DOI: 10.5010/jpb.2012.39.1.049] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
25
|
|
26
|
Lai K, Berkman PJ, Lorenc MT, Duran C, Smits L, Manoli S, Stiller J, Edwards D. WheatGenome.info: an integrated database and portal for wheat genome information. PLANT & CELL PHYSIOLOGY 2012; 53:e2. [PMID: 22009731 DOI: 10.1093/pcp/pcr141] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Bread wheat (Triticum aestivum) is one of the most important crop plants, globally providing staple food for a large proportion of the human population. However, improvement of this crop has been limited due to its large and complex genome. Advances in genomics are supporting wheat crop improvement. We provide a variety of web-based systems hosting wheat genome and genomic data to support wheat research and crop improvement. WheatGenome.info is an integrated database resource which includes multiple web-based applications. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second-generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This system includes links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/.
Collapse
Affiliation(s)
- Kaitao Lai
- School of Agriculture and Food Sciences and Australian Centre for Plant Functional Genomics, University of Queensland, Brisbane, QLD 4072, Australia
| | | | | | | | | | | | | | | |
Collapse
|
27
|
Azam S, Thakur V, Ruperao P, Shah T, Balaji J, Amindala B, Farmer AD, Studholme DJ, May GD, Edwards D, Jones JDG, Varshney RK. Coverage-based consensus calling (CbCC) of short sequence reads and comparison of CbCC results to identify SNPs in chickpea (Cicer arietinum; Fabaceae), a crop species without a reference genome. AMERICAN JOURNAL OF BOTANY 2012; 99:186-192. [PMID: 22301893 DOI: 10.3732/ajb.1100419] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
PREMISE OF THE STUDY Next-generation sequencing (NGS) technologies are frequently used for resequencing and mining of single nucleotide polymorphisms (SNPs) by comparison to a reference genome. In crop species such as chickpea (Cicer arietinum) that lack a reference genome sequence, NGS-based SNP discovery is a challenge. Therefore, unlike probability-based statistical approaches for consensus calling and by comparison with a reference sequence, a coverage-based consensus calling (CbCC) approach was applied and two genotypes were compared for SNP identification. METHODS A CbCC approach is used in this study with four commonly used short read alignment tools (Maq, Bowtie, Novoalign, and SOAP2) and 15.7 and 22.1 million Illumina reads for chickpea genotypes ICC4958 and ICC1882, together with the chickpea trancriptome assembly (CaTA). KEY RESULTS A nonredundant set of 4543 SNPs was identified between two chickpea genotypes. Experimental validation of 224 randomly selected SNPs showed superiority of Maq among individual tools, as 50.0% of SNPs predicted by Maq were true SNPs. For combinations of two tools, greatest accuracy (55.7%) was reported for Maq and Bowtie, with a combination of Bowtie, Maq, and Novoalign identifying 61.5% true SNPs. SNP prediction accuracy generally increased with increasing reads depth. CONCLUSIONS This study provides a benchmark comparison of tools as well as read depths for four commonly used tools for NGS SNP discovery in a crop species without a reference genome sequence. In addition, a large number of SNPs have been identified in chickpea that would be useful for molecular breeding.
Collapse
Affiliation(s)
- Sarwar Azam
- Centre of Excellence in Genomics, International Crops Research Institute for the Semi-Arid Tropics, Patancheru 502324, Andhra Pradesh, India
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Berkman PJ, Lai K, Lorenc MT, Edwards D. Next-generation sequencing applications for wheat crop improvement. AMERICAN JOURNAL OF BOTANY 2012; 99:365-71. [PMID: 22268223 DOI: 10.3732/ajb.1100309] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Bread wheat (Triticum aestivum; Poaceae) is a crop plant of great importance. It provides nearly 20% of the world's daily food supply measured by calorie intake, similar to that provided by rice. The yield of wheat has doubled over the last 40 years due to a combination of advanced agronomic practice and improved germplasm through selective breeding. More recently, yield growth has been less dramatic, and a significant improvement in wheat production will be required if demand from the growing human population is to be met. Next-generation sequencing (NGS) technologies are revolutionizing biology and can be applied to address critical issues in plant biology. Technologies can produce draft sequences of genomes with a significant reduction to the cost and timeframe of traditional technologies. In addition, NGS technologies can be used to assess gene structure and expression, and importantly, to identify heritable genome variation underlying important agronomic traits. This review provides an overview of the wheat genome and NGS technologies, details some of the problems in applying NGS technology to wheat, and describes how NGS technologies are starting to impact wheat crop improvement.
Collapse
Affiliation(s)
- Paul J Berkman
- University of Queensland, School of Agriculture and Food Sciences and Australian Centre for Plant Functional Genomics, Brisbane, QLD 4072, Australia
| | | | | | | |
Collapse
|
29
|
Lee HC, Lai K, Lorenc MT, Imelfort M, Duran C, Edwards D. Bioinformatics tools and databases for analysis of next-generation sequence data. Brief Funct Genomics 2011; 11:12-24. [DOI: 10.1093/bfgp/elr037] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
|
30
|
Abbadi A, Leckband G. Rapeseed breeding for oil content, quality, and sustainability. EUR J LIPID SCI TECH 2011. [DOI: 10.1002/ejlt.201100063] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
31
|
Duran C, Eales D, Marshall D, Imelfort M, Stiller J, Berkman PJ, Clark T, McKenzie M, Appleby N, Batley J, Basford K, Edwards D. Future tools for association mapping in crop plants. Genome 2011; 53:1017-23. [PMID: 21076517 DOI: 10.1139/g10-057] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Association mapping currently relies on the identification of genetic markers. Several technologies have been adopted for genetic marker analysis, with single nucleotide polymorphisms (SNPs) being the most popular where a reasonable quantity of genome sequence data are available. We describe several tools we have developed for the discovery, annotation, and visualization of molecular markers for association mapping. These include autoSNPdb for SNP discovery from assembled sequence data; TAGdb for the identification of gene specific paired read Illumina GAII data; CMap3D for the comparison of mapped genetic and physical markers; and BAC and Gene Annotator for the online annotation of genes and genomic sequences.
Collapse
Affiliation(s)
- Chris Duran
- University of Queensland, Australian Centre for Plant Functional Genomics, School of Land, Crop and Food Sciences, Brisbane, Australia
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Dereeper A, Nicolas S, Le Cunff L, Bacilieri R, Doligez A, Peros JP, Ruiz M, This P. SNiPlay: a web-based tool for detection, management and analysis of SNPs. Application to grapevine diversity projects. BMC Bioinformatics 2011; 12:134. [PMID: 21545712 PMCID: PMC3102043 DOI: 10.1186/1471-2105-12-134] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2010] [Accepted: 05/05/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND High-throughput re-sequencing, new genotyping technologies and the availability of reference genomes allow the extensive characterization of Single Nucleotide Polymorphisms (SNPs) and insertion/deletion events (indels) in many plant species. The rapidly increasing amount of re-sequencing and genotyping data generated by large-scale genetic diversity projects requires the development of integrated bioinformatics tools able to efficiently manage, analyze, and combine these genetic data with genome structure and external data. RESULTS In this context, we developed SNiPlay, a flexible, user-friendly and integrative web-based tool dedicated to polymorphism discovery and analysis. It integrates:1) a pipeline, freely accessible through the internet, combining existing softwares with new tools to detect SNPs and to compute different types of statistical indices and graphical layouts for SNP data. From standard sequence alignments, genotyping data or Sanger sequencing traces given as input, SNiPlay detects SNPs and indels events and outputs submission files for the design of Illumina's SNP chips. Subsequently, it sends sequences and genotyping data into a series of modules in charge of various processes: physical mapping to a reference genome, annotation (genomic position, intron/exon location, synonymous/non-synonymous substitutions), SNP frequency determination in user-defined groups, haplotype reconstruction and network, linkage disequilibrium evaluation, and diversity analysis (Pi, Watterson's Theta, Tajima's D).Furthermore, the pipeline allows the use of external data (such as phenotype, geographic origin, taxa, stratification) to define groups and compare statistical indices.2) a database storing polymorphisms, genotyping data and grapevine sequences released by public and private projects. It allows the user to retrieve SNPs using various filters (such as genomic position, missing data, polymorphism type, allele frequency), to compare SNP patterns between populations, and to export genotyping data or sequences in various formats. CONCLUSIONS Our experiments on grapevine genetic projects showed that SNiPlay allows geneticists to rapidly obtain advanced results in several key research areas of plant genetic diversity. Both the management and treatment of large amounts of SNP data are rendered considerably easier for end-users through automation and integration. Current developments are taking into account new advances in high-throughput technologies.SNiPlay is available at: http://sniplay.cirad.fr/.
Collapse
Affiliation(s)
- Alexis Dereeper
- Diversity, Genetics and Genomics of grapevine, UMR DIAPC, INRA, Montpellier, France.
| | | | | | | | | | | | | | | |
Collapse
|
33
|
Abstract
Genotyping technology now allows the rapid and affordable generation of million-SNP profiles for humans, leading to considerable activity in association mapping. Similar activity is anticipated for many plant species, including Brassica. These plant association mapping activities will require the same care in quality control and quality assurance as for humans. The subsequent analyses may draw upon the same body of theory that is described here in the language of quantitative genetics.
Collapse
Affiliation(s)
- Bruce S Weir
- Department of Biostatistics, University of Washington, Box 357232, Seattle, WA 98195-7232, USA.
| |
Collapse
|
34
|
Vidal RO, Mondego JMC, Pot D, Ambrósio AB, Andrade AC, Pereira LFP, Colombo CA, Vieira LGE, Carazzolle MF, Pereira GAG. A high-throughput data mining of single nucleotide polymorphisms in Coffea species expressed sequence tags suggests differential homeologous gene expression in the allotetraploid Coffea arabica. PLANT PHYSIOLOGY 2010; 154:1053-66. [PMID: 20864545 PMCID: PMC2971587 DOI: 10.1104/pp.110.162438] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Polyploidization constitutes a common mode of evolution in flowering plants. This event provides the raw material for the divergence of function in homeologous genes, leading to phenotypic novelty that can contribute to the success of polyploids in nature or their selection for use in agriculture. Mounting evidence underlined the existence of homeologous expression biases in polyploid genomes; however, strategies to analyze such transcriptome regulation remained scarce. Important factors regarding homeologous expression biases remain to be explored, such as whether this phenomenon influences specific genes, how paralogs are affected by genome doubling, and what is the importance of the variability of homeologous expression bias to genotype differences. This study reports the expressed sequence tag assembly of the allopolyploid Coffea arabica and one of its direct ancestors, Coffea canephora. The assembly was used for the discovery of single nucleotide polymorphisms through the identification of high-quality discrepancies in overlapped expressed sequence tags and for gene expression information indirectly estimated by the transcript redundancy. Sequence diversity profiles were evaluated within C. arabica (Ca) and C. canephora (Cc) and used to deduce the transcript contribution of the Coffea eugenioides (Ce) ancestor. The assignment of the C. arabica haplotypes to the C. canephora (CaCc) or C. eugenioides (CaCe) ancestral genomes allowed us to analyze gene expression contributions of each subgenome in C. arabica. In silico data were validated by the quantitative polymerase chain reaction and allele-specific combination TaqMAMA-based method. The presence of differential expression of C. arabica homeologous genes and its implications in coffee gene expression, ontology, and physiology are discussed.
Collapse
|
35
|
Verhoeven KJF, Casella G, McIntyre LM. Epistasis: obstacle or advantage for mapping complex traits? PLoS One 2010; 5:e12264. [PMID: 20865037 PMCID: PMC2928725 DOI: 10.1371/journal.pone.0012264] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2009] [Accepted: 04/19/2010] [Indexed: 01/22/2023] Open
Abstract
Identification of genetic loci in complex traits has focused largely on one-dimensional genome scans to search for associations between single markers and the phenotype. There is mounting evidence that locus interactions, or epistasis, are a crucial component of the genetic architecture of biologically relevant traits. However, epistasis is often viewed as a nuisance factor that reduces power for locus detection. Counter to expectations, recent work shows that fitting full models, instead of testing marker main effect and interaction components separately, in exhaustive multi-locus genome scans can have higher power to detect loci when epistasis is present than single-locus scans, and improvement that comes despite a much larger multiple testing alpha-adjustment in such searches. We demonstrate, both theoretically and via simulation, that the expected power to detect loci when fitting full models is often larger when these loci act epistatically than when they act additively. Additionally, we show that the power for single locus detection may be improved in cases of epistasis compared to the additive model. Our exploration of a two step model selection procedure shows that identifying the true model is difficult. However, this difficulty is certainly not exacerbated by the presence of epistasis, on the contrary, in some cases the presence of epistasis can aid in model selection. The impact of allele frequencies on both power and model selection is dramatic.
Collapse
Affiliation(s)
- Koen J. F. Verhoeven
- Netherlands Institute of Ecology (NIOO-KNAW), Department of Terrestrial Ecology, Heteren, The Netherlands
| | - George Casella
- Department of Statistics and Genetics Institute, University of Florida, Gainesville, Florida, United States of America
| | - Lauren M. McIntyre
- Genetics Institute, Department of Molecular Genetics and Microbiology and Department of Statistics, University of Florida, Gainesville, Florida, United States of America
- * E-mail:
| |
Collapse
|
36
|
Edwards D, Batley J. Plant genome sequencing: applications for crop improvement. PLANT BIOTECHNOLOGY JOURNAL 2010; 8:2-9. [PMID: 19906089 DOI: 10.1111/j.1467-7652.2009.00459.x] [Citation(s) in RCA: 98] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
DNA sequencing technology is undergoing a revolution with the commercialization of second generation technologies capable of sequencing thousands of millions of nucleotide bases in each run. The data explosion resulting from this technology is likely to continue to increase with the further development of second generation sequencing and the introduction of third generation single-molecule sequencing methods over the coming years. The question is no longer whether we can sequence crop genomes which are often large and complex, but how soon can we sequence them? Even cereal genomes such as wheat and barley which were once considered intractable are coming under the spotlight of the new sequencing technologies and an array of new projects and approaches are being established. The increasing availability of DNA sequence information enables the discovery of genes and molecular markers associated with diverse agronomic traits creating new opportunities for crop improvement. However, the challenge remains to convert this mass of data into knowledge that can be applied in crop breeding programs.
Collapse
Affiliation(s)
- David Edwards
- Australian Centre for Plant Functional Genomics and School of Land Crop and Food Sciences, University of Queensland, Brisbane, Australia.
| | | |
Collapse
|
37
|
McCouch SR, Zhao K, Wright M, Tung CW, Ebana K, Thomson M, Reynolds A, Wang D, DeClerck G, Ali ML, McClung A, Eizenga G, Bustamante C. Development of genome-wide SNP assays for rice. BREEDING SCIENCE 2010; 60:524-535. [PMID: 0 DOI: 10.1270/jsbbs.60.524] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Affiliation(s)
| | - Keyan Zhao
- Department of Biological Statistics and Computational Biology, Cornell University
- Department of Genetics, Stanford University
| | - Mark Wright
- Department of Plant Breeding and Genetics, Cornell University
- Department of Biological Statistics and Computational Biology, Cornell University
| | - Chih-Wei Tung
- Department of Plant Breeding and Genetics, Cornell University
| | | | | | - Andy Reynolds
- Department of Biological Statistics and Computational Biology, Cornell University
| | - Diane Wang
- Department of Plant Breeding and Genetics, Cornell University
| | | | - Md. Liakat Ali
- Rice Research and Extension Center, University of Arkansas
| | - Anna McClung
- USDA ARS, Dale Bumpers National Rice Research Center
| | | | - Carlos Bustamante
- Department of Biological Statistics and Computational Biology, Cornell University
- Department of Genetics, Stanford University
| |
Collapse
|
38
|
Kim C, Yoon U, Lee G, Park S, Seol YJ, Lee H, Hahn J. An integrated database to enhance the identification of SNP markers for rice. Bioinformation 2009; 4:269-70. [PMID: 20975922 PMCID: PMC2951715 DOI: 10.6026/97320630004269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2009] [Accepted: 12/15/2009] [Indexed: 11/23/2022] Open
Abstract
UNLABELLED The National Academy of Agricultural Science (NAAS) has developed a web-based marker database to provide information about SNP markers in rice. The database consists of three major functional categories: map viewing, marker searching and gene annotation. It provides 12,829 SNP markers information including gene location information on 12 chromosomes in rice. The annotation of SNP marker provides information such as marker name, EST number, gene definition and general marker information. Users are assisted in tracing any new structures of the chromosomes and gene positional functions using specific SNP markers. AVAILABILITY The database is available for free at http://nabic.niab.go.kr/SNP/
Collapse
Affiliation(s)
- Changkug Kim
- Genomics Division, National Academy of Agricultural Science (NAAS), Suwon 441-707, Korea
| | | | | | | | | | | | | |
Collapse
|
39
|
Duran C, Appleby N, Vardy M, Imelfort M, Edwards D, Batley J. Single nucleotide polymorphism discovery in barley using autoSNPdb. PLANT BIOTECHNOLOGY JOURNAL 2009; 7:326-33. [PMID: 19386041 DOI: 10.1111/j.1467-7652.2009.00407.x] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Molecular markers are used to provide the link between genotype and phenotype, for the production of molecular genetic maps and to assess genetic diversity within and between related species. Single nucleotide polymorphisms (SNPs) are the most abundant molecular genetic marker. SNPs can be identified in silico, but care must be taken to ensure that the identified SNPs reflect true genetic variation and are not a result of errors associated with DNA sequencing. The SNP detection method autoSNP has been developed to identify SNPs from sequence data for any species. Confidence in the predicted SNPs is based on sequence redundancy, and haplotype co-segregation scores are calculated for a further independent measure of confidence. We have extended the autoSNP method to produce autoSNPdb, which integrates SNP and gene annotation information with a graphical viewer. We have applied this software to public barley expressed sequences, and the resulting database is available over the Internet. SNPs can be viewed and searched by sequence, functional annotation or predicted synteny with a reference genome, in this case rice. The correlation between SNPs and barley cultivar, expressed tissue type and development stage has been collated for ease of exploration. An average of one SNP per 240 bp was identified, with SNPs more prevalent in the 5' regions and simple sequence repeat (SSR) flanking sequences. Overall, autoSNPdb can provide a wealth of genetic polymorphism information for any species for which sequence data are available.
Collapse
Affiliation(s)
- Chris Duran
- Australian Centre for Plant Functional Genomics, School of Land, Crop and Food Sciences, Institute for Molecular Bioscience, University of Queensland, Brisbane, Qld 4072, Australia
| | | | | | | | | | | |
Collapse
|
40
|
Imelfort M, Duran C, Batley J, Edwards D. Discovering genetic polymorphisms in next-generation sequencing data. PLANT BIOTECHNOLOGY JOURNAL 2009; 7:312-317. [PMID: 19386039 DOI: 10.1111/j.1467-7652.2009.00406.x] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
The ongoing revolution in DNA sequencing technology now enables the reading of thousands of millions of nucleotide bases in a single instrument run. However, this data quantity is often compromised by poor confidence in the read quality. The identification of genetic polymorphisms from this data is therefore problematic and, combined with the vast quantity of data, poses a major bioinformatics challenge. However, once these difficulties have been addressed, next-generation sequencing will offer a means to identify and characterize the wealth of genetic polymorphisms underlying the vast phenotypic variation in biological systems. We describe the recent advances in next-generation sequencing technology, together with preliminary approaches that can be applied for single nucleotide polymorphism discovery in plant species.
Collapse
Affiliation(s)
- Michael Imelfort
- Australian Centre for Plant Functional Genomics, School of Land, Crop and Food Sciences, University of Queensland, Brisbane, QLD 4072, Australia
| | | | | | | |
Collapse
|
41
|
|