1
|
Fang L, Liu T, Li M, Dong X, Han Y, Xu C, Li S, Zhang J, He X, Zhou Q, Luo D, Liu Z. MODMS: a multi-omics database for facilitating biological studies on alfalfa ( Medicago sativa L.). HORTICULTURE RESEARCH 2024; 11:uhad245. [PMID: 38239810 PMCID: PMC10794946 DOI: 10.1093/hr/uhad245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 11/13/2023] [Indexed: 01/22/2024]
Abstract
Alfalfa (Medicago sativa L.) is a globally important forage crop. It also serves as a vegetable and medicinal herb because of its excellent nutritional quality and significant economic value. Multi-omics data on alfalfa continue to accumulate owing to recent advances in high-throughput techniques, and integrating this information holds great potential for expediting genetic research and facilitating advances in alfalfa agronomic traits. Therefore, we developed a comprehensive database named MODMS (multi-omics database of M. sativa) that incorporates multiple reference genomes, annotations, comparative genomics, transcriptomes, high-quality genomic variants, proteomics, and metabolomics. This report describes our continuously evolving database, which provides researchers with several convenient tools and extensive omics data resources, facilitating the expansion of alfalfa research. Further details regarding the MODMS database are available at https://modms.lzu.edu.cn/.
Collapse
Affiliation(s)
- Longfa Fang
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China
| | - Tao Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China
| | - Mingyu Li
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China
| | - XueMing Dong
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China
| | - Yuling Han
- Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China
| | - Congzhuo Xu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China
| | - Siqi Li
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China
| | - Jia Zhang
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China
| | - Xiaojuan He
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China
| | - Qiang Zhou
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China
| | - Dong Luo
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China
| | - Zhipeng Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China
| |
Collapse
|
2
|
Zhao YX, Gao GX, Zhou Y, Guo CX, Li B, El-Ashram S, Li ZL. Genome-wide association studies uncover genes associated with litter traits in the pig. Animal 2022; 16:100672. [PMID: 36410176 DOI: 10.1016/j.animal.2022.100672] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 10/17/2022] [Accepted: 10/18/2022] [Indexed: 12/24/2022] Open
Abstract
Litter traits are critical economic variables in the pig industry as they represent a production indicator that can serve to determine sow fertility. In this study, a genome-wide association study on litter traits, including total number born (TNB), number born alive (NBA), litter birth weight (LBW), average birth weight (ABW), and piglet uniformity (PU), was carried out on two pig breeds (Yorkshire and Landrace). A total of 3 637 pigs of both breeds were genotyped using the GeneSeek GGP Porcine 50K SNP BeadChip. A mixed linear model (MLM) and fixed and random model circulating probability unification (FarmCPU) were employed in the genome-wide association studies for litter traits using combined data from the two pig breeds and data from each breed separately. Additionally, the heritability of traits was estimated using three methods-pedigree-based best linear unbiased prediction (PBLUP), genomic best linear unbiased prediction (GBLUP), and single-step best linear unbiased prediction (ssGBLUP)-and was found to lie between 0.065 and 0.1289, 0.0478 and 0.0938, 0.0793 and 0.0935, 0.1862 and 0.2163, and 0.0327 and 0.0419 for TNB, NBA, LBW, ABW, and PU, respectively. We also compared the genomic prediction accuracies and unbiasedness for litter traits of the three BLUP models. Our results indicated that the ssGBLUP method provided higher predictive accuracies and more rational unbiasedness compared with the PBLUP and GBLUP methodologies. Furthermore, based on their possible roles, eight candidate genes (INHBA, LEPR, HDHD2, CTNND2, RNF216, HMX1, PAPPA2, and NTN1) were identified as being linked with litter traits. In the middle of the test, these genes were found to be connected with pig metabolism and ovulation rate. Our results provide the insights into the genetic architecture of litter traits in pigs, and the potential single nucleotide polymorphisms (SNPs) and candidate genes identified may benefit economic profits in pig-breeding industry and contribute to improve litter traits.
Collapse
Affiliation(s)
- Y X Zhao
- School of Life Science and Engineering, Foshan University, Foshan, Guangdong 528000, China; Guangxi Yangxiang Agricultural and Animal Husbandry Co, Ltd, Guigang, Guangxi 537100, China
| | - G X Gao
- School of Life Science and Engineering, Foshan University, Foshan, Guangdong 528000, China
| | - Y Zhou
- Guangxi Yangxiang Agricultural and Animal Husbandry Co, Ltd, Guigang, Guangxi 537100, China
| | - C X Guo
- Guangxi Yangxiang Agricultural and Animal Husbandry Co, Ltd, Guigang, Guangxi 537100, China
| | - B Li
- Guangxi Yangxiang Agricultural and Animal Husbandry Co, Ltd, Guigang, Guangxi 537100, China
| | - S El-Ashram
- Faculty of Science, Kafrelsheikh University, Kafr El-Sheikh 33516, Egypt
| | - Z L Li
- School of Life Science and Engineering, Foshan University, Foshan, Guangdong 528000, China.
| |
Collapse
|
3
|
Zhou X, Li X, Zhang X, Yin D, Wang J, Zhao Y. Construction of a high-density genetic map and localization of grazing-tolerant QTLs in Medicago falcata L. FRONTIERS IN PLANT SCIENCE 2022; 13:985603. [PMID: 36262664 PMCID: PMC9574245 DOI: 10.3389/fpls.2022.985603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 08/26/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND Using genomic DNA from 79 F1 plants resulted from a crossing between parents with strong and weak grazing tolerance in Medicago falcata L., we generated an EcoRI restriction site-associated DNA (RAD) sequencing library. After sequencing and assembly, a high-density genetic map with high-quality SNP markers was constructed, with a total length of 1312.238 cM and an average density of 0.844 SNP/cM. METHODS The phenotypic traits of 79 F1 families were observed and the QTLS of 6 traits were analyzed by interval mapping. RESULTS Sixty three QTLs were identified for seven traits with LOD values from 3 to 6 and the contribution rates from 15% to 30%. Among the 63 QTLs, 17 were for natural shoot height, 12 for rhizome Length, 10 for Shoot canopy diameter, 9 for Basal plant diameter, 6 for stem number, 5 for absolute shoot height, and 4 for rhizome width. These QTLs were concentrated on LG2, LG4, LG5, LG7, and LG8. LG6 had only 6 QTLs. According to the results of QTL mapping, comparison of reference genomes, and functional annotation, 10 candidate genes that may be related to grazing tolerance were screened. qRT-PCR analysis showed that two candidate genes (LOC11412291 and LOC11440209) may be the key genes related to grazing tolerance of M. falcata. CONCLUSION The identified trait-associated QTLs and candidate genes in this study will provide a solid foundation for future molecular breeding for enhanced grazing-tolerance in M. falcata.
Collapse
|
4
|
A Review of Unreduced Gametes and Neopolyploids in Alfalfa: How to Fill the Gap between Well-Established Meiotic Mutants and Next-Generation Genomic Resources. PLANTS 2021; 10:plants10050999. [PMID: 34067689 PMCID: PMC8156078 DOI: 10.3390/plants10050999] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 05/03/2021] [Accepted: 05/12/2021] [Indexed: 01/11/2023]
Abstract
The gene flow mediated by unreduced gametes between diploid and tetraploid plants of the Medicagosativa-coerulea-falcata complex is pivotal for alfalfa breeding. Sexually tetraploidized hybrids could represent the best way to exploit progressive heterosis simultaneously derived from gene diversity, heterozygosity, and polyploidy. Moreover, unreduced gametes combined with parthenogenesis (i.e., apomixis) would enable the cloning of plants through seeds, providing a unique opportunity for the selection of superior genotypes with permanently fixed heterosis. This reproductive strategy has never been detected in the genus Medicago, but features of apomixis, such as restitutional apomeiosis and haploid parthenogenesis, have been reported. By means of an original case study, we demonstrated that sexually tetraploidized plants maintain apomeiosis, but this trait is developmentally independent from parthenogenesis. Alfalfa meiotic mutants producing unreduced egg cells revealed a null or very low capacity for parthenogenesis. The overall achievements reached so far are reviewed and discussed along with the efforts and strategies made for exploiting reproductive mutants that express apomictic elements in alfalfa breeding programs. Although several studies have investigated the cytological mechanisms responsible for 2n gamete formation and the inheritance of this trait, only a very small number of molecular markers and candidate genes putatively linked to unreduced gamete formation have been identified. Furthermore, this scenario has remained almost unchanged over the last two decades. Here, we propose a reverse genetics approach, by exploiting the genomic and transcriptomic resources available in alfalfa. Through a comparison with 9 proteins belonging to Arabidopsis thaliana known for their involvement in 2n gamete production, we identified 47 orthologous genes and evaluated their expression in several tissues, paving the way for novel candidate gene characterization studies. An overall view on strategies suitable to fill the gap between well-established meiotic mutants and next-generation genomic resources is presented and discussed.
Collapse
|
5
|
Li FD, Tong W, Xia EH, Wei CL. Optimized sequencing depth and de novo assembler for deeply reconstructing the transcriptome of the tea plant, an economically important plant species. BMC Bioinformatics 2019; 20:553. [PMID: 31694521 PMCID: PMC6836513 DOI: 10.1186/s12859-019-3166-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 10/21/2019] [Indexed: 11/10/2022] Open
Abstract
Background Tea is the oldest and among the world’s most popular non-alcoholic beverages, which has important economic, health and cultural values. Tea is commonly produced from the leaves of tea plants (Camellia sinensis), which belong to the genus Camellia of family Theaceae. In the last decade, many studies have generated the transcriptomes of tea plants at different developmental stages or under abiotic and/or biotic stresses to investigate the genetic basis of secondary metabolites that determine tea quality. However, these results exhibited large differences, particularly in the total number of reconstructed transcripts and the quality of the assembled transcriptomes. These differences largely result from limited knowledge regarding the optimized sequencing depth and assembler for transcriptome assembly of structurally complex plant species genomes. Results We employed different amounts of RNA-sequencing data, ranging from 4 to 84 Gb, to assemble the tea plant transcriptome using five well-known and representative transcript assemblers. Although the total number of assembled transcripts increased with increasing sequencing data, the proportion of unassembled transcripts became saturated as revealed by plant BUSCO datasets. Among the five representative assemblers, the Bridger package shows the best performance in both assembly completeness and accuracy as evaluated by the BUSCO datasets and genome alignment. In addition, we showed that Bridger and BinPacker harbored the shortest runtimes followed by SOAPdenovo and Trans-ABySS. Conclusions The present study compares the performance of five representative transcript assemblers and investigates the key factors that affect the assembly quality of the transcriptome of the tea plants. This study will be of significance in helping the tea research community obtain better sequencing and assembly of tea plant transcriptomes under conditions of interest and may thus help to answer major biological questions currently facing the tea industry.
Collapse
Affiliation(s)
- Fang-Dong Li
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China.,School of Science, Anhui Agricultural University, Hefei, 230036, China
| | - Wei Tong
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China
| | - En-Hua Xia
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China.
| | - Chao-Ling Wei
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China.
| |
Collapse
|
6
|
Raizada A, Souframanien J. Transcriptome sequencing, de novo assembly, characterisation of wild accession of blackgram (Vigna mungo var. silvestris) as a rich resource for development of molecular markers and validation of SNPs by high resolution melting (HRM) analysis. BMC PLANT BIOLOGY 2019; 19:358. [PMID: 31419947 PMCID: PMC6697964 DOI: 10.1186/s12870-019-1954-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Accepted: 07/31/2019] [Indexed: 05/07/2023]
Abstract
BACKGROUND Blackgram [Vigna mungo (L.) Hepper], is an important legume crop of Asia with limited genomic resources. We report a comprehensive set of genic simple sequence repeat (SSR) and single nucleotide polymorphism (SNPs) markers using Illumina MiSeq sequencing of transcriptome and its application in genetic variation analysis and mapping. RESULTS Transcriptome sequencing of immature seeds of wild blackgram, V. mungo var. silvestris by Illumina MiSeq technology generated 1.9 × 107 reads, which were assembled into 40,178 transcripts (TCS) with an average length of 446 bp covering 2.97 GB of the genome. A total of 38,753 CDS (Coding sequences) were predicted from 40,178 TCS and 28,984 CDS were annotated through BLASTX and mapped to GO and KEGG database resulting in 140 unique pathways. The tri-nucleotides were most abundant (39.9%) followed by di-nucleotide (30.2%). About 60.3 and 37.6% of SSR motifs were present in the coding sequences (CDS) and untranslated regions (UTRs) respectively. Among SNPs, the most abundant substitution type were transitions (Ts) (61%) followed by transversions (Tv) type (39%), with a Ts/Tv ratio of 1.58. A total of 2306 DEGs were identified by RNA Seq between wild and cultivar and validation was done by quantitative reverse transcription polymerase chain reaction. In this study, we genotyped SNPs with a validation rate of 78.87% by High Resolution Melting (HRM) Assay. CONCLUSION In the present study, 1621genic-SSR and 1844 SNP markers were developed from immature seed transcriptome sequence of blackgram and 31 genic-SSR markers were used to study genetic variations among different blackgram accessions. Above developed markers contribute towards enriching available genomic resources for blackgram and aid in breeding programmes.
Collapse
Affiliation(s)
- Avi Raizada
- Nuclear Agriculture and Biotechnology Division, BARC, Trombay, Mumbai, Trombay, 400085, India
- Homi Bhabha National Institute, Training School Complex, Anushakti Nagar, Mumbai, Anushakti Nagar, 400094, India
| | - J Souframanien
- Nuclear Agriculture and Biotechnology Division, BARC, Trombay, Mumbai, Trombay, 400085, India.
- Homi Bhabha National Institute, Training School Complex, Anushakti Nagar, Mumbai, Anushakti Nagar, 400094, India.
| |
Collapse
|
7
|
Wang Y, Shahid MQ, Ghouri F, Ercişli S, Baloch FS. Development of EST-based SSR and SNP markers in Gastrodia elata (herbal medicine) by sequencing, de novo assembly and annotation of the transcriptome. 3 Biotech 2019; 9:292. [PMID: 31321198 DOI: 10.1007/s13205-019-1823-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2018] [Accepted: 06/23/2019] [Indexed: 01/28/2023] Open
Abstract
Tianma (Gastrodia elata Blume) has unique biological characteristics and high medicinal value. The wild resource of G. elata is being overutilized and should be conserved as it is already included in the list of endangered species in China. The population size of cultivated G. elata is small because of domestication bottleneck. Therefore, it is of utmost importance to evolve high-quality varieties and conserve wild resources of G. elata. In this study, we sequenced tuber transcriptomes of three major cultivated sub-species of Gastrodia elata, namely G. elata BI. f. elata, G. elata Bl. f. glauca S. Chow, and G. elata Bl. f. Viridis, and obtained about 7.8G clean data. The assembled high-quality reads of three sub-species were clustered into 56,884 unigenes. Of these, 31,224 (54.89%), 25,733 (45.24%), 22,629 (39.78%), and 11,856 (20.84%) unigenes were annotated by Nr, Swiss-Port, Eukaryotic Ortholog Groups (KOG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases, respectively. Here, a total of 3766 EST-SSRs and 128,921 SNPs were identified from the unigenes. The results not only offer huge number of genes that were responsible for the growth, development, and metabolism of bioactive components, but also a large number of molecular markers were detected for future studies on the conservation genetics and molecular breeding of G. elata.
Collapse
|
8
|
Hawkins C, Yu LX. Recent progress in alfalfa (Medicago sativa L.) genomics and genomic selection. ACTA ACUST UNITED AC 2018. [DOI: 10.1016/j.cj.2018.01.006] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
9
|
You Q, Yang X, Peng Z, Xu L, Wang J. Development and Applications of a High Throughput Genotyping Tool for Polyploid Crops: Single Nucleotide Polymorphism (SNP) Array. FRONTIERS IN PLANT SCIENCE 2018; 9:104. [PMID: 29467780 PMCID: PMC5808122 DOI: 10.3389/fpls.2018.00104] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Accepted: 01/19/2018] [Indexed: 05/18/2023]
Abstract
Polypoid species play significant roles in agriculture and food production. Many crop species are polyploid, such as potato, wheat, strawberry, and sugarcane. Genotyping has been a daunting task for genetic studies of polyploid crops, which lags far behind the diploid crop species. Single nucleotide polymorphism (SNP) array is considered to be one of, high-throughput, relatively cost-efficient and automated genotyping approaches. However, there are significant challenges for SNP identification in complex, polyploid genomes, which has seriously slowed SNP discovery and array development in polyploid species. Ploidy is a significant factor impacting SNP qualities and validation rates of SNP markers in SNP arrays, which has been proven to be a very important tool for genetic studies and molecular breeding. In this review, we (1) discussed the pros and cons of SNP array in general for high throughput genotyping, (2) presented the challenges of and solutions to SNP calling in polyploid species, (3) summarized the SNP selection criteria and considerations of SNP array design for polyploid species, (4) illustrated SNP array applications in several different polyploid crop species, then (5) discussed challenges, available software, and their accuracy comparisons for genotype calling based on SNP array data in polyploids, and finally (6) provided a series of SNP array design and genotype calling recommendations. This review presents a complete overview of SNP array development and applications in polypoid crops, which will benefit the research in molecular breeding and genetics of crops with complex genomes.
Collapse
Affiliation(s)
- Qian You
- Key Laboratory of Sugarcane Biology and Genetic Breeding Ministry of Agriculture, Fujian Agriculture and Forestry University, Fuzhou, China
- Agronomy Department, University of Florida, Gainesville, FL, United States
| | - Xiping Yang
- Agronomy Department, University of Florida, Gainesville, FL, United States
| | - Ze Peng
- Agronomy Department, University of Florida, Gainesville, FL, United States
| | - Liping Xu
- Key Laboratory of Sugarcane Biology and Genetic Breeding Ministry of Agriculture, Fujian Agriculture and Forestry University, Fuzhou, China
- *Correspondence: Liping Xu
| | - Jianping Wang
- Agronomy Department, University of Florida, Gainesville, FL, United States
- Plant Molecular and Cellular Biology Program, Genetics Institute, University of Florida, Gainesville, FL, United States
- Key Laboratory of Genetics, Breeding and Multiple Utilization of Crops, Ministry of Education, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, China
- Jianping Wang
| |
Collapse
|
10
|
Yu LX. Identification of Single-Nucleotide Polymorphic Loci Associated with Biomass Yield under Water Deficit in Alfalfa ( Medicago sativa L.) Using Genome-Wide Sequencing and Association Mapping. FRONTIERS IN PLANT SCIENCE 2017; 8:1152. [PMID: 28706532 PMCID: PMC5489703 DOI: 10.3389/fpls.2017.01152] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2017] [Accepted: 06/15/2017] [Indexed: 05/08/2023]
Abstract
Alfalfa is a worldwide grown forage crop and is important due to its high biomass production and nutritional value. However, the production of alfalfa is challenged by adverse environmental factors such as drought and other stresses. Developing drought resistance alfalfa is an important breeding target for enhancing alfalfa productivity in arid and semi-arid regions. In the present study, we used genotyping-by-sequencing and genome-wide association to identify marker loci associated with biomass yield under drought in the field in a panel of diverse germplasm of alfalfa. A total of 28 markers at 22 genetic loci were associated with yield under water deficit, whereas only four markers associated with the same trait under well-watered condition. Comparisons of marker-trait associations between water deficit and well-watered conditions showed non-similarity except one. Most of the markers were identical across harvest periods within the treatment, although different levels of significance were found among the three harvests. The loci associated with biomass yield under water deficit located throughout all chromosomes in the alfalfa genome agreed with previous reports. Our results suggest that biomass yield under drought is a complex quantitative trait with polygenic inheritance and may involve a different mechanism compared to that of non-stress. BLAST searches of the flanking sequences of the associated loci against DNA databases revealed several stress-responsive genes linked to the drought resistance loci, including leucine-rich repeat receptor-like kinase, B3 DNA-binding domain protein, translation initiation factor IF2, and phospholipase-like protein. With further investigation, those markers closely linked to drought resistance can be used for MAS to accelerate the development of new alfalfa cultivars with improved resistance to drought and other abiotic stresses.
Collapse
Affiliation(s)
- Long-Xi Yu
- United States Department of Agriculture-Agricultural Research Service, Plant Germplasm Introduction Testing and ResearchProsser, WA, United States
| |
Collapse
|
11
|
Transcriptome Sequencing of Diverse Peanut (Arachis) Wild Species and the Cultivated Species Reveals a Wealth of Untapped Genetic Variability. G3-GENES GENOMES GENETICS 2016; 6:3825-3836. [PMID: 27729436 PMCID: PMC5144954 DOI: 10.1534/g3.115.026898] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
To test the hypothesis that the cultivated peanut species possesses almost no molecular variability, we sequenced a diverse panel of 22 Arachis accessions representing Arachis hypogaea botanical classes, A-, B-, and K- genome diploids, a synthetic amphidiploid, and a tetraploid wild species. RNASeq was performed on pools of three tissues, and de novo assembly was performed. Realignment of individual accession reads to transcripts of the cultivar OLin identified 306,820 biallelic SNPs. Among 10 naturally occurring tetraploid accessions, 40,382 unique homozygous SNPs were identified in 14,719 contigs. In eight diploid accessions, 291,115 unique SNPs were identified in 26,320 contigs. The average SNP rate among the 10 cultivated tetraploids was 0.5, and among eight diploids was 9.2 per 1000 bp. Diversity analysis indicated grouping of diploids according to genome classification, and cultivated tetraploids by subspecies. Cluster analysis of variants indicated that sequences of B genome species were the most similar to the tetraploids, and the next closest diploid accession belonged to the A genome species. A subset of 66 SNPs selected from the dataset was validated; of 782 SNP calls, 636 (81.32%) were confirmed using an allele-specific discrimination assay. We conclude that substantial genetic variability exists among wild species. Additionally, significant but lesser variability at the molecular level occurs among accessions of the cultivated species. This survey is the first to report significant SNP level diversity among transcripts, and may explain some of the phenotypic differences observed in germplasm surveys. Understanding SNP variants in the Arachis accessions will benefit in developing markers for selection.
Collapse
|
12
|
Kottapalli P, Ulloa M, Kottapalli KR, Payton P, Burke J. SNP Marker Discovery in Pima Cotton ( Gossypium barbadense L.) Leaf Transcriptomes. GENOMICS INSIGHTS 2016; 9:51-60. [PMID: 27721653 PMCID: PMC5049682 DOI: 10.4137/gei.s40377] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Revised: 08/22/2016] [Accepted: 08/24/2016] [Indexed: 11/17/2022]
Abstract
The objective of this study was to explore the known narrow genetic diversity and discover single-nucleotide polymorphic (SNP) markers for marker-assisted breeding within Pima cotton (Gossypium barbadense L.) leaf transcriptomes. cDNA from 25-day plants of three diverse cotton genotypes [Pima S6 (PS6), Pima S7 (PS7), and Pima 3-79 (P3-79)] was sequenced on Illumina sequencing platform. A total of 28.9 million reads (average read length of 138 bp) were generated by sequencing cDNA libraries of these three genotypes. The de novo assembly of reads generated transcriptome sets of 26,369 contigs for PS6, 25,870 contigs for PS7, and 24,796 contigs for P3-79. A Pima leaf reference transcriptome was generated consisting of 42,695 contigs. More than 10,000 single-nucleotide polymorphisms (SNPs) were identified between the genotypes, with 100% SNP frequency and a minimum of eight sequencing reads. The most prevalent SNP substitutions were C-T and A-G in these cotton genotypes. The putative SNPs identified can be utilized for characterizing genetic diversity, genotyping, and eventually in Pima cotton breeding through marker-assisted selection.
Collapse
Affiliation(s)
- Pratibha Kottapalli
- Center for Biotechnology and Genomics, Texas Tech University, Lubbock, TX, USA
| | - Mauricio Ulloa
- USDA-ARS, PA, CSRL, Plant Stress and Germplasm Development Research, Lubbock, TX, USA
| | | | - Paxton Payton
- USDA-ARS, PA, CSRL, Plant Stress and Germplasm Development Research, Lubbock, TX, USA
| | - John Burke
- USDA-ARS, PA, CSRL, Plant Stress and Germplasm Development Research, Lubbock, TX, USA
| |
Collapse
|
13
|
Recent Perspective of Next Generation Sequencing: Applications in Molecular Plant Biology and Crop Improvement. ACTA ACUST UNITED AC 2016. [DOI: 10.1007/s40011-016-0770-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
14
|
Wang J, Zhao Y, Ray I, Song M. Transcriptome responses in alfalfa associated with tolerance to intensive animal grazing. Sci Rep 2016; 6:19438. [PMID: 26763747 PMCID: PMC4725929 DOI: 10.1038/srep19438] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 12/02/2015] [Indexed: 01/15/2023] Open
Abstract
Tolerance of alfalfa (Medicago sativa L.) to animal grazing varies widely within the species. However, the molecular mechanisms influencing the grazing tolerant phenotype remain uncharacterized. The objective of this study was to identify genes and pathways that control grazing response in alfalfa. We analyzed whole-plant de novo transcriptomes from grazing tolerant and intolerant populations of M. sativa ssp. falcata subjected to grazing by sheep. Among the Gene Ontology terms which were identified as grazing responsive in the tolerant plants and differentially enriched between the tolerant and intolerant populations (both grazed), most were associated with the ribosome and translation-related activities, cell wall processes, and response to oxygen levels. Twenty-one grazing responsive pathways were identified that also exhibited differential expression between the tolerant and intolerant populations. These pathways were associated with secondary metabolite production, primary carbohydrate metabolic pathways, shikimate derivative dependent pathways, ribosomal subunit composition, hormone signaling, wound response, cell wall formation, and anti-oxidant defense. Sequence polymorphisms were detected among several differentially expressed homologous transcripts between the tolerant and intolerant populations. These differentially responsive genes and pathways constitute potential response mechanisms for grazing tolerance in alfalfa. They also provide potential targets for molecular breeding efforts to develop grazing-tolerant cultivars of alfalfa.
Collapse
Affiliation(s)
- Junjie Wang
- College of Ecology and Environmental Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Yan Zhao
- College of Ecology and Environmental Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Ian Ray
- Department of Plant and Environmental Sciences, New Mexico State University, Las Cruces, NM, USA
| | - Mingzhou Song
- Department of Computer Science, New Mexico State University, Las Cruces, NM, USA
| |
Collapse
|
15
|
Annicchiarico P, Nazzicari N, Li X, Wei Y, Pecetti L, Brummer EC. Accuracy of genomic selection for alfalfa biomass yield in different reference populations. BMC Genomics 2015; 16:1020. [PMID: 26626170 PMCID: PMC4667460 DOI: 10.1186/s12864-015-2212-y] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Accepted: 11/13/2015] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Genomic selection based on genotyping-by-sequencing (GBS) data could accelerate alfalfa yield gains, if it displayed moderate ability to predict parent breeding values. Its interest would be enhanced by predicting ability also for germplasm/reference populations other than those for which it was defined. Predicting accuracy may be influenced by statistical models, SNP calling procedures and missing data imputation strategies. RESULTS Landrace and variety material from two genetically-contrasting reference populations, i.e., 124 elite genotypes adapted to the Po Valley (sub-continental climate; PV population) and 154 genotypes adapted to Mediterranean-climate environments (Me population), were genotyped by GBS and phenotyped in separate environments for dry matter yield of their dense-planted half-sib progenies. Both populations showed no sub-population genetic structure. Predictive accuracy was higher by joint rather than separate SNP calling for the two data sets, and using random forest imputation of missing data. Highest accuracy was obtained using Support Vector Regression (SVR) for PV, and Ridge Regression BLUP and SVR for Me germplasm. Bayesian methods (Bayes A, Bayes B and Bayesian Lasso) tended to be less accurate. Random Forest Regression was the least accurate model. Accuracy attained about 0.35 for Me in the range of 0.30-0.50 missing data, and 0.32 for PV at 0.50 missing data, using at least 10,000 SNP markers. Cross-population predictions based on a smaller subset of common SNPs implied a relative loss of accuracy of about 25% for Me and 30% for PV. Genome-wide association analyses based on large subsets of M. truncatula-aligned markers revealed many SNPs with modest association with yield, and some genome areas hosting putative QTLs. A comparison of genomic vs. conventional selection for parent breeding value assuming 1-year vs. 5-year selection cycles, respectively, indicated over three-fold greater predicted yield gain per unit time for genomic selection. CONCLUSIONS Genomic selection for alfalfa yield is promising, based on its moderate prediction accuracy, moderate value of cross-population predictions, and lack of sub-population structure. There is limited scope for searching individual QTLs with overwhelming effect on yield. Some of our results can contribute to better design of genomic selection experiments for alfalfa and other crops with similar mating systems.
Collapse
Affiliation(s)
- Paolo Annicchiarico
- Council for Agricultural Research and Economics (CREA), Research Centre for Fodder Crops and Dairy Productions, 29 viale Piacenza, 26900, Lodi, Italy.
| | - Nelson Nazzicari
- Council for Agricultural Research and Economics (CREA), Research Centre for Fodder Crops and Dairy Productions, 29 viale Piacenza, 26900, Lodi, Italy.
| | - Xuehui Li
- Department of Plant Sciences, North Dakota State University, 1340 Administration Avenue, Fargo, ND, 58108, USA.
| | - Yanling Wei
- Plant Sciences Department, University of California, Davis, Plant Breeding Center, One Shields Avenue, Davis, CA, 95616, USA.
| | - Luciano Pecetti
- Council for Agricultural Research and Economics (CREA), Research Centre for Fodder Crops and Dairy Productions, 29 viale Piacenza, 26900, Lodi, Italy.
| | - E Charles Brummer
- Plant Sciences Department, University of California, Davis, Plant Breeding Center, One Shields Avenue, Davis, CA, 95616, USA.
| |
Collapse
|
16
|
Annicchiarico P, Nazzicari N, Li X, Wei Y, Pecetti L, Brummer EC. Accuracy of genomic selection for alfalfa biomass yield in different reference populations. BMC Genomics 2015. [PMID: 26626170 DOI: 10.1186/s12864‐015‐2212‐y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Genomic selection based on genotyping-by-sequencing (GBS) data could accelerate alfalfa yield gains, if it displayed moderate ability to predict parent breeding values. Its interest would be enhanced by predicting ability also for germplasm/reference populations other than those for which it was defined. Predicting accuracy may be influenced by statistical models, SNP calling procedures and missing data imputation strategies. RESULTS Landrace and variety material from two genetically-contrasting reference populations, i.e., 124 elite genotypes adapted to the Po Valley (sub-continental climate; PV population) and 154 genotypes adapted to Mediterranean-climate environments (Me population), were genotyped by GBS and phenotyped in separate environments for dry matter yield of their dense-planted half-sib progenies. Both populations showed no sub-population genetic structure. Predictive accuracy was higher by joint rather than separate SNP calling for the two data sets, and using random forest imputation of missing data. Highest accuracy was obtained using Support Vector Regression (SVR) for PV, and Ridge Regression BLUP and SVR for Me germplasm. Bayesian methods (Bayes A, Bayes B and Bayesian Lasso) tended to be less accurate. Random Forest Regression was the least accurate model. Accuracy attained about 0.35 for Me in the range of 0.30-0.50 missing data, and 0.32 for PV at 0.50 missing data, using at least 10,000 SNP markers. Cross-population predictions based on a smaller subset of common SNPs implied a relative loss of accuracy of about 25% for Me and 30% for PV. Genome-wide association analyses based on large subsets of M. truncatula-aligned markers revealed many SNPs with modest association with yield, and some genome areas hosting putative QTLs. A comparison of genomic vs. conventional selection for parent breeding value assuming 1-year vs. 5-year selection cycles, respectively, indicated over three-fold greater predicted yield gain per unit time for genomic selection. CONCLUSIONS Genomic selection for alfalfa yield is promising, based on its moderate prediction accuracy, moderate value of cross-population predictions, and lack of sub-population structure. There is limited scope for searching individual QTLs with overwhelming effect on yield. Some of our results can contribute to better design of genomic selection experiments for alfalfa and other crops with similar mating systems.
Collapse
Affiliation(s)
- Paolo Annicchiarico
- Council for Agricultural Research and Economics (CREA), Research Centre for Fodder Crops and Dairy Productions, 29 viale Piacenza, 26900, Lodi, Italy.
| | - Nelson Nazzicari
- Council for Agricultural Research and Economics (CREA), Research Centre for Fodder Crops and Dairy Productions, 29 viale Piacenza, 26900, Lodi, Italy.
| | - Xuehui Li
- Department of Plant Sciences, North Dakota State University, 1340 Administration Avenue, Fargo, ND, 58108, USA.
| | - Yanling Wei
- Plant Sciences Department, University of California, Davis, Plant Breeding Center, One Shields Avenue, Davis, CA, 95616, USA.
| | - Luciano Pecetti
- Council for Agricultural Research and Economics (CREA), Research Centre for Fodder Crops and Dairy Productions, 29 viale Piacenza, 26900, Lodi, Italy.
| | - E Charles Brummer
- Plant Sciences Department, University of California, Davis, Plant Breeding Center, One Shields Avenue, Davis, CA, 95616, USA.
| |
Collapse
|
17
|
Miao Z, Xu W, Li D, Hu X, Liu J, Zhang R, Tong Z, Dong J, Su Z, Zhang L, Sun M, Li W, Du Z, Hu S, Wang T. De novo transcriptome analysis of Medicago falcata reveals novel insights about the mechanisms underlying abiotic stress-responsive pathway. BMC Genomics 2015; 16:818. [PMID: 26481731 PMCID: PMC4615886 DOI: 10.1186/s12864-015-2019-x] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2015] [Accepted: 10/07/2015] [Indexed: 11/21/2022] Open
Abstract
Background The entire world is facing a deteriorating environment. Understanding the mechanisms underlying plant responses to external abiotic stresses is important for breeding stress-tolerant crops and herbages. Phytohormones play critical regulatory roles in plants in the response to external and internal cues to regulate growth and development. Medicago falcata is one of the stress-tolerant candidate leguminous species and is able to fix atmospheric nitrogen. This ability allows leguminous plants to grow in nitrogen deficient soils. Methods We performed Illumina sequencing of cDNA prepared from abiotic stress treated M. falcata. Sequencedreads were assembled to provide a transcriptome resource. Transcripts were annotated using BLASTsearches against the NCBI non-redundant database and gene ontology definitions were assigned. Acomparison among the three abiotic stress treated samples was carried out. The expression of transcriptswas confirmed with qRT-PCR. Results We present an abiotic stress-responsive M. falcata transcriptome using next-generation sequencing data from samples grown under standard, dehydration, high salinity, and cold conditions. We combined reads from all samples and de novo assembled 98,515 transcripts to build the M. falcata gene index. A comprehensive analysis of the transcriptome revealed abiotic stress-responsive mechanisms underlying the metabolism and core signalling components of major phytohormones. We identified nod factor signalling pathways during early symbiotic nodulation that are modified by abiotic stresses. Additionally, a global comparison of homology between the M. falcata and M. truncatula transcriptomes, along with five other leguminous species, revealed a high level of global sequence conservation within the family. Conclusions M. falcata is shown to be a model candidate for studying abiotic stress-responsive mechanisms in legumes. This global gene expression analysis provides new insights into the biochemical and molecular mechanisms involved in the acclimation to abiotic stresses. Our data provides many gene candidates that might be used for herbage and crop breeding. Additionally, FalcataBase (http://bioinformatics.cau.edu.cn/falcata/) was built for storing these data. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2019-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Zhenyan Miao
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China. .,Present address: Department of Agronomy, Purdue University, West Lafayette, IN, USA.
| | - Wei Xu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100029, China.
| | - Daofeng Li
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China. .,Present address: Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA.
| | - Xiaona Hu
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.
| | - Jiaxing Liu
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.
| | - Rongxue Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.
| | - Zongyong Tong
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.
| | - Jiangli Dong
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.
| | - Zhen Su
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.
| | - Liwei Zhang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.
| | - Min Sun
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100029, China.
| | - Wenjie Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100029, China.
| | - Zhenglin Du
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100029, China.
| | - Songnian Hu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100029, China.
| | - Tao Wang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.
| |
Collapse
|
18
|
Christmas MJ, Biffin E, Lowe AJ. Transcriptome sequencing, annotation and polymorphism detection in the hop bush, Dodonaea viscosa. BMC Genomics 2015; 16:803. [PMID: 26474753 PMCID: PMC4609105 DOI: 10.1186/s12864-015-1987-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Accepted: 10/06/2015] [Indexed: 01/09/2023] Open
Abstract
Background The hop bush, Dodonaea viscosa, is a trans-oceanic species distributed oversix continents. It evolved in Australia where it is found over a wide range of habitat types and is an ecologically important species. Limited genomic resources are currently available for this species, thus our understanding of its evolutionary history and ecological adaptation is restricted. Here, we present a comprehensive transcriptome dataset for future genomic studies into this species. Methods We performed Illumina sequencing of cDNA prepared from leaf tissue collected from seven populations of D. viscosa ssp. angustissima and spatulata distributed along an environmental gradient in South Australia. Sequenced reads were assembled to provide a transcriptome resource. Contiguous sequences (contigs) were annotated using BLAST searches against the NCBI non-redundant database and gene ontology definitions were assigned. Single nucleotide polymorphisms were detected for the establishment of a genetic marker set. A comparison between the two subspecies was also carried out. Results Illumina sequencing returned 268,672,818 sequence reads, which were de novoassembled into 105,125 contigs. Contigs with significant BLAST alignments (E value < 1e-5)numbered at 44,191, with 38,311 of these having their most significant hits to sequences from land plant species. Gene Ontology terms were assigned to 28,440 contigs and KEGG analysis identified 146 pathways that the gene products from 5,070 contigs are potentially involved in. The subspecies comparison identified 8,494 fixed SNP differences across 3,979 contiguous sequences, indicating a level of genetic differentiation between them. Across all samples, 248,235 SNPs were detected. Conclusions We have established a significant genomic data resource for D. viscosa,providing a comprehensive transcriptomic reference. Genetic differences among morphologically distinct subspecies were found. A wide range of putative gene regions were identified along with a large set of variable SNP markers, providing a basis for studies into the evolution and ecological adaptation of D. viscosa. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1987-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Matthew J Christmas
- Environment Institute and School of Biological Sciences, The University of Adelaide, North Terrace, Adelaide, 5005, SA, Australia.
| | - Ed Biffin
- Environment Institute and School of Biological Sciences, The University of Adelaide, North Terrace, Adelaide, 5005, SA, Australia.
| | - Andrew J Lowe
- Environment Institute and School of Biological Sciences, The University of Adelaide, North Terrace, Adelaide, 5005, SA, Australia.
| |
Collapse
|
19
|
O'Rourke JA, Fu F, Bucciarelli B, Yang SS, Samac DA, Lamb JFS, Monteros MJ, Graham MA, Gronwald JW, Krom N, Li J, Dai X, Zhao PX, Vance CP. The Medicago sativa gene index 1.2: a web-accessible gene expression atlas for investigating expression differences between Medicago sativa subspecies. BMC Genomics 2015; 16:502. [PMID: 26149169 PMCID: PMC4492073 DOI: 10.1186/s12864-015-1718-7] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Accepted: 06/24/2015] [Indexed: 11/19/2022] Open
Abstract
Background Alfalfa (Medicago sativa L.) is the primary forage legume crop species in the United States and plays essential economic and ecological roles in agricultural systems across the country. Modern alfalfa is the result of hybridization between tetraploid M. sativa ssp. sativa and M. sativa ssp. falcata. Due to its large and complex genome, there are few genomic resources available for alfalfa improvement. Results A de novo transcriptome assembly from two alfalfa subspecies, M. sativa ssp. sativa (B47) and M. sativa ssp. falcata (F56) was developed using Illumina RNA-seq technology. Transcripts from roots, nitrogen-fixing root nodules, leaves, flowers, elongating stem internodes, and post-elongation stem internodes were assembled into the Medicago sativa Gene Index 1.2 (MSGI 1.2) representing 112,626 unique transcript sequences. Nodule-specific and transcripts involved in cell wall biosynthesis were identified. Statistical analyses identified 20,447 transcripts differentially expressed between the two subspecies. Pair-wise comparisons of each tissue combination identified 58,932 sequences differentially expressed in B47 and 69,143 sequences differentially expressed in F56. Comparing transcript abundance in floral tissues of B47 and F56 identified expression differences in sequences involved in anthocyanin and carotenoid synthesis, which determine flower pigmentation. Single nucleotide polymorphisms (SNPs) unique to each M. sativa subspecies (110,241) were identified. Conclusions The Medicago sativa Gene Index 1.2 increases the expressed sequence data available for alfalfa by ninefold and can be expanded as additional experiments are performed. The MSGI 1.2 transcriptome sequences, annotations, expression profiles, and SNPs were assembled into the Alfalfa Gene Index and Expression Database (AGED) at http://plantgrn.noble.org/AGED/, a publicly available genomic resource for alfalfa improvement and legume research. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1718-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jamie A O'Rourke
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA, 50011, USA.
| | - Fengli Fu
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA.
| | | | - S Sam Yang
- USDA-ARS-Plant Science Research Unit, St. Paul, MN, 55108, USA. .,Present Address: Monsanto Company, Molecular Breeding Technology, Chesterfield, MO, 63167, USA.
| | - Deborah A Samac
- USDA-ARS-Plant Science Research Unit, St. Paul, MN, 55108, USA.
| | - JoAnn F S Lamb
- USDA-ARS-Plant Science Research Unit, St. Paul, MN, 55108, USA.
| | | | - Michelle A Graham
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA, 50011, USA.
| | - John W Gronwald
- USDA-ARS-Plant Science Research Unit, St. Paul, MN, 55108, USA.
| | - Nick Krom
- Samuel Roberts Noble Foundation, Ardmore, OK, 73401, USA.
| | - Jun Li
- Samuel Roberts Noble Foundation, Ardmore, OK, 73401, USA.
| | - Xinbin Dai
- Samuel Roberts Noble Foundation, Ardmore, OK, 73401, USA.
| | - Patrick X Zhao
- Samuel Roberts Noble Foundation, Ardmore, OK, 73401, USA.
| | - Carroll P Vance
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA. .,USDA-ARS-Plant Science Research Unit, St. Paul, MN, 55108, USA.
| |
Collapse
|
20
|
Abstract
High-throughput next-generation sequence-based genotyping and single nucleotide polymorphism (SNP) detection opens the door for emerging genomics-based breeding strategies such as genome-wide association analysis and genomic selection. In polyploids, SNP detection is confounded by a highly similar homeologous sequence where a polymorphism between subgenomes must be differentiated from a SNP. We have developed and implemented a novel tool called SWEEP: Sliding Window Extraction of Explicit Polymorphisms. SWEEP uses subgenome polymorphism haplotypes as contrast to identify true SNPs between genotypes. The tool is a single command script that calls a series of modules based on user-defined options and takes sorted/indexed bam files or vcf files as input. Filtering options are highly flexible and include filtering based on sequence depth, alternate allele ratio, and SNP quality on top of the SWEEP filtering procedure. Using real and simulated data we show that SWEEP outperforms current SNP filtering methods for polyploids. SWEEP can be used for high-quality SNP discovery in polyploid crops.
Collapse
|
21
|
Clevenger J, Chavarro C, Pearl SA, Ozias-Akins P, Jackson SA. Single Nucleotide Polymorphism Identification in Polyploids: A Review, Example, and Recommendations. MOLECULAR PLANT 2015; 8:831-46. [PMID: 25676455 DOI: 10.1016/j.molp.2015.02.002] [Citation(s) in RCA: 82] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2014] [Revised: 01/21/2015] [Accepted: 02/01/2015] [Indexed: 05/23/2023]
Abstract
Understanding the relationship between genotype and phenotype is a major biological question and being able to predict phenotypes based on molecular genotypes is integral to molecular breeding. Whole-genome duplications have shaped the history of all flowering plants and present challenges to elucidating the relationship between genotype and phenotype, especially in neopolyploid species. Although single nucleotide polymorphisms (SNPs) have become popular tools for genetic mapping, discovery and application of SNPs in polyploids has been difficult. Here, we summarize common experimental approaches to SNP calling, highlighting recent polyploid successes. To examine the impact of software choice on these analyses, we called SNPs among five peanut genotypes using different alignment programs (BWA-mem and Bowtie 2) and variant callers (SAMtools, GATK, and Freebayes). Alignments produced by Bowtie 2 and BWA-mem and analyzed in SAMtools shared 24.5% concordant SNPs, and SAMtools, GATK, and Freebayes shared 1.4% concordant SNPs. A subsequent analysis of simulated Brassica napus chromosome 1A and 1C genotypes demonstrated that, of the three software programs, SAMtools performed with the highest sensitivity and specificity on Bowtie 2 alignments. These results, however, are likely to vary among species, and we therefore propose a series of best practices for SNP calling in polyploids.
Collapse
Affiliation(s)
- Josh Clevenger
- Institute of Plant Breeding, Genetics & Genomics, University of Georgia, Tifton, GA 31793, USA
| | - Carolina Chavarro
- Center for Applied Genetic Technologies, University of Georgia, Athens, GA 30602, USA
| | - Stephanie A Pearl
- Center for Applied Genetic Technologies, University of Georgia, Athens, GA 30602, USA
| | - Peggy Ozias-Akins
- Institute of Plant Breeding, Genetics & Genomics, University of Georgia, Tifton, GA 31793, USA.
| | - Scott A Jackson
- Center for Applied Genetic Technologies, University of Georgia, Athens, GA 30602, USA.
| |
Collapse
|
22
|
Molecular Diversity and Population Structure of a Worldwide Collection of Cultivated Tetraploid Alfalfa (Medicago sativa subsp. sativa L.) Germplasm as Revealed by Microsatellite Markers. PLoS One 2015; 10:e0124592. [PMID: 25901573 PMCID: PMC4406709 DOI: 10.1371/journal.pone.0124592] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Accepted: 03/16/2015] [Indexed: 01/03/2023] Open
Abstract
Information on genetic diversity and population structure of a tetraploid alfalfa collection might be valuable in effective use of the genetic resources. A set of 336 worldwide genotypes of tetraploid alfalfa (Medicago sativa subsp. sativa L.) was genotyped using 85 genome-wide distributed SSR markers to reveal the genetic diversity and population structure in the alfalfa. Genetic diversity analysis identified a total of 1056 alleles across 85 marker loci. The average expected heterozygosity and polymorphism information content values were 0.677 and 0.638, respectively, showing high levels of genetic diversity in the cultivated tetraploid alfalfa germplasm. Comparison of genetic characteristics across chromosomes indicated regions of chromosomes 2 and 3 had the highest genetic diversity. A higher genetic diversity was detected in alfalfa landraces than that of wild materials and cultivars. Two populations were identified by the model-based population structure, principal coordinate and neighbor-joining analyses, corresponding to China and other parts of the world. However, lack of strictly correlation between clustering and geographic origins suggested extensive germplasm exchanges of alfalfa germplasm across diverse geographic regions. The quantitative analysis of the genetic diversity and population structure in this study could be useful for genetic and genomic analysis and utilization of the genetic variation in alfalfa breeding.
Collapse
|
23
|
Zhang S, Shi Y, Cheng N, Du H, Fan W, Wang C. De novo characterization of fall dormant and nondormant alfalfa (Medicago sativa L.) leaf transcriptome and identification of candidate genes related to fall dormancy. PLoS One 2015; 10:e0122170. [PMID: 25799491 PMCID: PMC4370819 DOI: 10.1371/journal.pone.0122170] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2014] [Accepted: 02/08/2015] [Indexed: 12/03/2022] Open
Abstract
Alfalfa (Medicago sativa L.) is one of the most widely cultivated perennial forage legumes worldwide. Fall dormancy is an adaptive character related to the biomass production and winter survival in alfalfa. The physiological, biochemical and molecular mechanisms causing fall dormancy and the related genes have not been well studied. In this study, we sequenced two standard varieties of alfalfa (dormant and non-dormant) at two time points and generated approximately 160 million high quality paired-end sequence reads using sequencing by synthesis (SBS) technology. The de novo transcriptome assembly generated a set of 192,875 transcripts with an average length of 856 bp representing about 165.1 Mb of the alfalfa leaf transcriptome. After assembly, 111,062 (57.6%) transcripts were annotated against the NCBI non-redundant database. A total of 30,165 (15.6%) transcripts were mapped to 323 Kyoto Encyclopedia of Genes and Genomes pathways. We also identified 41,973 simple sequence repeats, which can be used to generate markers for alfalfa, and 1,541 transcription factors were identified across 1,350 transcripts. Gene expression between dormant and non-dormant alfalfa at different time points were performed, and we identified several differentially expressed genes potentially related to fall dormancy. The Gene Ontology and pathways information were also identified. We sequenced and assembled the leaf transcriptome of alfalfa related to fall dormancy, and also identified some genes of interest involved in the fall dormancy mechanism. Thus, our research focused on studying fall dormancy in alfalfa through transcriptome sequencing. The sequencing and gene expression data generated in this study may be used further to elucidate the complete mechanisms governing fall dormancy in alfalfa.
Collapse
Affiliation(s)
- Senhao Zhang
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, Henan 450002, China
| | - Yinghua Shi
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, Henan 450002, China
| | - Ningning Cheng
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, Henan 450002, China
| | - Hongqi Du
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, Henan 450002, China
| | - Wenna Fan
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, Henan 450002, China
| | - Chengzhang Wang
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, Henan 450002, China
- * E-mail:
| |
Collapse
|
24
|
A saturated genetic linkage map of autotetraploid alfalfa (Medicago sativa L.) developed using genotyping-by-sequencing is highly syntenous with the Medicago truncatula genome. G3-GENES GENOMES GENETICS 2014; 4:1971-9. [PMID: 25147192 PMCID: PMC4199703 DOI: 10.1534/g3.114.012245] [Citation(s) in RCA: 76] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
A genetic linkage map is a valuable tool for quantitative trait locus mapping, map-based gene cloning, comparative mapping, and whole-genome assembly. Alfalfa, one of the most important forage crops in the world, is autotetraploid, allogamous, and highly heterozygous, characteristics that have impeded the construction of a high-density linkage map using traditional genetic marker systems. Using genotyping-by-sequencing (GBS), we constructed low-cost, reasonably high-density linkage maps for both maternal and paternal parental genomes of an autotetraploid alfalfa F1 population. The resulting maps contain 3591 single-nucleotide polymorphism markers on 64 linkage groups across both parents, with an average density of one marker per 1.5 and 1.0 cM for the maternal and paternal haplotype maps, respectively. Chromosome assignments were made based on homology of markers to the M. truncatula genome. Four linkage groups representing the four haplotypes of each alfalfa chromosome were assigned to each of the eight Medicago chromosomes in both the maternal and paternal parents. The alfalfa linkage groups were highly syntenous with M. truncatula, and clearly identified the known translocation between Chromosomes 4 and 8. In addition, a small inversion on Chromosome 1 was identified between M. truncatula and M. sativa. GBS enabled us to develop a saturated linkage map for alfalfa that greatly improved genome coverage relative to previous maps and that will facilitate investigation of genome structure. GBS could be used in breeding populations to accelerate molecular breeding in alfalfa.
Collapse
|
25
|
Galata M, Sarker LS, Mahmoud SS. Transcriptome profiling, and cloning and characterization of the main monoterpene synthases of Coriandrum sativum L. PHYTOCHEMISTRY 2014; 102:64-73. [PMID: 24636455 DOI: 10.1016/j.phytochem.2014.02.016] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2013] [Revised: 12/23/2013] [Accepted: 02/18/2014] [Indexed: 05/06/2023]
Abstract
Terpenoids are a large and diverse class of specialized metabolites that are essential for the growth and development of plants, and have tremendous industrial applications. The mericarps of Coriandrum sativum L. (coriander) produce an essential oil (EO) rich in monoterpenes, volatile C10 terpenoids. To investigate EO metabolism, the transcriptome of coriander mericarps, at three developmental stages (early, mid, late) was sequenced via Illumina technology and a transcript library was produced. To validate the usability of the transcriptome sequences, two terpene synthase candidate genes, CsγTRPS and CsLINS, encoding 558 and 562 amino acid proteins were expressed in bacteria, and the recombinant proteins purified by Ni-NTA affinity chromatography. The 65.16 (CsγTRPS) and 65.91 (CsLINS)kDa recombinant proteins catalyzed the conversion of geranyl diphosphate, the precursor to monoterpenes, to γ-terpinene and (S)-linalool, respectively, with apparent Vmax and Km values of 2.24±0.16 (CsγTRPS); 19.63±1.05 (CsLINS)pkat/mg and 66.25±13 (CsγTRPS); 2.5±0.6 (CsLINS)μM, respectively. Together, CsγTRPS and CsLINS account for the majority of EO constituents in coriander mericarps. Investigation of the coriander transcriptome, and knowledge gained from these experiments will facilitate future studies concerning essential and fatty acid oil production in coriander. They also enable efforts to improve the coriander oils through metabolic engineering or plant breeding.
Collapse
Affiliation(s)
- Mariana Galata
- Department of Biology, University of British Columbia Okanagan Campus, 3333 University Way, Kelowna, BC V1V 1V7, Canada
| | - Lukman S Sarker
- Department of Biology, University of British Columbia Okanagan Campus, 3333 University Way, Kelowna, BC V1V 1V7, Canada
| | - Soheil S Mahmoud
- Department of Biology, University of British Columbia Okanagan Campus, 3333 University Way, Kelowna, BC V1V 1V7, Canada.
| |
Collapse
|
26
|
Duarte J, Rivière N, Baranger A, Aubert G, Burstin J, Cornet L, Lavaud C, Lejeune-Hénaut I, Martinant JP, Pichon JP, Pilet-Nayel ML, Boutet G. Transcriptome sequencing for high throughput SNP development and genetic mapping in Pea. BMC Genomics 2014; 15:126. [PMID: 24521263 PMCID: PMC3925251 DOI: 10.1186/1471-2164-15-126] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2013] [Accepted: 02/05/2014] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Pea has a complex genome of 4.3 Gb for which only limited genomic resources are available to date. Although SNP markers are now highly valuable for research and modern breeding, only a few are described and used in pea for genetic diversity and linkage analysis. RESULTS We developed a large resource by cDNA sequencing of 8 genotypes representative of modern breeding material using the Roche 454 technology, combining both long reads (400 bp) and high coverage (3.8 million reads, reaching a total of 1,369 megabases). Sequencing data were assembled and generated a 68 K unigene set, from which 41 K were annotated from their best blast hit against the model species Medicago truncatula. Annotated contigs showed an even distribution along M. truncatula pseudochromosomes, suggesting a good representation of the pea genome. 10 K pea contigs were found to be polymorphic among the genetic material surveyed, corresponding to 35 K SNPs.We validated a subset of 1538 SNPs through the GoldenGate assay, proving their ability to structure a diversity panel of breeding germplasm. Among them, 1340 were genetically mapped and used to build a new consensus map comprising a total of 2070 markers. Based on blast analysis, we could establish 1252 bridges between our pea consensus map and the pseudochromosomes of M. truncatula, which provides new insight on synteny between the two species. CONCLUSIONS Our approach created significant new resources in pea, i.e. the most comprehensive genetic map to date tightly linked to the model species M. truncatula and a large SNP resource for both academic research and breeding.
Collapse
Affiliation(s)
- Jorge Duarte
- Biogemma, route d’Ennezat, CS 90126, Chappes 63720, France
| | | | - Alain Baranger
- INRA UMR 1349 IGEPP, BP35327, Le Rheu Cedex 35653, France
| | - Grégoire Aubert
- INRA UMR 1347 Agroécologie, Bat. Mendel, 17 rue Sully BP 86510, Dijon 21065, France
| | - Judith Burstin
- INRA UMR 1347 Agroécologie, Bat. Mendel, 17 rue Sully BP 86510, Dijon 21065, France
| | - Laurent Cornet
- Biogemma, route d’Ennezat, CS 90126, Chappes 63720, France
| | - Clément Lavaud
- INRA UMR 1349 IGEPP, BP35327, Le Rheu Cedex 35653, France
| | | | - Jean-Pierre Martinant
- Limagrain Europe, centre de recherche route d’Ennezat, CS 3911, Chappes 63720, France
| | | | | | - Gilles Boutet
- INRA UMR 1349 IGEPP, BP35327, Le Rheu Cedex 35653, France
| |
Collapse
|
27
|
Li X, Han Y, Wei Y, Acharya A, Farmer AD, Ho J, Monteros MJ, Brummer EC. Development of an alfalfa SNP array and its use to evaluate patterns of population structure and linkage disequilibrium. PLoS One 2014; 9:e84329. [PMID: 24416217 PMCID: PMC3887001 DOI: 10.1371/journal.pone.0084329] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2013] [Accepted: 11/14/2013] [Indexed: 11/18/2022] Open
Abstract
A large set of genome-wide markers and a high-throughput genotyping platform can facilitate the genetic dissection of complex traits and accelerate molecular breeding applications. Previously, we identified about 0.9 million SNP markers by sequencing transcriptomes of 27 diverse alfalfa genotypes. From this SNP set, we developed an Illumina Infinium array containing 9,277 SNPs. Using this array, we genotyped 280 diverse alfalfa genotypes and several genotypes from related species. About 81% (7,476) of the SNPs met the criteria for quality control and showed polymorphisms. The alfalfa SNP array also showed a high level of transferability for several closely related Medicago species. Principal component analysis and model-based clustering showed clear population structure corresponding to subspecies and ploidy levels. Within cultivated tetraploid alfalfa, genotypes from dormant and nondormant cultivars were largely assigned to different clusters; genotypes from semidormant cultivars were split between the groups. The extent of linkage disequilibrium (LD) across all genotypes rapidly decayed to 26 Kbp at r(2) = 0.2, but the rate varied across ploidy levels and subspecies. A high level of consistency in LD was found between and within the two subpopulations of cultivated dormant and nondormant alfalfa suggesting that genome-wide association studies (GWAS) and genomic selection (GS) could be conducted using alfalfa genotypes from throughout the fall dormancy spectrum. However, the relatively low LD levels would require a large number of markers to fully saturate the genome.
Collapse
Affiliation(s)
- Xuehui Li
- Forage Improvement Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
| | - Yuanhong Han
- Forage Improvement Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
| | - Yanling Wei
- Forage Improvement Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
| | - Ananta Acharya
- Forage Improvement Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
| | - Andrew D. Farmer
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Julie Ho
- Forage Genetics International, Davis, California, United States of America
| | - Maria J. Monteros
- Forage Improvement Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
| | - E. Charles Brummer
- Forage Improvement Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
| |
Collapse
|
28
|
Hirsch CD, Evans J, Buell CR, Hirsch CN. Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes. Brief Funct Genomics 2014; 13:257-67. [PMID: 24395692 DOI: 10.1093/bfgp/elt051] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Technology and software improvements in the last decade now provide methodologies to access the genome sequence of not only a single accession, but also multiple accessions of plant species. This provides a means to interrogate species diversity at the genome level. Ample diversity among accessions in a collection of species can be found, including single-nucleotide polymorphisms, insertions and deletions, copy number variation and presence/absence variation. For species with small, non-repetitive rich genomes, re-sequencing of query accessions is robust, highly informative, and economically feasible. However, for species with moderate to large sized repetitive-rich genomes, technical and economic barriers prevent en masse genome re-sequencing of accessions. Multiple approaches to access a focused subset of loci in species with larger genomes have been developed, including reduced representation sequencing, exome capture and transcriptome sequencing. Collectively, these approaches have enabled interrogation of diversity on a genome scale for large plant genomes, including crop species important to worldwide food security.
Collapse
|
29
|
Liu Z, Chen T, Ma L, Zhao Z, Zhao PX, Nan Z, Wang Y. Global transcriptome sequencing using the Illumina platform and the development of EST-SSR markers in autotetraploid alfalfa. PLoS One 2013; 8:e83549. [PMID: 24349529 PMCID: PMC3861513 DOI: 10.1371/journal.pone.0083549] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2013] [Accepted: 11/05/2013] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Alfalfa is the most widely cultivated forage legume and one of the most economically valuable crops in the world. The large size and complexity of the alfalfa genome has delayed the development of genomic resources for alfalfa research. Second-generation Illumina transcriptome sequencing is an efficient method for generating a global transcriptome sequence dataset for gene discovery and molecular marker development in alfalfa. METHODOLOGY/PRINCIPAL FINDINGS More than 28 million sequencing reads (5.64 Gb of clean nucleotides) were generated by Illumina paired-end sequencing from 15 different alfalfa tissue samples. In total, 40,433 unigenes with an average length of 803 bp were obtained by de novo assembly. Based on a sequence similarity search of known proteins, a total of 36,684 (90.73%) unigenes were annotated. In addition, 1,649 potential EST-SSRs were identified as potential molecular markers from unigenes with lengths exceeding 1 kb. A total of 100 pairs of PCR primers were randomly selected to validate the assembly quality and develop EST-SSR markers from genomic DNA. Of these primer pairs, 82 were able to amplify sequences in initial screening tests, and 27 primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among 10 alfalfa accessions. CONCLUSIONS/SIGNIFICANCE The present study provided global sequence data for autotetraploid alfalfa and demonstrates the Illumina platform is a fast and effective approach to EST-SSR markers development in alfalfa. The use of these transcriptome datasets will serve as a valuable public information platform to accelerate studies of the alfalfa genome.
Collapse
Affiliation(s)
- Zhipeng Liu
- State Key Laboratory of Grassland Agro-ecosystems, School of Pastoral Agricultural Science and Technology, Lanzhou University, Lanzhou, China
| | - Tianlong Chen
- State Key Laboratory of Grassland Agro-ecosystems, School of Pastoral Agricultural Science and Technology, Lanzhou University, Lanzhou, China
| | - Lichao Ma
- State Key Laboratory of Grassland Agro-ecosystems, School of Pastoral Agricultural Science and Technology, Lanzhou University, Lanzhou, China
| | - Zhiguang Zhao
- Key Laboratory of Cell Activities and Stress Adaptations, Ministry of Education, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Patrick X. Zhao
- Plant Biology Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
| | - Zhibiao Nan
- State Key Laboratory of Grassland Agro-ecosystems, School of Pastoral Agricultural Science and Technology, Lanzhou University, Lanzhou, China
| | - Yanrong Wang
- State Key Laboratory of Grassland Agro-ecosystems, School of Pastoral Agricultural Science and Technology, Lanzhou University, Lanzhou, China
| |
Collapse
|
30
|
SNP discovery in European anchovy (Engraulis encrasicolus, L) by high-throughput transcriptome and genome sequencing. PLoS One 2013; 8:e70051. [PMID: 23936375 PMCID: PMC3731364 DOI: 10.1371/journal.pone.0070051] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2013] [Accepted: 06/13/2013] [Indexed: 11/29/2022] Open
Abstract
Increased throughput in sequencing technologies has facilitated the acquisition of detailed genomic information in non-model species. The focus of this research was to discover and validate SNPs derived from the European anchovy (Engraulis encrasicolus) transcriptome, a species with no available reference genome, using next generation sequencing technologies. A cDNA library was constructed from four tissues of ten fish individuals corresponding to three populations of E. encrasicolus, and Roche 454 GS FLX Titanium sequencing yielded 19,367 contigs. Additionally, the European anchovy genome was sequenced for the same ten individuals using an Illumina HiSeq2000. Using a computational pipeline for combining transcriptome and genome information, a total of 18,994 SNPs met the necessary minor allele frequency and depth filters. A series of further stringent filters were applied to identify those SNPs likely to succeed in genotyping assays, and for filtering of those in potential duplicated genome regions. A novel method for detecting potential intron-exon boundaries in areas of putative SNPs has also been applied in silico to improve genotyping success. In all, 2,317 filtered putative transcriptome SNPs suitable for genotyping primer design were identified. From those, a subset of 530 were selected, with the genotyping results showing the highest reported conversion and validation rates (91.3% and 83.2%, respectively) reported to date for a non-model species. This study represents a promising strategy to discover genotypable SNPs in the exome of non-model organisms. The genomic resource generated for E. encrasicolus, both in terms of sequences and novel markers, will be informative for research into this species with applications including traceability studies, population genetic analyses and aquaculture.
Collapse
|