1
|
Ruperao P, Bajaj P, Subramani R, Yadav R, Reddy Lachagari VB, Lekkala SP, Rathore A, Archak S, Angadi UB, Singh R, Singh K, Mayes S, Rangan P. A pilot-scale comparison between single and double-digest RAD markers generated using GBS strategy in sesame (Sesamum indicum L.). PLoS One 2023; 18:e0286599. [PMID: 37267340 DOI: 10.1371/journal.pone.0286599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 05/19/2023] [Indexed: 06/04/2023] Open
Abstract
To reduce the genome sequence representation, restriction site-associated DNA sequencing (RAD-seq) protocols is being widely used either with single-digest or double-digest methods. In this study, we genotyped the sesame population (48 sample size) in a pilot scale to compare single and double-digest RAD-seq (sd and ddRAD-seq) methods. We analysed the resulting short-read data generated from both protocols and assessed their performance impacting the downstream analysis using various parameters. The distinct k-mer count and gene presence absence variation (PAV) showed a significant difference between the sesame samples studied. Additionally, the variant calling from both datasets (sdRAD-seq and ddRAD-seq) exhibits a significant difference between them. The combined variants from both datasets helped in identifying the most diverse samples and possible sub-groups in the sesame population. The most diverse samples identified from each analysis (k-mer, gene PAV, SNP count, Heterozygosity, NJ and PCA) can possibly be representative samples holding major diversity of the small sesame population used in this study. The best possible strategies with suggested inputs for modifications to utilize the RAD-seq strategy efficiently on a large dataset containing thousands of samples to be subjected to molecular analysis like diversity, population structure and core development studies were discussed.
Collapse
Affiliation(s)
- Pradeep Ruperao
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Prasad Bajaj
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Rajkumar Subramani
- ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi, India
| | - Rashmi Yadav
- ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi, India
| | | | | | | | - Sunil Archak
- ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi, India
| | - Ulavappa B Angadi
- ICAR-Indian Agricultural Statistical Research Institute, New Delhi, India
| | - Rakesh Singh
- ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi, India
| | - Kuldeep Singh
- Genebank, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Sean Mayes
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Parimalan Rangan
- ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi, India
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, Australia
| |
Collapse
|
2
|
Wang L, Yang J, Zhang H, Tao Q, Zhang Y, Dang Z, Zhang F, Luo Z. Sequence coverage required for accurate genotyping by sequencing in polyploid species. Mol Ecol Resour 2021; 22:1417-1426. [PMID: 34826191 DOI: 10.1111/1755-0998.13558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 11/12/2021] [Accepted: 11/15/2021] [Indexed: 11/29/2022]
Abstract
Polyploidy plays an important role in the evolution of eukaryotes, especially for flowering plants. Many of ecologically or agronomically important plant or crop species are polyploids, including sycamore maple (tetraploid), the world second and third largest food crops wheat (hexaploid) and potato (tetraploid) as well as economically important aquaculture animals such as Atlantic salmon and trout. The next generation sequencing data enables to allocate genotype at a sequence variant site, known as genotyping by sequencing (GBS). GBS has stimulated enormous interests in population based genomics studies in almost all diploid and many polyploid organisms. DNA sequence polymorphisms are codominant and thus fully informative about the underlying genotype at the polymorphic site, making GBS a straightforward task in diploids. However, sequence data may usually be uninformative in polyploid species, making GBS a far more challenging task in polyploids. This paper presents novel and rigorous statistical methods for predicting the number of sequence reads needed to ensure accurate GBS at a polymorphic site bared by the reads in polyploids and shows that a dozen of reads can ensure a probability of 95% to recover all constituent alleles of any tetraploid genotype but several hundreds of reads are needed to accurately uncover the genotype with probability confidence of 90%, subverting the proposition of GBS using low coverage sequence data in the literature. The theoretical prediction was tested by use of RAD-seq data from tetraploid potato cultivars. The paper provides polyploid experimentalists with theoretical guides and methods for designing and conducting their sequence-based studies.
Collapse
Affiliation(s)
- Lin Wang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Jixuan Yang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Hong Zhang
- Department of Statistics and Finance, University of Science and Technology of China, Hefei, China
| | - Qin Tao
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Yuxin Zhang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Zhenyu Dang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Fengjun Zhang
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Zewei Luo
- Laboratory of Population and Quantitative Genetics, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China.,School of Biosciences, University of Birmingham, Birmingham, UK
| |
Collapse
|