1
|
Huang PH, Wang TR, Li M, Fang OY, Su RP, Meng HH, Song YG, Li J. Different reference genomes determine different results: Comparing SNP calling in RAD-seq of Engelhardia roxburghiana using different reference genomes. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2024; 344:112109. [PMID: 38704094 DOI: 10.1016/j.plantsci.2024.112109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 04/23/2024] [Accepted: 04/30/2024] [Indexed: 05/06/2024]
Abstract
Advances in next-generation sequencing (NGS) have significantly reduced the cost and improved the efficiency of obtaining single nucleotide polymorphism (SNP) markers, particularly through restriction site-associated DNA sequencing (RAD-seq). Meanwhile, the progression in whole genome sequencing has led to the utilization of an increasing number of reference genomes in SNP calling processes. This study utilized RAD-seq data from 242 individuals of Engelhardia roxburghiana, a tropical tree of the walnut family (Juglandaceae), with SNP calling conducted using the STACKS pipeline. We aimed to compare both reference-based approaches, namely, employing a closely related species as the reference genome versus the species itself as the reference genome, to evaluate their respective merits and limitations. Our findings indicate a substantial discrepancy in the number of obtained SNPs between using a closely related species as opposed to the species itself as reference genomes, the former yielded approximately an order of magnitude fewer SNPs compared to the latter. While the missing rate of individuals and sites of the final SNPs obtained in the two scenarios showed no significant difference. The results showed that using the reference genome of the species itself tends to be prioritized in RAD-seq studies. However, if this is unavailable, considering closely related genomes is feasible due to their wide applicability and low missing rate as alternatives. This study contributes to enrich the understanding of the impact of SNP acquisition when utilizing different reference genomes.
Collapse
Affiliation(s)
- Pei-Han Huang
- Plant Phylogenetics and Conservation Group, Center for Integrative Conservation & Yunnan Key Laboratory for Conservation of Tropical Rainforests and Asian Elephants, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla 666303, China; Eastern China Conservation Centre for Wild Endangered Plant Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tian-Rui Wang
- Eastern China Conservation Centre for Wild Endangered Plant Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China; Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Min Li
- Plant Phylogenetics and Conservation Group, Center for Integrative Conservation & Yunnan Key Laboratory for Conservation of Tropical Rainforests and Asian Elephants, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla 666303, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Ou-Yan Fang
- Plant Phylogenetics and Conservation Group, Center for Integrative Conservation & Yunnan Key Laboratory for Conservation of Tropical Rainforests and Asian Elephants, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla 666303, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Ren-Ping Su
- Plant Phylogenetics and Conservation Group, Center for Integrative Conservation & Yunnan Key Laboratory for Conservation of Tropical Rainforests and Asian Elephants, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla 666303, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hong-Hu Meng
- Plant Phylogenetics and Conservation Group, Center for Integrative Conservation & Yunnan Key Laboratory for Conservation of Tropical Rainforests and Asian Elephants, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla 666303, China; Southeast Asia Biodiversity Research Institute, Chinese Academy of Sciences, Nay Pyi Taw 05282, Myanmar.
| | - Yi-Gang Song
- Eastern China Conservation Centre for Wild Endangered Plant Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China.
| | - Jie Li
- Plant Phylogenetics and Conservation Group, Center for Integrative Conservation & Yunnan Key Laboratory for Conservation of Tropical Rainforests and Asian Elephants, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla 666303, China.
| |
Collapse
|
2
|
Genomic Hatchery Introgression in Brown Trout (Salmo trutta L.): Development of a Diagnostic SNP Panel for Monitoring the Impacted Mediterranean Rivers. Genes (Basel) 2022; 13:genes13020255. [PMID: 35205298 PMCID: PMC8872556 DOI: 10.3390/genes13020255] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 01/05/2022] [Accepted: 01/27/2022] [Indexed: 02/01/2023] Open
Abstract
Brown trout (Salmo trutta L.) populations have been restocked during recent decades to satisfy angling demand and counterbalance the decline of wild populations. Millions of fertile brown trout individuals were released into Mediterranean and Atlantic rivers from hatcheries with homogeneous central European stocks. Consequently, many native gene pools have become endangered by introgressive hybridization with those hatchery stocks. Different genetic tools have been used to identify and evaluate the degree of introgression starting from pure native and restocking reference populations (e.g., LDH-C* locus, microsatellites). However, due to the high genetic structuring of brown trout, the definition of the "native pool" is hard to achieve. Additionally, although the LDH-C* locus is useful for determining the introgression degree at the population level, its consistency at individual level is far from being accurate, especially after several generations were since releases. Accordingly, the development of a more powerful and cost-effective tool is essential for an appropriate monitoring to recover brown-trout-native gene pools. Here, we used the 2b restriction site-associated DNA sequencing (2b-RADseq) and Stacks 2 with a reference genome to identify single-nucleotide polymorphisms (SNPs) diagnostic for hatchery-native fish discrimination in the Atlantic and Mediterranean drainages of the Iberian Peninsula. A final set of 20 SNPs was validated in a MassARRAY® System genotyping by contrasting data with the whole SNP dataset using samples with different degree of introgression from those previously recorded. Heterogeneous introgression impact was confirmed among and within river basins, and was the highest in the Mediterranean Slope. The SNP tool reported here should be assessed in a broader sample scenario in Southern Europe considering its potential for monitoring recovery plans.
Collapse
|
3
|
Development of SNP Set for the Marker-Assisted Selection of Guar ( Cyamopsis tetragonoloba (L.) Taub.) Based on a Custom Reference Genome Assembly. PLANTS 2021; 10:plants10102063. [PMID: 34685872 PMCID: PMC8539970 DOI: 10.3390/plants10102063] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 09/20/2021] [Accepted: 09/27/2021] [Indexed: 12/11/2022]
Abstract
Guar gum, a polysaccharide derived from guar seeds, is widely used in a variety of industrial applications, including oil and gas production. Although guar is mostly propagated in India, interest in guar as a new industrial legume crop is increasing worldwide, demanding the development of effective tools for marker-assisted selection. In this paper, we report a wide-ranging set of 4907 common SNPs and 327 InDels generated from RADseq genotyping data of 166 guar plants of different geographical origin. A custom guar reference genome was assembled and used for variant calling. A consensus set of variants was built using three bioinformatic pipelines for short variant discovery. The developed molecular markers were used for genome-wide association study, resulting in the discovery of six markers linked to the variation of an important agronomic trait—percentage of pods matured to the harvest date under long light day conditions. One of the associated variants was found inside the putative transcript sequence homologous to an ABC transporter in Arabidopsis, which has been shown to play an important role in D-myo-inositol phosphates metabolism. Earlier, we suggested that genes involved in myo-inositol phosphate metabolism have significant impact on the early flowering of guar plants. Hence, we believe that the developed SNP set allows for the identification of confident molecular markers of important agrobiological traits.
Collapse
|