1
|
Kvapilova K, Misenko P, Radvanszky J, Brzon O, Budis J, Gazdarica J, Pos O, Korabecna M, Kasny M, Szemes T, Kvapil P, Paces J, Kozmik Z. Validated WGS and WES protocols proved saliva-derived gDNA as an equivalent to blood-derived gDNA for clinical and population genomic analyses. BMC Genomics 2024; 25:187. [PMID: 38365587 PMCID: PMC10873937 DOI: 10.1186/s12864-024-10080-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 02/02/2024] [Indexed: 02/18/2024] Open
Abstract
BACKGROUND Whole exome sequencing (WES) and whole genome sequencing (WGS) have become standard methods in human clinical diagnostics as well as in population genomics (POPGEN). Blood-derived genomic DNA (gDNA) is routinely used in the clinical environment. Conversely, many POPGEN studies and commercial tests benefit from easy saliva sampling. Here, we evaluated the quality of variant call sets and the level of genotype concordance of single nucleotide variants (SNVs) and small insertions and deletions (indels) for WES and WGS using paired blood- and saliva-derived gDNA isolates employing genomic reference-based validated protocols. METHODS The genomic reference standard Coriell NA12878 was repeatedly analyzed using optimized WES and WGS protocols, and data calls were compared with the truth dataset published by the Genome in a Bottle Consortium. gDNA was extracted from the paired blood and saliva samples of 10 participants and processed using the same protocols. A comparison of paired blood-saliva call sets was performed in the context of WGS and WES genomic reference-based technical validation results. RESULTS The quality pattern of called variants obtained from genomic-reference-based technical replicates correlates with data calls of paired blood-saliva-derived samples in all levels of tested examinations despite a higher rate of non-human contamination found in the saliva samples. The F1 score of 10 blood-to-saliva-derived comparisons ranged between 0.8030-0.9998 for SNVs and between 0.8883-0.9991 for small-indels in the case of the WGS protocol, and between 0.8643-0.999 for SNVs and between 0.7781-1.000 for small-indels in the case of the WES protocol. CONCLUSION Saliva may be considered an equivalent material to blood for genetic analysis for both WGS and WES under strict protocol conditions. The accuracy of sequencing metrics and variant-detection accuracy is not affected by choosing saliva as the gDNA source instead of blood but much more significantly by the genomic context, variant types, and the sequencing technology used.
Collapse
Affiliation(s)
- Katerina Kvapilova
- Faculty of Science, Charles University, Albertov 6, Prague, 128 00, Czech Republic.
- Institute of Applied Biotechnologies a.s, Služeb 4, Prague, 108 00, Czech Republic.
| | - Pavol Misenko
- Geneton s.r.o, Ilkovičova 8, Bratislava, 841 04, Slovakia
| | - Jan Radvanszky
- Geneton s.r.o, Ilkovičova 8, Bratislava, 841 04, Slovakia
- Institute of Clinical and Translational Research, Biomedical Research Center of the Slovak Academy of Sciences, Dúbravská Cesta 9, Bratislava, 845 05, Slovakia
- Department of Molecular Biology, Faculty of Natural Sciences, Comenius University, Ilkovičova 3278/6, Karlova Ves, Bratislava, 841 04, Slovakia
- Comenius University Science Park, Comenius University, Ilkovičova 8, Karlova Ves, Bratislava, 841 04, Slovakia
| | - Ondrej Brzon
- Institute of Applied Biotechnologies a.s, Služeb 4, Prague, 108 00, Czech Republic
| | - Jaroslav Budis
- Geneton s.r.o, Ilkovičova 8, Bratislava, 841 04, Slovakia
- Comenius University Science Park, Comenius University, Ilkovičova 8, Karlova Ves, Bratislava, 841 04, Slovakia
- Slovak Centre for Scientific and Technical Information, Staré Mesto, Lamačská Cesta 8A, Bratislava, 811 04, Slovakia
| | - Juraj Gazdarica
- Geneton s.r.o, Ilkovičova 8, Bratislava, 841 04, Slovakia
- Comenius University Science Park, Comenius University, Ilkovičova 8, Karlova Ves, Bratislava, 841 04, Slovakia
- Slovak Centre for Scientific and Technical Information, Staré Mesto, Lamačská Cesta 8A, Bratislava, 811 04, Slovakia
| | - Ondrej Pos
- Geneton s.r.o, Ilkovičova 8, Bratislava, 841 04, Slovakia
- Comenius University Science Park, Comenius University, Ilkovičova 8, Karlova Ves, Bratislava, 841 04, Slovakia
| | - Marie Korabecna
- Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University and General University Hospital in Prague, Albertov 4, Prague, 128 00, Czech Republic
| | - Martin Kasny
- Institute of Applied Biotechnologies a.s, Služeb 4, Prague, 108 00, Czech Republic
- Department of Botany and Zoology, Faculty of Science, Masaryk University, Kotlářská 2, Brno, 611 37, Czech Republic
| | - Tomas Szemes
- Geneton s.r.o, Ilkovičova 8, Bratislava, 841 04, Slovakia
- Department of Molecular Biology, Faculty of Natural Sciences, Comenius University, Ilkovičova 3278/6, Karlova Ves, Bratislava, 841 04, Slovakia
- Comenius University Science Park, Comenius University, Ilkovičova 8, Karlova Ves, Bratislava, 841 04, Slovakia
| | - Petr Kvapil
- Institute of Applied Biotechnologies a.s, Služeb 4, Prague, 108 00, Czech Republic
| | - Jan Paces
- Laboratory of Genomics and Bioinformatics, Institute of Molecular Genetics of the Czech Academy of Sciences, Vídeňská 1083, Prague, 142 20, Czech Republic
| | - Zbynek Kozmik
- Laboratory of Transcriptional Regulation, Institute of Molecular Genetics of the Czech Academy of Sciences, Vídeňská 1083, Prague, 142 20, Czech Republic
| |
Collapse
|
2
|
Marenne G, Ludwig TE, Bocher O, Herzig AF, Aloui C, Tournier-Lasserve E, Génin E. RAVAQ: An integrative pipeline from quality control to region-based rare variant association analysis. Genet Epidemiol 2022; 46:256-265. [PMID: 35419876 DOI: 10.1002/gepi.22450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 02/04/2022] [Accepted: 03/15/2022] [Indexed: 11/07/2022]
Abstract
Next-generation sequencing technologies have opened up the possibility to sequence large samples of cases and controls to test for association with rare variants. To limit cost and increase sample sizes, data from controls could be used in multiple studies and might thus be generated on different sequencing platforms. This could pose some problems of comparability between cases and controls due to batch effects that could be confounding factors, leading to false-positive association signals. To limit batch effects and ensure comparability of datasets, stringent quality controls are required. We propose an integrative five-steps pipeline, RAVAQ, that (a) performs a specific three-step quality control taking into account the case-control status to ensure data comparability, (b) selects qualifying variants as defined by the user, and (c) performs rare variant association tests per genomic region. The RAVAQ pipeline is wrapped in an R package. It is user-friendly and flexible in its arguments to adapt to the specificity of each research project. We provide examples showing how RAVAQ improves rare variant association tests. The default RAVAQ quality control outperformed the widely used Variant Quality Score Recalibration method, removing inflation due to spurious signals. RAVAQ is open source and freely available at https://gitlab.com/gmarenne/ravaq.
Collapse
Affiliation(s)
| | - Thomas E Ludwig
- Inserm, Univ Brest, EFS, UMR 1078, GGB, Brest, France
- CHU Brest, Brest, France
| | - Ozvan Bocher
- Inserm, Univ Brest, EFS, UMR 1078, GGB, Brest, France
| | | | - Chaker Aloui
- Université de Paris, NeuroDiderot, Inserm UMR 1141, Paris, France
| | - Elisabeth Tournier-Lasserve
- Université de Paris, NeuroDiderot, Inserm UMR 1141, Paris, France
- AP-HP, Service de Génétique Moléculaire Neurovasculaire, Hôpital Saint-Louis, Paris, France
| | - Emmanuelle Génin
- Inserm, Univ Brest, EFS, UMR 1078, GGB, Brest, France
- CHU Brest, Brest, France
| |
Collapse
|