1
|
Belova V, Vasiliadis I, Repinskaia Z, Samitova A, Shmitko A, Ponikarovskaya N, Suchalko O, Cheranev V, Peter S, Peter S, Andrey K, Rebrikov D, Korostin D. Comparative evaluation of four exome enrichment solutions in 2024: Agilent, Roche, Vazyme and Nanodigmbio. BMC Genomics 2025; 26:76. [PMID: 39871131 PMCID: PMC11770928 DOI: 10.1186/s12864-024-11196-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Accepted: 12/30/2024] [Indexed: 01/29/2025] Open
Abstract
Whole exome sequencing (WES) is essential for identifying genetic variants linked to diseases. This study compares available to date four exome enrichment kits: Agilent SureSelect Human All Exon v8, Roche KAPA HyperExome, Vazyme VAHTS Target Capture Core Exome Panel, and Nanodigmbio NEXome Plus Panel v1. We evaluated target design, coverage statistics, and variant calling accuracy across these four different exome capture products. All kits showed high target coverage, with 10x coverage exceeding 97.5% and 20x coverage above 95%. Roche exhibited the most uniform coverage, indicated by the lowest fold-80 scores, while Nanodigmbio had more on-target reads due to fewer off-target reads. Variant calling performance, evaluated using in-lab standard E701 DNA sample, showed high recall rates for all kits, especially Agilent v8. All kits achieved an F-measure above 95.87%. Nanodigmbio had the highest precision with the fewest false positives but a slightly lower F-measure than other kits. This study also highlights the performance of new solutions from Vazyme (China) and Nanodigmbio (China), which were comparable to Agilent v8 and Roche KAPA kits. These findings assist researchers and clinicians in selecting appropriate exome capture solutions.
Collapse
Affiliation(s)
- Vera Belova
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Pirogov Russian National Research Medical University, Ostrovityanova str. 1, Moscow, 117997, Russia.
| | - Iuliia Vasiliadis
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Pirogov Russian National Research Medical University, Ostrovityanova str. 1, Moscow, 117997, Russia
| | - Zhanna Repinskaia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Pirogov Russian National Research Medical University, Ostrovityanova str. 1, Moscow, 117997, Russia
| | - Alina Samitova
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Pirogov Russian National Research Medical University, Ostrovityanova str. 1, Moscow, 117997, Russia
| | - Anna Shmitko
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Pirogov Russian National Research Medical University, Ostrovityanova str. 1, Moscow, 117997, Russia
| | - Natalya Ponikarovskaya
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Pirogov Russian National Research Medical University, Ostrovityanova str. 1, Moscow, 117997, Russia
| | - Oleg Suchalko
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Pirogov Russian National Research Medical University, Ostrovityanova str. 1, Moscow, 117997, Russia
| | - Valery Cheranev
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Pirogov Russian National Research Medical University, Ostrovityanova str. 1, Moscow, 117997, Russia
| | - Shatalov Peter
- National Medical Research Radiological Centre of the Ministry of Health of the Russian Federation, Koroleva st. 4, Obninsk, 249036, Russia
| | - Shegai Peter
- National Medical Research Radiological Centre of the Ministry of Health of the Russian Federation, Koroleva st. 4, Obninsk, 249036, Russia
| | - Kaprin Andrey
- National Medical Research Radiological Centre of the Ministry of Health of the Russian Federation, Koroleva st. 4, Obninsk, 249036, Russia
| | - Denis Rebrikov
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Pirogov Russian National Research Medical University, Ostrovityanova str. 1, Moscow, 117997, Russia
| | - Dmitriy Korostin
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Pirogov Russian National Research Medical University, Ostrovityanova str. 1, Moscow, 117997, Russia
| |
Collapse
|
2
|
Maróti Z, Ochieng PJ, Dombi J, Krész M, Kalmár T. Optimizing sequence data analysis using convolution neural network for the prediction of CNV bait positions. BMC Bioinformatics 2024; 25:389. [PMID: 39719572 DOI: 10.1186/s12859-024-06006-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Accepted: 12/05/2024] [Indexed: 12/26/2024] Open
Abstract
BACKGROUND Accurate prediction of copy number variations (CNVs) from targeted capture next-generation sequencing (NGS) data relies on effective normalization of read coverage profiles. The normalization process is particularly challenging due to hidden systemic biases such as GC bias, which can significantly affect the sensitivity and specificity of CNV detection. In many cases, the kit manifests provide only the genome coordinates of the targeted regions, and the exact bait design of the oligo capture baits is not available. Although the on-target regions significantly overlap with the bait design, a lack of adequate information allows less accurate normalization of the coverage data. In this study, we propose a novel approach that utilizes a 1D convolution neural network (CNN) model to predict the positions of capture baits in complex whole-exome sequencing (WES) kits. By accurately identifying the exact positions of bait coordinates, our model enables precise normalization of GC bias across target regions, thereby allowing better CNV data normalization. RESULTS We evaluated the optimal hyperparameters, model architecture, and complexity to predict the likely positions of the oligo capture baits. Our analysis shows that the CNN models outperform the Dense NN for bait predictions. Batch normalization is the most important parameter for the stable training of CNN models. Our results indicate that the spatiality of the data plays an important role in the prediction performance. We have shown that combined input data, including experimental coverage, on-target information, and sequence data, are critical for bait prediction. Furthermore, comparison with the on-target information indicated that the CNN models performed better in predicting bait positions that exhibited a high degree of overlap (>90%) with the true bait positions. RESULTS This study highlights the potential of utilizing CNN-based approaches to optimize coverage data analysis and improve copy number data normalization. Subsequent CNV detection based on these predicted coordinates facilitates more accurate measurement of coverage profiles and better normalization for GC bias. As a result, this approach could reduce systemic bias and improve the sensitivity and specificity of CNV detection in genomic studies.
Collapse
Affiliation(s)
- Zoltán Maróti
- Albert Szent-Györgyi Health Centre, University of Szeged, Korányi fasor 14-15, Szeged, H-6725, Csongrád-Csanád, Hungary.
| | - Peter Juma Ochieng
- Interdisciplinary Research Development and Innovation Center of Excellence, Institute of Informatics, University of Szeged, Árpád tér 2, Szeged, H-6720, Csongrád-Csanád, Hungary.
- HUN-REN SZTE Research Group on Artificial Intelligence, University of Szeged, Árpád tér 2, Szeged, H-6720, Csongrád-Csanád, Hungary.
- Institute of Informatics, University of Szeged, Árpád tér 2, Szeged, H-6720, Csongrád-Csanád, Hungary.
| | - József Dombi
- HUN-REN SZTE Research Group on Artificial Intelligence, University of Szeged, Árpád tér 2, Szeged, H-6720, Csongrád-Csanád, Hungary
- Institute of Informatics, University of Szeged, Árpád tér 2, Szeged, H-6720, Csongrád-Csanád, Hungary
| | - Miklós Krész
- InnoRenew CoE, Livade 6a, Izola, SI-6310, Slovenia
- Andrej Marušic Institute, University of Primorska, Muzejski trg 2, Koper, 6000, Slovenia
- Department of Applied Informatics, University of Szeged, Boldogasszony sgt. 6, Szeged, H-6725, Hungary
| | - Tibor Kalmár
- Albert Szent-Györgyi Health Centre, University of Szeged, Korányi fasor 14-15, Szeged, H-6725, Csongrád-Csanád, Hungary.
| |
Collapse
|
3
|
Moon Y, Hong CH, Kim YH, Kim JK, Ye SH, Kang EK, Choi HW, Cho H, Choi H, Lee DE, Choi Y, Kim TM, Heo SG, Han N, Hong KM. Enhancing Clinical Applications by Evaluation of Sensitivity and Specificity in Whole Exome Sequencing. Int J Mol Sci 2024; 25:13250. [PMID: 39769013 PMCID: PMC11678496 DOI: 10.3390/ijms252413250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 12/04/2024] [Accepted: 12/06/2024] [Indexed: 01/11/2025] Open
Abstract
The cost-effectiveness of whole exome sequencing (WES) remains controversial due to variant call variability, necessitating sensitivity and specificity evaluation. WES was performed by three companies (AA, BB, and CC) using reference standards composed of DNA from hydatidiform mole and individual blood at various ratios. Sensitivity was assessed by the detection rate of null-homozygote (N-H) alleles at expected variant allelic fractions, while false positive (FP) errors were counted for unexpected alleles. Sensitivity was approximately 20% for in-house results from BB and CC and around 5% for AA. Dynamic Read Analysis for GENomics (DRAGEN) analyses identified 1.34 to 1.71 times more variants, detecting over 96% of in-house variants, with sensitivity for common variants increasing to 5%. In-house FP errors varied significantly among companies (up to 13.97 times), while DRAGEN minimized this variation. Despite DRAGEN showing higher FP errors for BB and CC, the increased sensitivity highlights the importance of effective bioinformatic conditions. We also assessed the potential effects of target enrichment and proposed optimal cutoff values for the read depth and variant allele fraction in WES. Optimizing bioinformatic analysis based on sensitivity and specificity from reference standards can enhance variant detection and improve the clinical utility of WES.
Collapse
Affiliation(s)
- Youngbeen Moon
- Bioinformatics Analysis Team, Research Core Center, Research Institute, National Cancer Center, Goyang 10408, Gyeonggi-do, Republic of Korea; (Y.M.); (J.-K.K.)
| | - Chung Hwan Hong
- Cancer Molecular Biology Branch, Division of Cancer Biology, Research Institute, National Cancer Center, Goyang 10408, Gyeonggi-do, Republic of Korea; (C.H.H.); (S.-H.Y.); (E.-K.K.); (H.W.C.)
| | - Young-Ho Kim
- Diagnostic and Therapeutics Technology Branch, Division of Technology Convergence, Research Institute, National Cancer Center, Goyang 10408, Gyeonggi-do, Republic of Korea; (Y.-H.K.); (H.C.); (H.C.)
| | - Jong-Kwang Kim
- Bioinformatics Analysis Team, Research Core Center, Research Institute, National Cancer Center, Goyang 10408, Gyeonggi-do, Republic of Korea; (Y.M.); (J.-K.K.)
| | - Seo-Hyeon Ye
- Cancer Molecular Biology Branch, Division of Cancer Biology, Research Institute, National Cancer Center, Goyang 10408, Gyeonggi-do, Republic of Korea; (C.H.H.); (S.-H.Y.); (E.-K.K.); (H.W.C.)
| | - Eun-Kyung Kang
- Cancer Molecular Biology Branch, Division of Cancer Biology, Research Institute, National Cancer Center, Goyang 10408, Gyeonggi-do, Republic of Korea; (C.H.H.); (S.-H.Y.); (E.-K.K.); (H.W.C.)
| | - Hye Won Choi
- Cancer Molecular Biology Branch, Division of Cancer Biology, Research Institute, National Cancer Center, Goyang 10408, Gyeonggi-do, Republic of Korea; (C.H.H.); (S.-H.Y.); (E.-K.K.); (H.W.C.)
| | - Hyeri Cho
- Diagnostic and Therapeutics Technology Branch, Division of Technology Convergence, Research Institute, National Cancer Center, Goyang 10408, Gyeonggi-do, Republic of Korea; (Y.-H.K.); (H.C.); (H.C.)
| | - Hana Choi
- Diagnostic and Therapeutics Technology Branch, Division of Technology Convergence, Research Institute, National Cancer Center, Goyang 10408, Gyeonggi-do, Republic of Korea; (Y.-H.K.); (H.C.); (H.C.)
| | - Dong-eun Lee
- Biostatistics Collaboration Team, Research Core Center, Research Institute, National Cancer Center, Goyang 10408, Gyeonggi-do, Republic of Korea;
| | - Yongdoo Choi
- Division of Technology Convergence, National Cancer Center, 323 Ilsan-ro, Goyang 10408, Gyeonggi-do, Republic of Korea;
| | - Tae-Min Kim
- Department of Medical Informatics and Cancer Research Institute, College of Medicine, The Catholic University of Korea, Seoul 06591, Gyeonggi-do, Republic of Korea;
| | - Seong Gu Heo
- Dana Farber Cancer Institute, Boston, MA 02215, USA;
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Harvard Medical School, Boston, MA 02115, USA
| | - Namshik Han
- Milner Therapeutics Institute, University of Cambridge, Cambridge CB2 0AW, UK;
- Cambridge Centre for AI in Medicine, Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge CB3 0WA, UK
- Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| | - Kyeong-Man Hong
- Bioinformatics Analysis Team, Research Core Center, Research Institute, National Cancer Center, Goyang 10408, Gyeonggi-do, Republic of Korea; (Y.M.); (J.-K.K.)
- Cancer Molecular Biology Branch, Division of Cancer Biology, Research Institute, National Cancer Center, Goyang 10408, Gyeonggi-do, Republic of Korea; (C.H.H.); (S.-H.Y.); (E.-K.K.); (H.W.C.)
| |
Collapse
|
4
|
Mann BC, Loubser J, Omar S, Glanz C, Ektefaie Y, Jacobson KR, Warren RM, Farhat MR. Systematic review and meta-analysis of protocols and yield of direct from sputum sequencing of Mycobacterium tuberculosis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.04.625621. [PMID: 39677639 PMCID: PMC11642866 DOI: 10.1101/2024.12.04.625621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
Direct sputum whole genome sequencing (dsWGS) can revolutionize Mycobacterium tuberculosis (Mtb) diagnosis by enabling rapid detection of drug resistance and strain diversity without the biohazard of culture. We searched PubMed, Web of Science and Google scholar, and identified 8 studies that met inclusion criteria for testing protocols for dsWGS. Utilising meta-regression we identify several key factors positively associated with dsWGS success, including higher Mtb bacillary load, mechanical disruption, and enzymatic/chemical lysis. Specifically, smear grades of 3+ (OR = 14.7, 95% CI: 3.5, 62.1; p = 0.0005) were strongly associated with improved outcomes, whereas decontamination with sodium hydroxide (NaOH) was negatively associated (OR = 0.005, 95% CI: 0.001, 0.03; p = 7e-06), likely due to its harsh effects on Mtb cells. Furthermore, mechanical lysis (OR = 193.3, 95% CI: 11.7, 3197.8; p = 0.008) and enzymatic/chemical lysis (OR = 18.5, 95% CI: 1.9, 183.1; p = 0.02) were also strongly associated with improved dsWGS. Across the studies, we observed a high degree of variability in approaches to sputum pre-processing prior to dsWGS highlighting the need for standardized best practices. In particular we conclude that optimizing pre-processing steps including decontamination with the exploration of alternatives to NaOH to better preserve Mtb cells and DNA, and best practices for cell lysis during DNA extraction as priorities. Further and considering the strong association between Mtb load and successful dsWGS, protocol improvements for optimal sputum sample collection, handling, and storage could also further enhance the success rate of dsWGS.
Collapse
Affiliation(s)
- B C Mann
- DST/NRF Centre of Excellence for Biomedical Tuberculosis Research, SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Depts of Biomedical Sciences, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - J Loubser
- DST/NRF Centre of Excellence for Biomedical Tuberculosis Research, SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Depts of Biomedical Sciences, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - S Omar
- DST/NRF Centre of Excellence for Biomedical Tuberculosis Research, SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Depts of Biomedical Sciences, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - C Glanz
- DST/NRF Centre of Excellence for Biomedical Tuberculosis Research, SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Depts of Biomedical Sciences, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Y Ektefaie
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - K R Jacobson
- Section of Infectious Diseases, Boston University School of Medicine, Boston, MA, USA
| | - R M Warren
- DST/NRF Centre of Excellence for Biomedical Tuberculosis Research, SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Depts of Biomedical Sciences, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - M R Farhat
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
5
|
Loh CA, Shields DA, Schwing A, Evrony GD. High-fidelity, large-scale targeted profiling of microsatellites. Genome Res 2024; 34:1008-1026. [PMID: 39013593 PMCID: PMC11368184 DOI: 10.1101/gr.278785.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 07/11/2024] [Indexed: 07/18/2024]
Abstract
Microsatellites are highly mutable sequences that can serve as markers for relationships among individuals or cells within a population. The accuracy and resolution of reconstructing these relationships depends on the fidelity of microsatellite profiling and the number of microsatellites profiled. However, current methods for targeted profiling of microsatellites incur significant "stutter" artifacts that interfere with accurate genotyping, and sequencing costs preclude whole-genome microsatellite profiling of a large number of samples. We developed a novel method for accurate and cost-effective targeted profiling of a panel of more than 150,000 microsatellites per sample, along with a computational tool for designing large-scale microsatellite panels. Our method addresses the greatest challenge for microsatellite profiling-"stutter" artifacts-with a low-temperature hybridization capture that significantly reduces these artifacts. We also developed a computational tool for accurate genotyping of the resulting microsatellite sequencing data that uses an ensemble approach integrating three microsatellite genotyping tools, which we optimize by analysis of de novo microsatellite mutations in human trios. Altogether, our suite of experimental and computational tools enables high-fidelity, large-scale profiling of microsatellites, which may find utility in diverse applications such as lineage tracing, population genetics, ecology, and forensics.
Collapse
Affiliation(s)
- Caitlin A Loh
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, New York 10016, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, New York 10016, USA
| | - Danielle A Shields
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, New York 10016, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, New York 10016, USA
| | - Adam Schwing
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, New York 10016, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, New York 10016, USA
| | - Gilad D Evrony
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, New York 10016, USA;
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, New York 10016, USA
| |
Collapse
|
6
|
Abulí A, Antolín E, Borrell A, Garcia-Hoyos M, García Santiago F, Gómez Manjón I, Maíz N, González González C, Rodríguez-Revenga L, Valenzuena Palafoll I, Suela J. Guidelines for NGS procedures applied to prenatal diagnosis by the Spanish Society of Gynecology and Obstetrics and the Spanish Association of Prenatal Diagnosis. J Med Genet 2024; 61:727-733. [PMID: 38834294 DOI: 10.1136/jmg-2024-109878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 05/15/2024] [Indexed: 06/06/2024]
Abstract
OBJECTIVE This document addresses the clinical application of next-generation sequencing (NGS) technologies for prenatal genetic diagnosis and aims to establish clinical practice recommendations in Spain to ensure uniformity in implementing these technologies into prenatal care. METHODS A joint committee of expert obstetricians and geneticists was created to review the existing literature on fetal NGS for genetic diagnosis and to make recommendations for Spanish healthcare professionals. RESULTS This guideline summarises technical aspects of NGS technologies, clinical indications in prenatal setting, considerations regarding findings to be reported, genetic counselling considerations as well as data storage and protection policies. CONCLUSIONS This document provides updated recommendations for the use of NGS diagnostic tests in prenatal diagnosis. These recommendations should be periodically reviewed as our knowledge of the clinical utility of NGS technologies, applied during pregnancy, may advance.
Collapse
Affiliation(s)
- Anna Abulí
- Clinical and Molecular Genetics, Vall d'Hebron University Hospital, Barcelona, Spain
- Medicine Genetics Group, Vall d'Hebron Research Institute (VHIR), Barcelona, Spain
| | - Eugenia Antolín
- Gynecology and Obstetrics, La Paz University Hospital, Madrid, Spain
| | - Antoni Borrell
- Gynecology and Obstetrics, Clinic Hospital of Barcelona, Barcelona, Spain
| | | | | | | | - Nerea Maíz
- Maternal-Fetal Medicine Research Group, Vall d'Hebron Research Institute (VHIR), Barcelona, Spain
- Obstetrics, Vall d'Hebron University Hospital, Barcelona, Spain
| | | | - Laia Rodríguez-Revenga
- Biochemistry and Molecular Genetics, Clinic Hospital of Barcelona, Barcelona, Spain
- August Pi Sunyer Biomedical Research Institute (IDIBAPS), Barcelona, Spain
| | | | - Javier Suela
- Genetics, Sanitas Central Laboratory, Alcobendas, Spain
| |
Collapse
|
7
|
Yeo NKW, Lim CK, Yaung KN, Khoo NKH, Arkachaisri T, Albani S, Yeo JG. Genetic interrogation for sequence and copy number variants in systemic lupus erythematosus. Front Genet 2024; 15:1341272. [PMID: 38501057 PMCID: PMC10944961 DOI: 10.3389/fgene.2024.1341272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 02/20/2024] [Indexed: 03/20/2024] Open
Abstract
Early-onset systemic lupus erythematosus presents with a more severe disease and is associated with a greater genetic burden, especially in patients from Black, Asian or Hispanic ancestries. Next-generation sequencing techniques, notably whole exome sequencing, have been extensively used in genomic interrogation studies to identify causal disease variants that are increasingly implicated in the development of autoimmunity. This Review discusses the known casual variants of polygenic and monogenic systemic lupus erythematosus and its implications under certain genetic disparities while suggesting an age-based sequencing strategy to aid in clinical diagnostics and patient management for improved patient care.
Collapse
Affiliation(s)
- Nicholas Kim-Wah Yeo
- Translational Immunology Institute, SingHealth Duke-NUS Academic Medical Centre, Singapore, Singapore
- Duke-NUS Medical School, Singapore, Singapore
| | - Che Kang Lim
- Duke-NUS Medical School, Singapore, Singapore
- Department of Clinical Translation Research, Singapore General Hospital, Singapore, Singapore
| | - Katherine Nay Yaung
- Translational Immunology Institute, SingHealth Duke-NUS Academic Medical Centre, Singapore, Singapore
- Duke-NUS Medical School, Singapore, Singapore
| | - Nicholas Kim Huat Khoo
- Translational Immunology Institute, SingHealth Duke-NUS Academic Medical Centre, Singapore, Singapore
| | - Thaschawee Arkachaisri
- Translational Immunology Institute, SingHealth Duke-NUS Academic Medical Centre, Singapore, Singapore
- Duke-NUS Medical School, Singapore, Singapore
- Rheumatology and Immunology Service, KK Women’s and Children’s Hospital, Singapore, Singapore
| | - Salvatore Albani
- Translational Immunology Institute, SingHealth Duke-NUS Academic Medical Centre, Singapore, Singapore
- Duke-NUS Medical School, Singapore, Singapore
- Rheumatology and Immunology Service, KK Women’s and Children’s Hospital, Singapore, Singapore
| | - Joo Guan Yeo
- Translational Immunology Institute, SingHealth Duke-NUS Academic Medical Centre, Singapore, Singapore
- Duke-NUS Medical School, Singapore, Singapore
- Rheumatology and Immunology Service, KK Women’s and Children’s Hospital, Singapore, Singapore
| |
Collapse
|
8
|
Barbitoff YA, Ushakov MO, Lazareva TE, Nasykhova YA, Glotov AS, Predeus AV. Bioinformatics of germline variant discovery for rare disease diagnostics: current approaches and remaining challenges. Brief Bioinform 2024; 25:bbad508. [PMID: 38271481 PMCID: PMC10810331 DOI: 10.1093/bib/bbad508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/18/2023] [Accepted: 12/12/2023] [Indexed: 01/27/2024] Open
Abstract
Next-generation sequencing (NGS) has revolutionized the field of rare disease diagnostics. Whole exome and whole genome sequencing are now routinely used for diagnostic purposes; however, the overall diagnosis rate remains lower than expected. In this work, we review current approaches used for calling and interpretation of germline genetic variants in the human genome, and discuss the most important challenges that persist in the bioinformatic analysis of NGS data in medical genetics. We describe and attempt to quantitatively assess the remaining problems, such as the quality of the reference genome sequence, reproducible coverage biases, or variant calling accuracy in complex regions of the genome. We also discuss the prospects of switching to the complete human genome assembly or the human pan-genome and important caveats associated with such a switch. We touch on arguably the hardest problem of NGS data analysis for medical genomics, namely, the annotation of genetic variants and their subsequent interpretation. We highlight the most challenging aspects of annotation and prioritization of both coding and non-coding variants. Finally, we demonstrate the persistent prevalence of pathogenic variants in the coding genome, and outline research directions that may enhance the efficiency of NGS-based disease diagnostics.
Collapse
Affiliation(s)
- Yury A Barbitoff
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
- Bioinformatics Institute, Kentemirovskaya st. 2A, 197342, St. Petersburg, Russia
| | - Mikhail O Ushakov
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Tatyana E Lazareva
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Yulia A Nasykhova
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Andrey S Glotov
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Alexander V Predeus
- Bioinformatics Institute, Kentemirovskaya st. 2A, 197342, St. Petersburg, Russia
| |
Collapse
|
9
|
Mann BC, Jacobson KR, Ghebrekristos Y, Warren RM, Farhat MR. Assessment and validation of enrichment and target capture approaches to improve Mycobacterium tuberculosis WGS from direct patient samples. J Clin Microbiol 2023; 61:e0038223. [PMID: 37728909 PMCID: PMC10595060 DOI: 10.1128/jcm.00382-23] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 07/20/2023] [Indexed: 09/22/2023] Open
Abstract
Within-host Mycobacterium tuberculosis (Mtb) diversity may detect antibiotic resistance or predict tuberculosis treatment failure and is best captured through sequencing directly from sputum. Here, we compared three sample pre-processing steps for DNA decontamination and studied the yield of a new target enrichment protocol for optimal whole-genome sequencing (WGS) from direct patient samples. Mtb-positive NALC-NaOH-treated patient sputum sediments were pooled, and heat inactivated, split in replicates, and treated by either a wash, DNase I, or benzonase digestion. Levels of contaminating host DNA and target Mtb DNA were assessed by quantitative PCR (qPCR), followed by WGS with and without custom dsDNA target enrichment. The pre-treatment sample has a high host-to-target ratio of DNA (6,168 ± 1,638 host copies/ng to 212.3 ± 59.4 Mtb copies/ng) that significantly decreased with all three treatments. Benzonase treatment resulted in the highest enrichment of Mtb DNA at 100-fold compared with control (3,422 ± 2,162 host copies/ng to 11,721 ± 7,096 Mtb copies/ng). The custom dsDNA probe panel successfully enriched libraries from as little as 0.45 pg of Mtb DNA (100 genome copies). Applied to direct sputum the dsDNA target enrichment panel increased the percent of sequencing reads mapping to the Mtb target for all three pre-processing methods. Comparing the results of the benzonase sample sequenced both with and without enrichment, the percent of sequencing reads mapping to the Mtb increased to 90.95% from 1.18%. We demonstrate a low limit of detection for a new custom dsDNA Mtb target enrichment panel that has a favorable cost profile. The results also demonstrate that pre-processing to remove contaminating extracellular DNA prior to cell lysis and DNA extraction improves the host-to-Mtb DNA ratio but is not adequate to support average coverage WGS without target capture.
Collapse
Affiliation(s)
- B. C. Mann
- Department of Biomedical Sciences, DST/NRF Centre of Excellence for Biomedical Tuberculosis Research, SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
| | - K. R. Jacobson
- Section of Infectious Diseases, Boston University School of Medicine, Boston, Massachusetts, USA
| | - Y. Ghebrekristos
- Department of Biomedical Sciences, DST/NRF Centre of Excellence for Biomedical Tuberculosis Research, SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
- National Health Laboratory Service, Greenpoint Tuberculosis Laboratory, Cape Town, South Africa
| | - R. M. Warren
- Department of Biomedical Sciences, DST/NRF Centre of Excellence for Biomedical Tuberculosis Research, SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - M. R. Farhat
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
10
|
Zhou J, Zhang M, Li X, Wang Z, Pan D, Shi Y. Correction to: Performance comparison of four types of target enrichment baits for exome DNA sequencing. Hereditas 2023; 160:35. [PMID: 37670385 PMCID: PMC10481460 DOI: 10.1186/s41065-023-00296-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2023] Open
Affiliation(s)
- Juan Zhou
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, 200030, People's Republic of China
| | - Mancang Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, 200030, People's Republic of China
| | - Xiaoqi Li
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, 200030, People's Republic of China
| | - Zhuo Wang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, 200030, People's Republic of China
| | - Dun Pan
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, 200030, People's Republic of China
| | - Yongyong Shi
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, 200030, People's Republic of China.
| |
Collapse
|
11
|
Tilemis FN, Marinakis NM, Veltra D, Svingou M, Kekou K, Mitrakos A, Tzetis M, Kosma K, Makrythanasis P, Traeger-Synodinos J, Sofocleous C. Germline CNV Detection through Whole-Exome Sequencing (WES) Data Analysis Enhances Resolution of Rare Genetic Diseases. Genes (Basel) 2023; 14:1490. [PMID: 37510394 PMCID: PMC10379589 DOI: 10.3390/genes14071490] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/14/2023] [Accepted: 07/20/2023] [Indexed: 07/30/2023] Open
Abstract
Whole-Exome Sequencing (WES) has proven valuable in the characterization of underlying genetic defects in most rare diseases (RDs). Copy Number Variants (CNVs) were initially thought to escape detection. Recent technological advances enabled CNV calling from WES data with the use of accurate and highly sensitive bioinformatic tools. Amongst 920 patients referred for WES, 454 unresolved cases were further analysed using the ExomeDepth algorithm. CNVs were called, evaluated and categorized according to ACMG/ClinGen recommendations. Causative CNVs were identified in 40 patients, increasing the diagnostic yield of WES from 50.7% (466/920) to 55% (506/920). Twenty-two CNVs were available for validation and were all confirmed; of these, five were novel. Implementation of the ExomeDepth tool promoted effective identification of phenotype-relevant and/or novel CNVs. Among the advantages of calling CNVs from WES data, characterization of complex genotypes comprising both CNVs and SNVs minimizes cost and time to final diagnosis, while allowing differentiation between true or false homozygosity, as well as compound heterozygosity of variants in AR genes. The use of a specific algorithm for calling CNVs from WES data enables ancillary detection of different types of causative genetic variants, making WES a critical first-tier diagnostic test for patients with RDs.
Collapse
Affiliation(s)
- Faidon-Nikolaos Tilemis
- Laboratory of Medical Genetics, St. Sophia's Children's Hospital, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Nikolaos M Marinakis
- Laboratory of Medical Genetics, St. Sophia's Children's Hospital, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
- Research University Institute for the Study and Prevention of Genetic and Malignant Disease of Childhood, St. Sophia's Children's Hospital, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Danai Veltra
- Laboratory of Medical Genetics, St. Sophia's Children's Hospital, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
- Research University Institute for the Study and Prevention of Genetic and Malignant Disease of Childhood, St. Sophia's Children's Hospital, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Maria Svingou
- Laboratory of Medical Genetics, St. Sophia's Children's Hospital, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Kyriaki Kekou
- Laboratory of Medical Genetics, St. Sophia's Children's Hospital, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Anastasios Mitrakos
- Laboratory of Medical Genetics, St. Sophia's Children's Hospital, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
- Research University Institute for the Study and Prevention of Genetic and Malignant Disease of Childhood, St. Sophia's Children's Hospital, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Maria Tzetis
- Laboratory of Medical Genetics, St. Sophia's Children's Hospital, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Konstantina Kosma
- Laboratory of Medical Genetics, St. Sophia's Children's Hospital, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Periklis Makrythanasis
- Laboratory of Medical Genetics, St. Sophia's Children's Hospital, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
- Department of Genetic Medicine and Development, Medical School, University of Geneva, 1211 Geneva, Switzerland
- Biomedical Research Foundation of the Academy of Athens, 11527 Athens, Greece
| | - Joanne Traeger-Synodinos
- Laboratory of Medical Genetics, St. Sophia's Children's Hospital, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Christalena Sofocleous
- Laboratory of Medical Genetics, St. Sophia's Children's Hospital, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
| |
Collapse
|
12
|
Yaldiz B, Kucuk E, Hampstead J, Hofste T, Pfundt R, Corominas Galbany J, Rinne T, Yntema HG, Hoischen A, Nelen M, Gilissen C. Twist exome capture allows for lower average sequence coverage in clinical exome sequencing. Hum Genomics 2023; 17:39. [PMID: 37138343 PMCID: PMC10155375 DOI: 10.1186/s40246-023-00485-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 04/20/2023] [Indexed: 05/05/2023] Open
Abstract
BACKGROUND Exome and genome sequencing are the predominant techniques in the diagnosis and research of genetic disorders. Sufficient, uniform and reproducible/consistent sequence coverage is a main determinant for the sensitivity to detect single-nucleotide (SNVs) and copy number variants (CNVs). Here we compared the ability to obtain comprehensive exome coverage for recent exome capture kits and genome sequencing techniques. RESULTS We compared three different widely used enrichment kits (Agilent SureSelect Human All Exon V5, Agilent SureSelect Human All Exon V7 and Twist Bioscience) as well as short-read and long-read WGS. We show that the Twist exome capture significantly improves complete coverage and coverage uniformity across coding regions compared to other exome capture kits. Twist performance is comparable to that of both short- and long-read whole genome sequencing. Additionally, we show that even at a reduced average coverage of 70× there is only minimal loss in sensitivity for SNV and CNV detection. CONCLUSION We conclude that exome sequencing with Twist represents a significant improvement and could be performed at lower sequence coverage compared to other exome capture techniques.
Collapse
Affiliation(s)
- Burcu Yaldiz
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands
| | - Erdi Kucuk
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands
| | - Juliet Hampstead
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands
| | - Tom Hofste
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands
| | - Rolph Pfundt
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Centre, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands
| | - Jordi Corominas Galbany
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands
| | - Tuula Rinne
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands
| | - Helger G Yntema
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Centre, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands
| | - Alexander Hoischen
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands
| | - Marcel Nelen
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands
| | - Christian Gilissen
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands.
| |
Collapse
|
13
|
Yeh SH, Li CL, Lin YY, Ho MC, Wang YC, Tseng ST, Chen PJ. Hepatitis B Virus DNA Integration Drives Carcinogenesis and Provides a New Biomarker for HBV-related HCC. Cell Mol Gastroenterol Hepatol 2023; 15:921-929. [PMID: 36690297 PMCID: PMC9972564 DOI: 10.1016/j.jcmgh.2023.01.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 12/24/2022] [Accepted: 01/02/2023] [Indexed: 01/25/2023]
Abstract
Hepatitis B virus (HBV) DNA integration is an incidental event in the virus replication cycle and occurs in less than 1% of infected hepatocytes during viral infection. However, HBV DNA is present in the genome of approximately 90% of HBV-related HCCs and is the most common somatic mutation. Whole genome sequencing of liver tissues from chronic hepatitis B patients showed integration occurring at random positions in human chromosomes; however, in the genomes of HBV-related HCC patients, there are integration hotspots. Both the enrichment of the HBV-integration proportion in HCC and the emergence of integration hotspots suggested a strong positive selection of HBV-integrated hepatocytes to progress to HCC. The activation of HBV integration hotspot genes, such as telomerase (TERT) or histone methyltransferase (MLL4/KMT2B), resembles insertional mutagenesis by oncogenic animal retroviruses. These candidate oncogenic genes might shed new light on HBV-related HCC biology and become targets for new cancer therapies. Finally, the HBV integrations in individual HCC contain unique sequences at the junctions, such as virus-host chimera DNA (vh-DNA) presumably being a signature molecule for individual HCC. HBV integration may thus provide a new cell-free tumor DNA biomarker to monitor residual HCC after curative therapies or to track the development of de novo HCC.
Collapse
Affiliation(s)
- Shiou-Hwei Yeh
- Graduate Institute of Microbiology, National Taiwan University College of Medicine, Taipei, Taiwan; Department of Clinical Laboratory Sciences and Medical Biotechnology, National Taiwan University College of Medicine, Taipei, Taiwan; National Taiwan University Center for Genomic Medicine, National Taiwan University, Taipei, Taiwan
| | - Chiao-Ling Li
- Graduate Institute of Microbiology, National Taiwan University College of Medicine, Taipei, Taiwan
| | - You-Yu Lin
- Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei, Taiwan; Genome and Systems Biology Degree Program, National Taiwan University College of Life Science, Taipei, Taiwan
| | - Ming-Chih Ho
- Department of Surgery, National Taiwan University Hospital, Taipei, Taiwan
| | | | | | - Pei-Jer Chen
- National Taiwan University Center for Genomic Medicine, National Taiwan University, Taipei, Taiwan; Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei, Taiwan; Department of Internal Medicine, National Taiwan University Hospital, Taipei, Taiwan.
| |
Collapse
|
14
|
Yan B, Wang D, Vaisvila R, Sun Z, Ettwiller L. Methyl-SNP-seq reveals dual readouts of methylome and variome at molecule resolution while enabling target enrichment. Genome Res 2022; 32:2079-2091. [PMID: 36332968 PMCID: PMC9808626 DOI: 10.1101/gr.277080.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 10/31/2022] [Indexed: 11/06/2022]
Abstract
Covalent modifications of genomic DNA are crucial for most organisms to survive. Amplicon-based high-throughput sequencing technologies erase all DNA modifications to retain only sequence information for the four canonical nucleobases, necessitating specialized technologies for ascertaining epigenetic information. To also capture base modification information, we developed Methyl-SNP-seq, a technology that takes advantage of the complementarity of the double helix to extract the methylation and original sequence information from a single DNA molecule. More specifically, Methyl-SNP-seq uses bisulfite conversion of one of the strands to identify cytosine methylation while retaining the original four-bases sequence information on the other strand. As both strands are locked together to link the dual readouts on a single paired-end read, Methyl-SNP-seq allows detecting the methylation status of any DNA even without a reference genome. Because one of the strands retains the original four nucleotide composition, Methyl-SNP-seq can also be used in conjunction with standard sequence-specific probes for targeted enrichment and amplification. We show the usefulness of this technology in a broad spectrum of applications ranging from allele-specific methylation analysis in humans to identification of methyltransferase specificity in complex bacterial communities.
Collapse
Affiliation(s)
- Bo Yan
- New England Biolabs, Incorporated, Ipswich, Massachusetts 01938, USA
| | - Duan Wang
- SLC Management, Wellesley Hills, Massachusetts 02481, USA
| | | | - Zhiyi Sun
- New England Biolabs, Incorporated, Ipswich, Massachusetts 01938, USA
| | | |
Collapse
|
15
|
Fitzgerald T, Birney E. CNest: A novel copy number association discovery method uncovers 862 new associations from 200,629 whole-exome sequence datasets in the UK Biobank. CELL GENOMICS 2022; 2:100167. [PMID: 36779085 PMCID: PMC9903682 DOI: 10.1016/j.xgen.2022.100167] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2021] [Revised: 04/11/2022] [Accepted: 07/13/2022] [Indexed: 10/15/2022]
Abstract
Copy number variation (CNV) is known to influence human traits, having a rich history of research into common and rare genetic disease, and although CNV is accepted as an important class of genomic variation, progress on copy-number-based genome-wide association studies (GWASs) from next-generation sequencing (NGS) data has been limited. Here we present a novel method for large-scale copy number analysis from NGS data generating robust copy number estimates and allowing copy number GWASs (CN-GWASs) to be performed genome-wide in discovery mode. We provide a detailed analysis in the UK Biobank resource and a specifically designed software package. We use these methods to perform CN-GWAS analysis across 78 human traits, discovering over 800 genetic associations that are likely to contribute strongly to trait distributions. Finally, we compare CNV and SNP association signals across the same traits and samples, defining specific CNV association classes.
Collapse
Affiliation(s)
- Tomas Fitzgerald
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK
| |
Collapse
|
16
|
Corominas J, Smeekens SP, Nelen MR, Yntema HG, Kamsteeg EJ, Pfundt R, Gilissen C. Clinical exome sequencing - mistakes and caveats. Hum Mutat 2022; 43:1041-1055. [PMID: 35191116 PMCID: PMC9541396 DOI: 10.1002/humu.24360] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Revised: 01/11/2022] [Accepted: 02/18/2022] [Indexed: 11/30/2022]
Abstract
Massive parallel sequencing technology has become the predominant technique for genetic diagnostics and research. Many genetic laboratories have wrestled with the challenges of setting up genetic testing workflows based on a completely new technology. The learning curve we went through as a laboratory was accompanied by growing pains while we gained new knowledge and expertise. Here we discuss some important mistakes that have been made in our laboratory through 10 years of clinical exome sequencing but that have given us important new insights on how to adapt our working methods. We provide these examples and the lessons that we learned to help other laboratories avoid to make the same mistakes.
Collapse
Affiliation(s)
- Jordi Corominas
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Sanne P Smeekens
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Marcel R Nelen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Helger G Yntema
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands.,Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Erik-Jan Kamsteeg
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands.,Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Rolph Pfundt
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands.,Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Christian Gilissen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands.,Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| |
Collapse
|
17
|
Lozano N, Lanza VF, Suárez-González J, Herranz M, Sola-Campoy PJ, Rodríguez-Grande C, Buenestado-Serrano S, Ruiz-Serrano MJ, Tudó G, Alcaide F, Muñoz P, García de Viedma D, Pérez-Lago L. Detection of Minority Variants and Mixed Infections in Mycobacterium tuberculosis by Direct Whole-Genome Sequencing on Noncultured Specimens Using a Specific-DNA Capture Strategy. mSphere 2021; 6:e0074421. [PMID: 34908457 PMCID: PMC8673255 DOI: 10.1128/msphere.00744-21] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 11/24/2021] [Indexed: 12/01/2022] Open
Abstract
Detection of mixed Mycobacterium tuberculosis (MTB) infections is essential, particularly when resistance mutations are present in minority bacterial populations that may affect patients' disease evolution and treatment. Whole-genome sequencing (WGS) has extended the amount of key information available for the diagnosis of MTB infection, including the identification of mixed infections. Having genomic information at diagnosis for early intervention requires carrying out WGS directly on the clinical samples. However, few studies have been successful with this approach due to the low representation of MTB DNA in sputa. In this study, we evaluated the ability of a strategy based on specific MTB DNA enrichment by using a newly designed capture platform (MycoCap) to detect minority variants and mixed infections by WGS on controlled mixtures of MTB DNAs in a simulated sputum genetic background. A pilot study was carried out with 12 samples containing 98% of a DNA pool from sputa of patients without MTB infection and 2% of MTB DNA mixtures at different proportions. Our strategy allowed us to generate sequences with a quality equivalent to those obtained from culture: 62.5× depth coverage and 95% breadth coverage (for at least 20× reads). Assessment of minority variant detection was carried out by manual analysis and allowed us to identify heterozygous positions up to a 95:5 ratio. The strategy also automatically distinguished mixed infections up to a 90:10 proportion. Our strategy efficiently captures MTB DNA in a nonspecific genetic background, allows detection of minority variants and mixed infections, and is a promising tool for performing WGS directly on clinical samples. IMPORTANCE We present a new strategy to identify mixed infections and minority variants in Mycobacterium tuberculosis by whole-genome sequencing. The objective of the strategy is the direct detection in patient sputum; in this way, minority populations of resistant strains can be identified at the time of diagnosis, facilitating identification of the most appropriate treatment for the patient from the first moment. For this, a platform for capturing M. tuberculosis-specific DNA was designed to enrich the clinical sample and obtain quality sequences.
Collapse
Affiliation(s)
- Nuria Lozano
- Instituto de Investigación Sanitaria Gregorio Marañón, Madrid, Spain
- Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón, Madrid, Spain
| | - Val F. Lanza
- Bioinformatics Unit IRYCIS, University Hospital Ramón y Cajal, Madrid, Spain
- CIBER Enfermedades Infecciosas, Madrid, Spain
| | - Julia Suárez-González
- Instituto de Investigación Sanitaria Gregorio Marañón, Madrid, Spain
- Unidad de Genómica, Hospital General Universitario Gregorio Marañón, Madrid, Spain
| | - Marta Herranz
- Instituto de Investigación Sanitaria Gregorio Marañón, Madrid, Spain
- Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón, Madrid, Spain
- CIBER Enfermedades Respiratorias, CIBERES, Madrid, Spain
| | - Pedro J. Sola-Campoy
- Instituto de Investigación Sanitaria Gregorio Marañón, Madrid, Spain
- Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón, Madrid, Spain
| | - Cristina Rodríguez-Grande
- Instituto de Investigación Sanitaria Gregorio Marañón, Madrid, Spain
- Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón, Madrid, Spain
| | - Sergio Buenestado-Serrano
- Instituto de Investigación Sanitaria Gregorio Marañón, Madrid, Spain
- Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón, Madrid, Spain
| | - María Jesús Ruiz-Serrano
- Instituto de Investigación Sanitaria Gregorio Marañón, Madrid, Spain
- Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón, Madrid, Spain
| | - Griselda Tudó
- Servei de Microbiologia, Hospital Clinic-CDB, Facultat de Medicina i Ciències de la Salut, Universitat de Barcelona, Barcelona, Spain
| | - Fernando Alcaide
- Servicio de Microbiología, Hospital Universitario de Bellvitge-IDIBELL, L’Hospitalet de Llobregat, Barcelona, Spain
- Department of Pathology and Experimental Therapy, University of Barcelona, L’Hospitalet de Llobregat, Barcelona, Spain
| | - Patricia Muñoz
- Instituto de Investigación Sanitaria Gregorio Marañón, Madrid, Spain
- Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón, Madrid, Spain
- CIBER Enfermedades Respiratorias, CIBERES, Madrid, Spain
- Departmento de Medicina, Universidad Complutense de Madrid, Madrid, Spain
| | - Darío García de Viedma
- Instituto de Investigación Sanitaria Gregorio Marañón, Madrid, Spain
- Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón, Madrid, Spain
- CIBER Enfermedades Respiratorias, CIBERES, Madrid, Spain
| | - Laura Pérez-Lago
- Instituto de Investigación Sanitaria Gregorio Marañón, Madrid, Spain
- Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón, Madrid, Spain
| |
Collapse
|
18
|
CNV Detection from Exome Sequencing Data in Routine Diagnostics of Rare Genetic Disorders: Opportunities and Limitations. Genes (Basel) 2021; 12:genes12091427. [PMID: 34573409 PMCID: PMC8472439 DOI: 10.3390/genes12091427] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 09/08/2021] [Accepted: 09/09/2021] [Indexed: 12/15/2022] Open
Abstract
To assess the potential of detecting copy number variations (CNVs) directly from exome sequencing (ES) data in diagnostic settings, we developed a CNV-detection pipeline based on ExomeDepth software and applied it to ES data of 450 individuals. Initially, only CNVs affecting genes in the requested diagnostic gene panels were scored and tested against arrayCGH results. Pathogenic CNVs were detected in 18 individuals. Most detected CNVs were larger than 400 kb (11/18), but three individuals had small CNVs impacting one or a few exons only and were thus not detectable by arrayCGH. Conversely, two pathogenic CNVs were initially missed, as they impacted genes not included in the original gene panel analysed, and a third one was missed as it was in a poorly covered region. The overall combined diagnostic rate (SNVs + CNVs) in our cohort was 36%, with wide differences between clinical domains. We conclude that (1) the ES-based CNV pipeline detects efficiently large and small pathogenic CNVs, (2) the detection of CNV relies on uniformity of sequencing and good coverage, and (3) in patients who remain unsolved by the gene panel analysis, CNV analysis should be extended to all captured genes, as diagnostically relevant CNVs may occur everywhere in the genome.
Collapse
|