1
|
Maruzani R, Brierley L, Jorgensen A, Fowler A. Benchmarking UMI-aware and standard variant callers for low frequency ctDNA variant detection. BMC Genomics 2024; 25:827. [PMID: 39227777 PMCID: PMC11370058 DOI: 10.1186/s12864-024-10737-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 08/22/2024] [Indexed: 09/05/2024] Open
Abstract
BACKGROUND Circulating tumour DNA (ctDNA) is a subset of cell free DNA (cfDNA) released by tumour cells into the bloodstream. Circulating tumour DNA has shown great potential as a biomarker to inform treatment in cancer patients. Collecting ctDNA is minimally invasive and reflects the entire genetic makeup of a patient's cancer. ctDNA variants in NGS data can be difficult to distinguish from sequencing and PCR artefacts due to low abundance, particularly in the early stages of cancer. Unique Molecular Identifiers (UMIs) are short sequences ligated to the sequencing library before amplification. These sequences are useful for filtering out low frequency artefacts. The utility of ctDNA as a cancer biomarker depends on accurate detection of cancer variants. RESULTS In this study, we benchmarked six variant calling tools, including two UMI-aware callers for their ability to call ctDNA variants. The standard variant callers tested included Mutect2, bcftools, LoFreq and FreeBayes. The UMI-aware variant callers benchmarked were UMI-VarCal and UMIErrorCorrect. We used both datasets with known variants spiked in at low frequencies, and datasets containing ctDNA, and generated synthetic UMI sequences for these datasets. Variant callers displayed different preferences for sensitivity and specificity. Mutect2 showed high sensitivity, while returning more privately called variants than any other caller in data without synthetic UMIs - an indicator of false positive variant discovery. In data encoded with synthetic UMIs, UMI-VarCal detected fewer putative false positive variants than all other callers in synthetic datasets. Mutect2 showed a balance between high sensitivity and specificity in data encoded with synthetic UMIs. CONCLUSIONS Our results indicate UMI-aware variant callers have potential to improve sensitivity and specificity in calling low frequency ctDNA variants over standard variant calling tools. There is a growing need for further development of UMI-aware variant calling tools if effective early detection methods for cancer using ctDNA samples are to be realised.
Collapse
Affiliation(s)
- Rugare Maruzani
- Department of Health Data Science, Institute of Population Health, University of Liverpool, Waterhouse Building, Block F, Brownlow Street, Liverpool, L69 3GF, UK.
| | - Liam Brierley
- Department of Health Data Science, Institute of Population Health, University of Liverpool, Waterhouse Building, Block F, Brownlow Street, Liverpool, L69 3GF, UK
- MRC-University of Glasgow Centre for Virus Research, University of Glasgow, Garscube Campus, 464 Bearsden Road, Glasgow, G61 1QH, UK
| | - Andrea Jorgensen
- Department of Health Data Science, Institute of Population Health, University of Liverpool, Waterhouse Building, Block F, Brownlow Street, Liverpool, L69 3GF, UK
| | - Anna Fowler
- Department of Health Data Science, Institute of Population Health, University of Liverpool, Waterhouse Building, Block F, Brownlow Street, Liverpool, L69 3GF, UK
| |
Collapse
|
2
|
Andersson D, Kebede FT, Escobar M, Österlund T, Ståhlberg A. Principles of digital sequencing using unique molecular identifiers. Mol Aspects Med 2024; 96:101253. [PMID: 38367531 DOI: 10.1016/j.mam.2024.101253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 01/26/2024] [Accepted: 02/03/2024] [Indexed: 02/19/2024]
Abstract
Massively parallel sequencing technologies have long been used in both basic research and clinical routine. The recent introduction of digital sequencing has made previously challenging applications possible by significantly improving sensitivity and specificity to now allow detection of rare sequence variants, even at single molecule level. Digital sequencing utilizes unique molecular identifiers (UMIs) to minimize sequencing-induced errors and quantification biases. Here, we discuss the principles of UMIs and how they are used in digital sequencing. We outline the properties of different UMI types and the consequences of various UMI approaches in relation to experimental protocols and bioinformatics. Finally, we describe how digital sequencing can be applied in specific research fields, focusing on cancer management where it can be used in screening of asymptomatic individuals, diagnosis, treatment prediction, prognostication, monitoring treatment efficacy and early detection of treatment resistance as well as relapse.
Collapse
Affiliation(s)
- Daniel Andersson
- Sahlgrenska Center for Cancer Research, Department of Laboratory Medicine, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, 413 90, Gothenburg, Sweden
| | - Firaol Tamiru Kebede
- Sahlgrenska Center for Cancer Research, Department of Laboratory Medicine, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, 413 90, Gothenburg, Sweden
| | - Mandy Escobar
- Sahlgrenska Center for Cancer Research, Department of Laboratory Medicine, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, 413 90, Gothenburg, Sweden
| | - Tobias Österlund
- Sahlgrenska Center for Cancer Research, Department of Laboratory Medicine, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, 413 90, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 413 90, Gothenburg, Sweden; Department of Clinical Genetics and Genomics, Sahlgrenska University Hospital, 413 45, Gothenburg, Sweden
| | - Anders Ståhlberg
- Sahlgrenska Center for Cancer Research, Department of Laboratory Medicine, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, 413 90, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 413 90, Gothenburg, Sweden; Department of Clinical Genetics and Genomics, Sahlgrenska University Hospital, 413 45, Gothenburg, Sweden.
| |
Collapse
|
3
|
Xiang X, Lu B, Song D, Li J, Shu K, Pu D. Evaluating the performance of low-frequency variant calling tools for the detection of variants from short-read deep sequencing data. Sci Rep 2023; 13:20444. [PMID: 37993475 PMCID: PMC10665316 DOI: 10.1038/s41598-023-47135-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 11/09/2023] [Indexed: 11/24/2023] Open
Abstract
Detection of low-frequency variants with high accuracy plays an important role in biomedical research and clinical practice. However, it is challenging to do so with next-generation sequencing (NGS) approaches due to the high error rates of NGS. To accurately distinguish low-level true variants from these errors, many statistical variants calling tools for calling low-frequency variants have been proposed, but a systematic performance comparison of these tools has not yet been performed. Here, we evaluated four raw-reads-based variant callers (SiNVICT, outLyzer, Pisces, and LoFreq) and four UMI-based variant callers (DeepSNVMiner, MAGERI, smCounter2, and UMI-VarCal) considering their capability to call single nucleotide variants (SNVs) with allelic frequency as low as 0.025% in deep sequencing data. We analyzed a total of 54 simulated data with various sequencing depths and variant allele frequencies (VAFs), two reference data, and Horizon Tru-Q sample data. The results showed that the UMI-based callers, except smCounter2, outperformed the raw-reads-based callers regarding detection limit. Sequencing depth had almost no effect on the UMI-based callers but significantly influenced on the raw-reads-based callers. Regardless of the sequencing depth, MAGERI showed the fastest analysis, while smCounter2 consistently took the longest to finish the variant calling process. Overall, DeepSNVMiner and UMI-VarCal performed the best with considerably good sensitivity and precision of 88%, 100%, and 84%, 100%, respectively. In conclusion, the UMI-based callers, except smCounter2, outperformed the raw-reads-based callers in terms of sensitivity and precision. We recommend using DeepSNVMiner and UMI-VarCal for low-frequency variant detection. The results provide important information regarding future directions for reliable low-frequency variant detection and algorithm development, which is critical in genetics-based medical research and clinical applications.
Collapse
Affiliation(s)
- Xudong Xiang
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Bowen Lu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Dongyang Song
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Jie Li
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Kunxian Shu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Dan Pu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| |
Collapse
|
4
|
Tébar-Martínez R, Martín-Arana J, Gimeno-Valiente F, Tarazona N, Rentero-Garrido P, Cervantes A. Strategies for improving detection of circulating tumor DNA using next generation sequencing. Cancer Treat Rev 2023; 119:102595. [PMID: 37390697 DOI: 10.1016/j.ctrv.2023.102595] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 06/23/2023] [Indexed: 07/02/2023]
Abstract
Cancer has become a global health issue and liquid biopsy has emerged as a non-invasive tool for various applications. In cancer, circulating tumor DNA (ctDNA) can be detected from cell-free DNA (cfDNA) obtained from plasma and has potential for early diagnosis, treatment, resistance, minimal residual disease detection, and tumoral heterogeneity identification. However, the low frequency of ctDNA requires techniques for accurate analysis. Multitarget assay such as Next Generation Sequencing (NGS) need improvement to achieve limits of detection that can identify the low frequency variants present in the cfDNA. In this review, we provide a general overview of the use of cfDNA and ctDNA in cancer, and discuss techniques developed to optimize NGS as a tool for ctDNA detection. We also summarize the results obtained using NGS strategies in both investigational and clinical contexts.
Collapse
Affiliation(s)
- Roberto Tébar-Martínez
- Department of Medical Oncology, INCLIVA Health Research Institute, University of Valencia, C. de Menéndez y Pelayo, 4, 46010 Valencia, Spain; Precision Medicine Unit, INCLIVA Health Research Institute, C. de Menéndez y Pelayo, 4, 46010 Valencia, Spain.
| | - Jorge Martín-Arana
- Department of Medical Oncology, INCLIVA Health Research Institute, University of Valencia, C. de Menéndez y Pelayo, 4, 46010 Valencia, Spain; Bioinformatics Unit, INCLIVA Health Research Institute, C. de Menéndez y Pelayo, 4, 46010 Valencia, Spain.
| | - Francisco Gimeno-Valiente
- Cancer Evolution and Genome Instability Laboratory, University College of London Cancer Institute, 72 Huntley St, WC1E 6DD London, United Kingdom.
| | - Noelia Tarazona
- Department of Medical Oncology, INCLIVA Health Research Institute, University of Valencia, C. de Menéndez y Pelayo, 4, 46010 Valencia, Spain; Health Institute Carlos III, CIBERONC, C/ Sinesio Delgado, 4, 28029 Madrid, Spain.
| | - Pilar Rentero-Garrido
- Precision Medicine Unit, INCLIVA Health Research Institute, C. de Menéndez y Pelayo, 4, 46010 Valencia, Spain.
| | - Andrés Cervantes
- Department of Medical Oncology, INCLIVA Health Research Institute, University of Valencia, C. de Menéndez y Pelayo, 4, 46010 Valencia, Spain; Health Institute Carlos III, CIBERONC, C/ Sinesio Delgado, 4, 28029 Madrid, Spain.
| |
Collapse
|
5
|
Assaf N, Hanania N, Lefebvre C, Penther D, Salmeron G, Petitjean B, Terré C. Molecular characterization of adult IRF4 large B-cell lymphoma with spontaneous remission. Acta Oncol 2023; 62:948-952. [PMID: 37517001 DOI: 10.1080/0284186x.2023.2238546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 06/28/2023] [Indexed: 08/01/2023]
Affiliation(s)
- Nada Assaf
- Department of Pathology and Laboratory Medicine, American University of Beirut Medical Center, Lebanon
| | - Noor Hanania
- Department of Pathology and Laboratory Medicine, American University of Beirut Medical Center, Lebanon
| | - Christine Lefebvre
- Laboratoire d'Hématologie Biologique, Centre Hospitalier Universitaire de Grenoble Alpes (CHUGA), France
| | | | - Géraldine Salmeron
- Department of Hematology, Centre Hospitalier de Versailles, Le Chesnay, France
- UMR1184, University Paris-Saclay, France
- Department of Laboratory Medicine, Hematology, Centre Hospitalier de Versailles, Le Chesnay, France
| | - Bruno Petitjean
- Anatomie et Cytologie Pathologiques, Centre Hospitalier Intercommunal de Poissy Saint Germain en Laye, France
| | - Christine Terré
- Department of Laboratory Medicine, Hemato-Oncologic Cytogenetics, Centre Hospitalier de Versailles, Le Chesnay, France
| |
Collapse
|
6
|
Boßelmann CM, Leu C, Lal D. Technological and computational approaches to detect somatic mosaicism in epilepsy. Neurobiol Dis 2023:106208. [PMID: 37343892 DOI: 10.1016/j.nbd.2023.106208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 06/03/2023] [Accepted: 06/16/2023] [Indexed: 06/23/2023] Open
Abstract
Lesional epilepsy is a common and severe disease commonly associated with malformations of cortical development, including focal cortical dysplasia and hemimegalencephaly. Recent advances in sequencing and variant calling technologies have identified several genetic causes, including both short/single nucleotide and structural somatic variation. In this review, we aim to provide a comprehensive overview of the methodological advancements in this field while highlighting the unresolved technological and computational challenges that persist, including ultra-low variant allele fractions in bulk tissue, low availability of paired control samples, spatial variability of mutational burden within the lesion, and the issue of false-positive calls and validation procedures. Information from genetic testing in focal epilepsy may be integrated into clinical care to inform histopathological diagnosis, postoperative prognosis, and candidate precision therapies.
Collapse
Affiliation(s)
- Christian M Boßelmann
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA; Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Costin Leu
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA; Department of Clinical and Experimental Epilepsy, Institute of Neurology, University College London, London, UK.
| | - Dennis Lal
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA; Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and M.I.T., Cambridge, MA, USA; Cologne Center for Genomics (CCG), University of Cologne, Cologne, DE, USA
| |
Collapse
|
7
|
Österlund T, Filges S, Johansson G, Ståhlberg A. UMIErrorCorrect and UMIAnalyzer: Software for Consensus Read Generation, Error Correction, and Visualization Using Unique Molecular Identifiers. Clin Chem 2022; 68:1425-1435. [PMID: 36031761 DOI: 10.1093/clinchem/hvac136] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 07/08/2022] [Indexed: 11/14/2022]
Abstract
BACKGROUND Targeted sequencing using unique molecular identifiers (UMIs) enables detection of rare variant alleles in challenging applications, such as cell-free DNA analysis from liquid biopsies. Standard bioinformatics pipelines for data processing and variant calling are not adapted for deep-sequencing data containing UMIs, are inflexible, and require multistep workflows or dedicated computing resources. METHODS We developed a bioinformatics pipeline using Python and an R package for data analysis and visualization. To validate our pipeline, we analyzed cell-free DNA reference material with known mutant allele frequencies (0%, 0.125%, 0.25%, and 1%) and public data sets. RESULTS We developed UMIErrorCorrect, a bioinformatics pipeline for analyzing sequencing data containing UMIs. UMIErrorCorrect only requires fastq files as inputs and performs alignment, UMI clustering, error correction, and variant calling. We also provide UMIAnalyzer, a graphical user interface, for data mining, visualization, variant interpretation, and report generation. UMIAnalyzer allows the user to adjust analysis parameters and study their effect on variant calling. We demonstrated the flexibility of UMIErrorCorrect by analyzing data from 4 different targeted sequencing protocols. We also show its ability to detect different mutant allele frequencies in standardized cell-free DNA reference material. UMIErrorCorrect outperformed existing pipelines for targeted UMI sequencing data in terms of variant detection sensitivity. CONCLUSIONS UMIErrorCorrect and UMIAnalyzer are comprehensive and customizable bioinformatics tools that can be applied to any type of library preparation protocol and enrichment chemistry using UMIs. Access to simple, generic, and open-source bioinformatics tools will facilitate the implementation of UMI-based sequencing approaches in basic research and clinical applications.
Collapse
Affiliation(s)
- Tobias Österlund
- Department of Clinical Genetics and Genomics, Sahlgrenska University Hospital, Region Västra Götaland, Gothenburg, Sweden.,Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.,Sahlgrenska Center for Cancer Research, Department of Laboratory Medicine, Institute of Biomedicine, University of Gothenburg, Gothenburg, Sweden
| | - Stefan Filges
- Sahlgrenska Center for Cancer Research, Department of Laboratory Medicine, Institute of Biomedicine, University of Gothenburg, Gothenburg, Sweden
| | - Gustav Johansson
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.,Sahlgrenska Center for Cancer Research, Department of Laboratory Medicine, Institute of Biomedicine, University of Gothenburg, Gothenburg, Sweden.,SiMSen Diagnostics AB, Gothenburg, Sweden
| | - Anders Ståhlberg
- Department of Clinical Genetics and Genomics, Sahlgrenska University Hospital, Region Västra Götaland, Gothenburg, Sweden.,Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.,Sahlgrenska Center for Cancer Research, Department of Laboratory Medicine, Institute of Biomedicine, University of Gothenburg, Gothenburg, Sweden
| |
Collapse
|
8
|
Coto-Llerena M, Benjak A, Gallon J, Meier MA, Boldanova T, Terracciano LM, Ng CKY, Piscuoglio S. Circulating Cell-Free DNA Captures the Intratumor Heterogeneity in Multinodular Hepatocellular Carcinoma. JCO Precis Oncol 2022; 6:e2100335. [PMID: 35263170 PMCID: PMC8926063 DOI: 10.1200/po.21.00335] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Hepatocellular carcinoma (HCC) is a highly heterogeneous disease, with more than 40% of patients initially diagnosed with multinodular HCCs. Although circulating cell-free DNA (cfDNA) has been shown to effectively detect somatic mutations, little is known about its utility to capture intratumor heterogeneity in patients with multinodular HCC undergoing systemic treatment.
Collapse
Affiliation(s)
- Mairene Coto-Llerena
- Institute of Medical Genetics and Pathology, University Hospital Basel, Basel, Switzerland.,Visceral Surgery and Precision Medicine Research Laboratory, Department of Biomedicine, University of Basel, Basel, Switzerland
| | - Andrej Benjak
- Department for BioMedical Research (DBMR), University of Bern, Bern, Switzerland
| | - John Gallon
- Visceral Surgery and Precision Medicine Research Laboratory, Department of Biomedicine, University of Basel, Basel, Switzerland
| | - Marie-Anne Meier
- Hepatology Laboratory, Department of Biomedicine, University of Basel, Basel, Switzerland.,Division of Gastroenterology and Hepatology, University Hospital Basel, Basel, Switzerland
| | - Tuyana Boldanova
- Hepatology Laboratory, Department of Biomedicine, University of Basel, Basel, Switzerland.,Division of Gastroenterology and Hepatology, University Hospital Basel, Basel, Switzerland
| | - Luigi M Terracciano
- Department of Anatomic Pathology, IRCCS Humanitas Research Hospital, Rozzano, Milan, Italy.,Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, Milan, Italy
| | - Charlotte K Y Ng
- Department for BioMedical Research (DBMR), University of Bern, Bern, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Salvatore Piscuoglio
- Institute of Medical Genetics and Pathology, University Hospital Basel, Basel, Switzerland.,Visceral Surgery and Precision Medicine Research Laboratory, Department of Biomedicine, University of Basel, Basel, Switzerland
| |
Collapse
|
9
|
Sater V, Viailly PJ, Lecroq T, Prieur-Gaston É, Bohers É, Viennot M, Ruminy P, Dauchel H, Vera P, Jardin F. UMI-Varcal: A Low-Frequency Variant Caller for UMI-Tagged Paired-End Sequencing Data. Methods Mol Biol 2022; 2493:235-245. [PMID: 35751818 DOI: 10.1007/978-1-0716-2293-3_14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The rapid transition from traditional sequencing methods to Next-Generation Sequencing (NGS) has allowed for a faster and more accurate detection of somatic variants (Single-Nucleotide Variant (SNV) and Copy Number Variation (CNV)) in tumor cells. NGS technologies require a succession of steps during which false variants can be silently added at low frequencies. Filtering these artifacts can be a rather difficult task especially when the experiments are designed to look for very low frequency variants. Recently, adding unique molecular barcodes called UMI (Unique Molecular Identifier) to the DNA fragments appears to be a very effective strategy to specifically filter out false variants from the variant calling results (Kukita et al. DNA Res 22(4):269-277, 2015; Newman et al. Nat Biotechnol 34(5):547-555, 2016; Schmitt et al. Proc Natl Acad Sci U S A 109(36):14508-14513). Here, we describe UMI-VarCal (Sater et al. Bioinformatics 36:2718-2724, 2020), which can use the UMI information from UMI-tagged reads to offer a faster and more accurate variant calling analysis.
Collapse
Affiliation(s)
- Vincent Sater
- Normandie Univ, UNIROUEN, LITIS EA 4108, Rouen, France.
| | - Pierre-Julien Viailly
- Department of Pathology, Centre Henri Becquerel, Rouen, France
- INSERM U1245, University of Normandie UNIROUEN, Rouen, France
| | | | | | - Élodie Bohers
- Department of Pathology, Centre Henri Becquerel, Rouen, France
- INSERM U1245, University of Normandie UNIROUEN, Rouen, France
| | - Mathieu Viennot
- Department of Pathology, Centre Henri Becquerel, Rouen, France
- INSERM U1245, University of Normandie UNIROUEN, Rouen, France
| | - Philippe Ruminy
- Department of Pathology, Centre Henri Becquerel, Rouen, France
- INSERM U1245, University of Normandie UNIROUEN, Rouen, France
| | | | - Pierre Vera
- Normandie Univ, UNIROUEN, LITIS EA 4108, Rouen, France
- Department of Pathology, Centre Henri Becquerel, Rouen, France
| | - Fabrice Jardin
- Department of Pathology, Centre Henri Becquerel, Rouen, France
- INSERM U1245, University of Normandie UNIROUEN, Rouen, France
| |
Collapse
|
10
|
Bohers E, Viailly PJ, Jardin F. cfDNA Sequencing: Technological Approaches and Bioinformatic Issues. Pharmaceuticals (Basel) 2021; 14:ph14060596. [PMID: 34205827 PMCID: PMC8234829 DOI: 10.3390/ph14060596] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 06/18/2021] [Accepted: 06/18/2021] [Indexed: 12/14/2022] Open
Abstract
In the era of precision medicine, it is crucial to identify molecular alterations that will guide the therapeutic management of patients. In this context, circulating tumoral DNA (ctDNA) released by the tumor in body fluids, like blood, and carrying its molecular characteristics is becoming a powerful biomarker for non-invasive detection and monitoring of cancer. Major recent technological advances, especially in terms of sequencing, have made possible its analysis, the challenge still being its reliable early detection. Different parameters, from the pre-analytical phase to the choice of sequencing technology and bioinformatic tools can influence the sensitivity of ctDNA detection.
Collapse
|
11
|
Hynst J, Navrkalova V, Pal K, Pospisilova S. Bioinformatic strategies for the analysis of genomic aberrations detected by targeted NGS panels with clinical application. PeerJ 2021; 9:e10897. [PMID: 33850640 PMCID: PMC8019320 DOI: 10.7717/peerj.10897] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 01/13/2021] [Indexed: 01/21/2023] Open
Abstract
Molecular profiling of tumor samples has acquired importance in cancer research, but currently also plays an important role in the clinical management of cancer patients. Rapid identification of genomic aberrations improves diagnosis, prognosis and effective therapy selection. This can be attributed mainly to the development of next-generation sequencing (NGS) methods, especially targeted DNA panels. Such panels enable a relatively inexpensive and rapid analysis of various aberrations with clinical impact specific to particular diagnoses. In this review, we discuss the experimental approaches and bioinformatic strategies available for the development of an NGS panel for a reliable analysis of selected biomarkers. Compliance with defined analytical steps is crucial to ensure accurate and reproducible results. In addition, a careful validation procedure has to be performed before the application of NGS targeted assays in routine clinical practice. With more focus on bioinformatics, we emphasize the need for thorough pipeline validation and management in relation to the particular experimental setting as an integral part of the NGS method establishment. A robust and reproducible bioinformatic analysis running on powerful machines is essential for proper detection of genomic variants in clinical settings since distinguishing between experimental noise and real biological variants is fundamental. This review summarizes state-of-the-art bioinformatic solutions for careful detection of the SNV/Indels and CNVs for targeted sequencing resulting in translation of sequencing data into clinically relevant information. Finally, we share our experience with the development of a custom targeted NGS panel for an integrated analysis of biomarkers in lymphoproliferative disorders.
Collapse
Affiliation(s)
- Jakub Hynst
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic.,Department of Internal Medicine-Hematology and Oncology, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic.,Department of Medical Genetics and Genomics, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic
| | - Veronika Navrkalova
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic.,Department of Internal Medicine-Hematology and Oncology, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic
| | - Karol Pal
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic.,Department of Hematology, University Hospital Schleswig-Holstein, Kiel, Germany
| | - Sarka Pospisilova
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic.,Department of Internal Medicine-Hematology and Oncology, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic.,Department of Medical Genetics and Genomics, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic
| |
Collapse
|
12
|
Camus V, Jardin F. Cell-Free DNA for the Management of Classical Hodgkin Lymphoma. Pharmaceuticals (Basel) 2021; 14:ph14030207. [PMID: 33801462 PMCID: PMC7998645 DOI: 10.3390/ph14030207] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2021] [Revised: 02/25/2021] [Accepted: 02/26/2021] [Indexed: 12/12/2022] Open
Abstract
Cell-free DNA (cfDNA) testing, is an emerging “liquid biopsy” tool for noninvasive lymphoma detection, and an increased amount of data are now available to use this technique with accuracy, especially in classical Hodgkin lymphoma (cHL). The advantages of cfDNA include simplicity of repeated blood sample acquisition over time; dynamic, noninvasive, and quantitative analysis; fast turnover time; reasonable cost; and established consistency with results from tumor genomic DNA. cfDNA analysis offers an easy method for genotyping the overall molecular landscape of pediatric and adult cHL and may help in cases of diagnostic difficulties between cHL and other lymphomas. cfDNA levels are correlated with clinical, prognostic, and metabolic features, and may serve as a therapeutic response evaluation tool and as a minimal residual disease (MRD) biomarker in complement to positron emission tomography (PET). Indeed, cfDNA real-time monitoring by fast high-throughput techniques enables the prompt detection of refractory disease or may help to address PET residual hypermetabolic situations during or at the end of treatment. The major recent works presented and described here demonstrated the clinically meaningful applicability of cfDNA testing in diagnostic and theranostic settings, but also in disease risk assessment, therapeutic molecular response, and monitoring of cHL treatments.
Collapse
Affiliation(s)
- Vincent Camus
- Correspondence: ; Tel.: +33(0)-2-32-08-29-47; Fax: +33-(0)-2-32-08-22-83
| | | |
Collapse
|
13
|
Woerner AE, Mandape S, King JL, Muenzler M, Crysup B, Budowle B. Reducing noise and stutter in short tandem repeat loci with unique molecular identifiers. Forensic Sci Int Genet 2020; 51:102459. [PMID: 33429137 DOI: 10.1016/j.fsigen.2020.102459] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 10/28/2020] [Accepted: 12/21/2020] [Indexed: 12/24/2022]
Abstract
Unique molecular identifiers (UMIs) are a promising approach to contend with errors generated during PCR and massively parallel sequencing (MPS). With UMI technology, random molecular barcodes are ligated to template DNA molecules prior to PCR, allowing PCR and sequencing error to be tracked and corrected bioinformatically. UMIs have the potential to be particularly informative for the interpretation of short tandem repeats (STRs). Traditional MPS approaches may simply lead to the observation of alleles that are consistent with the hypotheses of stutter, while with UMIs stutter products bioinformatically may be re-associated with their parental alleles and subsequently removed. Herein, a bioinformatics pipeline named strumi is described that is designed for the analysis of STRs that are tagged with UMIs. Unlike other tools, strumi is an alignment-free machine learning driven algorithm that clusters individual MPS reads into UMI families, infers consensus super-reads that represent each family and provides an estimate the resulting haplotype's accuracy. Super-reads, in turn, approximate independent measurements not of the PCR products, but of the original template molecules, both in terms of quantity and sequence identity. Provisional assessments show that naïve threshold-based approaches generate super-reads that are accurate (∼97 % haplotype accuracy, compared to ∼78 % when UMIs are not used), and the application of a more nuanced machine learning approach increases the accuracy to ∼99.5 % depending on the level of certainty desired. With these features, UMIs may greatly simplify probabilistic genotyping systems and reduce uncertainty. However, the ability to interpret alleles at trace levels also permits the interpretation, characterization and quantification of contamination as well as somatic variation (including somatic stutter), which may present newfound challenges.
Collapse
Affiliation(s)
- August E Woerner
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA; Department of Microbiology, Immunology and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA.
| | - Sammed Mandape
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA
| | - Jonathan L King
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA
| | - Melissa Muenzler
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA
| | - Benjamin Crysup
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA
| | - Bruce Budowle
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA; Department of Microbiology, Immunology and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA
| |
Collapse
|
14
|
Frequent Germline and Somatic Single Nucleotide Variants in the Promoter Region of the Ribosomal RNA Gene in Japanese Lung Adenocarcinoma Patients. Cells 2020; 9:cells9112409. [PMID: 33153169 PMCID: PMC7692307 DOI: 10.3390/cells9112409] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 10/28/2020] [Accepted: 10/29/2020] [Indexed: 12/25/2022] Open
Abstract
Ribosomal RNA (rRNA), the most abundant non-coding RNA species, is a major component of the ribosome. Impaired ribosome biogenesis causes the dysfunction of protein synthesis and diseases called “ribosomopathies,” including genetic disorders with cancer risk. However, the potential role of rRNA gene (rDNA) alterations in cancer is unknown. We investigated germline and somatic single-nucleotide variants (SNVs) in the rDNA promoter region (positions −248 to +100, relative to the transcription start site) in 82 lung adenocarcinomas (LUAC). Twenty-nine tumors (35.4%) carried germline SNVs, and eight tumors (9.8%) harbored somatic SNVs. Interestingly, the presence of germline SNVs between positions +1 and +100 (n = 12; 14.6%) was associated with significantly shorter recurrence-free survival (RFS) and overall survival (OS) by univariate analysis (p < 0.05, respectively), and was an independent prognostic factor for RFS and OS by multivariate analysis. LUAC cell line PC9, carrying rDNA promoter SNV at position +49, showed significantly higher ribosome biogenesis than H1650 cells without SNV. Upon nucleolar stress induced by actinomycin D, PC9 retained significantly higher ribosome biogenesis than H1650. These results highlight the possible functional role of SNVs at specific sites of the rDNA promoter region in ribosome biogenesis, the progression of LUAC, and their potential prognostic value.
Collapse
|
15
|
Sater V, Viailly PJ, Lecroq T, Ruminy P, Bérard C, Prieur-Gaston É, Jardin F. UMI-Gen: A UMI-based read simulator for variant calling evaluation in paired-end sequencing NGS libraries. Comput Struct Biotechnol J 2020; 18:2270-2280. [PMID: 32952940 PMCID: PMC7484502 DOI: 10.1016/j.csbj.2020.08.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 08/03/2020] [Accepted: 08/05/2020] [Indexed: 11/02/2022] Open
Abstract
Motivation With Next Generation Sequencing becoming more affordable every year, NGS technologies asserted themselves as the fastest and most reliable way to detect Single Nucleotide Variants (SNV) and Copy Number Variations (CNV) in cancer patients. These technologies can be used to sequence DNA at very high depths thus allowing to detect abnormalities in tumor cells with very low frequencies. Multiple variant callers are publicly available and are usually efficient at calling out variants. However, when frequencies begin to drop under 1%, the specificity of these tools suffers greatly as true variants at very low frequencies can be easily confused with sequencing or PCR artifacts. The recent use of Unique Molecular Identifiers (UMI) in NGS experiments has offered a way to accurately separate true variants from artifacts. UMI-based variant callers are slowly replacing raw-read based variant callers as the standard method for an accurate detection of variants at very low frequencies. However, benchmarking done in the tools publication are usually realized on real biological data in which real variants are not known, making it difficult to assess their accuracy. Results We present UMI-Gen, a UMI-based read simulator for targeted sequencing paired-end data. UMI-Gen generates reference reads covering the targeted regions at a user customizable depth. After that, using a number of control files, it estimates the background error rate at each position and then modifies the generated reads to mimic real biological data. Finally, it will insert real variants in the reads from a list provided by the user. Availability The entire pipeline is available at https://gitlab.com/vincent-sater/umigen under MIT license.
Collapse
Affiliation(s)
- Vincent Sater
- University of Rouen Normandy UNIROUEN, LITIS EA 4108, 76000 Rouen, France.,INSERM U1245, University of Rouen Normandy UNIROUEN, 76000 Rouen, France
| | - Pierre-Julien Viailly
- Department of Pathology, Centre Henri Becquerel, 76000 Rouen, France.,INSERM U1245, University of Rouen Normandy UNIROUEN, 76000 Rouen, France
| | - Thierry Lecroq
- University of Rouen Normandy UNIROUEN, LITIS EA 4108, 76000 Rouen, France
| | - Philippe Ruminy
- Department of Pathology, Centre Henri Becquerel, 76000 Rouen, France.,INSERM U1245, University of Rouen Normandy UNIROUEN, 76000 Rouen, France
| | - Caroline Bérard
- University of Rouen Normandy UNIROUEN, LITIS EA 4108, 76000 Rouen, France
| | | | - Fabrice Jardin
- Department of Pathology, Centre Henri Becquerel, 76000 Rouen, France.,INSERM U1245, University of Rouen Normandy UNIROUEN, 76000 Rouen, France
| |
Collapse
|