1
|
Wang S, Zhu X, Wang X, Liu Y, Zhao M, Chang Z, Wang X, Shao Y, Wang J. TMBstable: a variant caller controls performance variation across heterogeneous sequencing samples. Brief Bioinform 2024; 25:bbae159. [PMID: 38632951 PMCID: PMC11024516 DOI: 10.1093/bib/bbae159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/31/2024] [Accepted: 03/25/2024] [Indexed: 04/19/2024] Open
Abstract
In cancer genomics, variant calling has advanced, but traditional mean accuracy evaluations are inadequate for biomarkers like tumor mutation burden, which vary significantly across samples, affecting immunotherapy patient selection and threshold settings. In this study, we introduce TMBstable, an innovative method that dynamically selects optimal variant calling strategies for specific genomic regions using a meta-learning framework, distinguishing it from traditional callers with uniform sample-wide strategies. The process begins with segmenting the sample into windows and extracting meta-features for clustering, followed by using a pre-trained meta-model to select suitable algorithms for each cluster, thereby addressing strategy-sample mismatches, reducing performance fluctuations and ensuring consistent performance across various samples. We evaluated TMBstable using both simulated and real non-small cell lung cancer and nasopharyngeal carcinoma samples, comparing it with advanced callers. The assessment, focusing on stability measures, such as the variance and coefficient of variation in false positive rate, false negative rate, precision and recall, involved 300 simulated and 106 real tumor samples. Benchmark results showed TMBstable's superior stability with the lowest variance and coefficient of variation across performance metrics, highlighting its effectiveness in analyzing the counting-based biomarker. The TMBstable algorithm can be accessed at https://github.com/hello-json/TMBstable for academic usage only.
Collapse
Affiliation(s)
- Shenjie Wang
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| | - Xiaoyan Zhu
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| | - Xuwen Wang
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yuqian Liu
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| | - Minchao Zhao
- Nanjing Geneseeq Technology Inc., Nanjing, Jiangsu, China
| | - Zhili Chang
- Nanjing Geneseeq Technology Inc., Nanjing, Jiangsu, China
| | - Xiaonan Wang
- Nanjing Geneseeq Technology Inc., Nanjing, Jiangsu, China
| | - Yang Shao
- Nanjing Geneseeq Technology Inc., Nanjing, Jiangsu, China
- School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Jiayin Wang
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| |
Collapse
|
2
|
Rheinnecker M, Fröhlich M, Rübsam M, Paramasivam N, Heilig CE, Fröhling S, Schlenk RF, Hutter B, Hübschmann D. ZygosityPredictor. BIOINFORMATICS ADVANCES 2024; 4:vbae017. [PMID: 38560552 PMCID: PMC10980564 DOI: 10.1093/bioadv/vbae017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 02/05/2024] [Accepted: 02/02/2024] [Indexed: 04/04/2024]
Abstract
Summary ZygosityPredictor provides functionality to evaluate how many copies of a gene are affected by mutations in next generation sequencing data. In cancer samples, the tool processes both somatic and germline mutations. In particular, ZygosityPredictor computes the number of affected copies for single nucleotide variants and small insertions and deletions (Indels). In addition, the tool integrates information at gene level via phasing of several variants and subsequent logic to derive how strongly a gene is affected by mutations and provides a measure of confidence. This information is of particular interest in precision oncology, e.g. when assessing whether unmutated copies of tumor-suppressor genes remain. Availability and implementation ZygosityPredictor was implemented as an R-package and is available via Bioconductor at https://bioconductor.org/packages/ZygosityPredictor. Detailed documentation is provided in the vignette including application to an example genome.
Collapse
Affiliation(s)
- Marco Rheinnecker
- Computational Oncology Group, Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT) Heidelberg and German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Martina Fröhlich
- Computational Oncology Group, Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT) Heidelberg and German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- German Cancer Consortium (DKTK), 69120 Heidelberg, Germany
| | - Marc Rübsam
- Computational Oncology Group, Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT) Heidelberg and German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Nagarajan Paramasivam
- Computational Oncology Group, Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT) Heidelberg and German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Christoph E Heilig
- German Cancer Consortium (DKTK), 69120 Heidelberg, Germany
- Division of Translational Medical Oncology, NCT Heidelberg and DKFZ, 69120 Heidelberg, Germany
| | - Stefan Fröhling
- German Cancer Consortium (DKTK), 69120 Heidelberg, Germany
- Division of Translational Medical Oncology, NCT Heidelberg and DKFZ, 69120 Heidelberg, Germany
| | - Richard F Schlenk
- German Cancer Consortium (DKTK), 69120 Heidelberg, Germany
- Department of Medical Oncology, NCT Heidelberg, Heidelberg University Hospital, 69120 Heidelberg, Germany
- Department of Hematology, Oncology and Rheumatology, Heidelberg University Hospital, 69120 Heidelberg, Germany
- NCT Trial Center, NCT Heidelberg, Heidelberg University Hospital and DKFZ, 69120 Heidelberg, Germany
| | - Barbara Hutter
- Computational Oncology Group, Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT) Heidelberg and German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- German Cancer Consortium (DKTK), 69120 Heidelberg, Germany
| | - Daniel Hübschmann
- Computational Oncology Group, Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT) Heidelberg and German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- German Cancer Consortium (DKTK), 69120 Heidelberg, Germany
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM), Pattern Recognition and Digital Medicine Group, 69120 Heidelberg, Germany
- Innovation and Service Unit for Bioinformatics and Precision Medicine, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| |
Collapse
|
3
|
Leenanitikul J, Chanchaem P, Mankhong S, Denariyakoon S, Fongchaiya V, Arayataweegool A, Angspatt P, Wongchanapai P, Prapanpoj V, Chatamra K, Pisitkun T, Sriswasdi S, Wongkongkathep P. Concordance between whole exome sequencing of circulating tumor DNA and tumor tissue. PLoS One 2023; 18:e0292879. [PMID: 37878600 PMCID: PMC10599540 DOI: 10.1371/journal.pone.0292879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Accepted: 10/02/2023] [Indexed: 10/27/2023] Open
Abstract
Next generation sequencing of circulating tumor DNA (ctDNA) has been used as a noninvasive alternative for cancer diagnosis and characterization of tumor mutational landscape. However, low ctDNA fraction and other factors can limit the ability of ctDNA analysis to capture tumor-specific and actionable variants. In this study, whole-exome sequencings (WES) were performed on paired ctDNA and tumor biopsy in 15 cancer patients to assess the extent of concordance between mutational profiles derived from the two source materials. We found that up to 16.4% ctDNA fraction can still be insufficient for detecting tumor-specific variants and that good concordance with tumor biopsy is consistently achieved at higher ctDNA fractions. Most importantly, ctDNA analysis can consistently capture tumor heterogeneity and detect key cancer-related genes even in a patient with both primary and metastatic tumors.
Collapse
Affiliation(s)
- Julanee Leenanitikul
- Bioinformatics and Computational Biology Program, Chulalongkorn University, Bangkok, Thailand
| | - Prangwalai Chanchaem
- Research Unit of Systems Microbiology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| | - Suwanan Mankhong
- Research Unit of Systems Microbiology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| | - Sikrit Denariyakoon
- The Queen Sirikit Center for Breast Cancer, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Valla Fongchaiya
- The Queen Sirikit Center for Breast Cancer, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Areeya Arayataweegool
- The Queen Sirikit Center for Breast Cancer, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Pattama Angspatt
- Division of Medical Oncology, Department of Medicine, Faculty of Medicine, Chulalongkorn University and the King Chulalongkorn Memorial Hospital, Bangkok, Thailand
| | - Ploytuangporn Wongchanapai
- Division of Medical Oncology, Department of Medicine, Faculty of Medicine, Chulalongkorn University and the King Chulalongkorn Memorial Hospital, Bangkok, Thailand
| | | | - Kris Chatamra
- The Queen Sirikit Center for Breast Cancer, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Trairak Pisitkun
- Center of Excellence in Systems Biology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Research Affairs, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| | - Sira Sriswasdi
- Research Affairs, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Center of Excellence in Computational Molecular Biology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| | - Piriya Wongkongkathep
- Center of Excellence in Systems Biology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Research Affairs, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| |
Collapse
|
4
|
Kazdal D, Menzel M, Budczies J, Stenzinger A. [Molecular tumor diagnostics as the driving force behind precision oncology]. Dtsch Med Wochenschr 2023; 148:1157-1165. [PMID: 37657453 DOI: 10.1055/a-1937-0347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/03/2023]
Abstract
Molecular pathological diagnostics plays a central role in personalized oncology and requires multidisciplinary teamwork. It is just as relevant for the individual patient who is being treated with an approved therapy method or an individual treatment attempt as it is for prospective clinical studies that require the identification of specific therapeutic target structures or complex biomarkers for study inclusion. It is also of crucial importance for the generation of real-world data, which is becoming increasingly important for drug development. Future developments will be significantly shaped by improvements in scalable molecular diagnostics, in which increasingly complex and multi-layered data sets must be quickly converted into clinically useful information. One focus will be on the development of adaptive diagnostic strategies in order to be able to depict the enormous plasticity of a cancer disease over time.
Collapse
|
5
|
Martínez-Jiménez F, Movasati A, Brunner SR, Nguyen L, Priestley P, Cuppen E, Van Hoeck A. Pan-cancer whole-genome comparison of primary and metastatic solid tumours. Nature 2023; 618:333-341. [PMID: 37165194 PMCID: PMC10247378 DOI: 10.1038/s41586-023-06054-z] [Citation(s) in RCA: 43] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 04/05/2023] [Indexed: 05/12/2023]
Abstract
Metastatic cancer remains an almost inevitably lethal disease1-3. A better understanding of disease progression and response to therapies therefore remains of utmost importance. Here we characterize the genomic differences between early-stage untreated primary tumours and late-stage treated metastatic tumours using a harmonized pan-cancer analysis (or reanalysis) of two unpaired primary4 and metastatic5 cohorts of 7,108 whole-genome-sequenced tumours. Metastatic tumours in general have a lower intratumour heterogeneity and a conserved karyotype, displaying only a modest increase in mutations, although frequencies of structural variants are elevated overall. Furthermore, highly variable tumour-specific contributions of mutational footprints of endogenous (for example, SBS1 and APOBEC) and exogenous mutational processes (for example, platinum treatment) are present. The majority of cancer types had either moderate genomic differences (for example, lung adenocarcinoma) or highly consistent genomic portraits (for example, ovarian serous carcinoma) when comparing early-stage and late-stage disease. Breast, prostate, thyroid and kidney renal clear cell carcinomas and pancreatic neuroendocrine tumours are clear exceptions to the rule, displaying an extensive transformation of their genomic landscape in advanced stages. Exposure to treatment further scars the tumour genome and introduces an evolutionary bottleneck that selects for known therapy-resistant drivers in approximately half of treated patients. Our data showcase the potential of pan-cancer whole-genome analysis to identify distinctive features of late-stage tumours and provide a valuable resource to further investigate the biological basis of cancer and resistance to therapies.
Collapse
Affiliation(s)
- Francisco Martínez-Jiménez
- Center for Molecular Medicine, Oncode Institute, University Medical Center Utrecht, Utrecht, The Netherlands
- Vall d'Hebron Institute of Oncology (VHIO), Barcelona, Spain
- Hartwig Medical Foundation, Amsterdam, The Netherlands
| | - Ali Movasati
- Center for Molecular Medicine, Oncode Institute, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Sascha Remy Brunner
- Center for Molecular Medicine, Oncode Institute, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Luan Nguyen
- Center for Molecular Medicine, Oncode Institute, University Medical Center Utrecht, Utrecht, The Netherlands
- Hartwig Medical Foundation Australia, Sydney, New South Wales, Australia
| | - Peter Priestley
- Hartwig Medical Foundation Australia, Sydney, New South Wales, Australia
| | - Edwin Cuppen
- Center for Molecular Medicine, Oncode Institute, University Medical Center Utrecht, Utrecht, The Netherlands.
- Hartwig Medical Foundation, Amsterdam, The Netherlands.
| | - Arne Van Hoeck
- Center for Molecular Medicine, Oncode Institute, University Medical Center Utrecht, Utrecht, The Netherlands
| |
Collapse
|
6
|
Muñoz-Barrera A, Rubio-Rodríguez LA, Díaz-de Usera A, Jáspez D, Lorenzo-Salazar JM, González-Montelongo R, García-Olivares V, Flores C. From Samples to Germline and Somatic Sequence Variation: A Focus on Next-Generation Sequencing in Melanoma Research. Life (Basel) 2022; 12:1939. [PMID: 36431075 PMCID: PMC9695713 DOI: 10.3390/life12111939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 11/12/2022] [Accepted: 11/16/2022] [Indexed: 11/24/2022] Open
Abstract
Next-generation sequencing (NGS) applications have flourished in the last decade, permitting the identification of cancer driver genes and profoundly expanding the possibilities of genomic studies of cancer, including melanoma. Here we aimed to present a technical review across many of the methodological approaches brought by the use of NGS applications with a focus on assessing germline and somatic sequence variation. We provide cautionary notes and discuss key technical details involved in library preparation, the most common problems with the samples, and guidance to circumvent them. We also provide an overview of the sequence-based methods for cancer genomics, exposing the pros and cons of targeted sequencing vs. exome or whole-genome sequencing (WGS), the fundamentals of the most common commercial platforms, and a comparison of throughputs and key applications. Details of the steps and the main software involved in the bioinformatics processing of the sequencing results, from preprocessing to variant prioritization and filtering, are also provided in the context of the full spectrum of genetic variation (SNVs, indels, CNVs, structural variation, and gene fusions). Finally, we put the emphasis on selected bioinformatic pipelines behind (a) short-read WGS identification of small germline and somatic variants, (b) detection of gene fusions from transcriptomes, and (c) de novo assembly of genomes from long-read WGS data. Overall, we provide comprehensive guidance across the main methodological procedures involved in obtaining sequencing results for the most common short- and long-read NGS platforms, highlighting key applications in melanoma research.
Collapse
Affiliation(s)
- Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Luis A. Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Ana Díaz-de Usera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - David Jáspez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - José M. Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Rafaela González-Montelongo
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Víctor García-Olivares
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, 28029 Madrid, Spain
- Facultad de Ciencias de la Salud, Universidad Fernando de Pessoa Canarias, 35450 Las Palmas de Gran Canaria, Spain
| |
Collapse
|