1
|
Chi WY, Hu Y, Huang HC, Kuo HH, Lin SH, Kuo CTJ, Tao J, Fan D, Huang YM, Wu AA, Hung CF, Wu TC. Molecular targets and strategies in the development of nucleic acid cancer vaccines: from shared to personalized antigens. J Biomed Sci 2024; 31:94. [PMID: 39379923 PMCID: PMC11463125 DOI: 10.1186/s12929-024-01082-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Accepted: 09/01/2024] [Indexed: 10/10/2024] Open
Abstract
Recent breakthroughs in cancer immunotherapies have emphasized the importance of harnessing the immune system for treating cancer. Vaccines, which have traditionally been used to promote protective immunity against pathogens, are now being explored as a method to target cancer neoantigens. Over the past few years, extensive preclinical research and more than a hundred clinical trials have been dedicated to investigating various approaches to neoantigen discovery and vaccine formulations, encouraging development of personalized medicine. Nucleic acids (DNA and mRNA) have become particularly promising platform for the development of these cancer immunotherapies. This shift towards nucleic acid-based personalized vaccines has been facilitated by advancements in molecular techniques for identifying neoantigens, antigen prediction methodologies, and the development of new vaccine platforms. Generating these personalized vaccines involves a comprehensive pipeline that includes sequencing of patient tumor samples, data analysis for antigen prediction, and tailored vaccine manufacturing. In this review, we will discuss the various shared and personalized antigens used for cancer vaccine development and introduce strategies for identifying neoantigens through the characterization of gene mutation, transcription, translation and post translational modifications associated with oncogenesis. In addition, we will focus on the most up-to-date nucleic acid vaccine platforms, discuss the limitations of cancer vaccines as well as provide potential solutions, and raise key clinical and technical considerations in vaccine development.
Collapse
Affiliation(s)
- Wei-Yu Chi
- Physiology, Biophysics and Systems Biology Graduate Program, Weill Cornell Medicine, New York, NY, USA
| | - Yingying Hu
- Tri-Institutional PhD Program in Chemical Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Hsin-Che Huang
- Tri-Institutional PhD Program in Chemical Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Hui-Hsuan Kuo
- Pharmacology PhD Program, Weill Cornell Medicine, New York, NY, USA
| | - Shu-Hong Lin
- Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- The University of Texas Graduate School of Biomedical Sciences at Houston and MD Anderson Cancer Center, Houston, TX, USA
| | - Chun-Tien Jimmy Kuo
- Division of Pharmaceutics and Pharmacology, College of Pharmacy, The Ohio State University, Columbus, OH, USA
| | - Julia Tao
- Department of Pathology, Johns Hopkins School of Medicine, 1550 Orleans St, CRB II Room 309, Baltimore, MD, 21287, USA
| | - Darrell Fan
- Department of Pathology, Johns Hopkins School of Medicine, 1550 Orleans St, CRB II Room 309, Baltimore, MD, 21287, USA
| | - Yi-Min Huang
- Department of Pathology, Johns Hopkins School of Medicine, 1550 Orleans St, CRB II Room 309, Baltimore, MD, 21287, USA
| | - Annie A Wu
- Department of Pathology, Johns Hopkins School of Medicine, 1550 Orleans St, CRB II Room 309, Baltimore, MD, 21287, USA
| | - Chien-Fu Hung
- Department of Pathology, Johns Hopkins School of Medicine, 1550 Orleans St, CRB II Room 309, Baltimore, MD, 21287, USA
- Department of Oncology, Johns Hopkins School of Medicine, Baltimore, MD, USA
- Department of Obstetrics and Gynecology, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - T-C Wu
- Department of Pathology, Johns Hopkins School of Medicine, 1550 Orleans St, CRB II Room 309, Baltimore, MD, 21287, USA.
- Department of Oncology, Johns Hopkins School of Medicine, Baltimore, MD, USA.
- Department of Obstetrics and Gynecology, Johns Hopkins School of Medicine, Baltimore, MD, USA.
- Department of Molecular Microbiology and Immunology, Bloomberg School of Public Health, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
2
|
Dhanda SK, Mahajan S, Manoharan M. Neoepitopes prediction strategies: an integration of cancer genomics and immunoinformatics approaches. Brief Funct Genomics 2023; 22:1-8. [PMID: 36398967 DOI: 10.1093/bfgp/elac041] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/28/2022] [Accepted: 10/14/2022] [Indexed: 11/19/2022] Open
Abstract
A major near-term medical impact of the genomic technology revolution will be the elucidation of mechanisms of cancer pathogenesis, leading to improvements in the diagnosis of cancer and the selection of cancer treatment. Next-generation sequencing technologies have accelerated the characterization of a tumor, leading to the comprehensive discovery of all the major alterations in a given cancer genome, followed by the translation of this information using computational and immunoinformatics approaches to cancer diagnostics and therapeutic efforts. In the current article, we review various components of cancer immunoinformatics applied to a series of fields of cancer research, including computational tools for cancer mutation detection, cancer mutation and immunological databases, and computational vaccinology.
Collapse
Affiliation(s)
- Sandeep Kumar Dhanda
- Department of Oncology, St Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Swapnil Mahajan
- DeepKnomics Labs Private Limited, 7014 Prestige Garden Bay, IVRI Road, Avalahalli, Behind CRPF Campus, Yelahanka, Bangalore 560064, India
| | - Malini Manoharan
- DeepKnomics Labs Private Limited, 7014 Prestige Garden Bay, IVRI Road, Avalahalli, Behind CRPF Campus, Yelahanka, Bangalore 560064, India
| |
Collapse
|
3
|
Brock S, Jackson DB, Soldatos TG, Hornischer K, Schäfer A, Diella F, Emmert MY, Hoerstrup SP. Whole patient knowledge modeling of COVID-19 symptomatology reveals common molecular mechanisms. FRONTIERS IN MOLECULAR MEDICINE 2023; 2:1035290. [PMID: 39086962 PMCID: PMC11285600 DOI: 10.3389/fmmed.2022.1035290] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 12/12/2022] [Indexed: 08/02/2024]
Abstract
Infection with SARS-CoV-2 coronavirus causes systemic, multi-faceted COVID-19 disease. However, knowledge connecting its intricate clinical manifestations with molecular mechanisms remains fragmented. Deciphering the molecular basis of COVID-19 at the whole-patient level is paramount to the development of effective therapeutic approaches. With this goal in mind, we followed an iterative, expert-driven process to compile data published prior to and during the early stages of the pandemic into a comprehensive COVID-19 knowledge model. Recent updates to this model have also validated multiple earlier predictions, suggesting the importance of such knowledge frameworks in hypothesis generation and testing. Overall, our findings suggest that SARS-CoV-2 perturbs several specific mechanisms, unleashing a pathogenesis spectrum, ranging from "a perfect storm" triggered by acute hyper-inflammation, to accelerated aging in protracted "long COVID-19" syndromes. In this work, we shortly report on these findings that we share with the community via 1) a synopsis of key evidence associating COVID-19 symptoms and plausible mechanisms, with details presented within 2) the accompanying "COVID-19 Explorer" webserver, developed specifically for this purpose (found at https://covid19.molecularhealth.com). We anticipate that our model will continue to facilitate clinico-molecular insights across organ systems together with hypothesis generation for the testing of potential repurposing drug candidates, new pharmacological targets and clinically relevant biomarkers. Our work suggests that whole patient knowledge models of human disease can potentially expedite the development of new therapeutic strategies and support evidence-driven clinical hypothesis generation and decision making.
Collapse
Affiliation(s)
| | | | - Theodoros G. Soldatos
- Molecular Health GmbH, Heidelberg, Germany
- SRH Hochschule, University of Applied Science, Heidelberg, Germany
| | | | | | | | - Maximilian Y. Emmert
- Institute for Regenerative Medicine, University of Zurich, Zurich, Switzerland
- Wyss Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
- Department of Cardiothoracic and Vascular Surgery, German Heart Institute Berlin, Berlin, Germany
- Department of Cardiovascular Surgery, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Simon P. Hoerstrup
- Institute for Regenerative Medicine, University of Zurich, Zurich, Switzerland
- Wyss Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
| |
Collapse
|
4
|
Brock S, Soldatos TG, Jackson DB, Diella F, Hornischer K, Schäfer A, Hoerstrup SP, Emmert MY. The COVID-19 explorer-An integrated, whole patient knowledge model of COVID-19 disease. FRONTIERS IN MOLECULAR MEDICINE 2022; 2:1035215. [PMID: 39086977 PMCID: PMC11285624 DOI: 10.3389/fmmed.2022.1035215] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 11/07/2022] [Indexed: 08/02/2024]
Abstract
Since early 2020 the COVID-19 pandemic has paralyzed the world, resulting in more than half a billion infections and over 6 million deaths within a 28-month period. Knowledge about the disease remains largely disjointed, especially when considering the molecular mechanisms driving the diversity of clinical manifestations and symptoms. Despite the recent availability of vaccines, there remains an urgent need to develop effective treatments for cases of severe disease, especially in the face of novel virus variants. The complexity of the situation is exacerbated by the emergence of COVID-19 as a complex and multifaceted systemic disease affecting independent tissues and organs throughout the body. The development of effective treatment strategies is therefore predicated on an integrated understanding of the underlying disease mechanisms and their potentially causative link to the diversity of observed clinical phenotypes. To address this need, we utilized a computational technology (the Dataome platform) to build an integrated clinico-molecular view on the most important COVID-19 clinical phenotypes. Our results provide the first integrated, whole-patient model of COVID-19 symptomatology that connects the molecular lifecycle of SARS-CoV-2 with microvesicle-mediated intercellular communication and the contact activation and kallikrein-kinin systems. The model not only explains the clinical pleiotropy of COVID-19, but also provides an evidence-driven framework for drug development/repurposing and the identification of critical risk factors. The associated knowledge is provided in the form of the open source COVID-19 Explorer (https://covid19.molecularhealth.com), enabling the global community to explore and analyze the key molecular features of systemic COVID-19 and associated implications for research priorities and therapeutic strategies. Our work suggests that knowledge modeling solutions may offer important utility in expediting the global response to future health emergencies.
Collapse
Affiliation(s)
| | - Theodoros G. Soldatos
- Molecular Health GmbH, Heidelberg, Germany
- SRH Hochscule, University of Applied Science, Heidelberg, Germany
| | | | | | | | | | - Simon P. Hoerstrup
- Institute for Regenerative Medicine, University of Zurich, Zurich, Switzerland
- Wyss Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Maximilian Y. Emmert
- Institute for Regenerative Medicine, University of Zurich, Zurich, Switzerland
- Wyss Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
- Department of Cardiothoracic and Vascular Surgery, German Heart Institute Berlin, Berlin, Germany
- Department of Cardiovascular Surgery, Charité Universitätsmedizin Berlin, Berlin, Germany
| |
Collapse
|
5
|
Zhou Y, Zhao XC, Wang LQ, Chen CW, Hsu MH, Liao WT, Deng X, Yan Q, Zhao GP, Chen CL, Zhang L, Chiu CH. Detecting Genetic Variation of Colonizing Streptococcus agalactiae Genomes in Humans: A Precision Protocol. FRONTIERS IN BIOINFORMATICS 2022; 2:813599. [PMID: 36304301 PMCID: PMC9580942 DOI: 10.3389/fbinf.2022.813599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 05/19/2022] [Indexed: 11/14/2022] Open
Abstract
Deciphering the genotypic diversity of within-individual pathogens and verifying the evolutionary model can help elucidate resistant genotypes, virulent subpopulations, and the mechanism of opportunistic pathogenicity. However, observed polymorphic mutations (PMs) are rare and difficult to be detected in the “dominant-lineage” model of bacterial infection due to the low frequency. The four pooled group B Streptococcus (GBS) samples were collected from the genital tracts of healthy pregnant women, and the pooled samples and the isogenic controls were genomically sequenced. Using the PMcalling program, we detected the PMs in samples and compared the results between two technical duplicates, GBS-M001T and GBS-M001C. Tested with simulated datasets, the PMcalling program showed high sensitivity especially in low-frequency PMs and reasonable specificity. The genomic sequence data from pooled samples of GBS colonizing carrier pregnant women were analyzed, and few high-frequency PMs and some low-frequency PMs were discovered, indicating a dominant-lineage evolution model. The PMs mainly were nonsynonymous and enriched in quorum sensing, glycolysis/gluconeogenesis, ATP-binding cassette (ABC) transporters, etc., suggesting antimicrobial or environmental selective pressure. The re-analysis of the published Burkholderia dolosa data showed a diverse-community model, and only a few low-frequency PMs were shared between different individuals. Genes of general control non-repressible 5-related N-acetyltransferases family, major facilitator superfamily (MFS) transporter, and ABC transporter were positive selection candidates. Our findings indicate an unreported nature of the dominant-lineage model of GBS colonization in healthy women, and a formerly not observed mutation pool in a colonized microbial community, possibly maintained by selection pressure.
Collapse
Affiliation(s)
- Yan Zhou
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
- *Correspondence: Yan Zhou, ; Liang Zhang, ; Cheng-Hsun Chiu,
| | - Xue-Chao Zhao
- The Institutes of Biology and Medical Sciences, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Lin-Qi Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
| | - Cheng-Wen Chen
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
| | - Mei-Hua Hsu
- Molecular Infectious Disease Research Center, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
- Department of Pediatrics, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
| | - Wan-Ting Liao
- Molecular Infectious Disease Research Center, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
- Department of Pediatrics, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
| | - Xiao Deng
- The Institutes of Biology and Medical Sciences, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Qing Yan
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
| | - Guo-Ping Zhao
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
| | - Chyi-Liang Chen
- Molecular Infectious Disease Research Center, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
- Department of Pediatrics, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
| | - Liang Zhang
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
- *Correspondence: Yan Zhou, ; Liang Zhang, ; Cheng-Hsun Chiu,
| | - Cheng-Hsun Chiu
- Molecular Infectious Disease Research Center, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
- Department of Pediatrics, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
- *Correspondence: Yan Zhou, ; Liang Zhang, ; Cheng-Hsun Chiu,
| |
Collapse
|
6
|
Kısakol B, Sarıhan Ş, Ergün MA, Baysan M. Detailed evaluation of cancer sequencing pipelines in different microenvironments and heterogeneity levels. ACTA ACUST UNITED AC 2021; 45:114-126. [PMID: 33907494 PMCID: PMC8068765 DOI: 10.3906/biy-2008-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 02/03/2021] [Indexed: 11/25/2022]
Abstract
The importance of next generation sequencing (NGS) rises in cancer research as accessing this key technology becomes easier for researchers. The sequence data created by NGS technologies must be processed by various bioinformatics algorithms within a pipeline in order to convert raw data to meaningful information. Mapping and variant calling are the two main steps of these analysis pipelines, and many algorithms are available for these steps. Therefore, detailed benchmarking of these algorithms in different scenarios is crucial for the efficient utilization of sequencing technologies. In this study, we compared the performance of twelve pipelines (three mapping and four variant discovery algorithms) with recommended settings to capture single nucleotide variants. We observed significant discrepancy in variant calls among tested pipelines for different heterogeneity levels in real and simulated samples with overall high specificity and low sensitivity. Additional to the individual evaluation of pipelines, we also constructed and tested the performance of pipeline combinations. In these analyses, we observed that certain pipelines complement each other much better than others and display superior performance than individual pipelines. This suggests that adhering to a single pipeline is not optimal for cancer sequencing analysis and sample heterogeneity should be considered in algorithm optimization.
Collapse
Affiliation(s)
- Batuhan Kısakol
- Department of Physiology and Medical Physics, Centre for Systems Medicine, Royal College of Surgeons in Ireland, Dublin Ireland
| | - Şahin Sarıhan
- Computer Engineering Department, Faculty of Engineering, Marmara University, İstanbul, Turkey Turkey
| | - Mehmet Arif Ergün
- Computer Engineering Department, Faculty of Computer and Informatics Engineering, İstanbul Technical University,İstanbul Turkey
| | - Mehmet Baysan
- Computer Engineering Department, Faculty of Computer and Informatics Engineering, İstanbul Technical University,İstanbul Turkey
| |
Collapse
|
7
|
Wardell CP, Ashby C, Bauer MA. FiNGS: high quality somatic mutations using filters for next generation sequencing. BMC Bioinformatics 2021; 22:77. [PMID: 33602113 PMCID: PMC7890800 DOI: 10.1186/s12859-021-03995-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 02/02/2021] [Indexed: 01/15/2023] Open
Abstract
Background Somatic variant callers are used to find mutations in sequencing data from cancer samples. They are very sensitive and have high recall, but also may produce low precision data with a large proportion of false positives. Further ad hoc filtering is commonly performed after variant calling and before further analysis. Improving the filtering of somatic variants in a reproducible way represents an unmet need. We have developed Filters for Next Generation Sequencing (FiNGS), software written specifically to address these filtering issues. Results Developed and tested using publicly available sequencing data sets, we demonstrate that FiNGS reliably improves upon the precision of default variant caller outputs and performs better than other tools designed for the same task. Conclusions FiNGS provides researchers with a tool to reproducibly filter somatic variants that is simple to both deploy and use, with filters and thresholds that are fully configurable by the user. It ingests and emits standard variant call format (VCF) files and will slot into existing sequencing pipelines. It allows users to develop and implement their own filtering strategies and simple sharing of these with others.
Collapse
Affiliation(s)
- Christopher Paul Wardell
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, 4301 W Markham St, Little Rock, AR, 72205, USA.
| | - Cody Ashby
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, 4301 W Markham St, Little Rock, AR, 72205, USA
| | - Michael Anton Bauer
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, 4301 W Markham St, Little Rock, AR, 72205, USA
| |
Collapse
|
8
|
Richters MM, Xia H, Campbell KM, Gillanders WE, Griffith OL, Griffith M. Best practices for bioinformatic characterization of neoantigens for clinical utility. Genome Med 2019; 11:56. [PMID: 31462330 PMCID: PMC6714459 DOI: 10.1186/s13073-019-0666-2] [Citation(s) in RCA: 129] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 08/16/2019] [Indexed: 12/13/2022] Open
Abstract
Neoantigens are newly formed peptides created from somatic mutations that are capable of inducing tumor-specific T cell recognition. Recently, researchers and clinicians have leveraged next generation sequencing technologies to identify neoantigens and to create personalized immunotherapies for cancer treatment. To create a personalized cancer vaccine, neoantigens must be computationally predicted from matched tumor-normal sequencing data, and then ranked according to their predicted capability in stimulating a T cell response. This candidate neoantigen prediction process involves multiple steps, including somatic mutation identification, HLA typing, peptide processing, and peptide-MHC binding prediction. The general workflow has been utilized for many preclinical and clinical trials, but there is no current consensus approach and few established best practices. In this article, we review recent discoveries, summarize the available computational tools, and provide analysis considerations for each step, including neoantigen prediction, prioritization, delivery, and validation methods. In addition to reviewing the current state of neoantigen analysis, we provide practical guidance, specific recommendations, and extensive discussion of critical concepts and points of confusion in the practice of neoantigen characterization for clinical use. Finally, we outline necessary areas of development, including the need to improve HLA class II typing accuracy, to expand software support for diverse neoantigen sources, and to incorporate clinical response data to improve neoantigen prediction algorithms. The ultimate goal of neoantigen characterization workflows is to create personalized vaccines that improve patient outcomes in diverse cancer types.
Collapse
Affiliation(s)
- Megan M Richters
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Forest Park Avenue, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - Huiming Xia
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Forest Park Avenue, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - Katie M Campbell
- Division of Hematology and Oncology, Medical Plaza Driveway, Department of Medicine, University of California, Los Angeles, Los Angeles, CA, 90024, USA
| | - William E Gillanders
- Department of Surgery, South Euclid Avenue, Washington University School of Medicine, St. Louis, MO, 63110, USA
- Siteman Cancer Center, Parkview Place, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Obi L Griffith
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA.
- McDonnell Genome Institute, Forest Park Avenue, Washington University School of Medicine, St. Louis, MO, 63108, USA.
- Siteman Cancer Center, Parkview Place, Washington University School of Medicine, St. Louis, MO, 63110, USA.
- Department of Genetics, South Euclid Avenue, Washington University School of Medicine, St. Louis, MO, 63110, USA.
| | - Malachi Griffith
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA.
- McDonnell Genome Institute, Forest Park Avenue, Washington University School of Medicine, St. Louis, MO, 63108, USA.
- Siteman Cancer Center, Parkview Place, Washington University School of Medicine, St. Louis, MO, 63110, USA.
- Department of Genetics, South Euclid Avenue, Washington University School of Medicine, St. Louis, MO, 63110, USA.
| |
Collapse
|
9
|
Anzar I, Sverchkova A, Stratford R, Clancy T. NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer. BMC Med Genomics 2019; 12:63. [PMID: 31096972 PMCID: PMC6524241 DOI: 10.1186/s12920-019-0508-5] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 04/22/2019] [Indexed: 12/30/2022] Open
Abstract
Background The accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogeneity coupled with the problem of sequencing and alignment artifacts, makes somatic variant calling a challenging task. Current variant filtering strategies, such as rule-based filtering and consensus voting of different algorithms, have previously helped to increase specificity, although comes at the cost of sensitivity. Methods In light of this, we have developed the NeoMutate framework which incorporates 7 supervised machine learning (ML) algorithms to exploit the strengths of multiple variant callers, using a non-redundant set of biological and sequence features. We benchmarked NeoMutate by simulating more than 10,000 bona fide cancer-related mutations into three well-characterized Genome in a Bottle (GIAB) reference samples. Results A robust and exhaustive evaluation of NeoMutate’s performance based on 5-fold cross validation experiments, in addition to 3 independent tests, demonstrated a substantially improved variant detection accuracy compared to any of its individual composite variant callers and consensus calling of multiple tools. Conclusions We show here that integrating multiple tools in an ensemble ML layer optimizes somatic variant detection rates, leading to a potentially improved variant selection framework for the diagnosis and treatment of cancer. Electronic supplementary material The online version of this article (10.1186/s12920-019-0508-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Irantzu Anzar
- OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway
| | - Angelina Sverchkova
- OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway
| | - Richard Stratford
- OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway
| | - Trevor Clancy
- OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway.
| |
Collapse
|
10
|
Calling Variants in the Clinic: Informed Variant Calling Decisions Based on Biological, Clinical, and Laboratory Variables. Comput Struct Biotechnol J 2019; 17:561-569. [PMID: 31049166 PMCID: PMC6482431 DOI: 10.1016/j.csbj.2019.04.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Revised: 03/12/2019] [Accepted: 04/03/2019] [Indexed: 01/10/2023] Open
Abstract
Deep sequencing genomic analysis is becoming increasingly common in clinical research and practice, enabling accurate identification of diagnostic, prognostic, and predictive determinants. Variant calling, distinguishing between true mutations and experimental errors, is a central task of genomic analysis and often requires sophisticated statistical, computational, and/or heuristic techniques. Although variant callers seek to overcome noise inherent in biological experiments, variant calling can be significantly affected by outside factors including those used to prepare, store, and analyze samples. The goal of this review is to discuss known experimental features, such as sample preparation, library preparation, and sequencing, alongside diverse biological and clinical variables, and evaluate their effect on variant caller selection and optimization.
Collapse
|