1
|
Dhanda SK, Mahajan S, Manoharan M. Neoepitopes prediction strategies: an integration of cancer genomics and immunoinformatics approaches. Brief Funct Genomics 2023; 22:1-8. [PMID: 36398967 DOI: 10.1093/bfgp/elac041] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/28/2022] [Accepted: 10/14/2022] [Indexed: 11/19/2022] Open
Abstract
A major near-term medical impact of the genomic technology revolution will be the elucidation of mechanisms of cancer pathogenesis, leading to improvements in the diagnosis of cancer and the selection of cancer treatment. Next-generation sequencing technologies have accelerated the characterization of a tumor, leading to the comprehensive discovery of all the major alterations in a given cancer genome, followed by the translation of this information using computational and immunoinformatics approaches to cancer diagnostics and therapeutic efforts. In the current article, we review various components of cancer immunoinformatics applied to a series of fields of cancer research, including computational tools for cancer mutation detection, cancer mutation and immunological databases, and computational vaccinology.
Collapse
Affiliation(s)
- Sandeep Kumar Dhanda
- Department of Oncology, St Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Swapnil Mahajan
- DeepKnomics Labs Private Limited, 7014 Prestige Garden Bay, IVRI Road, Avalahalli, Behind CRPF Campus, Yelahanka, Bangalore 560064, India
| | - Malini Manoharan
- DeepKnomics Labs Private Limited, 7014 Prestige Garden Bay, IVRI Road, Avalahalli, Behind CRPF Campus, Yelahanka, Bangalore 560064, India
| |
Collapse
|
2
|
Zhou Y, Zhao XC, Wang LQ, Chen CW, Hsu MH, Liao WT, Deng X, Yan Q, Zhao GP, Chen CL, Zhang L, Chiu CH. Detecting Genetic Variation of Colonizing Streptococcus agalactiae Genomes in Humans: A Precision Protocol. FRONTIERS IN BIOINFORMATICS 2022; 2:813599. [PMID: 36304301 PMCID: PMC9580942 DOI: 10.3389/fbinf.2022.813599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 05/19/2022] [Indexed: 11/14/2022] Open
Abstract
Deciphering the genotypic diversity of within-individual pathogens and verifying the evolutionary model can help elucidate resistant genotypes, virulent subpopulations, and the mechanism of opportunistic pathogenicity. However, observed polymorphic mutations (PMs) are rare and difficult to be detected in the “dominant-lineage” model of bacterial infection due to the low frequency. The four pooled group B Streptococcus (GBS) samples were collected from the genital tracts of healthy pregnant women, and the pooled samples and the isogenic controls were genomically sequenced. Using the PMcalling program, we detected the PMs in samples and compared the results between two technical duplicates, GBS-M001T and GBS-M001C. Tested with simulated datasets, the PMcalling program showed high sensitivity especially in low-frequency PMs and reasonable specificity. The genomic sequence data from pooled samples of GBS colonizing carrier pregnant women were analyzed, and few high-frequency PMs and some low-frequency PMs were discovered, indicating a dominant-lineage evolution model. The PMs mainly were nonsynonymous and enriched in quorum sensing, glycolysis/gluconeogenesis, ATP-binding cassette (ABC) transporters, etc., suggesting antimicrobial or environmental selective pressure. The re-analysis of the published Burkholderia dolosa data showed a diverse-community model, and only a few low-frequency PMs were shared between different individuals. Genes of general control non-repressible 5-related N-acetyltransferases family, major facilitator superfamily (MFS) transporter, and ABC transporter were positive selection candidates. Our findings indicate an unreported nature of the dominant-lineage model of GBS colonization in healthy women, and a formerly not observed mutation pool in a colonized microbial community, possibly maintained by selection pressure.
Collapse
Affiliation(s)
- Yan Zhou
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
- *Correspondence: Yan Zhou, ; Liang Zhang, ; Cheng-Hsun Chiu,
| | - Xue-Chao Zhao
- The Institutes of Biology and Medical Sciences, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Lin-Qi Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
| | - Cheng-Wen Chen
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
| | - Mei-Hua Hsu
- Molecular Infectious Disease Research Center, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
- Department of Pediatrics, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
| | - Wan-Ting Liao
- Molecular Infectious Disease Research Center, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
- Department of Pediatrics, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
| | - Xiao Deng
- The Institutes of Biology and Medical Sciences, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Qing Yan
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
| | - Guo-Ping Zhao
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
| | - Chyi-Liang Chen
- Molecular Infectious Disease Research Center, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
- Department of Pediatrics, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
| | - Liang Zhang
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
- *Correspondence: Yan Zhou, ; Liang Zhang, ; Cheng-Hsun Chiu,
| | - Cheng-Hsun Chiu
- Molecular Infectious Disease Research Center, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
- Department of Pediatrics, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Taiwan
- *Correspondence: Yan Zhou, ; Liang Zhang, ; Cheng-Hsun Chiu,
| |
Collapse
|
3
|
Kısakol B, Sarıhan Ş, Ergün MA, Baysan M. Detailed evaluation of cancer sequencing pipelines in different microenvironments and heterogeneity levels. ACTA ACUST UNITED AC 2021; 45:114-126. [PMID: 33907494 PMCID: PMC8068765 DOI: 10.3906/biy-2008-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 02/03/2021] [Indexed: 11/25/2022]
Abstract
The importance of next generation sequencing (NGS) rises in cancer research as accessing this key technology becomes easier for researchers. The sequence data created by NGS technologies must be processed by various bioinformatics algorithms within a pipeline in order to convert raw data to meaningful information. Mapping and variant calling are the two main steps of these analysis pipelines, and many algorithms are available for these steps. Therefore, detailed benchmarking of these algorithms in different scenarios is crucial for the efficient utilization of sequencing technologies. In this study, we compared the performance of twelve pipelines (three mapping and four variant discovery algorithms) with recommended settings to capture single nucleotide variants. We observed significant discrepancy in variant calls among tested pipelines for different heterogeneity levels in real and simulated samples with overall high specificity and low sensitivity. Additional to the individual evaluation of pipelines, we also constructed and tested the performance of pipeline combinations. In these analyses, we observed that certain pipelines complement each other much better than others and display superior performance than individual pipelines. This suggests that adhering to a single pipeline is not optimal for cancer sequencing analysis and sample heterogeneity should be considered in algorithm optimization.
Collapse
Affiliation(s)
- Batuhan Kısakol
- Department of Physiology and Medical Physics, Centre for Systems Medicine, Royal College of Surgeons in Ireland, Dublin Ireland
| | - Şahin Sarıhan
- Computer Engineering Department, Faculty of Engineering, Marmara University, İstanbul, Turkey Turkey
| | - Mehmet Arif Ergün
- Computer Engineering Department, Faculty of Computer and Informatics Engineering, İstanbul Technical University,İstanbul Turkey
| | - Mehmet Baysan
- Computer Engineering Department, Faculty of Computer and Informatics Engineering, İstanbul Technical University,İstanbul Turkey
| |
Collapse
|
4
|
Wardell CP, Ashby C, Bauer MA. FiNGS: high quality somatic mutations using filters for next generation sequencing. BMC Bioinformatics 2021; 22:77. [PMID: 33602113 PMCID: PMC7890800 DOI: 10.1186/s12859-021-03995-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 02/02/2021] [Indexed: 01/15/2023] Open
Abstract
Background Somatic variant callers are used to find mutations in sequencing data from cancer samples. They are very sensitive and have high recall, but also may produce low precision data with a large proportion of false positives. Further ad hoc filtering is commonly performed after variant calling and before further analysis. Improving the filtering of somatic variants in a reproducible way represents an unmet need. We have developed Filters for Next Generation Sequencing (FiNGS), software written specifically to address these filtering issues. Results Developed and tested using publicly available sequencing data sets, we demonstrate that FiNGS reliably improves upon the precision of default variant caller outputs and performs better than other tools designed for the same task. Conclusions FiNGS provides researchers with a tool to reproducibly filter somatic variants that is simple to both deploy and use, with filters and thresholds that are fully configurable by the user. It ingests and emits standard variant call format (VCF) files and will slot into existing sequencing pipelines. It allows users to develop and implement their own filtering strategies and simple sharing of these with others.
Collapse
Affiliation(s)
- Christopher Paul Wardell
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, 4301 W Markham St, Little Rock, AR, 72205, USA.
| | - Cody Ashby
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, 4301 W Markham St, Little Rock, AR, 72205, USA
| | - Michael Anton Bauer
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, 4301 W Markham St, Little Rock, AR, 72205, USA
| |
Collapse
|
5
|
Richters MM, Xia H, Campbell KM, Gillanders WE, Griffith OL, Griffith M. Best practices for bioinformatic characterization of neoantigens for clinical utility. Genome Med 2019; 11:56. [PMID: 31462330 PMCID: PMC6714459 DOI: 10.1186/s13073-019-0666-2] [Citation(s) in RCA: 125] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 08/16/2019] [Indexed: 12/13/2022] Open
Abstract
Neoantigens are newly formed peptides created from somatic mutations that are capable of inducing tumor-specific T cell recognition. Recently, researchers and clinicians have leveraged next generation sequencing technologies to identify neoantigens and to create personalized immunotherapies for cancer treatment. To create a personalized cancer vaccine, neoantigens must be computationally predicted from matched tumor-normal sequencing data, and then ranked according to their predicted capability in stimulating a T cell response. This candidate neoantigen prediction process involves multiple steps, including somatic mutation identification, HLA typing, peptide processing, and peptide-MHC binding prediction. The general workflow has been utilized for many preclinical and clinical trials, but there is no current consensus approach and few established best practices. In this article, we review recent discoveries, summarize the available computational tools, and provide analysis considerations for each step, including neoantigen prediction, prioritization, delivery, and validation methods. In addition to reviewing the current state of neoantigen analysis, we provide practical guidance, specific recommendations, and extensive discussion of critical concepts and points of confusion in the practice of neoantigen characterization for clinical use. Finally, we outline necessary areas of development, including the need to improve HLA class II typing accuracy, to expand software support for diverse neoantigen sources, and to incorporate clinical response data to improve neoantigen prediction algorithms. The ultimate goal of neoantigen characterization workflows is to create personalized vaccines that improve patient outcomes in diverse cancer types.
Collapse
Affiliation(s)
- Megan M Richters
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Forest Park Avenue, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - Huiming Xia
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Forest Park Avenue, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - Katie M Campbell
- Division of Hematology and Oncology, Medical Plaza Driveway, Department of Medicine, University of California, Los Angeles, Los Angeles, CA, 90024, USA
| | - William E Gillanders
- Department of Surgery, South Euclid Avenue, Washington University School of Medicine, St. Louis, MO, 63110, USA
- Siteman Cancer Center, Parkview Place, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Obi L Griffith
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA.
- McDonnell Genome Institute, Forest Park Avenue, Washington University School of Medicine, St. Louis, MO, 63108, USA.
- Siteman Cancer Center, Parkview Place, Washington University School of Medicine, St. Louis, MO, 63110, USA.
- Department of Genetics, South Euclid Avenue, Washington University School of Medicine, St. Louis, MO, 63110, USA.
| | - Malachi Griffith
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA.
- McDonnell Genome Institute, Forest Park Avenue, Washington University School of Medicine, St. Louis, MO, 63108, USA.
- Siteman Cancer Center, Parkview Place, Washington University School of Medicine, St. Louis, MO, 63110, USA.
- Department of Genetics, South Euclid Avenue, Washington University School of Medicine, St. Louis, MO, 63110, USA.
| |
Collapse
|
6
|
Anzar I, Sverchkova A, Stratford R, Clancy T. NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer. BMC Med Genomics 2019; 12:63. [PMID: 31096972 PMCID: PMC6524241 DOI: 10.1186/s12920-019-0508-5] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 04/22/2019] [Indexed: 12/30/2022] Open
Abstract
Background The accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogeneity coupled with the problem of sequencing and alignment artifacts, makes somatic variant calling a challenging task. Current variant filtering strategies, such as rule-based filtering and consensus voting of different algorithms, have previously helped to increase specificity, although comes at the cost of sensitivity. Methods In light of this, we have developed the NeoMutate framework which incorporates 7 supervised machine learning (ML) algorithms to exploit the strengths of multiple variant callers, using a non-redundant set of biological and sequence features. We benchmarked NeoMutate by simulating more than 10,000 bona fide cancer-related mutations into three well-characterized Genome in a Bottle (GIAB) reference samples. Results A robust and exhaustive evaluation of NeoMutate’s performance based on 5-fold cross validation experiments, in addition to 3 independent tests, demonstrated a substantially improved variant detection accuracy compared to any of its individual composite variant callers and consensus calling of multiple tools. Conclusions We show here that integrating multiple tools in an ensemble ML layer optimizes somatic variant detection rates, leading to a potentially improved variant selection framework for the diagnosis and treatment of cancer. Electronic supplementary material The online version of this article (10.1186/s12920-019-0508-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Irantzu Anzar
- OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway
| | - Angelina Sverchkova
- OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway
| | - Richard Stratford
- OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway
| | - Trevor Clancy
- OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway.
| |
Collapse
|
7
|
Calling Variants in the Clinic: Informed Variant Calling Decisions Based on Biological, Clinical, and Laboratory Variables. Comput Struct Biotechnol J 2019; 17:561-569. [PMID: 31049166 PMCID: PMC6482431 DOI: 10.1016/j.csbj.2019.04.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Revised: 03/12/2019] [Accepted: 04/03/2019] [Indexed: 01/10/2023] Open
Abstract
Deep sequencing genomic analysis is becoming increasingly common in clinical research and practice, enabling accurate identification of diagnostic, prognostic, and predictive determinants. Variant calling, distinguishing between true mutations and experimental errors, is a central task of genomic analysis and often requires sophisticated statistical, computational, and/or heuristic techniques. Although variant callers seek to overcome noise inherent in biological experiments, variant calling can be significantly affected by outside factors including those used to prepare, store, and analyze samples. The goal of this review is to discuss known experimental features, such as sample preparation, library preparation, and sequencing, alongside diverse biological and clinical variables, and evaluate their effect on variant caller selection and optimization.
Collapse
|