1
|
Huang AY, Lee EA. Identification of Somatic Mutations From Bulk and Single-Cell Sequencing Data. FRONTIERS IN AGING 2022; 2:800380. [PMID: 35822012 PMCID: PMC9261417 DOI: 10.3389/fragi.2021.800380] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Accepted: 12/08/2021] [Indexed: 12/26/2022]
Abstract
Somatic mutations are DNA variants that occur after the fertilization of zygotes and accumulate during the developmental and aging processes in the human lifespan. Somatic mutations have long been known to cause cancer, and more recently have been implicated in a variety of non-cancer diseases. The patterns of somatic mutations, or mutational signatures, also shed light on the underlying mechanisms of the mutational process. Advances in next-generation sequencing over the decades have enabled genome-wide profiling of DNA variants in a high-throughput manner; however, unlike germline mutations, somatic mutations are carried only by a subset of the cell population. Thus, sensitive bioinformatic methods are required to distinguish mutant alleles from sequencing and base calling errors in bulk tissue samples. An alternative way to study somatic mutations, especially those present in an extremely small number of cells or even in a single cell, is to sequence single-cell genomes after whole-genome amplification (WGA); however, it is critical and technically challenging to exclude numerous technical artifacts arising during error-prone and uneven genome amplification in current WGA methods. To address these challenges, multiple bioinformatic tools have been developed. In this review, we summarize the latest progress in methods for identification of somatic mutations and the challenges that remain to be addressed in the future.
Collapse
Affiliation(s)
- August Yue Huang
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, United States, Department of Pediatrics, Harvard Medical School, Boston, MA, United States
| | - Eunjung Alice Lee
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, United States, Department of Pediatrics, Harvard Medical School, Boston, MA, United States
| |
Collapse
|
2
|
Xu J, Yang P, Xue S, Sharma B, Sanchez-Martin M, Wang F, Beaty KA, Dehan E, Parikh B. Translating cancer genomics into precision medicine with artificial intelligence: applications, challenges and future perspectives. Hum Genet 2019; 138:109-124. [PMID: 30671672 PMCID: PMC6373233 DOI: 10.1007/s00439-019-01970-5] [Citation(s) in RCA: 95] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Accepted: 01/02/2019] [Indexed: 02/07/2023]
Abstract
In the field of cancer genomics, the broad availability of genetic information offered by next-generation sequencing technologies and rapid growth in biomedical publication has led to the advent of the big-data era. Integration of artificial intelligence (AI) approaches such as machine learning, deep learning, and natural language processing (NLP) to tackle the challenges of scalability and high dimensionality of data and to transform big data into clinically actionable knowledge is expanding and becoming the foundation of precision medicine. In this paper, we review the current status and future directions of AI application in cancer genomics within the context of workflows to integrate genomic analysis for precision cancer care. The existing solutions of AI and their limitations in cancer genetic testing and diagnostics such as variant calling and interpretation are critically analyzed. Publicly available tools or algorithms for key NLP technologies in the literature mining for evidence-based clinical recommendations are reviewed and compared. In addition, the present paper highlights the challenges to AI adoption in digital healthcare with regard to data requirements, algorithmic transparency, reproducibility, and real-world assessment, and discusses the importance of preparing patients and physicians for modern digitized healthcare. We believe that AI will remain the main driver to healthcare transformation toward precision medicine, yet the unprecedented challenges posed should be addressed to ensure safety and beneficial impact to healthcare.
Collapse
Affiliation(s)
- Jia Xu
- IBM Watson Health, Cambridge, MA, USA.
| | | | - Shang Xue
- IBM Watson Health, Cambridge, MA, USA
| | | | | | - Fang Wang
- IBM Watson Health, Cambridge, MA, USA
| | | | | | | |
Collapse
|
3
|
What Does This Mutation Mean? The Tools and Pitfalls of Variant Interpretation in Lymphoid Malignancies. Int J Mol Sci 2018; 19:ijms19041251. [PMID: 29677173 PMCID: PMC5979354 DOI: 10.3390/ijms19041251] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Revised: 04/09/2018] [Accepted: 04/14/2018] [Indexed: 01/21/2023] Open
Abstract
High throughput sequencing (HTS) is increasingly important in determining cancer diagnoses, with subsequent prognostic and therapeutic implications. The biology of cancer is becoming increasingly deciphered and it is clear that therapy needs to be individually tailored. Whilst translational research plays an important role in lymphoid malignancies, few guidelines exist to guide biologists and routine laboratories through this constantly evolving field. In this article, we review the challenges of interpreting HTS in lymphoid malignancies and provide a toolkit to interpret single nucleotide variants obtained from HTS. We define the pre-analytical issues such as sequencing DNA obtained from formalin-fixed and paraffin-embedded tissue (FFPE), the acquisition of germline DNA, or the bioinformatic pitfalls, the analytical issues encountered and how to manage them. We describe the main constitutional and cancer databases, their characteristics and limitations, with an emphasis on variant interpretation in lymphoid malignancies. Finally, we discuss the challenges of predictions that one can make using in silico or in vitro modelling, pharmacogenomic screening, and the limits of those prediction tools. This description of the current status in genomic interpretation highlights the need for new large databases and international collaboration in the lymphoma field.
Collapse
|
4
|
Anjanappa M, Hao Y, Simpson ER, Bhat-Nakshatri P, Nelson JB, Tersey SA, Mirmira RG, Cohen-Gadol AA, Saadatzadeh MR, Li L, Fang F, Nephew KP, Miller KD, Liu Y, Nakshatri H. A system for detecting high impact-low frequency mutations in primary tumors and metastases. Oncogene 2017; 37:185-196. [PMID: 28892047 DOI: 10.1038/onc.2017.322] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Revised: 08/01/2017] [Accepted: 08/02/2017] [Indexed: 12/14/2022]
Abstract
Tumor complexity and intratumor heterogeneity contribute to subclonal diversity. Despite advances in next-generation sequencing (NGS) and bioinformatics, detecting rare mutations in primary tumors and metastases contributing to subclonal diversity is a challenge for precision genomics. Here, in order to identify rare mutations, we adapted a recently described epithelial reprograming assay for short-term propagation of epithelial cells from primary and metastatic tumors. Using this approach, we expanded minor clones and obtained epithelial cell-specific DNA/RNA for quantitative NGS analysis. Comparative Ampliseq Comprehensive Cancer Panel sequence analyses were performed on DNA from unprocessed breast tumor and tumor cells propagated from the same tumor. We identified previously uncharacterized mutations present only in the cultured tumor cells, a subset of which has been reported in brain metastatic but not primary breast tumors. In addition, whole-genome sequencing identified mutations enriched in liver metastases of various cancers, including Notch pathway mutations/chromosomal inversions in 5/5 liver metastases, irrespective of cancer types. Mutations/rearrangements in FHIT, involved in purine metabolism, were detected in 4/5 liver metastases, and the same four liver metastases shared mutations in 32 genes, including mutations of different HLA-DR family members affecting OX40 signaling pathway, which could impact the immune response to metastatic cells. Pathway analyses of all mutated genes in liver metastases showed aberrant tumor necrosis factor and transforming growth factor signaling in metastatic cells. Epigenetic regulators including KMT2C/MLL3 and ARID1B, which are mutated in >50% of hepatocellular carcinomas, were also mutated in liver metastases. Thus, irrespective of cancer types, organ-specific metastases may share common genomic aberrations. Since recent studies show independent evolution of primary tumors and metastases and in most cases mutation burden is higher in metastases than primary tumors, the method described here may allow early detection of subclonal somatic alterations associated with metastatic progression and potentially identify therapeutically actionable, metastasis-specific genomic aberrations.
Collapse
Affiliation(s)
- M Anjanappa
- Department of Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Y Hao
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, IN, USA
| | - E R Simpson
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, IN, USA
| | - P Bhat-Nakshatri
- Department of Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - J B Nelson
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - S A Tersey
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - R G Mirmira
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - A A Cohen-Gadol
- Department of Neurosurgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - M R Saadatzadeh
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - L Li
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, IN, USA.,Department of Medical and Molecular Genetics, Indiana University School of Medicine, IN, USA
| | - F Fang
- Medical Science Program, Indiana University, Bloomington, IN, USA
| | - K P Nephew
- Medical Science Program, Indiana University, Bloomington, IN, USA
| | - K D Miller
- Division of Hematology/Oncology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Y Liu
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, IN, USA.,Department of Medical and Molecular Genetics, Indiana University School of Medicine, IN, USA
| | - H Nakshatri
- Department of Surgery, Indiana University School of Medicine, Indianapolis, IN, USA.,Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN, USA.,Roudebush VA Medical Center, Indianapolis, IN, USA
| |
Collapse
|