1
|
Zheng J, Li T, Ye H, Jiang Z, Jiang W, Yang H, Wu Z, Xie Z. Comprehensive identification of pathogenic variants in retinoblastoma by long- and short-read sequencing. Cancer Lett 2024; 598:217121. [PMID: 39009069 DOI: 10.1016/j.canlet.2024.217121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 06/16/2024] [Accepted: 07/11/2024] [Indexed: 07/17/2024]
Abstract
Retinoblastoma (RB) is the most common intraocular malignancy in childhood. The causal variants in RB are mostly characterized by previously used short-read sequencing (SRS) analysis, which has technical limitations in identifying structural variants (SVs) and phasing information. Long-read sequencing (LRS) technology has advantages over SRS in detecting SVs, phased genetic variants, and methylation. In this study, we comprehensively characterized the genetic landscape of RB using combinatorial LRS and SRS of 16 RB tumors and 16 matched blood samples. We detected a total of 232 somatic SVs, with an average of 14.5 SVs per sample across the whole genome in our cohort. We identified 20 distinct pathogenic variants disrupting RB1 gene, including three novel small variants and five somatic SVs. We found more somatic SVs were detected from LRS than SRS (140 vs. 122) in RB samples with WGS data, particularly the insertions (18 vs. 1). Furthermore, our analysis shows that, with the exception of one sample who lacked the methylation data, all samples presented biallelic inactivation of RB1 in various forms, including two cases with the biallelic hypermethylated promoter and four cases with compound heterozygous mutations which were missing in SRS analysis. By inferring relative timing of somatic events, we reveal the genetic progression that RB1 disruption early and followed by copy number changes, including amplifications of Chr2p and deletions of Chr16q, during RB tumorigenesis. Altogether, we characterize the comprehensive genetic landscape of RB, providing novel insights into the genetic alterations and mechanisms contributing to RB initiation and development. Our work also establishes a framework to analyze genomic landscape of cancers based on LRS data.
Collapse
Affiliation(s)
- Jingjing Zheng
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Tong Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Huijing Ye
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zehang Jiang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Wenbing Jiang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Huasheng Yang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
| | - Zhikun Wu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
2
|
Ji S, Zhu T, Sethia A, Wang W. Accelerated somatic mutation calling for whole-genome and whole-exome sequencing data from heterogenous tumor samples. Genome Res 2024; 34:633-641. [PMID: 38589250 PMCID: PMC11146589 DOI: 10.1101/gr.278456.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 04/03/2024] [Indexed: 04/10/2024]
Abstract
Accurate detection of somatic mutations in DNA sequencing data is a fundamental prerequisite for cancer research. Previous analytical challenges were overcome by consensus mutation calling from four to five popular callers. This, however, increases the already nontrivial computing time from individual callers. Here, we launch MuSE 2, powered by multistep parallelization and efficient memory allocation, to resolve the computing time bottleneck. MuSE 2 speeds up 50 times more than MuSE 1 and eight to 80 times more than other popular callers. Our benchmark study suggests combining MuSE 2 and the recently accelerated Strelka2 achieves high efficiency and accuracy in analyzing large cancer genomic data sets.
Collapse
Affiliation(s)
- Shuangxi Ji
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - Tong Zhu
- NVIDIA Corporation, Santa Clara, California 95051, USA
| | - Ankit Sethia
- NVIDIA Corporation, Santa Clara, California 95051, USA
| | - Wenyi Wang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA;
| |
Collapse
|
3
|
Keskus A, Bryant A, Ahmad T, Yoo B, Aganezov S, Goretsky A, Donmez A, Lansdon LA, Rodriguez I, Park J, Liu Y, Cui X, Gardner J, McNulty B, Sacco S, Shetty J, Zhao Y, Tran B, Narzisi G, Helland A, Cook DE, Chang PC, Kolesnikov A, Carroll A, Molloy EK, Pushel I, Guest E, Pastinen T, Shafin K, Miga KH, Malikic S, Day CP, Robine N, Sahinalp C, Dean M, Farooqi MS, Paten B, Kolmogorov M. Severus: accurate detection and characterization of somatic structural variation in tumor genomes using long reads. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.22.24304756. [PMID: 38585974 PMCID: PMC10996739 DOI: 10.1101/2024.03.22.24304756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Most current studies rely on short-read sequencing to detect somatic structural variation (SV) in cancer genomes. Long-read sequencing offers the advantage of better mappability and long-range phasing, which results in substantial improvements in germline SV detection. However, current long-read SV detection methods do not generalize well to the analysis of somatic SVs in tumor genomes with complex rearrangements, heterogeneity, and aneuploidy. Here, we present Severus: a method for the accurate detection of different types of somatic SVs using a phased breakpoint graph approach. To benchmark various short- and long-read SV detection methods, we sequenced five tumor/normal cell line pairs with Illumina, Nanopore, and PacBio sequencing platforms; on this benchmark Severus showed the highest F1 scores (harmonic mean of the precision and recall) as compared to long-read and short-read methods. We then applied Severus to three clinical cases of pediatric cancer, demonstrating concordance with known genetic findings as well as revealing clinically relevant cryptic rearrangements missed by standard genomic panels.
Collapse
Affiliation(s)
- Ayse Keskus
- Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Asher Bryant
- Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Tanveer Ahmad
- Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Byunggil Yoo
- Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
| | | | - Anton Goretsky
- Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Ataberk Donmez
- Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Lisa A. Lansdon
- Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
| | - Isabel Rodriguez
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Rockville, MD, USA
| | - Jimin Park
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
| | - Yuelin Liu
- Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Xiwen Cui
- Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
| | | | | | - Samuel Sacco
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
| | - Jyoti Shetty
- Sequencing Facility, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Yongmei Zhao
- Sequencing Facility Bioinformatics Group, Biomedical Informatics and Data Science Directorate, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Bao Tran
- Sequencing Facility, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | | | | | | | | | | | | | - Erin K. Molloy
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Irina Pushel
- Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
| | - Erin Guest
- Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
| | - Tomi Pastinen
- Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
| | - Kishwar Shafin
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Rockville, MD, USA
| | - Karen H. Miga
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
| | - Salem Malikic
- Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Chi-Ping Day
- Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
| | | | - Cenk Sahinalp
- Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Michael Dean
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Rockville, MD, USA
| | - Midhat S. Farooqi
- Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
| | | | - Mikhail Kolmogorov
- Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
| |
Collapse
|
4
|
Paulin LF, Fan J, O'Neill K, Pleasance E, Porter VL, Jones SJM, Sedlazeck FJ. The benefit of a complete reference genome for cancer structural variant analysis. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.15.24304369. [PMID: 38562786 PMCID: PMC10984048 DOI: 10.1101/2024.03.15.24304369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
The complexities of cancer genomes are becoming more easily interpreted due to advancements in sequencing technologies and improved bioinformatic analysis. Structural variants (SVs) represent an important subset of somatic events in tumors. While detection of SVs has been markedly improved by the development of long-read sequencing, somatic variant identification and annotation remains challenging. We hypothesized that use of a completed human reference genome (CHM13-T2T) would improve somatic SV calling. Our findings in a tumour/normal matched benchmark sample and two patient samples show that the CHM13-T2T improves SV detection and prioritization accuracy compared to GRCh38, with a notable reduction in false positive calls. We also overcame the lack of annotation resources for CHM13-T2T by lifting over CHM13-T2T-aligned reads to the GRCh38 genome, therefore combining both improved alignment and advanced annotations. In this process, we assessed the current SV benchmark set for COLO829/COLO829BL across four replicates sequenced at different centers with different long-read technologies. We discovered instability of this cell line across these replicates; 346 SVs (1.13%) were only discoverable in a single replicate. We identify 49 somatic SVs, which appear to be stable as they are consistently present across the four replicates. As such, we propose this consensus set as an updated benchmark for somatic SV calling and include both GRCh38 and CHM13-T2T coordinates in our benchmark. The benchmark is available at: 10.5281/zenodo.10819636 Our work demonstrates new approaches to optimize somatic SV prioritization in cancer with potential improvements in other genetic diseases.
Collapse
Affiliation(s)
- Luis F Paulin
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA
| | - Jeremy Fan
- Canada's Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC, Canada
| | - Kieran O'Neill
- Canada's Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC, Canada
| | - Erin Pleasance
- Canada's Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC, Canada
| | - Vanessa L Porter
- Canada's Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Steven J M Jones
- Canada's Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| |
Collapse
|
5
|
Simpson JT. Detecting Somatic Mutations Without Matched Normal Samples Using Long Reads. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.26.582089. [PMID: 38464143 PMCID: PMC10925087 DOI: 10.1101/2024.02.26.582089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
DNA sequencing of tumours to identify somatic mutations has become a critical tool to guide the type of treatment given to cancer patients. The gold standard for mutation calling is comparing sequencing data from the tumour to a matched normal sample to avoid mis-classifying inherited SNPs as mutations. This procedure works extremely well, but in certain situations only a tumour sample is available. While approaches have been developed to find mutations without a matched normal, they have limited accuracy or require specific types of input data (e.g. ultra-deep sequencing). Here we explore the application of single molecule long read sequencing to calling somatic mutations without matched normal samples. We develop a simple theoretical framework to show how haplotype phasing is an important source of information for determining whether a variant is a somatic mutation. We then use simulations to assess the range of experimental parameters (tumour purity, sequencing depth) where this approach is effective. These ideas are developed into a prototype somatic mutation caller, smrest, and its use is demonstrated on two highly mutated cancer cell lines. Finally, we argue that this approach has potential to measure clinically important biomarkers that are based on the genome-wide distribution of mutations: tumour mutation burden and mutation signatures.
Collapse
Affiliation(s)
- Jared T. Simpson
- Ontario Institute for Cancer Research, Toronto, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
- Department of Computer Science, University of Toronto, Toronto, Canada
| |
Collapse
|
6
|
Smolka M, Paulin LF, Grochowski CM, Horner DW, Mahmoud M, Behera S, Kalef-Ezra E, Gandhi M, Hong K, Pehlivan D, Scholz SW, Carvalho CMB, Proukakis C, Sedlazeck FJ. Detection of mosaic and population-level structural variants with Sniffles2. Nat Biotechnol 2024:10.1038/s41587-023-02024-y. [PMID: 38168980 PMCID: PMC11217151 DOI: 10.1038/s41587-023-02024-y] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 10/11/2023] [Indexed: 01/05/2024]
Abstract
Calling structural variations (SVs) is technically challenging, but using long reads remains the most accurate way to identify complex genomic alterations. Here we present Sniffles2, which improves over current methods by implementing a repeat aware clustering coupled with a fast consensus sequence and coverage-adaptive filtering. Sniffles2 is 11.8 times faster and 29% more accurate than state-of-the-art SV callers across different coverages (5-50×), sequencing technologies (ONT and HiFi) and SV types. Furthermore, Sniffles2 solves the problem of family-level to population-level SV calling to produce fully genotyped VCF files. Across 11 probands, we accurately identified causative SVs around MECP2, including highly complex alleles with three overlapping SVs. Sniffles2 also enables the detection of mosaic SVs in bulk long-read data. As a result, we identified multiple mosaic SVs in brain tissue from a patient with multiple system atrophy. The identified SV showed a remarkable diversity within the cingulate cortex, impacting both genes involved in neuron function and repetitive elements.
Collapse
Affiliation(s)
- Moritz Smolka
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA
| | - Luis F Paulin
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA
| | | | - Dominic W Horner
- Department of Clinical and Movement Neurosciences, Royal Free Campus, Queen Square Institute of Neurology, University College London, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | - Medhat Mahmoud
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Sairam Behera
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA
| | - Ester Kalef-Ezra
- Department of Clinical and Movement Neurosciences, Royal Free Campus, Queen Square Institute of Neurology, University College London, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | - Mira Gandhi
- Pacific Northwest Research Institute (PNRI), Seattle, WA, USA
| | - Karl Hong
- Bionano Genomics, San Diego, CA, USA
| | - Davut Pehlivan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Division of Neurology and Developmental Neuroscience, Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | - Sonja W Scholz
- Neurodegenerative Diseases Research Unit, National Institute of Neurological Disorders and Stroke, Bethesda, MD, USA
- Department of Neurology, Johns Hopkins University Medical Center, Baltimore, MD, USA
| | - Claudia M B Carvalho
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Pacific Northwest Research Institute (PNRI), Seattle, WA, USA
| | - Christos Proukakis
- Department of Clinical and Movement Neurosciences, Royal Free Campus, Queen Square Institute of Neurology, University College London, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA.
- Department of Computer Science, Rice University, Houston, TX, USA.
| |
Collapse
|
7
|
Majidian S, Agustinho DP, Chin CS, Sedlazeck FJ, Mahmoud M. Genomic variant benchmark: if you cannot measure it, you cannot improve it. Genome Biol 2023; 24:221. [PMID: 37798733 PMCID: PMC10552390 DOI: 10.1186/s13059-023-03061-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 09/18/2023] [Indexed: 10/07/2023] Open
Abstract
Genomic benchmark datasets are essential to driving the field of genomics and bioinformatics. They provide a snapshot of the performances of sequencing technologies and analytical methods and highlight future challenges. However, they depend on sequencing technology, reference genome, and available benchmarking methods. Thus, creating a genomic benchmark dataset is laborious and highly challenging, often involving multiple sequencing technologies, different variant calling tools, and laborious manual curation. In this review, we discuss the available benchmark datasets and their utility. Additionally, we focus on the most recent benchmark of genes with medical relevance and challenging genomic complexity.
Collapse
Affiliation(s)
- Sina Majidian
- Department of Computational Biology, University of Lausanne, 1015, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | | | | | - Fritz J Sedlazeck
- Baylor College of Medicine, Human Genome Sequencing Center, Houston, TX, 77030, USA.
- Department of Computer Science, Rice University, 6100 Main Street, Houston, TX, 77005, USA.
| | - Medhat Mahmoud
- Baylor College of Medicine, Human Genome Sequencing Center, Houston, TX, 77030, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
8
|
Li C, Chen L, Pan G, Zhang W, Li SC. Deciphering complex breakage-fusion-bridge genome rearrangements with Ambigram. Nat Commun 2023; 14:5528. [PMID: 37684230 PMCID: PMC10491683 DOI: 10.1038/s41467-023-41259-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 08/28/2023] [Indexed: 09/10/2023] Open
Abstract
Breakage-fusion-bridge (BFB) is a complex rearrangement that leads to tumor malignancy. Existing models for detecting BFBs rely on the ideal BFB hypothesis, ruling out the possibility of BFBs entangled with other structural variations, that is, complex BFBs. We propose an algorithm Ambigram to identify complex BFB and reconstruct the rearranged structure of the local genome during the cancer subclone evolution process. Ambigram handles data from short, linked, long, and single-cell sequences, and optical mapping technologies. Ambigram successfully deciphers the gold- or silver-standard complex BFBs against the state-of-the-art in multiple cancers. Ambigram dissects the intratumor heterogeneity of complex BFB events with single-cell reads from melanoma and gastric cancer. Furthermore, applying Ambigram to liver and cervical cancer data suggests that the BFB mechanism may mediate oncovirus integrations. BFB also exists in noncancer genomics. Investigating the complete human genome reference with Ambigram suggests that the BFB mechanism may be involved in two genome reorganizations of Homo Sapiens during evolution. Moreover, Ambigram discovers the signals of recurrent foldback inversions and complex BFBs in whole genome data from the 1000 genome project, and congenital heart diseases, respectively.
Collapse
Affiliation(s)
- Chaohui Li
- Department of Computer Science, City University of Hong Kong, Hong Kong, China
| | - Lingxi Chen
- Department of Computer Science, City University of Hong Kong, Hong Kong, China
| | - Guangze Pan
- Department of Computer Science, City University of Hong Kong, Hong Kong, China
| | - Wenqian Zhang
- Department of Computer Science, City University of Hong Kong, Hong Kong, China
| | - Shuai Cheng Li
- Department of Computer Science, City University of Hong Kong, Hong Kong, China.
| |
Collapse
|
9
|
Shiraishi Y, Koya J, Chiba K, Okada A, Arai Y, Saito Y, Shibata T, Kataoka K. Precise characterization of somatic complex structural variations from tumor/control paired long-read sequencing data with nanomonsv. Nucleic Acids Res 2023; 51:e74. [PMID: 37336583 PMCID: PMC10415145 DOI: 10.1093/nar/gkad526] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 05/23/2023] [Accepted: 06/07/2023] [Indexed: 06/21/2023] Open
Abstract
We present our novel software, nanomonsv, for detecting somatic structural variations (SVs) using tumor and matched control long-read sequencing data with a single-base resolution. The current version of nanomonsv includes two detection modules, Canonical SV module, and Single breakend SV module. Using tumor/control paired long-read sequencing data from three cancer and their matched lymphoblastoid lines, we demonstrate that Canonical SV module can identify somatic SVs that can be captured by short-read technologies with higher precision and recall than existing methods. In addition, we have developed a workflow to classify mobile element insertions while elucidating their in-depth properties, such as 5' truncations, internal inversions, as well as source sites for 3' transductions. Furthermore, Single breakend SV module enables the detection of complex SVs that can only be identified by long-reads, such as SVs involving highly-repetitive centromeric sequences, and LINE1- and virus-mediated rearrangements. In summary, our approaches applied to cancer long-read sequencing data can reveal various features of somatic SVs and will lead to a better understanding of mutational processes and functional consequences of somatic SVs.
Collapse
Affiliation(s)
- Yuichi Shiraishi
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Junji Koya
- Division of Molecular Oncology, National Cancer Center Research Institute, Tokyo, Japan
| | - Kenichi Chiba
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Ai Okada
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Yasuhito Arai
- Division of Cancer Genomics, National Cancer Center Research Institute, Tokyo, Japan
| | - Yuki Saito
- Division of Molecular Oncology, National Cancer Center Research Institute, Tokyo, Japan
- Department of Gastroenterology, Keio University School of Medicine, Tokyo, Japan
| | - Tatsuhiro Shibata
- Division of Cancer Genomics, National Cancer Center Research Institute, Tokyo, Japan
- Laboratory of Molecular Medicine, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Keisuke Kataoka
- Division of Molecular Oncology, National Cancer Center Research Institute, Tokyo, Japan
- Department of Hematology, Keio University School of Medicine, Tokyo, Japan
| |
Collapse
|
10
|
Diesh C, Stevens GJ, Xie P, De Jesus Martinez T, Hershberg EA, Leung A, Guo E, Dider S, Zhang J, Bridge C, Hogue G, Duncan A, Morgan M, Flores T, Bimber BN, Haw R, Cain S, Buels RM, Stein LD, Holmes IH. JBrowse 2: a modular genome browser with views of synteny and structural variation. Genome Biol 2023; 24:74. [PMID: 37069644 PMCID: PMC10108523 DOI: 10.1186/s13059-023-02914-z] [Citation(s) in RCA: 68] [Impact Index Per Article: 68.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 03/20/2023] [Indexed: 04/19/2023] Open
Abstract
We present JBrowse 2, a general-purpose genome annotation browser offering enhanced visualization of complex structural variation and evolutionary relationships. It retains core features of JBrowse while adding new views for synteny, dotplots, breakpoints, gene fusions, and whole-genome overviews. It allows users to share sessions, open multiple genomes, and navigate between views. It can be embedded in a web page, used as a standalone application, or run from Jupyter notebooks or R sessions. These improvements are enabled by a ground-up redesign using modern web technology. We describe application functionality, use cases, performance benchmarks, and implementation notes for web administrators and developers.
Collapse
Affiliation(s)
- Colin Diesh
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Garrett J Stevens
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Peter Xie
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | | | - Elliot A. Hershberg
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Angel Leung
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Emma Guo
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Shihab Dider
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Junjun Zhang
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Caroline Bridge
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Gregory Hogue
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Andrew Duncan
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Matthew Morgan
- Center for Applied Systems and Software, 224 Milne Computer Center, 1800 SW Campus Way, Oregon State University, Corvallis, OR 97331 USA
| | - Tia Flores
- Center for Applied Systems and Software, 224 Milne Computer Center, 1800 SW Campus Way, Oregon State University, Corvallis, OR 97331 USA
| | - Benjamin N. Bimber
- Oregon National Primate Research Center, Oregon Health and Science University, Beaverton, OR 97006 USA
| | - Robin Haw
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Scott Cain
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Robert M. Buels
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Lincoln D. Stein
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Ian H. Holmes
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| |
Collapse
|
11
|
Cuppen E, Elemento O, Rosenquist R, Nikic S, IJzerman M, Zaleski ID, Frederix G, Levin LÅ, Mullighan CG, Buettner R, Pugh TJ, Grimmond S, Caldas C, Andre F, Custers I, Campo E, van Snellenberg H, Schuh A, Nakagawa H, von Kalle C, Haferlach T, Fröhling S, Jobanputra V. Implementation of Whole-Genome and Transcriptome Sequencing Into Clinical Cancer Care. JCO Precis Oncol 2022; 6:e2200245. [PMID: 36480778 PMCID: PMC10166391 DOI: 10.1200/po.22.00245] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 09/30/2022] [Accepted: 09/21/2022] [Indexed: 12/13/2022] Open
Abstract
PURPOSE The combination of whole-genome and transcriptome sequencing (WGTS) is expected to transform diagnosis and treatment for patients with cancer. WGTS is a comprehensive precision diagnostic test that is starting to replace the standard of care for oncology molecular testing in health care systems around the world; however, the implementation and widescale adoption of this best-in-class testing is lacking. METHODS Here, we address the barriers in integrating WGTS for cancer diagnostics and treatment selection and answer questions regarding utility in different cancer types, cost-effectiveness and affordability, and other practical considerations for WGTS implementation. RESULTS We review the current studies implementing WGTS in health care systems and provide a synopsis of the clinical evidence and insights into practical considerations for WGTS implementation. We reflect on regulatory, costs, reimbursement, and incidental findings aspects of this test. CONCLUSION WGTS is an appropriate comprehensive clinical test for many tumor types and can replace multiple, cascade testing approaches currently performed. Decreasing sequencing cost, increasing number of clinically relevant aberrations and discovery of more complex biomarkers of treatment response, should pave the way for health care systems and laboratories in implementing WGTS into clinical practice, to transform diagnosis and treatment for patients with cancer.
Collapse
Affiliation(s)
- Edwin Cuppen
- Hartwig Medical Foundation, Amsterdam, the Netherlands
- Center for Molecular Medicine and Oncode Institute, University Medical Center, Utrecht, the Netherlands
| | - Olivier Elemento
- Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY
| | - Richard Rosenquist
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
- Clinical Genetics, Karolinska University Hospital, Solna, Sweden
| | - Svetlana Nikic
- Illumina Productos de España, S.L.U., Plaza Pablo Ruiz Picasso, Madrid, Spain
| | - Maarten IJzerman
- Erasmus School of Health Policy & Management, Erasmus University, Rotterdam, the Netherlands
- Centre for Cancer Research, University of Melbourne, Melbourne, Australia
| | - Isabelle Durand Zaleski
- Université de Paris, CRESS, INSERM, INRA, URCEco, AP-HP, Hôpital de l'Hôtel Dieu, Paris, France
| | - Geert Frederix
- Julius Center for Health Sciences and Primary Care, University Medical Center, Utrecht, the Netherlands
| | - Lars-Åke Levin
- Department of Health, Medicine and Caring Sciences (HMV), Linköping University, Linköping, Sweden
| | | | | | - Trevor J. Pugh
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Sean Grimmond
- Centre for Cancer Research, University of Melbourne, Melbourne, Australia
| | - Carlos Caldas
- Cancer Research UK Cambridge Institute and Department of Oncology, University of Cambridge, Cambridge, United Kingdom
| | | | | | - Elias Campo
- Institut d’Investigacions Biomèdiques August Pi I Sunyer (IDIBAPS), Barcelona, Spain
- Centro de Investigación Biomédica en Red, Cáncer (CIBERONC), Madrid, Spain
- Hematopathology Unit, Hospital Clínic of Barcelona, Barcelona, Spain
- University of Barcelona, Barcelona, Spain
| | | | - Anna Schuh
- University of Oxford, Oxford, United Kingdom
| | - Hidewaki Nakagawa
- Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
| | - Christof von Kalle
- Berlin Institute of Health at Charité—Universitätsmedizin Berlin, Clinical Study Center, Berlin, Germany
| | | | - Stefan Fröhling
- Division of Translational Medical Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Vaidehi Jobanputra
- New York Genome Center; Department of Pathology, Columbia University Irving Medical Center, New York, NY
| |
Collapse
|