1
|
Ren H, Chen X, Wang J, Chen Y, Hafiz A, Xiao Q, Fu S, Madireddy A, Li WV, Shi X, Cao J. Temporal and structural patterns of hepatitis B virus integrations in hepatocellular carcinoma. J Med Virol 2023; 95:e29187. [PMID: 37877809 PMCID: PMC11131385 DOI: 10.1002/jmv.29187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 09/25/2023] [Accepted: 10/11/2023] [Indexed: 10/26/2023]
Abstract
Chronic infection of hepatitis B virus (HBV) is the major cause of hepatocellular carcinoma (HCC). Notably, 90% of HBV-positive HCC cases exhibit detectable HBV integrations, hinting at the potential early entanglement of these viral integrations in tumorigenesis and their subsequent oncogenic implications. Nevertheless, the precise chronology of integration events during HCC tumorigenesis, alongside their sequential structural patterns, has remained elusive thus far. In this study, we applied whole-genome sequencing to multiple biopsies extracted from six HBV-positive HCC cases. Through this approach, we identified point mutations and viral integrations, offering a blueprint for the intricate tumor phylogeny of these samples. The emergent narrative paints a rich tapestry of diverse evolutionary trajectories characterizing the analyzed tumors. We uncovered oncogenic integration events in some samples that appear to happen before and during the initiation stage of tumor development based on their locations in reconstituted trajectories. Furthermore, we conducted additional long-read sequencing of selected samples and unveiled integration-bridged chromosome rearrangements and tandem repeats of the HBV sequence within integrations. In summary, this study revealed premalignant oncogenic and sequential complex integrations and highlighted the contributions of HBV integrations to HCC development and genome instability.
Collapse
Affiliation(s)
- Haozhen Ren
- Department of Hepatobiliary Surgery, the Affiliated Drum Tower Hospital of Nanjing University Medical School, Nanjing, China
- Hepatobiliary Institute, Nanjing University, Nanjing, China
| | - Xun Chen
- Institute for the Advanced Study of Human Biology (ASHBi), Kyoto University, Kyoto, Japan
| | - Jinglin Wang
- Department of Hepatobiliary Surgery, the Affiliated Drum Tower Hospital of Nanjing University Medical School, Nanjing, China
- Hepatobiliary Institute, Nanjing University, Nanjing, China
| | - Ying Chen
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ
| | - Alex Hafiz
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ
| | - Qian Xiao
- Institute of Modern Biology, Nanjing University, Nanjing, China
| | - Shiwei Fu
- Department of Statistics, University of California, Riverside, Riverside, CA
| | - Advaitha Madireddy
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ
| | - Wei Vivian Li
- Department of Statistics, University of California, Riverside, Riverside, CA
| | - Xiaolei Shi
- Department of Hepatobiliary Surgery, the Affiliated Drum Tower Hospital of Nanjing University Medical School, Nanjing, China
- Hepatobiliary Institute, Nanjing University, Nanjing, China
| | - Jian Cao
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ
- Department of Medicine, Robert Wood Johnson Medical School, Rutgers University, New Brunswick, NJ
| |
Collapse
|
2
|
Stephens Z, O’Brien D, Dehankar M, Roberts LR, Iyer RK, Kocher JP. Exogene: A performant workflow for detecting viral integrations from paired-end next-generation sequencing data. PLoS One 2021; 16:e0250915. [PMID: 34550971 PMCID: PMC8457494 DOI: 10.1371/journal.pone.0250915] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 07/08/2021] [Indexed: 01/14/2023] Open
Abstract
The integration of viruses into the human genome is known to be associated with tumorigenesis in many cancers, but the accurate detection of integration breakpoints from short read sequencing data is made difficult by human-viral homologies, viral genome heterogeneity, coverage limitations, and other factors. To address this, we present Exogene, a sensitive and efficient workflow for detecting viral integrations from paired-end next generation sequencing data. Exogene's read filtering and breakpoint detection strategies yield integration coordinates that are highly concordant with long read validation. We demonstrate this concordance across 6 TCGA Hepatocellular carcinoma (HCC) tumor samples, identifying integrations of hepatitis B virus that are also supported by long reads. Additionally, we applied Exogene to targeted capture data from 426 previously studied HCC samples, achieving 98.9% concordance with existing methods and identifying 238 high-confidence integrations that were not previously reported. Exogene is applicable to multiple types of paired-end sequence data, including genome, exome, RNA-Seq and targeted capture.
Collapse
Affiliation(s)
- Zachary Stephens
- Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, Urbana, IL, United States of America
| | - Daniel O’Brien
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States of America
| | - Mrunal Dehankar
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States of America
| | - Lewis R. Roberts
- Department of Internal Medicine, Mayo Clinic, Rochester, MN, United States of America
| | - Ravishankar K. Iyer
- Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, Urbana, IL, United States of America
| | - Jean-Pierre Kocher
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States of America
| |
Collapse
|
3
|
Chen X, Li D. Sequencing facility and DNA source associated patterns of virus-mappable reads in whole-genome sequencing data. Genomics 2021; 113:1189-1198. [PMID: 33301893 PMCID: PMC7856238 DOI: 10.1016/j.ygeno.2020.12.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2020] [Revised: 11/25/2020] [Accepted: 12/04/2020] [Indexed: 12/12/2022]
Abstract
Numerous viral sequences have been reported in the whole-genome sequencing (WGS) data of human blood. However, it is not clear to what degree the virus-mappable reads represent true viral sequences rather than random-mapping or noise originating from sample preparation, sequencing processes, or other sources. Identification of patterns of virus-mappable reads may generate novel indicators for evaluating the origins of these viral sequences. We characterized paired-end unmapped reads and reads aligned to viral references in human WGS datasets, then compared patterns of the virus-mappable reads among DNA sources and sequencing facilities which produced these datasets. We then examined potential origins of the source- and facility-associated viral reads. The proportions of clean unmapped reads among the seven sequencing facilities were significantly different (P < 2 × 10-16). We identified 260,339 reads that were mappable to a total of 99 viral references in 2535 samples. The majority (86.7%) of these virus-mappable reads (corresponding to 47 viral references), which can be classified into four groups based on their distinct patterns, were strongly associated with sequencing facility or DNA source (adjusted P value <0.01). Possible origins of these reads include artificial sequences in library preparation, recombinant vectors in cell culture, and phages co-contaminated with their host bacteria. The sequencing facility-associated virus-mappable reads and patterns were repeatedly observed in other datasets produced in the same facilities. We have constructed an analytic framework and profiled the unmapped reads mappable to viral references. The results provide a new understanding of sequencing facility- and DNA source-associated batch effects in deep sequencing data and may facilitate improved bioinformatics filtering of reads.
Collapse
Affiliation(s)
- Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, VT 05405, USA
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, VT 05405, USA; Department of Computer Science, University of Vermont, Burlington, VT 05405, USA; Neuroscience, Behavior, Health Initiative, University of Vermont, Burlington, VT 05405, USA.
| |
Collapse
|
4
|
Chen X, Kost J, Sulovari A, Wong N, Liang WS, Cao J, Li D. A virome-wide clonal integration analysis platform for discovering cancer viral etiology. Genome Res 2019; 29:819-830. [PMID: 30872350 PMCID: PMC6499315 DOI: 10.1101/gr.242529.118] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Accepted: 03/11/2019] [Indexed: 12/31/2022]
Abstract
Oncoviral infection is responsible for 12%–15% of cancer in humans. Convergent evidence from epidemiology, pathology, and oncology suggests that new viral etiologies for cancers remain to be discovered. Oncoviral profiles can be obtained from cancer genome sequencing data; however, widespread viral sequence contamination and noncausal viruses complicate the process of identifying genuine oncoviruses. Here, we propose a novel strategy to address these challenges by performing virome-wide screening of early-stage clonal viral integrations. To implement this strategy, we developed VIcaller, a novel platform for identifying viral integrations that are derived from any characterized viruses and shared by a large proportion of tumor cells using whole-genome sequencing (WGS) data. The sensitivity and precision were confirmed with simulated and benchmark cancer data sets. By applying this platform to cancer WGS data sets with proven or speculated viral etiology, we newly identified or confirmed clonal integrations of hepatitis B virus (HBV), human papillomavirus (HPV), Epstein-Barr virus (EBV), and BK Virus (BKV), suggesting the involvement of these viruses in early stages of tumorigenesis in affected tumors, such as HBV in TERT and KMT2B (also known as MLL4) gene loci in liver cancer, HPV and BKV in bladder cancer, and EBV in non-Hodgkin's lymphoma. We also showed the capacity of VIcaller to identify integrations from some uncharacterized viruses. This is the first study to systematically investigate the strategy and method of virome-wide screening of clonal integrations to identify oncoviruses. Searching clonal viral integrations with our platform has the capacity to identify virus-caused cancers and discover cancer viral etiologies.
Collapse
Affiliation(s)
- Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Jason Kost
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Arvis Sulovari
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Nathalie Wong
- Department of Anatomical and Cellular Pathology, Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, NT, Hong Kong 999077, P.R. China
| | - Winnie S Liang
- Translational Genomics Research Institute, Phoenix, Arizona 85004, USA
| | - Jian Cao
- Division of Medical Oncology, Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08903, USA.,Department of Medicine, Rutgers Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08903, USA
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA.,Neuroscience, Behavior, and Health Initiative, University of Vermont, Burlington, Vermont 05405, USA.,Department of Computer Science, University of Vermont, Burlington, Vermont 05405, USA
| |
Collapse
|