1
|
Shukla HG, Chakraborty M, Emerson J. Genetic variation in recalcitrant repetitive regions of the Drosophila melanogaster genome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.11.598575. [PMID: 38915508 PMCID: PMC11195212 DOI: 10.1101/2024.06.11.598575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
Many essential functions of organisms are encoded in highly repetitive genomic regions, including histones involved in DNA packaging, centromeres that are core components of chromosome segregation, ribosomal RNA comprising the protein translation machinery, telomeres that ensure chromosome integrity, piRNA clusters encoding host defenses against selfish elements, and virtually the entire Y chromosome. These regions, formed by highly similar tandem arrays, pose significant challenges for experimental and informatic study, impeding sequence-level descriptions essential for understanding genetic variation. Here, we report the assembly and variation analysis of such repetitive regions in Drosophila melanogaster, offering significant improvements to the existing community reference assembly. Our work successfully recovers previously elusive segments, including complete reconstructions of the histone locus and the pericentric heterochromatin of the X chromosome, spanning the Stellate locus to the distal flank of the rDNA cluster. To infer structural changes in these regions where alignments are often not practicable, we introduce landmark anchors based on unique variants that are putatively orthologous. These regions display considerable structural variation between different D. melanogaster strains, exhibiting differences in copy number and organization of homologous repeat units between haplotypes. In the histone cluster, although we observe minimal genetic exchange indicative of crossing over, the variation patterns suggest mechanisms such as unequal sister chromatid exchange. We also examine the prevalence and scale of concerted evolution in the histone and Stellate clusters and discuss the mechanisms underlying these observed patterns.
Collapse
Affiliation(s)
- Harsh G. Shukla
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California 92697, USA
- Graduate Program in Mathematical, Computational and Systems Biology, University of California Irvine, Irvine, California 92697, USA
| | - Mahul Chakraborty
- Department of Biology, Texas A&M University, College Station, Texas 77843, USA
| | - J.J. Emerson
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California 92697, USA
- Center for Complex Biological Systems, University of California Irvine, Irvine, California 92697, USA
| |
Collapse
|
2
|
Lin MJ, Iyer S, Chen NC, Langmead B. Measuring, visualizing, and diagnosing reference bias with biastools. Genome Biol 2024; 25:101. [PMID: 38641647 PMCID: PMC11027314 DOI: 10.1186/s13059-024-03240-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 04/04/2024] [Indexed: 04/21/2024] Open
Abstract
Many bioinformatics methods seek to reduce reference bias, but no methods exist to comprehensively measure it. Biastools analyzes and categorizes instances of reference bias. It works in various scenarios: when the donor's variants are known and reads are simulated; when donor variants are known and reads are real; and when variants are unknown and reads are real. Using biastools, we observe that more inclusive graph genomes result in fewer biased sites. We find that end-to-end alignment reduces bias at indels relative to local aligners. Finally, we use biastools to characterize how T2T references improve large-scale bias.
Collapse
Affiliation(s)
- Mao-Jan Lin
- Department of Computer Science, Johns Hopkins University, Baltimore, USA.
| | - Sheila Iyer
- Department of Computer Science, Johns Hopkins University, Baltimore, USA
| | - Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University, Baltimore, USA
| | - Ben Langmead
- Department of Computer Science, Johns Hopkins University, Baltimore, USA.
| |
Collapse
|
3
|
Zhou Q, Ji F, Lin D, Liu X, Zhu Z, Ruan J. KSNP: a fast de Bruijn graph-based haplotyping tool approaching data-in time cost. Nat Commun 2024; 15:3126. [PMID: 38605047 PMCID: PMC11009271 DOI: 10.1038/s41467-024-47562-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 04/04/2024] [Indexed: 04/13/2024] Open
Abstract
Long reads that cover more variants per read raise opportunities for accurate haplotype construction, whereas the genotype errors of single nucleotide polymorphisms pose great computational challenges for haplotyping tools. Here we introduce KSNP, an efficient haplotype construction tool based on the de Bruijn graph (DBG). KSNP leverages the ability of DBG in handling high-throughput erroneous reads to tackle the challenges. Compared to other notable tools in this field, KSNP achieves at least 5-fold speedup while producing comparable haplotype results. The time required for assembling human haplotypes is reduced to nearly the data-in time.
Collapse
Affiliation(s)
- Qian Zhou
- PengCheng Laboratory, Shenzhen, China
| | - Fahu Ji
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Dongxiao Lin
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Xianming Liu
- PengCheng Laboratory, Shenzhen, China
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Zexuan Zhu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China.
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen, China.
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
| |
Collapse
|
4
|
De Battista D, Yakymi R, Scheibe E, Sato S, Gerstein H, Markowitz TE, Lack J, Mereu R, Manieli C, Zamboni F, Farci P. Identification of Two Distinct Immune Subtypes in Hepatitis B Virus (HBV)-Associated Hepatocellular Carcinoma (HCC). Cancers (Basel) 2024; 16:1370. [PMID: 38611048 PMCID: PMC11011136 DOI: 10.3390/cancers16071370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/12/2024] [Accepted: 03/28/2024] [Indexed: 04/14/2024] Open
Abstract
HBV is the most common risk factor for HCC development, accounting for almost 50% of cases worldwide. Despite significant advances in immunotherapy, there is limited information on the HBV-HCC tumor microenvironment (TME), which may influence the response to checkpoint inhibitors. Here, we characterize the TME in a unique series of liver specimens from HBV-HCC patients to identify who might benefit from immunotherapy. By combining an extensive immunohistochemistry analysis with the transcriptomic profile of paired liver samples (tumor vs. nontumorous tissue) from 12 well-characterized Caucasian patients with HBV-HCC, we identified two distinct tumor subtypes that we defined immune-high and immune-low. The immune-high subtype, seen in half of the patients, is characterized by a high number of infiltrating B and T cells in association with stromal activation and a transcriptomic profile featuring inhibition of antigen presentation and CTL activation. All the immune-high tumors expressed high levels of CTLA-4 and low levels of PD-1, while PD-L1 was present only in four of six cases. In contrast, the immune-low subtype shows significantly lower lymphocyte infiltration and stromal activation. By whole exome sequencing, we documented that four out of six individuals with the immune-low subtype had missense mutations in the CTNNB1 gene, while only one patient had mutations in this gene in the immune-high subtype. Outside the tumor, there were no differences between the two subtypes. This study identifies two distinctive immune subtypes in HBV-associated HCC, regardless of the microenvironment observed in the surrounding nontumorous tissue, providing new insights into pathogenesis. These findings may be instrumental in the identification of patients who might benefit from immunotherapy.
Collapse
Affiliation(s)
- Davide De Battista
- Hepatic Pathogenesis Section, Laboratory of Infectious Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA; (D.D.B.); (R.Y.); (E.S.); (S.S.); (H.G.)
| | - Rylee Yakymi
- Hepatic Pathogenesis Section, Laboratory of Infectious Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA; (D.D.B.); (R.Y.); (E.S.); (S.S.); (H.G.)
| | - Evangeline Scheibe
- Hepatic Pathogenesis Section, Laboratory of Infectious Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA; (D.D.B.); (R.Y.); (E.S.); (S.S.); (H.G.)
| | - Shinya Sato
- Hepatic Pathogenesis Section, Laboratory of Infectious Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA; (D.D.B.); (R.Y.); (E.S.); (S.S.); (H.G.)
| | - Hannah Gerstein
- Hepatic Pathogenesis Section, Laboratory of Infectious Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA; (D.D.B.); (R.Y.); (E.S.); (S.S.); (H.G.)
| | - Tovah E. Markowitz
- Integrated Data Sciences Section, Research Technologies Branch, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA;
| | - Justin Lack
- NIAID Collaborative Bioinformatics Resource, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA;
| | - Roberto Mereu
- Department of Surgery, Liver Transplantation Center, Azienda Ospedaliera Brotzu, 09047 Cagliari, Italy; (R.M.); (F.Z.)
| | - Cristina Manieli
- Sevizio di Anatomia Patologica, Azienda Ospedaliera Brotzu, 09047 Cagliari, Italy;
| | - Fausto Zamboni
- Department of Surgery, Liver Transplantation Center, Azienda Ospedaliera Brotzu, 09047 Cagliari, Italy; (R.M.); (F.Z.)
| | - Patrizia Farci
- Hepatic Pathogenesis Section, Laboratory of Infectious Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA; (D.D.B.); (R.Y.); (E.S.); (S.S.); (H.G.)
| |
Collapse
|
5
|
Ergun MA, Cinal O, Bakışlı B, Emül AA, Baysan M. COSAP: Comparative Sequencing Analysis Platform. BMC Bioinformatics 2024; 25:130. [PMID: 38532317 DOI: 10.1186/s12859-024-05756-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 03/20/2024] [Indexed: 03/28/2024] Open
Abstract
BACKGROUND Recent improvements in sequencing technologies enabled detailed profiling of genomic features. These technologies mostly rely on short reads which are merged and compared to reference genome for variant identification. These operations should be done with computers due to the size and complexity of the data. The need for analysis software resulted in many programs for mapping, variant calling and annotation steps. Currently, most programs are either expensive enterprise software with proprietary code which makes access and verification very difficult or open-access programs that are mostly based on command-line operations without user interfaces and extensive documentation. Moreover, a high level of disagreement is observed among popular mapping and variant calling algorithms in multiple studies, which makes relying on a single algorithm unreliable. User-friendly open-source software tools that offer comparative analysis are an important need considering the growth of sequencing technologies. RESULTS Here, we propose Comparative Sequencing Analysis Platform (COSAP), an open-source platform that provides popular sequencing algorithms for SNV, indel, structural variant calling, copy number variation, microsatellite instability and fusion analysis and their annotations. COSAP is packed with a fully functional user-friendly web interface and a backend server which allows full independent deployment for both individual and institutional scales. COSAP is developed as a workflow management system and designed to enhance cooperation among scientists with different backgrounds. It is publicly available at https://cosap.bio and https://github.com/MBaysanLab/cosap/ . The source code of the frontend and backend services can be found at https://github.com/MBaysanLab/cosap-webapi/ and https://github.com/MBaysanLab/cosap_frontend/ respectively. All services are packed as Docker containers as well. Pipelines that combine algorithms can be customized and new algorithms can be added with minimal coding through modular structure. CONCLUSIONS COSAP simplifies and speeds up the process of DNA sequencing analyses providing commonly used algorithms for SNV, indel, structural variant calling, copy number variation, microsatellite instability and fusion analysis as well as their annotations. COSAP is packed with a fully functional user-friendly web interface and a backend server which allows full independent deployment for both individual and institutional scales. Standardized implementations of popular algorithms in a modular platform make comparisons much easier to assess the impact of alternative pipelines which is crucial in establishing reproducibility of sequencing analyses.
Collapse
Affiliation(s)
- Mehmet Arif Ergun
- Department of Computer Engineering, Istanbul Technical University, 34469, Istanbul, Turkey
| | - Omer Cinal
- Department of Computer Engineering, Istanbul Technical University, 34469, Istanbul, Turkey
| | - Berkant Bakışlı
- Department of Computer Engineering, Istanbul Technical University, 34469, Istanbul, Turkey
| | - Abdullah Asım Emül
- Department of Computer Engineering, Istanbul Technical University, 34469, Istanbul, Turkey
| | - Mehmet Baysan
- Department of Computer Engineering, Istanbul Technical University, 34469, Istanbul, Turkey.
| |
Collapse
|
6
|
Dufort y Álvarez G, Xargay-Ferrer M, Pagès-Zamora A, Ochoa I. EMVC-2: an efficient single-nucleotide variant caller based on expectation maximization. Bioinformatics 2024; 40:btad681. [PMID: 37963064 PMCID: PMC10919945 DOI: 10.1093/bioinformatics/btad681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 10/18/2023] [Accepted: 11/13/2023] [Indexed: 11/16/2023] Open
Abstract
MOTIVATION Single-nucleotide variants (SNVs) are the most common type of genetic variation in the human genome. Accurate and efficient detection of SNVs from next-generation sequencing (NGS) data is essential for various applications in genomics and personalized medicine. However, SNV calling methods usually suffer from high computational complexity and limited accuracy. In this context, there is a need for new methods that overcome these limitations and provide fast reliable results. RESULTS We present EMVC-2, a novel method for SNV calling from NGS data. EMVC-2 uses a multi-class ensemble classification approach based on the expectation-maximization algorithm that infers at each locus the most likely genotype from multiple labels provided by different learners. The inferred variants are then validated by a decision tree that filters out unlikely ones. We evaluate EMVC-2 on several publicly available real human NGS data for which the set of SNVs is available, and demonstrate that it outperforms state-of-the-art variant callers in terms of accuracy and speed, on average. AVAILABILITY AND IMPLEMENTATION EMVC-2 is coded in C and Python, and is freely available for download at: https://github.com/guilledufort/EMVC-2. EMVC-2 is also available in Bioconda.
Collapse
Affiliation(s)
| | - Martí Xargay-Ferrer
- SPCOM Group, Universitat Politècnica de Catalunya – BarcelonaTech (UPC), 08034 Barcelona, Spain
| | - Alba Pagès-Zamora
- SPCOM Group, Universitat Politècnica de Catalunya – BarcelonaTech (UPC), 08034 Barcelona, Spain
| | - Idoia Ochoa
- Department of Electrical Engineering, Tecnun, University of Navarra, 20018 Donostia, Spain
| |
Collapse
|
7
|
Simpson JT. Detecting Somatic Mutations Without Matched Normal Samples Using Long Reads. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.26.582089. [PMID: 38464143 PMCID: PMC10925087 DOI: 10.1101/2024.02.26.582089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
DNA sequencing of tumours to identify somatic mutations has become a critical tool to guide the type of treatment given to cancer patients. The gold standard for mutation calling is comparing sequencing data from the tumour to a matched normal sample to avoid mis-classifying inherited SNPs as mutations. This procedure works extremely well, but in certain situations only a tumour sample is available. While approaches have been developed to find mutations without a matched normal, they have limited accuracy or require specific types of input data (e.g. ultra-deep sequencing). Here we explore the application of single molecule long read sequencing to calling somatic mutations without matched normal samples. We develop a simple theoretical framework to show how haplotype phasing is an important source of information for determining whether a variant is a somatic mutation. We then use simulations to assess the range of experimental parameters (tumour purity, sequencing depth) where this approach is effective. These ideas are developed into a prototype somatic mutation caller, smrest, and its use is demonstrated on two highly mutated cancer cell lines. Finally, we argue that this approach has potential to measure clinically important biomarkers that are based on the genome-wide distribution of mutations: tumour mutation burden and mutation signatures.
Collapse
Affiliation(s)
- Jared T. Simpson
- Ontario Institute for Cancer Research, Toronto, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
- Department of Computer Science, University of Toronto, Toronto, Canada
| |
Collapse
|
8
|
Lin MJ, Iyer S, Chen NC, Langmead B. Measuring, visualizing and diagnosing reference bias with biastools. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.13.557552. [PMID: 37745608 PMCID: PMC10515925 DOI: 10.1101/2023.09.13.557552] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Many bioinformatics methods seek to reduce reference bias, but no methods exist to comprehensively measure it. Biastools analyzes and categorizes instances of reference bias. It works in various scenarios, i.e. (a) when the donor's variants are known and reads are simulated, (b) when donor variants are known and reads are real, and (c) when variants are unknown and reads are real. Using biastools, we observe that more inclusive graph genomes result in fewer biased sites. We find that end-to-end alignment reduces bias at indels relative to local aligners. Finally, we use biastools to characterize how T2T references improve large-scale bias.
Collapse
Affiliation(s)
- Mao-Jan Lin
- Department of Computer Science, Johns Hopkins University
| | - Sheila Iyer
- Department of Computer Science, Johns Hopkins University
| | - Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University
| | - Ben Langmead
- Department of Computer Science, Johns Hopkins University
| |
Collapse
|
9
|
Barbitoff YA, Ushakov MO, Lazareva TE, Nasykhova YA, Glotov AS, Predeus AV. Bioinformatics of germline variant discovery for rare disease diagnostics: current approaches and remaining challenges. Brief Bioinform 2024; 25:bbad508. [PMID: 38271481 PMCID: PMC10810331 DOI: 10.1093/bib/bbad508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/18/2023] [Accepted: 12/12/2023] [Indexed: 01/27/2024] Open
Abstract
Next-generation sequencing (NGS) has revolutionized the field of rare disease diagnostics. Whole exome and whole genome sequencing are now routinely used for diagnostic purposes; however, the overall diagnosis rate remains lower than expected. In this work, we review current approaches used for calling and interpretation of germline genetic variants in the human genome, and discuss the most important challenges that persist in the bioinformatic analysis of NGS data in medical genetics. We describe and attempt to quantitatively assess the remaining problems, such as the quality of the reference genome sequence, reproducible coverage biases, or variant calling accuracy in complex regions of the genome. We also discuss the prospects of switching to the complete human genome assembly or the human pan-genome and important caveats associated with such a switch. We touch on arguably the hardest problem of NGS data analysis for medical genomics, namely, the annotation of genetic variants and their subsequent interpretation. We highlight the most challenging aspects of annotation and prioritization of both coding and non-coding variants. Finally, we demonstrate the persistent prevalence of pathogenic variants in the coding genome, and outline research directions that may enhance the efficiency of NGS-based disease diagnostics.
Collapse
Affiliation(s)
- Yury A Barbitoff
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
- Bioinformatics Institute, Kentemirovskaya st. 2A, 197342, St. Petersburg, Russia
| | - Mikhail O Ushakov
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Tatyana E Lazareva
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Yulia A Nasykhova
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Andrey S Glotov
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Alexander V Predeus
- Bioinformatics Institute, Kentemirovskaya st. 2A, 197342, St. Petersburg, Russia
| |
Collapse
|
10
|
Gartner V, Redelings BD, Gaither C, Parr JB, Kalonji A, Phanzu F, Brazeau NF, Juliano JJ, Wray GA. Genomic insights into Plasmodium vivax population structure and diversity in central Africa. Malar J 2024; 23:27. [PMID: 38238806 PMCID: PMC10797969 DOI: 10.1186/s12936-024-04852-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 01/12/2024] [Indexed: 01/22/2024] Open
Abstract
BACKGROUND Though Plasmodium vivax is the second most common malaria species to infect humans, it has not traditionally been considered a major human health concern in central Africa given the high prevalence of the human Duffy-negative phenotype that is believed to prevent infection. Increasing reports of asymptomatic and symptomatic infections in Duffy-negative individuals throughout Africa raise the possibility that P. vivax is evolving to evade host resistance, but there are few parasite samples with genomic data available from this part of the world. METHODS Whole genome sequencing of one new P. vivax isolate from the Democratic Republic of the Congo (DRC) was performed and used in population genomics analyses to assess how this central African isolate fits into the global context of this species. RESULTS Plasmodium vivax from DRC is similar to other African populations and is not closely related to the non-human primate parasite P. vivax-like. Evidence is found for a duplication of the gene PvDBP and a single copy of PvDBP2. CONCLUSION These results suggest an endemic P. vivax population is present in central Africa. Intentional sampling of P. vivax across Africa would further contextualize this sample within African P. vivax diversity and shed light on the mechanisms of infection in Duffy negative individuals. These results are limited by the uncertainty of how representative this single sample is of the larger population of P. vivax in central Africa.
Collapse
Affiliation(s)
- Valerie Gartner
- Biology Department, Duke University, Durham, NC, 27708, USA
- University Program in Genetics and Genomics, Duke University, Durham, NC, 27708, USA
| | - Benjamin D Redelings
- Biology Department, Duke University, Durham, NC, 27708, USA
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, 66045, USA
- Ronin Institute, Durham, NC, 27705, USA
| | | | | | - Albert Kalonji
- SANRU Asbl, 149 A/B, Boulevard du 30 Juin, Kinshasa, Gombe, Democratic Republic of Congo
| | - Fernandine Phanzu
- SANRU Asbl, 149 A/B, Boulevard du 30 Juin, Kinshasa, Gombe, Democratic Republic of Congo
| | | | | | - Gregory A Wray
- Biology Department, Duke University, Durham, NC, 27708, USA.
| |
Collapse
|
11
|
Ho GY, Vandenberg CJ, Lim R, Christie EL, Garsed DW, Lieschke E, Nesic K, Kondrashova O, Ratnayake G, Radke M, Penington JS, Carmagnac A, Heong V, Kyran EL, Zhang F, Traficante N, Huang R, Dobrovic A, Swisher EM, McNally O, Kee D, Wakefield MJ, Papenfuss AT, Bowtell DDL, Barker HE, Scott CL. The microtubule inhibitor eribulin demonstrates efficacy in platinum-resistant and refractory high-grade serous ovarian cancer patient-derived xenograft models. Ther Adv Med Oncol 2023; 15:17588359231208674. [PMID: 38028140 PMCID: PMC10666702 DOI: 10.1177/17588359231208674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 09/25/2023] [Indexed: 12/01/2023] Open
Abstract
Background Despite initial response to platinum-based chemotherapy and PARP inhibitor therapy (PARPi), nearly all recurrent high-grade serous ovarian cancer (HGSC) will acquire lethal drug resistance; indeed, ~15% of individuals have de novo platinum-refractory disease. Objectives To determine the potential of anti-microtubule agent (AMA) therapy (paclitaxel, vinorelbine and eribulin) in platinum-resistant or refractory (PRR) HGSC by assessing response in patient-derived xenograft (PDX) models of HGSC. Design and methods Of 13 PRR HGSC PDX, six were primary PRR, derived from chemotherapy-naïve samples (one was BRCA2 mutant) and seven were from samples obtained following chemotherapy treatment in the clinic (five were mutant for either BRCA1 or BRCA2 (BRCA1/2), four with prior PARPi exposure), recapitulating the population of individuals with aggressive treatment-resistant HGSC in the clinic. Molecular analyses and in vivo treatment studies were undertaken. Results Seven out of thirteen PRR PDX (54%) were sensitive to treatment with the AMA, eribulin (time to progressive disease (PD) ⩾100 days from the start of treatment) and 11 out of 13 PDX (85%) derived significant benefit from eribulin [time to harvest (TTH) for each PDX with p < 0.002]. In 5 out of 10 platinum-refractory HGSC PDX (50%) and one out of three platinum-resistant PDX (33%), eribulin was more efficacious than was cisplatin, with longer time to PD and significantly extended TTH (each PDX p < 0.02). Furthermore, four of these models were extremely sensitive to all three AMA tested, maintaining response until the end of the experiment (120d post-treatment start). Despite harbouring secondary BRCA2 mutations, two BRCA2-mutant PDX models derived from heavily pre-treated individuals were sensitive to AMA. PRR HGSC PDX models showing greater sensitivity to AMA had high proliferative indices and oncogene expression. Two PDX models, both with prior chemotherapy and/or PARPi exposure, were refractory to all AMA, one of which harboured the SLC25A40-ABCB1 fusion, known to upregulate drug efflux via MDR1. Conclusion The efficacy observed for eribulin in PRR HGSC PDX was similar to that observed for paclitaxel, which transformed ovarian cancer clinical practice. Eribulin is therefore worthy of further consideration in clinical trials, particularly in ovarian carcinoma with early failure of carboplatin/paclitaxel chemotherapy.
Collapse
Affiliation(s)
- Gwo Yaw Ho
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
- The Royal Women’s Hospital, Parkville, VIC, Australia
- School of Clinical Sciences, Monash University, Clayton Road, Clayton, VIC 3168, Australia
| | - Cassandra J. Vandenberg
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Ratana Lim
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Elizabeth L. Christie
- Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC, Australia
| | - Dale W. Garsed
- Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC, Australia
| | - Elizabeth Lieschke
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Ksenija Nesic
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Olga Kondrashova
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
- QIMR Berghofer Medical Research Institute, Herston, QLD, Australia
| | | | - Marc Radke
- University of Washington, Seattle, WA, USA
| | - Jocelyn S. Penington
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Amandine Carmagnac
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Valerie Heong
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Elizabeth L. Kyran
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Fan Zhang
- Department of Surgery, Austin Health, University of Melbourne, Heidelberg, VIC, Australia
| | - Nadia Traficante
- Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC, Australia
| | | | | | - Alexander Dobrovic
- Department of Surgery, Austin Health, University of Melbourne, Heidelberg, VIC, Australia
| | | | - Orla McNally
- The Royal Women’s Hospital, Parkville, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC, Australia
- Department of Obstetrics and Gynaecology, University of Melbourne, Parkville, VIC, Australia
| | - Damien Kee
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC, Australia
- Department of Medical Oncology, Austin Hospital, Heidelberg, VIC, Australia
| | - Matthew J. Wakefield
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Obstetrics and Gynaecology, University of Melbourne, Parkville, VIC, Australia
| | - Anthony T. Papenfuss
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
- Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC, Australia
| | - David D. L. Bowtell
- Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC, Australia
| | - Holly E. Barker
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Clare L. Scott
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
- The Royal Women’s Hospital, Parkville, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC, Australia
- Department of Obstetrics and Gynaecology, University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
12
|
Zhang B, Bassani-Sternberg M. Current perspectives on mass spectrometry-based immunopeptidomics: the computational angle to tumor antigen discovery. J Immunother Cancer 2023; 11:e007073. [PMID: 37899131 PMCID: PMC10619091 DOI: 10.1136/jitc-2023-007073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/21/2023] [Indexed: 10/31/2023] Open
Abstract
Identification of tumor antigens presented by the human leucocyte antigen (HLA) molecules is essential for the design of effective and safe cancer immunotherapies that rely on T cell recognition and killing of tumor cells. Mass spectrometry (MS)-based immunopeptidomics enables high-throughput, direct identification of HLA-bound peptides from a variety of cell lines, tumor tissues, and healthy tissues. It involves immunoaffinity purification of HLA complexes followed by MS profiling of the extracted peptides using data-dependent acquisition, data-independent acquisition, or targeted approaches. By incorporating DNA, RNA, and ribosome sequencing data into immunopeptidomics data analysis, the proteogenomic approach provides a powerful means for identifying tumor antigens encoded within the canonical open reading frames of annotated coding genes and non-canonical tumor antigens derived from presumably non-coding regions of our genome. We discuss emerging computational challenges in immunopeptidomics data analysis and tumor antigen identification, highlighting key considerations in the proteogenomics-based approach, including accurate DNA, RNA and ribosomal sequencing data analysis, careful incorporation of predicted novel protein sequences into reference protein database, special quality control in MS data analysis due to the expanded and heterogeneous search space, cancer-specificity determination, and immunogenicity prediction. The advancements in technology and computation is continually enabling us to identify tumor antigens with higher sensitivity and accuracy, paving the way toward the development of more effective cancer immunotherapies.
Collapse
Affiliation(s)
- Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| |
Collapse
|
13
|
Liu E, Sudha P, Becker N, Jaouadi O, Suvannasankha A, Lee K, Abonour R, Abu Zaid M, Walker BA. Identifying novel mechanisms of biallelic TP53 loss refines poor outcome for patients with multiple myeloma. Blood Cancer J 2023; 13:144. [PMID: 37696786 PMCID: PMC10495448 DOI: 10.1038/s41408-023-00919-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 08/27/2023] [Accepted: 08/30/2023] [Indexed: 09/13/2023] Open
Abstract
Biallelic TP53 inactivation is the most important high-risk factor associated with poor survival in multiple myeloma. Classical biallelic TP53 inactivation has been defined as simultaneous mutation and copy number loss in most studies; however, numerous studies have demonstrated that other factors could lead to the inactivation of TP53. Here, we hypothesized that novel biallelic TP53 inactivated samples existed in the multiple myeloma population. A random forest regression model that exploited an expression signature of 16 differentially expressed genes between classical biallelic TP53 and TP53 wild-type samples was subsequently established and used to identify novel biallelic TP53 samples from monoallelic TP53 groups. The model reflected high accuracy and robust performance in newly diagnosed relapsed and refractory populations. Patient survival of classical and novel biallelic TP53 samples was consistently much worse than those with mono-allelic or wild-type TP53 status. We also demonstrated that some predicted biallelic TP53 samples simultaneously had copy number loss and aberrant splicing, resulting in overexpression of high-risk transcript variants, leading to biallelic inactivation. We discovered that splice site mutation and overexpression of the splicing factor MED18 were reasons for aberrant splicing. Taken together, our study unveiled the complex transcriptome of TP53, some of which might benefit future studies targeting abnormal TP53.
Collapse
Affiliation(s)
- Enze Liu
- Melvin and Bren Simon Comprehensive Cancer Center, Division of Hematology and Oncology, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Parvathi Sudha
- Melvin and Bren Simon Comprehensive Cancer Center, Division of Hematology and Oncology, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Nathan Becker
- Melvin and Bren Simon Comprehensive Cancer Center, Division of Hematology and Oncology, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Oumaima Jaouadi
- Melvin and Bren Simon Comprehensive Cancer Center, Division of Hematology and Oncology, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Attaya Suvannasankha
- Melvin and Bren Simon Comprehensive Cancer Center, Division of Hematology and Oncology, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Kelvin Lee
- Melvin and Bren Simon Comprehensive Cancer Center, Division of Hematology and Oncology, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Rafat Abonour
- Melvin and Bren Simon Comprehensive Cancer Center, Division of Hematology and Oncology, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Mohammad Abu Zaid
- Melvin and Bren Simon Comprehensive Cancer Center, Division of Hematology and Oncology, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Brian A Walker
- Melvin and Bren Simon Comprehensive Cancer Center, Division of Hematology and Oncology, School of Medicine, Indiana University, Indianapolis, IN, USA.
- Center for Computational Biology and Bioinformatics, School of Medicine, Indiana University, Indianapolis, IN, USA.
| |
Collapse
|
14
|
Hodgins HP, Chen P, Lobb B, Wei X, Tremblay BJM, Mansfield MJ, Lee VCY, Lee PG, Coffin J, Duggan AT, Dolphin AE, Renaud G, Dong M, Doxey AC. Ancient Clostridium DNA and variants of tetanus neurotoxins associated with human archaeological remains. Nat Commun 2023; 14:5475. [PMID: 37673908 PMCID: PMC10482840 DOI: 10.1038/s41467-023-41174-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 08/23/2023] [Indexed: 09/08/2023] Open
Abstract
The analysis of microbial genomes from human archaeological samples offers a historic snapshot of ancient pathogens and provides insights into the origins of modern infectious diseases. Here, we analyze metagenomic datasets from 38 human archaeological samples and identify bacterial genomic sequences related to modern-day Clostridium tetani, which produces the tetanus neurotoxin (TeNT) and causes the disease tetanus. These genomic assemblies had varying levels of completeness, and a subset of them displayed hallmarks of ancient DNA damage. Phylogenetic analyses revealed known C. tetani clades as well as potentially new Clostridium lineages closely related to C. tetani. The genomic assemblies encode 13 TeNT variants with unique substitution profiles, including a subgroup of TeNT variants found exclusively in ancient samples from South America. We experimentally tested a TeNT variant selected from an ancient Chilean mummy sample and found that it induced tetanus muscle paralysis in mice, with potency comparable to modern TeNT. Thus, our ancient DNA analysis identifies DNA from neurotoxigenic C. tetani in archaeological human samples, and a novel variant of TeNT that can cause disease in mammals.
Collapse
Affiliation(s)
- Harold P Hodgins
- Department of Biology and the Waterloo Centre for Microbial Research, University of Waterloo, Waterloo, ON, Canada
| | - Pengsheng Chen
- Department of Urology, Boston Children's Hospital, Boston, MA, USA
- Department of Surgery and Department of Microbiology, Harvard Medical School, Boston, MA, USA
| | - Briallen Lobb
- Department of Biology and the Waterloo Centre for Microbial Research, University of Waterloo, Waterloo, ON, Canada
| | - Xin Wei
- Department of Biology and the Waterloo Centre for Microbial Research, University of Waterloo, Waterloo, ON, Canada
| | - Benjamin J M Tremblay
- Department of Biology and the Waterloo Centre for Microbial Research, University of Waterloo, Waterloo, ON, Canada
| | - Michael J Mansfield
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Okinawa, Japan
| | - Victoria C Y Lee
- Department of Biology and the Waterloo Centre for Microbial Research, University of Waterloo, Waterloo, ON, Canada
| | - Pyung-Gang Lee
- Department of Urology, Boston Children's Hospital, Boston, MA, USA
- Department of Surgery and Department of Microbiology, Harvard Medical School, Boston, MA, USA
| | - Jeffrey Coffin
- Department of Anthropology, University of Waterloo, Waterloo, ON, Canada
| | - Ana T Duggan
- McMaster Ancient DNA Centre, Department of Anthropology, McMaster University, Hamilton, ON, Canada
| | - Alexis E Dolphin
- Department of Anthropology, University of Waterloo, Waterloo, ON, Canada
| | - Gabriel Renaud
- Department of Health Technology, Section of Bioinformatics, Technical University of Denmark, Kongens Lyngby, Denmark.
| | - Min Dong
- Department of Urology, Boston Children's Hospital, Boston, MA, USA.
- Department of Surgery and Department of Microbiology, Harvard Medical School, Boston, MA, USA.
| | - Andrew C Doxey
- Department of Biology and the Waterloo Centre for Microbial Research, University of Waterloo, Waterloo, ON, Canada.
| |
Collapse
|
15
|
Abdelmogod A, Papadopoulos L, Riordan S, Wong M, Weltman M, Lim R, McEvoy C, Fellowes A, Fox S, Bedő J, Penington J, Pham K, Hofmann O, Vissers JHA, Grimmond S, Ratnayake G, Christie M, Mitchell C, Murray WK, McClymont K, Luk P, Papenfuss AT, Kee D, Scott CL, Goldstein D, Barker HE. A Matched Molecular and Clinical Analysis of the Epithelioid Haemangioendothelioma Cohort in the Stafford Fox Rare Cancer Program and Contextual Literature Review. Cancers (Basel) 2023; 15:4378. [PMID: 37686662 PMCID: PMC10487006 DOI: 10.3390/cancers15174378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 08/21/2023] [Accepted: 08/29/2023] [Indexed: 09/10/2023] Open
Abstract
BACKGROUND Epithelioid haemangioendothelioma (EHE) is an ultra-rare malignant vascular tumour with a prevalence of 1 per 1,000,000. It is typically molecularly characterised by a WWTR1::CAMTA1 gene fusion in approximately 90% of cases, or a YAP1::TFE3 gene fusion in approximately 10% of cases. EHE cases are typically refractory to therapies, and no anticancer agents are reimbursed for EHE in Australia. METHODS We report a cohort of nine EHE cases with comprehensive histologic and molecular profiling from the Walter and Eliza Hall Institute of Medical Research Stafford Fox Rare Cancer Program (WEHI-SFRCP) collated via nation-wide referral to the Australian Rare Cancer (ARC) Portal. The diagnoses of EHE were confirmed by histopathological and immunohistochemical (IHC) examination. Molecular profiling was performed using the TruSight Oncology 500 assay, the TruSight RNA fusion panel, whole genome sequencing (WGS), or whole exome sequencing (WES). RESULTS Molecular analysis of RNA, DNA or both was possible in seven of nine cases. The WWTR1::CAMTA1 fusion was identified in five cases. The YAP1::TFE3 fusion was identified in one case, demonstrating unique morphology compared to cases with the more common WWTR1::CAMTA1 fusion. All tumours expressed typical endothelial markers CD31, ERG, and CD34 and were negative for pan-cytokeratin. Cases with a WWTR1::CAMTA1 fusion displayed high expression of CAMTA1 and the single case with a YAP1::TFE3 fusion displayed high expression of TFE3. Survival was highly variable and unrelated to molecular profile. CONCLUSIONS This cohort of EHE cases provides molecular and histopathological characterisation and matching clinical information that emphasises the molecular patterns and variable clinical outcomes and adds to our knowledge of this ultra-rare cancer. Such information from multiple studies will advance our understanding, potentially improving treatment options.
Collapse
Affiliation(s)
- Arwa Abdelmogod
- Limestone Coast Local Health Network, Flinders University, Bedford Park, SA 5042, Australia;
| | - Lia Papadopoulos
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia; (L.P.); (R.L.); (J.B.); (J.P.); (A.T.P.); (D.K.); (C.L.S.)
- The Australian Rare Cancer Portal, BioGrid, Parkville, VIC 3051, Australia;
- Eastern Health Clinical School, Monash University, Box Hill, VIC 3128, Australia
| | - Stephen Riordan
- Prince of Wales Clinical School, University of NSW, Randwick, NSW 2031, Australia;
- Gastrointestinal and Liver Unit, Prince of Wales Hospital, Randwick, NSW 2031, Australia
| | - Melvin Wong
- Radiology Department, Prince of Wales Hospital, Randwick, NSW 2031, Australia;
| | - Martin Weltman
- Department of Gastroenterology, Nepean Hospital, Kingswood, NSW 2747, Australia;
| | - Ratana Lim
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia; (L.P.); (R.L.); (J.B.); (J.P.); (A.T.P.); (D.K.); (C.L.S.)
| | - Christopher McEvoy
- Peter MacCallum Cancer Centre, Melbourne, VIC 3000, Australia; (C.M.); (A.F.)
| | - Andrew Fellowes
- Peter MacCallum Cancer Centre, Melbourne, VIC 3000, Australia; (C.M.); (A.F.)
| | - Stephen Fox
- Peter MacCallum Cancer Centre, Melbourne, VIC 3000, Australia; (C.M.); (A.F.)
| | - Justin Bedő
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia; (L.P.); (R.L.); (J.B.); (J.P.); (A.T.P.); (D.K.); (C.L.S.)
| | - Jocelyn Penington
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia; (L.P.); (R.L.); (J.B.); (J.P.); (A.T.P.); (D.K.); (C.L.S.)
| | - Kym Pham
- Centre for Cancer Research and Department of Clinical Pathology, University of Melbourne, Melbourne, VIC 3010, Australia; (K.P.); (O.H.); (J.H.A.V.); (S.G.)
| | - Oliver Hofmann
- Centre for Cancer Research and Department of Clinical Pathology, University of Melbourne, Melbourne, VIC 3010, Australia; (K.P.); (O.H.); (J.H.A.V.); (S.G.)
| | - Joseph H. A. Vissers
- Centre for Cancer Research and Department of Clinical Pathology, University of Melbourne, Melbourne, VIC 3010, Australia; (K.P.); (O.H.); (J.H.A.V.); (S.G.)
| | - Sean Grimmond
- Centre for Cancer Research and Department of Clinical Pathology, University of Melbourne, Melbourne, VIC 3010, Australia; (K.P.); (O.H.); (J.H.A.V.); (S.G.)
| | | | | | - Catherine Mitchell
- Department of Pathology, Peter MacCallum Cancer Centre, Melbourne, VIC 3000, Australia; (C.M.); (W.K.M.)
| | - William K. Murray
- Department of Pathology, Peter MacCallum Cancer Centre, Melbourne, VIC 3000, Australia; (C.M.); (W.K.M.)
| | - Kelly McClymont
- Sullivan Nicolaides Pathology, Brisbane, QLD 4000, Australia;
| | - Peter Luk
- Royal Prince Alfred Hospital, Camperdown, NSW 2050, Australia;
| | - Anthony T. Papenfuss
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia; (L.P.); (R.L.); (J.B.); (J.P.); (A.T.P.); (D.K.); (C.L.S.)
- Department of Gastroenterology, Nepean Hospital, Kingswood, NSW 2747, Australia;
- Sir Peter MacCallum Cancer Centre, Department of Oncology, University of Melbourne, Parkville, VIC 3000, Australia
| | - Damien Kee
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia; (L.P.); (R.L.); (J.B.); (J.P.); (A.T.P.); (D.K.); (C.L.S.)
- The Australian Rare Cancer Portal, BioGrid, Parkville, VIC 3051, Australia;
- Sir Peter MacCallum Cancer Centre, Department of Oncology, University of Melbourne, Parkville, VIC 3000, Australia
- Austin Health, Heidelberg, VIC 3084, Australia
| | - Clare L. Scott
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia; (L.P.); (R.L.); (J.B.); (J.P.); (A.T.P.); (D.K.); (C.L.S.)
- The Australian Rare Cancer Portal, BioGrid, Parkville, VIC 3051, Australia;
- The Royal Womens’ Hospital, Parkville, VIC 3052, Australia;
- Sir Peter MacCallum Cancer Centre, Department of Oncology, University of Melbourne, Parkville, VIC 3000, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC 3010, Australia
- Department of Obstetrics and Gynaecology, University of Melbourne, Parkville, VIC 3010, Australia
| | - David Goldstein
- The Australian Rare Cancer Portal, BioGrid, Parkville, VIC 3051, Australia;
- Eastern Health Clinical School, Monash University, Box Hill, VIC 3128, Australia
- Nelune Center, Prince of Wales Hospital, Randwick, NSW 2031, Australia
| | - Holly E. Barker
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia; (L.P.); (R.L.); (J.B.); (J.P.); (A.T.P.); (D.K.); (C.L.S.)
- Department of Medical Biology, University of Melbourne, Melbourne, VIC 3010, Australia
| |
Collapse
|
16
|
Berger SI, Pitsava G, Cohen AJ, Délot EC, LoTempio J, Andrew EH, Martin GM, Marmolejos S, Albert J, Meltzer B, Fraser J, Regier DS, Kahn-Kirby AH, Smith E, Knoblach S, Ko A, Fusaro VA, Vilain E. Increased diagnostic yield from negative whole genome-slice panels using automated reanalysis. Clin Genet 2023; 104:377-383. [PMID: 37194472 PMCID: PMC10524710 DOI: 10.1111/cge.14360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 05/04/2023] [Accepted: 05/04/2023] [Indexed: 05/18/2023]
Abstract
We evaluated the diagnostic yield using genome-slice panel reanalysis in the clinical setting using an automated phenotype/gene ranking system. We analyzed whole genome sequencing (WGS) data produced from clinically ordered panels built as bioinformatic slices for 16 clinically diverse, undiagnosed cases referred to the Pediatric Mendelian Genomics Research Center, an NHGRI-funded GREGoR Consortium site. Genome-wide reanalysis was performed using Moon™, a machine-learning-based tool for variant prioritization. In five out of 16 cases, we discovered a potentially clinically significant variant. In four of these cases, the variant was found in a gene not included in the original panel due to phenotypic expansion of a disorder or incomplete initial phenotyping of the patient. In the fifth case, the gene containing the variant was included in the original panel, but being a complex structural rearrangement with intronic breakpoints outside the clinically analyzed regions, it was not initially identified. Automated genome-wide reanalysis of clinical WGS data generated during targeted panels testing yielded a 25% increase in diagnostic findings and a possibly clinically relevant finding in one additional case, underscoring the added value of analyses versus those routinely performed in the clinical setting.
Collapse
Affiliation(s)
- Seth I. Berger
- Children’s National Rare Disease Institute, Division of Genetics and Metabolism, Washington, DC, USA
- Center for Genetic Medicine Research, Children’s National Research Institute, Washington, DC, USA
| | - Georgia Pitsava
- Center for Genetic Medicine Research, Children’s National Research Institute, Washington, DC, USA
| | - Andrea J. Cohen
- Children’s National Rare Disease Institute, Division of Genetics and Metabolism, Washington, DC, USA
- Center for Genetic Medicine Research, Children’s National Research Institute, Washington, DC, USA
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Emmanuèle C. Délot
- Center for Genetic Medicine Research, Children’s National Research Institute, Washington, DC, USA
- Department of Genomics and Precision Medicine, George Washington University, Washington, DC, USA
| | - Jonathan LoTempio
- Center for Genetic Medicine Research, Children’s National Research Institute, Washington, DC, USA
- Department of Genomics and Precision Medicine, George Washington University, Washington, DC, USA
| | - Erin Hallie Andrew
- Children’s National Rare Disease Institute, Division of Genetics and Metabolism, Washington, DC, USA
- Center for Genetic Medicine Research, Children’s National Research Institute, Washington, DC, USA
| | | | - Sofia Marmolejos
- Center for Genetic Medicine Research, Children’s National Research Institute, Washington, DC, USA
| | - Jessica Albert
- Molecular Diagnostics Laboratories, Children’s National Hospital, Washington, DC, USA
| | - Beatrix Meltzer
- Molecular Diagnostics Laboratories, Children’s National Hospital, Washington, DC, USA
| | - Jamie Fraser
- Children’s National Rare Disease Institute, Division of Genetics and Metabolism, Washington, DC, USA
- Center for Genetic Medicine Research, Children’s National Research Institute, Washington, DC, USA
| | - Debra S. Regier
- Children’s National Rare Disease Institute, Division of Genetics and Metabolism, Washington, DC, USA
- Center for Genetic Medicine Research, Children’s National Research Institute, Washington, DC, USA
| | | | | | - Susan Knoblach
- Center for Genetic Medicine Research, Children’s National Research Institute, Washington, DC, USA
| | - Arthur Ko
- Center for Genetic Medicine Research, Children’s National Research Institute, Washington, DC, USA
| | | | - Eric Vilain
- Center for Genetic Medicine Research, Children’s National Research Institute, Washington, DC, USA
- Department of Genomics and Precision Medicine, George Washington University, Washington, DC, USA
- Institute for Clinical and Translational Science, University of California, Irvine, CA, USA
| |
Collapse
|
17
|
Nesic K, Krais JJ, Vandenberg CJ, Wang Y, Patel P, Cai KQ, Kwan T, Lieschke E, Ho GY, Barker HE, Bedo J, Casadei S, Farrell A, Radke M, Shield-Artin K, Penington JS, Geissler F, Kyran E, Zhang F, Dobrovic A, Olesen I, Kristeleit R, Oza A, Ratnayake G, Traficante N, DeFazio A, Bowtell DDL, Harding TC, Lin K, Swisher EM, Kondrashova O, Scott CL, Johnson N, Wakefield MJ. BRCA1 secondary splice-site mutations drive exon-skipping and PARP inhibitor resistance. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.03.20.23287465. [PMID: 36993400 PMCID: PMC10055590 DOI: 10.1101/2023.03.20.23287465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
BRCA1 splice isoforms Δ11 and Δ11q can contribute to PARP inhibitor (PARPi) resistance by splicing-out the mutation-containing exon, producing truncated, partially-functional proteins. However, the clinical impact and underlying drivers of BRCA1 exon skipping remain undetermined. We analyzed nine ovarian and breast cancer patient derived xenografts (PDX) with BRCA1 exon 11 frameshift mutations for exon skipping and therapy response, including a matched PDX pair derived from a patient pre- and post-chemotherapy/PARPi. BRCA1 exon 11 skipping was elevated in PARPi resistant PDX tumors. Two independent PDX models acquired secondary BRCA1 splice site mutations (SSMs), predicted in silico to drive exon skipping. Predictions were confirmed using qRT-PCR, RNA sequencing, western blots and BRCA1 minigene modelling. SSMs were also enriched in post-PARPi ovarian cancer patient cohorts from the ARIEL2 and ARIEL4 clinical trials. We demonstrate that SSMs drive BRCA1 exon 11 skipping and PARPi resistance, and should be clinically monitored, along with frame-restoring secondary mutations.
Collapse
Affiliation(s)
- Ksenija Nesic
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | | | - Cassandra J. Vandenberg
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | | | | | | | - Tanya Kwan
- Clovis Oncology Inc., San Francisco, CA, USA
| | - Elizabeth Lieschke
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Gwo-Yaw Ho
- School of Clinical Sciences, Monash University, Clayton, Victoria, Australia
| | - Holly E. Barker
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Justin Bedo
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | | | - Andrew Farrell
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Marc Radke
- University of Washington, Seattle, WA, USA
| | - Kristy Shield-Artin
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Jocelyn S. Penington
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Franziska Geissler
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Elizabeth Kyran
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Fan Zhang
- University of Melbourne Department of Surgery, Austin Health, Heidelberg, Victoria, Australia
| | - Alexander Dobrovic
- University of Melbourne Department of Surgery, Austin Health, Heidelberg, Victoria, Australia
| | - Inger Olesen
- The Andrew Love Cancer Centre, Barwon Health, Geelong, Victoria, Australia
| | - Rebecca Kristeleit
- Department of Oncology, Guys and St Thomas’ NHS Foundation Trust, London, UK
- National Institute for Health Research, University College London Hospitals Clinical Research Facility, London, UK
| | - Amit Oza
- Princess Margaret Cancer Center, Toronto, ON, Canada
| | | | - Nadia Traficante
- Sir Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, VIC, Australia
| | | | - Anna DeFazio
- The Daffodil Centre, The University of Sydney, a joint venture with Cancer Council New South Wales, Sydney, New South Wales, Australia
- The Westmead Institute for Medical Research, Sydney, New South Wales, Australia
- Department of Gynecological Oncology, Westmead Hospital, Western Sydney Local Health District, New South Wales, Australia
| | - David D. L. Bowtell
- Sir Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, VIC, Australia
| | | | - Kevin Lin
- Clovis Oncology Inc., San Francisco, CA, USA
| | | | - Olga Kondrashova
- QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Clare L. Scott
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
- Royal Women’s Hospital, Parkville, VIC, Australia
- Sir Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, VIC, Australia
- Department of Obstetrics and Gynecology, University of Melbourne, Parkville, VIC, Australia
| | | | - Matthew J. Wakefield
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
- Department of Obstetrics and Gynecology, University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
18
|
Seah YM, Stewart MK, Hoogestraat D, Ryder M, Cookson BT, Salipante SJ, Hoffman NG. In Silico Evaluation of Variant Calling Methods for Bacterial Whole-Genome Sequencing Assays. J Clin Microbiol 2023; 61:e0184222. [PMID: 37428072 PMCID: PMC10446864 DOI: 10.1128/jcm.01842-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 06/18/2023] [Indexed: 07/11/2023] Open
Abstract
Identification and analysis of clinically relevant strains of bacteria increasingly relies on whole-genome sequencing. The downstream bioinformatics steps necessary for calling variants from short-read sequences are well-established but seldom validated against haploid genomes. We devised an in silico workflow to introduce single nucleotide polymorphisms (SNP) and indels into bacterial reference genomes, and computationally generate sequencing reads based on the mutated genomes. We then applied the method to Mycobacterium tuberculosis H37Rv, Staphylococcus aureus NCTC 8325, and Klebsiella pneumoniae HS11286, and used the synthetic reads as truth sets for evaluating several popular variant callers. Insertions proved especially challenging for most variant callers to correctly identify, relative to deletions and single nucleotide polymorphisms. With adequate read depth, however, variant callers that use high quality soft-clipped reads and base mismatches to perform local realignment consistently had the highest precision and recall in identifying insertions and deletions ranging from1 to 50 bp. The remaining variant callers had lower recall values associated with identification of insertions greater than 20 bp.
Collapse
Affiliation(s)
- Yee Mey Seah
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Mary K. Stewart
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Daniel Hoogestraat
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Molly Ryder
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Brad T. Cookson
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
- Department of Microbiology, University of Washington, Seattle, Washington, USA
| | - Stephen J. Salipante
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Noah G. Hoffman
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| |
Collapse
|
19
|
Smith P, Bradley T, Gavarró LM, Goranova T, Ennis DP, Mirza HB, De Silva D, Piskorz AM, Sauer CM, Al-Khalidi S, Funingana IG, Reinius MAV, Giannone G, Lewsley LA, Stobo J, McQueen J, Bryson G, Eldridge M, Macintyre G, Markowetz F, Brenton JD, McNeish IA. The copy number and mutational landscape of recurrent ovarian high-grade serous carcinoma. Nat Commun 2023; 14:4387. [PMID: 37474499 PMCID: PMC10359414 DOI: 10.1038/s41467-023-39867-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 06/28/2023] [Indexed: 07/22/2023] Open
Abstract
The drivers of recurrence and resistance in ovarian high grade serous carcinoma remain unclear. We investigate the acquisition of resistance by collecting tumour biopsies from a cohort of 276 women with relapsed ovarian high grade serous carcinoma in the BriTROC-1 study. Panel sequencing shows close concordance between diagnosis and relapse, with only four discordant cases. There is also very strong concordance in copy number between diagnosis and relapse, with no significant difference in purity, ploidy or focal somatic copy number alterations, even when stratified by platinum sensitivity or prior chemotherapy lines. Copy number signatures are strongly correlated with immune cell infiltration, whilst diagnosis samples from patients with primary platinum resistance have increased rates of CCNE1 and KRAS amplification and copy number signature 1 exposure. Our data show that the ovarian high grade serous carcinoma genome is remarkably stable between diagnosis and relapse and acquired chemotherapy resistance does not select for common copy number drivers.
Collapse
Affiliation(s)
- Philip Smith
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Thomas Bradley
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| | | | - Teodora Goranova
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Darren P Ennis
- Ovarian Cancer Action Research Centre, Department of Surgery and Cancer, Imperial College London, London, UK
| | - Hasan B Mirza
- Ovarian Cancer Action Research Centre, Department of Surgery and Cancer, Imperial College London, London, UK
| | - Dilrini De Silva
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Anna M Piskorz
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Carolin M Sauer
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| | | | - Ionut-Gabriel Funingana
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Marika A V Reinius
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Gaia Giannone
- Ovarian Cancer Action Research Centre, Department of Surgery and Cancer, Imperial College London, London, UK
| | - Liz-Anne Lewsley
- CRUK Glasgow Clinical Trials Unit, Institute of Cancer Sciences, University of Glasgow, Glasgow, UK
| | - Jamie Stobo
- CRUK Glasgow Clinical Trials Unit, Institute of Cancer Sciences, University of Glasgow, Glasgow, UK
| | - John McQueen
- CRUK Glasgow Clinical Trials Unit, Institute of Cancer Sciences, University of Glasgow, Glasgow, UK
| | - Gareth Bryson
- Department of Histopathology, Queen Elizabeth University Hospital, Glasgow, UK
| | - Matthew Eldridge
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Geoff Macintyre
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
- Centro Nacional de Investigaciones Oncológicas, Madrid, Spain
| | | | - James D Brenton
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK.
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK.
| | - Iain A McNeish
- Ovarian Cancer Action Research Centre, Department of Surgery and Cancer, Imperial College London, London, UK.
| |
Collapse
|
20
|
Boßelmann CM, Leu C, Lal D. Technological and computational approaches to detect somatic mosaicism in epilepsy. Neurobiol Dis 2023:106208. [PMID: 37343892 DOI: 10.1016/j.nbd.2023.106208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 06/03/2023] [Accepted: 06/16/2023] [Indexed: 06/23/2023] Open
Abstract
Lesional epilepsy is a common and severe disease commonly associated with malformations of cortical development, including focal cortical dysplasia and hemimegalencephaly. Recent advances in sequencing and variant calling technologies have identified several genetic causes, including both short/single nucleotide and structural somatic variation. In this review, we aim to provide a comprehensive overview of the methodological advancements in this field while highlighting the unresolved technological and computational challenges that persist, including ultra-low variant allele fractions in bulk tissue, low availability of paired control samples, spatial variability of mutational burden within the lesion, and the issue of false-positive calls and validation procedures. Information from genetic testing in focal epilepsy may be integrated into clinical care to inform histopathological diagnosis, postoperative prognosis, and candidate precision therapies.
Collapse
Affiliation(s)
- Christian M Boßelmann
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA; Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Costin Leu
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA; Department of Clinical and Experimental Epilepsy, Institute of Neurology, University College London, London, UK.
| | - Dennis Lal
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA; Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and M.I.T., Cambridge, MA, USA; Cologne Center for Genomics (CCG), University of Cologne, Cologne, DE, USA
| |
Collapse
|
21
|
Chen NC, Kolesnikov A, Goel S, Yun T, Chang PC, Carroll A. Improving variant calling using population data and deep learning. BMC Bioinformatics 2023; 24:197. [PMID: 37173615 PMCID: PMC10182612 DOI: 10.1186/s12859-023-05294-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Accepted: 04/17/2023] [Indexed: 05/15/2023] Open
Abstract
Large-scale population variant data is often used to filter and aid interpretation of variant calls in a single sample. These approaches do not incorporate population information directly into the process of variant calling, and are often limited to filtering which trades recall for precision. In this study, we develop population-aware DeepVariant models with a new channel encoding allele frequencies from the 1000 Genomes Project. This model reduces variant calling errors, improving both precision and recall in single samples, and reduces rare homozygous and pathogenic clinvar calls cohort-wide. We assess the use of population-specific or diverse reference panels, finding the greatest accuracy with diverse panels, suggesting that large, diverse panels are preferable to individual populations, even when the population matches sample ancestry. Finally, we show that this benefit generalizes to samples with different ancestry from the training data even when the ancestry is also excluded from the reference panel.
Collapse
Affiliation(s)
- Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, 21218, USA.
| | | | | | | | | | | |
Collapse
|
22
|
Dall G, Vandenberg CJ, Nesic K, Ratnayake G, Zhu W, Vissers JHA, Bedő J, Penington J, Wakefield MJ, Kee D, Carmagnac A, Lim R, Shield-Artin K, Milesi B, Lobley A, Kyran EL, O'Grady E, Tram J, Zhou W, Nugawela D, Stewart KP, Caldwell R, Papadopoulos L, Ng AP, Dobrovic A, Fox SB, McNally O, Power JD, Meniawy T, Tan TH, Collins IM, Klein O, Barnett S, Olesen I, Hamilton A, Hofmann O, Grimmond S, Papenfuss AT, Scott CL, Barker HE. Targeting homologous recombination deficiency in uterine leiomyosarcoma. J Exp Clin Cancer Res 2023; 42:112. [PMID: 37143137 PMCID: PMC10157936 DOI: 10.1186/s13046-023-02687-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 04/25/2023] [Indexed: 05/06/2023] Open
Abstract
BACKGROUND Uterine leiomyosarcoma (uLMS) is a rare and aggressive gynaecological malignancy, with individuals with advanced uLMS having a five-year survival of < 10%. Mutations in the homologous recombination (HR) DNA repair pathway have been observed in ~ 10% of uLMS cases, with reports of some individuals benefiting from poly (ADP-ribose) polymerase (PARP) inhibitor (PARPi) therapy, which targets this DNA repair defect. In this report, we screened individuals with uLMS, accrued nationally, for mutations in the HR repair pathway and explored new approaches to therapeutic targeting. METHODS A cohort of 58 individuals with uLMS were screened for HR Deficiency (HRD) using whole genome sequencing (WGS), whole exome sequencing (WES) or NGS panel testing. Individuals identified to have HRD uLMS were offered PARPi therapy and clinical outcome details collected. Patient-derived xenografts (PDX) were generated for therapeutic targeting. RESULTS All 13 uLMS samples analysed by WGS had a dominant COSMIC mutational signature 3; 11 of these had high genome-wide loss of heterozygosity (LOH) (> 0.2) but only two samples had a CHORD score > 50%, one of which had a homozygous pathogenic alteration in an HR gene (deletion in BRCA2). A further three samples harboured homozygous HRD alterations (all deletions in BRCA2), detected by WES or panel sequencing, with 5/58 (9%) individuals having HRD uLMS. All five individuals gained access to PARPi therapy. Two of three individuals with mature clinical follow up achieved a complete response or durable partial response (PR) with the subsequent addition of platinum to PARPi upon minor progression during initial PR on PARPi. Corresponding PDX responses were most rapid, complete and sustained with the PARP1-specific PARPi, AZD5305, compared with either olaparib alone or olaparib plus cisplatin, even in a paired sample of a BRCA2-deleted PDX, derived following PARPi therapy in the patient, which had developed PARPi-resistance mutations in PRKDC, encoding DNA-PKcs. CONCLUSIONS Our work demonstrates the value of identifying HRD for therapeutic targeting by PARPi and platinum in individuals with the aggressive rare malignancy, uLMS and suggests that individuals with HRD uLMS should be included in trials of PARP1-specific PARPi.
Collapse
Affiliation(s)
- Genevieve Dall
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, 3052, Australia
| | - Cassandra J Vandenberg
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.
- Department of Medical Biology, University of Melbourne, Parkville, VIC, 3052, Australia.
| | - Ksenija Nesic
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, 3052, Australia
| | | | - Wenying Zhu
- Centre for Cancer Research and Department of Clinical Pathology, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Joseph H A Vissers
- Centre for Cancer Research and Department of Clinical Pathology, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Justin Bedő
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- School of Computing and Information Systems, the University of Melbourne, Parkville, VIC, 3010, Australia
| | - Jocelyn Penington
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
| | - Matthew J Wakefield
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, 3052, Australia
| | - Damien Kee
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Olivia Newton-John Cancer Research Institute, Heidelberg, VIC, 3084, Australia
- Austin Health, Heidelberg, VIC, 3084, Australia
- Australian Rare Cancer Portal, BioGrid Australia, Melbourne Health, Parkville, VIC, 3052, Australia
- Peter MacCallum Cancer Centre and Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria, 3010, Australia
| | - Amandine Carmagnac
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
| | - Ratana Lim
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
| | - Kristy Shield-Artin
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, 3052, Australia
| | - Briony Milesi
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Royal Women's Hospital, Parkville, VIC, 3052, Australia
| | - Amanda Lobley
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Royal Women's Hospital, Parkville, VIC, 3052, Australia
| | - Elizabeth L Kyran
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
| | - Emily O'Grady
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
| | - Joshua Tram
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
| | - Warren Zhou
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
| | - Devindee Nugawela
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
| | - Kym Pham Stewart
- Centre for Cancer Research and Department of Clinical Pathology, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Reece Caldwell
- Australian Rare Cancer Portal, BioGrid Australia, Melbourne Health, Parkville, VIC, 3052, Australia
| | - Lia Papadopoulos
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Australian Rare Cancer Portal, BioGrid Australia, Melbourne Health, Parkville, VIC, 3052, Australia
| | - Ashley P Ng
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, 3052, Australia
- Peter MacCallum Cancer Centre and Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria, 3010, Australia
- Royal Melbourne Hospital, Parkville, VIC, 3052, Australia
| | | | - Stephen B Fox
- Peter MacCallum Cancer Centre and Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria, 3010, Australia
| | - Orla McNally
- Royal Women's Hospital, Parkville, VIC, 3052, Australia
- Peter MacCallum Cancer Centre and Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria, 3010, Australia
- Department of Obstetrics and Gynaecology, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Jeremy D Power
- Launceston General Hospital, Launceston, TAS, 7250, Australia
| | - Tarek Meniawy
- University of Western Australia, Perth, WA, 6009, Australia
| | - Teng Han Tan
- Peter MacCallum Cancer Centre and Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria, 3010, Australia
| | - Ian M Collins
- SouthWest Healthcare, Warrnambool, VIC, 3280, Australia
- Faculty of Health, School of Medicine, Deakin University, Warrnambool, VIC, 3280, Australia
| | - Oliver Klein
- Olivia Newton-John Cancer Research Institute, Heidelberg, VIC, 3084, Australia
- Austin Health, Heidelberg, VIC, 3084, Australia
| | - Stephen Barnett
- Royal Melbourne Hospital, Parkville, VIC, 3052, Australia
- Western Hospital, Footscray, VIC, 3011, Australia
| | - Inger Olesen
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- University Hospital Geelong, Geelong, VIC, 3220, Australia
| | - Anne Hamilton
- Royal Women's Hospital, Parkville, VIC, 3052, Australia
- Peter MacCallum Cancer Centre and Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria, 3010, Australia
| | - Oliver Hofmann
- Centre for Cancer Research and Department of Clinical Pathology, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Sean Grimmond
- Centre for Cancer Research and Department of Clinical Pathology, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Anthony T Papenfuss
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, 3052, Australia
- Peter MacCallum Cancer Centre and Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria, 3010, Australia
| | - Clare L Scott
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, 3052, Australia
- Royal Women's Hospital, Parkville, VIC, 3052, Australia
- Australian Rare Cancer Portal, BioGrid Australia, Melbourne Health, Parkville, VIC, 3052, Australia
- Peter MacCallum Cancer Centre and Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria, 3010, Australia
- Royal Melbourne Hospital, Parkville, VIC, 3052, Australia
- Department of Obstetrics and Gynaecology, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Holly E Barker
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, 3052, Australia
| |
Collapse
|
23
|
Olson ND, Wagner J, Dwarshuis N, Miga KH, Sedlazeck FJ, Salit M, Zook JM. Variant calling and benchmarking in an era of complete human genome sequences. Nat Rev Genet 2023:10.1038/s41576-023-00590-0. [PMID: 37059810 DOI: 10.1038/s41576-023-00590-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/22/2023] [Indexed: 04/16/2023]
Abstract
Genetic variant calling from DNA sequencing has enabled understanding of germline variation in hundreds of thousands of humans. Sequencing technologies and variant-calling methods have advanced rapidly, routinely providing reliable variant calls in most of the human genome. We describe how advances in long reads, deep learning, de novo assembly and pangenomes have expanded access to variant calls in increasingly challenging, repetitive genomic regions, including medically relevant regions, and how new benchmark sets and benchmarking methods illuminate their strengths and limitations. Finally, we explore the possible future of more complete characterization of human genome variation in light of the recent completion of a telomere-to-telomere human genome reference assembly and human pangenomes, and we consider the innovations needed to benchmark their newly accessible repetitive regions and complex variants.
Collapse
Affiliation(s)
- Nathan D Olson
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Nathan Dwarshuis
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Karen H Miga
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Fritz J Sedlazeck
- Baylor College of Medicine, Human Genome Sequencing Center, Houston, TX, USA
| | | | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA.
| |
Collapse
|
24
|
Hogle SL, Ruusulehto L, Cairns J, Hultman J, Hiltunen T. Localized coevolution between microbial predator and prey alters community-wide gene expression and ecosystem function. THE ISME JOURNAL 2023; 17:514-524. [PMID: 36658394 PMCID: PMC10030642 DOI: 10.1038/s41396-023-01361-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 01/03/2023] [Accepted: 01/06/2023] [Indexed: 01/20/2023]
Abstract
Closely interacting microbial species pairs (e.g., predator and prey) can become coadapted via reciprocal natural selection. A fundamental challenge in evolutionary ecology is to untangle how coevolution in small species groups affects and is affected by biotic interactions in diverse communities. We conducted an experiment with a synthetic 30-species bacterial community where we experimentally manipulated the coevolutionary history of a ciliate predator and one bacterial prey species from the community. Altering the coevolutionary history of the focal prey species had little effect on community structure or carrying capacity in the presence or absence of the coevolved predator. However, community metabolic potential (represented by per-cell ATP concentration) was significantly higher in the presence of both the coevolved focal predator and prey. This ecosystem-level response was mirrored by community-wide transcriptional shifts that resulted in the differential regulation of nutrient acquisition and surface colonization pathways across multiple bacterial species. Our findings show that the disruption of localized coevolution between species pairs can reverberate through community-wide transcriptional networks even while community composition remains largely unchanged. We propose that these altered expression patterns may signal forthcoming evolutionary and ecological change.
Collapse
Affiliation(s)
- Shane L Hogle
- Department of Biology, University of Turku, Turku, Finland.
| | - Liisa Ruusulehto
- Department of Microbiology, University of Helsinki, Helsinki, Finland
| | - Johannes Cairns
- Department of Computer Science, University of Helsinki, Helsinki, Finland
- Organismal and Evolutionary Biology Research Programme, University of Helsinki, Helsinki, Finland
| | - Jenni Hultman
- Department of Microbiology, University of Helsinki, Helsinki, Finland
- Natural Resources Institute Finland, Helsinki, Finland
| | - Teppo Hiltunen
- Department of Biology, University of Turku, Turku, Finland.
- Department of Microbiology, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
25
|
Cai Y, Chen R, Gao S, Li W, Liu Y, Su G, Song M, Jiang M, Jiang C, Zhang X. Artificial intelligence applied in neoantigen identification facilitates personalized cancer immunotherapy. Front Oncol 2023; 12:1054231. [PMID: 36698417 PMCID: PMC9868469 DOI: 10.3389/fonc.2022.1054231] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 12/16/2022] [Indexed: 01/10/2023] Open
Abstract
The field of cancer neoantigen investigation has developed swiftly in the past decade. Predicting novel and true neoantigens derived from large multi-omics data became difficult but critical challenges. The rise of Artificial Intelligence (AI) or Machine Learning (ML) in biomedicine application has brought benefits to strengthen the current computational pipeline for neoantigen prediction. ML algorithms offer powerful tools to recognize the multidimensional nature of the omics data and therefore extract the key neoantigen features enabling a successful discovery of new neoantigens. The present review aims to outline the significant technology progress of machine learning approaches, especially the newly deep learning tools and pipelines, that were recently applied in neoantigen prediction. In this review article, we summarize the current state-of-the-art tools developed to predict neoantigens. The standard workflow includes calling genetic variants in paired tumor and blood samples, and rating the binding affinity between mutated peptide, MHC (I and II) and T cell receptor (TCR), followed by characterizing the immunogenicity of tumor epitopes. More specifically, we highlight the outstanding feature extraction tools and multi-layer neural network architectures in typical ML models. It is noted that more integrated neoantigen-predicting pipelines are constructed with hybrid or combined ML algorithms instead of conventional machine learning models. In addition, the trends and challenges in further optimizing and integrating the existing pipelines are discussed.
Collapse
Affiliation(s)
- Yu Cai
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Rui Chen
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Shenghan Gao
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Wenqing Li
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Yuru Liu
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Guodong Su
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Mingming Song
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Mengju Jiang
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Chao Jiang
- Department of Neurology, The Second Affiliated Hospital of Xi’an Medical University, Xi’an, Shaanxi, China,*Correspondence: Chao Jiang, ; Xi Zhang,
| | - Xi Zhang
- School of Medicine, Northwest University, Xi’an, Shaanxi, China,*Correspondence: Chao Jiang, ; Xi Zhang,
| |
Collapse
|
26
|
Han S, Kim K, Park S, Lee AJ, Chun H, Jung I. scAVENGERS: a genotype-based deconvolution of individuals in multiplexed single-cell ATAC-seq data without reference genotypes. NAR Genom Bioinform 2022; 4:lqac095. [PMID: 36601579 PMCID: PMC9803874 DOI: 10.1093/nargab/lqac095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 11/09/2022] [Accepted: 12/11/2022] [Indexed: 01/01/2023] Open
Abstract
Genetic differences inferred from sequencing reads can be used for demultiplexing of pooled single-cell RNA-seq (scRNA-seq) data across multiple donors without WGS-based reference genotypes. However, such methods could not be directly applied to single-cell ATAC-seq (scATAC-seq) data owing to the lower read coverage for each variant compared to scRNA-seq. We propose a new software, scATAC-seq Variant-based EstimatioN for GEnotype ReSolving (scAVENGERS), which resolves this issue by calling more individual-specific germline variants and using an optimized mixture model for the scATAC-seq. The benchmark conducted with three synthetic multiplexed scATAC-seq datasets of peripheral blood mononuclear cells and prefrontal cortex tissues showed outstanding performance compared to existing methods in terms of accuracy, doublet detection, and a portion of donor-assigned cells. Furthermore, analyzing the effect of the improved sections provided insight into handling pooled single-cell data in the future. Our source code of the devised software is available at GitHub: https://github.com/kaistcbfg/scAVENGERS.
Collapse
Affiliation(s)
- Seungbeom Han
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
| | - Kyukwang Kim
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
| | - Seongwan Park
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
| | - Andrew J Lee
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
| | - Hyonho Chun
- Department of Mathematical Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
| | - Inkyung Jung
- To whom correspondence should be addressed. Tel: +82 42 350 7315; Fax: +82 42 350 2610;
| |
Collapse
|
27
|
Woerner AE, Mandape S, Kapema KB, Duque TM, Smuts A, King JL, Crysup B, Wang X, Huang M, Ge J, Budowle B. Optimized variant calling for estimating kinship. Forensic Sci Int Genet 2022; 61:102785. [DOI: 10.1016/j.fsigen.2022.102785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 08/07/2022] [Accepted: 09/29/2022] [Indexed: 11/16/2022]
|
28
|
Nguyen J, Saffari P, Pollack A, Vennam S, Gong X, West R, Pollack J. New Ameloblastoma Cell Lines Enable Preclinical Study of Targeted Therapies. J Dent Res 2022; 101:1517-1525. [PMID: 35689405 PMCID: PMC9608093 DOI: 10.1177/00220345221100773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Ameloblastoma (AB) is an odontogenic tumor that arises from ameloblast-lineage cells. Although relatively uncommon and rarely metastatic, AB tumors are locally invasive and destructive to the jawbone and surrounding structures. Standard-of-care surgical resection often leads to disfigurement, and many tumors will locally recur, necessitating increasingly challenging surgeries. Recent genomic studies of AB have uncovered oncogenic driver mutations, including in the mitogen-activated protein kinase (MAPK) and Hedgehog signaling pathways. Medical therapies targeting those drivers would be a highly desirable alternative or addition to surgery; however, a paucity of existing AB cell lines has stymied clinical translation. To bridge this gap, here we report the establishment of 6 new AB cell lines-generated by "conditional reprogramming"-and their genomic characterization that reveals driver mutations in FGFR2, KRAS, NRAS, BRAF, PIK3CA, and SMO. Furthermore, in proof-of-principle studies, we use the new cell lines to investigate AB oncogene dependency and drug sensitivity. Among our findings, AB cells with KRAS or NRAS mutation (MAPK pathway) are exquisitely sensitive to MEK inhibition, which propels ameloblast differentiation. AB cells with activating SMO-L412F mutation (Hedgehog pathway) are insensitive to vismodegib; however, a distinct small-molecule SMO inhibitor, BMS-833923, significantly reduces both downstream Hedgehog signaling and tumor cell viability. The novel cell line resource enables preclinical studies and promises to speed the translation of new molecularly targeted therapies for the management of ameloblastoma and related odontogenic neoplasms.
Collapse
Affiliation(s)
- J. Nguyen
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - P.S. Saffari
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - A.S. Pollack
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - S. Vennam
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - X. Gong
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - R.B. West
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - J.R. Pollack
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
29
|
Hunt M, Letcher B, Malone KM, Nguyen G, Hall MB, Colquhoun RM, Lima L, Schatz MC, Ramakrishnan S, Iqbal Z. Minos: variant adjudication and joint genotyping of cohorts of bacterial genomes. Genome Biol 2022; 23:147. [PMID: 35791022 PMCID: PMC9254434 DOI: 10.1186/s13059-022-02714-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Accepted: 06/20/2022] [Indexed: 12/30/2022] Open
Abstract
There are many short-read variant-calling tools, with different strengths and weaknesses. We present a tool, Minos, which combines outputs from arbitrary variant callers, increasing recall without loss of precision. We benchmark on 62 samples from three bacterial species and an outbreak of 385 Mycobacterium tuberculosis samples. Minos also enables joint genotyping; we demonstrate on a large (N=13k) M. tuberculosis cohort, building a map of non-synonymous SNPs and indels in a region where all such variants are assumed to cause rifampicin resistance. We quantify the correlation with phenotypic resistance and then replicate in a second cohort (N=10k).
Collapse
Affiliation(s)
- Martin Hunt
- EMBL-EBI, Cambridge, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | | | | | | | | | - Rachel M Colquhoun
- Institute of Evolutionary Biology, Ashworth Laboratories, University of Edinburgh, Edinburgh, UK
| | | | - Michael C Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | | | | |
Collapse
|
30
|
Fernandez G, Yubero D, Palau F, Armstrong J. Molecular Modelling Hurdle in the Next-Generation Sequencing Era. Int J Mol Sci 2022; 23:ijms23137176. [PMID: 35806177 PMCID: PMC9266691 DOI: 10.3390/ijms23137176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 06/24/2022] [Accepted: 06/27/2022] [Indexed: 12/10/2022] Open
Abstract
There are challenges in the genetic diagnosis of rare diseases, and pursuing an optimal strategy to identify the cause of the disease is one of the main objectives of any clinical genomics unit. A range of techniques are currently used to characterize the genomic variability within the human genome to detect causative variants of specific disorders. With the introduction of next-generation sequencing (NGS) in the clinical setting, geneticists can study single-nucleotide variants (SNVs) throughout the entire exome/genome. In turn, the number of variants to be evaluated per patient has increased significantly, and more information has to be processed and analyzed to determine a proper diagnosis. Roughly 50% of patients with a Mendelian genetic disorder are diagnosed using NGS, but a fair number of patients still suffer a diagnostic odyssey. Due to the inherent diversity of the human population, as more exomes or genomes are sequenced, variants of uncertain significance (VUSs) will increase exponentially. Thus, assigning relevance to a VUS (non-synonymous as well as synonymous) in an undiagnosed patient becomes crucial to assess the proper diagnosis. Multiple algorithms have been used to predict how a specific mutation might affect the protein’s function, but they are far from accurate enough to be conclusive. In this work, we highlight the difficulties of genomic variability determined by NGS that have arisen in diagnosing rare genetic diseases, and how molecular modelling has to be a key component to elucidate the relevance of a specific mutation in the protein’s loss of function or malfunction. We suggest that the creation of a multi-omics data model should improve the classification of pathogenicity for a significant amount of the detected genomic variability. Moreover, we argue how it should be incorporated systematically in the process of variant evaluation to be useful in the clinical setting and the diagnostic pipeline.
Collapse
Affiliation(s)
- Guerau Fernandez
- Department of Genetic and Molecular Medicine—IPER, Hospital Sant Joan de Déu, Institut de Recerca Sant Joan de Déu, 08950 Barcelona, Spain; (G.F.); (F.P.); (J.A.)
- Center for Biomedical Research Network on Rare Diseases (CIBERER), ISCIII, 08950 Barcelona, Spain
| | - Dèlia Yubero
- Department of Genetic and Molecular Medicine—IPER, Hospital Sant Joan de Déu, Institut de Recerca Sant Joan de Déu, 08950 Barcelona, Spain; (G.F.); (F.P.); (J.A.)
- Center for Biomedical Research Network on Rare Diseases (CIBERER), ISCIII, 08950 Barcelona, Spain
- Correspondence: ; Tel.: +34-93-600-9451; Fax: +34-93-600-9760
| | - Francesc Palau
- Department of Genetic and Molecular Medicine—IPER, Hospital Sant Joan de Déu, Institut de Recerca Sant Joan de Déu, 08950 Barcelona, Spain; (G.F.); (F.P.); (J.A.)
- Center for Biomedical Research Network on Rare Diseases (CIBERER), ISCIII, 08950 Barcelona, Spain
- Division of Pediatrics, University of Barcelona School of Medicine and Health Sciences, 08007 Barcelona, Spain
| | - Judith Armstrong
- Department of Genetic and Molecular Medicine—IPER, Hospital Sant Joan de Déu, Institut de Recerca Sant Joan de Déu, 08950 Barcelona, Spain; (G.F.); (F.P.); (J.A.)
- Center for Biomedical Research Network on Rare Diseases (CIBERER), ISCIII, 08950 Barcelona, Spain
| |
Collapse
|
31
|
Sarwal V, Niehus S, Ayyala R, Kim M, Sarkar A, Chang S, Lu A, Rajkumar N, Darfci-Maher N, Littman R, Chhugani K, Soylev A, Comarova Z, Wesel E, Castellanos J, Chikka R, Distler MG, Eskin E, Flint J, Mangul S. A comprehensive benchmarking of WGS-based deletion structural variant callers. Brief Bioinform 2022; 23:6618239. [PMID: 35753701 DOI: 10.1093/bib/bbac221] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 04/30/2022] [Accepted: 05/11/2022] [Indexed: 01/10/2023] Open
Abstract
Advances in whole-genome sequencing (WGS) promise to enable the accurate and comprehensive structural variant (SV) discovery. Dissecting SVs from WGS data presents a substantial number of challenges and a plethora of SV detection methods have been developed. Currently, evidence that investigators can use to select appropriate SV detection tools is lacking. In this article, we have evaluated the performance of SV detection tools on mouse and human WGS data using a comprehensive polymerase chain reaction-confirmed gold standard set of SVs and the genome-in-a-bottle variant set, respectively. In contrast to the previous benchmarking studies, our gold standard dataset included a complete set of SVs allowing us to report both precision and sensitivity rates of the SV detection methods. Our study investigates the ability of the methods to detect deletions, thus providing an optimistic estimate of SV detection performance as the SV detection methods that fail to detect deletions are likely to miss more complex SVs. We found that SV detection tools varied widely in their performance, with several methods providing a good balance between sensitivity and precision. Additionally, we have determined the SV callers best suited for low- and ultralow-pass sequencing data as well as for different deletion length categories.
Collapse
Affiliation(s)
- Varuni Sarwal
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA.,Indian Institute of Technology Delhi, Hauz Khas, New Delhi, Delhi 110016, India
| | - Sebastian Niehus
- Berlin Institute of Health (BIH), Anna-Louisa-Karsch-Str. 2, 10178 Berlin, Germany.,Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Charitéplatz 1, 10117 Berlin, Germany
| | - Ram Ayyala
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA
| | - Minyoung Kim
- Department of Quantitative and Computational Biology, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089
| | - Aditya Sarkar
- School of Computing and Electrical Engineering, Indian Institute of Technology Mandi, Kamand, Mandi, Himachal Pradesh 175001, India
| | - Sei Chang
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA
| | - Angela Lu
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA
| | - Neha Rajkumar
- Department of Bioengineering, Department of Bioengineering, University of California Los Angeles, Los Angeles, CA, 90095
| | - Nicholas Darfci-Maher
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA
| | - Russell Littman
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA
| | - Karishma Chhugani
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California 1985 Zonal Avenue Los Angeles, CA 90089-9121
| | - Arda Soylev
- Department of Computer Engineering, Konya Food and Agriculture University, Konya, Turkey
| | - Zoia Comarova
- Department Civil and Environmental Engineering, University of Southern California, Los Angeles, CA, United States
| | - Emily Wesel
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA
| | - Jacqueline Castellanos
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA
| | - Rahul Chikka
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA
| | - Margaret G Distler
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA
| | - Eleazar Eskin
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA.,Department of Human Genetics, David Geffen School of Medicine at UCLA, 695 Charles E. Young Drive South, Box 708822, Los Angeles, CA, 90095, USA.,Department of Computational Medicine, David Geffen School of Medicine at UCLA, 73-235 CHS, Los Angeles, CA, 90095, USA
| | - Jonathan Flint
- Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, 760 Westwood Plaza, Los Angeles, CA 90095, USA
| | - Serghei Mangul
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California 1985 Zonal Avenue Los Angeles, CA 90089-9121
| |
Collapse
|
32
|
Sousos N, Ní Leathlobhair M, Simoglou Karali C, Louka E, Bienz N, Royston D, Clark SA, Hamblin A, Howard K, Mathews V, George B, Roy A, Psaila B, Wedge DC, Mead AJ. In utero origin of myelofibrosis presenting in adult monozygotic twins. Nat Med 2022; 28:1207-1211. [PMID: 35637336 PMCID: PMC9205768 DOI: 10.1038/s41591-022-01793-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Accepted: 03/22/2022] [Indexed: 12/11/2022]
Abstract
The latency between acquisition of an initiating somatic driver mutation by a single-cell and clinical presentation with cancer is largely unknown. We describe a remarkable case of monozygotic twins presenting with CALR mutation-positive myeloproliferative neoplasms (MPNs) (aged 37 and 38 years), with a clinical phenotype of primary myelofibrosis. The CALR mutation was absent in T cells and dermal fibroblasts, confirming somatic acquisition. Whole-genome sequencing lineage tracing revealed a common clonal origin of the CALR-mutant MPN clone, which occurred in utero followed by twin-to-twin transplacental transmission and subsequent similar disease latency. Index sorting and single-colony genotyping revealed phenotypic hematopoietic stem cells (HSCs) as the likely MPN-propagating cell. Furthermore, neonatal blood spot analysis confirmed in utero origin of the JAK2V617F mutation in a patient presenting with polycythemia vera (aged 34 years). These findings provide a unique window into the prolonged evolutionary dynamics of MPNs and fitness advantage exerted by MPN-associated driver mutations in HSCs.
Collapse
Affiliation(s)
- Nikolaos Sousos
- Medical Research Council (MRC) Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, National Institute for Health Research Biomedical Research Centre, University of Oxford, Oxford, UK
- Cancer and Haematology Centre, Churchill Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Máire Ní Leathlobhair
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
- Ludwig Institute for Cancer Research, University of Oxford, Oxford, UK
- Department of Microbiology, Moyne Institute of Preventive Medicine, School of Genetics and Microbiology, Trinity College Dublin, Dublin, Ireland
| | - Christina Simoglou Karali
- Medical Research Council (MRC) Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, National Institute for Health Research Biomedical Research Centre, University of Oxford, Oxford, UK
| | - Eleni Louka
- Medical Research Council (MRC) Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, National Institute for Health Research Biomedical Research Centre, University of Oxford, Oxford, UK
| | - Nicola Bienz
- Haematology Service, Wexham Park Hospital, Frimley Health NHS Foundation Trust, Slough, UK
| | - Daniel Royston
- Department of Cellular Pathology, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Sally-Ann Clark
- Flow Cytometry Facility, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
| | - Angela Hamblin
- Cancer and Haematology Centre, Churchill Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
- National Institute for Health Research Biomedical Research Centre, University of Oxford, Oxford, UK
| | - Kieran Howard
- National Institute for Health Research Biomedical Research Centre, University of Oxford, Oxford, UK
| | - Vikram Mathews
- Department of Haematology, Christian Medical College, Vellore, India
| | - Biju George
- Department of Haematology, Christian Medical College, Vellore, India
| | - Anindita Roy
- Medical Research Council (MRC) Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, National Institute for Health Research Biomedical Research Centre, University of Oxford, Oxford, UK
- Department of Paediatrics, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
| | - Bethan Psaila
- Medical Research Council (MRC) Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, National Institute for Health Research Biomedical Research Centre, University of Oxford, Oxford, UK
- Cancer and Haematology Centre, Churchill Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - David C Wedge
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.
- Manchester Cancer Research Centre, The University of Manchester, Manchester, UK.
| | - Adam J Mead
- Medical Research Council (MRC) Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, National Institute for Health Research Biomedical Research Centre, University of Oxford, Oxford, UK.
- Cancer and Haematology Centre, Churchill Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, UK.
| |
Collapse
|
33
|
Dodani DD, Nguyen MH, Morin RD, Marra MA, Corbett RD. Combinatorial and Machine Learning Approaches for Improved Somatic Variant Calling From Formalin-Fixed Paraffin-Embedded Genome Sequence Data. Front Genet 2022; 13:834764. [PMID: 35571031 PMCID: PMC9092826 DOI: 10.3389/fgene.2022.834764] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 03/18/2022] [Indexed: 11/13/2022] Open
Abstract
Formalin fixation of paraffin-embedded tissue samples is a well-established method for preserving tissue and is routinely used in clinical settings. Although formalin-fixed, paraffin-embedded (FFPE) tissues are deemed crucial for research and clinical applications, the fixation process results in molecular damage to nucleic acids, thus confounding their use in genome sequence analysis. Methods to improve genomic data quality from FFPE tissues have emerged, but there remains significant room for improvement. Here, we use whole-genome sequencing (WGS) data from matched Fresh Frozen (FF) and FFPE tissue samples to optimize a sensitive and precise FFPE single nucleotide variant (SNV) calling approach. We present methods to reduce the prevalence of false-positive SNVs by applying combinatorial techniques to five publicly available variant callers. We also introduce FFPolish, a novel variant classification method that efficiently classifies FFPE-specific false-positive variants. Our combinatorial and statistical techniques improve precision and F1 scores compared to the results of publicly available tools when tested individually.
Collapse
Affiliation(s)
- Dollina D Dodani
- The Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, Canada
| | - Matthew H Nguyen
- The Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, Canada
| | - Ryan D Morin
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Provincial Health Services Authority, Vancouver, BC, Canada.,Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Marco A Marra
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Provincial Health Services Authority, Vancouver, BC, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Richard D Corbett
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Provincial Health Services Authority, Vancouver, BC, Canada
| |
Collapse
|
34
|
Niu YN, Roberts EG, Denisko D, Hoffman MM. Assessing and assuring interoperability of a genomics file format. Bioinformatics 2022; 38:3327-3336. [PMID: 35575355 PMCID: PMC9237710 DOI: 10.1093/bioinformatics/btac327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 03/30/2022] [Accepted: 05/11/2022] [Indexed: 12/01/2022] Open
Abstract
Motivation Bioinformatics software tools operate largely through the use of specialized genomics file formats. Often these formats lack formal specification, making it difficult or impossible for the creators of these tools to robustly test them for correct handling of input and output. This causes problems in interoperability between different tools that, at best, wastes time and frustrates users. At worst, interoperability issues could lead to undetected errors in scientific results. Results We developed a new verification system, Acidbio, which tests for correct behavior in bioinformatics software packages. We crafted tests to unify correct behavior when tools encounter various edge cases—potentially unexpected inputs that exemplify the limits of the format. To analyze the performance of existing software, we tested the input validation of 80 Bioconda packages that parsed the Browser Extensible Data (BED) format. We also used a fuzzing approach to automatically perform additional testing. Of 80 software packages examined, 75 achieved less than 70% correctness on our test suite. We categorized multiple root causes for the poor performance of different types of software. Fuzzing detected other errors that the manually designed test suite could not. We also created a badge system that developers can use to indicate more precisely which BED variants their software accepts and to advertise the software’s performance on the test suite. Availability and implementation Acidbio is available at https://github.com/hoffmangroup/acidbio. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yi Nian Niu
- Princess Margaret Cancer Centre University Health Network, Toronto, ON, M5G 2C1, Canada
| | - Eric G Roberts
- Princess Margaret Cancer Centre University Health Network, Toronto, ON, M5G 2C1, Canada
| | - Danielle Denisko
- Princess Margaret Cancer Centre University Health Network, Toronto, ON, M5G 2C1, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada
| | - Michael M Hoffman
- Princess Margaret Cancer Centre University Health Network, Toronto, ON, M5G 2C1, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada.,Vector Institute, Toronto, ON, M5G 1M1, Canada
| |
Collapse
|
35
|
Barbitoff YA, Abasov R, Tvorogova VE, Glotov AS, Predeus AV. Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery. BMC Genomics 2022; 23:155. [PMID: 35193511 PMCID: PMC8862519 DOI: 10.1186/s12864-022-08365-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 02/03/2022] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Accurate variant detection in the coding regions of the human genome is a key requirement for molecular diagnostics of Mendelian disorders. Efficiency of variant discovery from next-generation sequencing (NGS) data depends on multiple factors, including reproducible coverage biases of NGS methods and the performance of read alignment and variant calling software. Although variant caller benchmarks are published constantly, no previous publications have leveraged the full extent of available gold standard whole-genome (WGS) and whole-exome (WES) sequencing datasets. RESULTS In this work, we systematically evaluated the performance of 4 popular short read aligners (Bowtie2, BWA, Isaac, and Novoalign) and 9 novel and well-established variant calling and filtering methods (Clair3, DeepVariant, Octopus, GATK, FreeBayes, and Strelka2) using a set of 14 "gold standard" WES and WGS datasets available from Genome In A Bottle (GIAB) consortium. Additionally, we have indirectly evaluated each pipeline's performance using a set of 6 non-GIAB samples of African and Russian ethnicity. In our benchmark, Bowtie2 performed significantly worse than other aligners, suggesting it should not be used for medical variant calling. When other aligners were considered, the accuracy of variant discovery mostly depended on the variant caller and not the read aligner. Among the tested variant callers, DeepVariant consistently showed the best performance and the highest robustness. Other actively developed tools, such as Clair3, Octopus, and Strelka2, also performed well, although their efficiency had greater dependence on the quality and type of the input data. We have also compared the consistency of variant calls in GIAB and non-GIAB samples. With few important caveats, best-performing tools have shown little evidence of overfitting. CONCLUSIONS The results show surprisingly large differences in the performance of cutting-edge tools even in high confidence regions of the coding genome. This highlights the importance of regular benchmarking of quickly evolving tools and pipelines. We also discuss the need for a more diverse set of gold standard genomes that would include samples of African, Hispanic, or mixed ancestry. Additionally, there is also a need for better variant caller assessment in the repetitive regions of the coding genome.
Collapse
Affiliation(s)
- Yury A Barbitoff
- Bioinformatics Institute, St. Petersburg, Russia. .,Department of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology and Reproductology, St. Petersburg, Russia. .,Department of Genetics and Biotechnology, St. Petersburg State University, St. Petersburg, Russia.
| | - Ruslan Abasov
- Bioinformatics Institute, St. Petersburg, Russia.,Dmitry Rogachev National Research Center of Pediatric Hematology-Oncology and Immunology, Moscow, Russia
| | - Varvara E Tvorogova
- Bioinformatics Institute, St. Petersburg, Russia.,Department of Genetics and Biotechnology, St. Petersburg State University, St. Petersburg, Russia
| | - Andrey S Glotov
- Department of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology and Reproductology, St. Petersburg, Russia
| | | |
Collapse
|
36
|
Establishment of reference standards for multifaceted mosaic variant analysis. Sci Data 2022; 9:35. [PMID: 35115554 PMCID: PMC8813952 DOI: 10.1038/s41597-022-01133-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 12/20/2021] [Indexed: 11/21/2022] Open
Abstract
Detection of somatic mosaicism in non-proliferative cells is a new challenge in genome research, however, the accuracy of current detection strategies remains uncertain due to the lack of a ground truth. Herein, we sought to present a set of ultra-deep sequenced WES data based on reference standards generated by cell line mixtures, providing a total of 386,613 mosaic single-nucleotide variants (SNVs) and insertion-deletion mutations (INDELs) with variant allele frequencies (VAFs) ranging from 0.5% to 56%, as well as 35,113,417 non-variant and 19,936 germline variant sites as a negative control. The whole reference standard set mimics the cumulative aspect of mosaic variant acquisition such as in the early developmental stage owing to the progressive mixing of cell lines with established genotypes, ultimately unveiling 741 possible inter-sample relationships with respect to variant sharing and asymmetry in VAFs. We expect that our reference data will be essential for optimizing the current use of mosaic variant detection strategies and for developing algorithms to enable future improvements. Measurement(s) | genotype | Technology Type(s) | DNA sequencing | Factor Type(s) | genotyping | Sample Characteristic - Organism | Homo sapiens | Sample Characteristic - Environment | cell line |
Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.16970041
Collapse
|
37
|
Cooke DP, Wedge DC, Lunter G. Benchmarking small-variant genotyping in polyploids. Genome Res 2022; 32:403-408. [PMID: 34965940 PMCID: PMC8805713 DOI: 10.1101/gr.275579.121] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 12/19/2021] [Indexed: 11/24/2022]
Abstract
Genotyping from sequencing is the basis of emerging strategies in the molecular breeding of polyploid plants. However, compared with the situation for diploids, in which genotyping accuracies are confidently determined with comprehensive benchmarks, polyploids have been neglected; there are no benchmarks measuring genotyping error rates for small variants using real sequencing reads. We previously introduced a variant calling method, Octopus, that accurately calls germline variants in diploids and somatic mutations in tumors. Here, we evaluate Octopus and other popular tools on whole-genome tetraploid and hexaploid data sets created using in silico mixtures of diploid Genome in a Bottle (GIAB) samples. We find that genotyping errors are abundant for typical sequencing depths but that Octopus makes 25% fewer errors than other methods on average. We supplement our benchmarks with concordance analysis in real autotriploid banana data sets.
Collapse
Affiliation(s)
- Daniel P Cooke
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford OX3 9DS, United Kingdom
| | - David C Wedge
- Manchester Cancer Research Centre, University of Manchester, Manchester M20 4GJ, United Kingdom
| | - Gerton Lunter
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford OX3 9DS, United Kingdom
- Department of Epidemiology, University Medical Center Groningen, 9713 GZ Groningen, The Netherlands
| |
Collapse
|
38
|
Anzar I, Sverchkova A, Samarakoon P, Ellingsen EB, Gaudernack G, Stratford R, Clancy T. Personalized
HLA
typing leads to the discovery of novel
HLA
alleles and tumor‐specific
HLA
variants. HLA 2022; 99:313-327. [PMID: 35073457 PMCID: PMC9546058 DOI: 10.1111/tan.14562] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 01/08/2022] [Accepted: 01/21/2022] [Indexed: 11/29/2022]
Abstract
Accurate and full‐length typing of the HLA region is important in many clinical and research settings. With the advent of next generation sequencing (NGS), several HLA typing algorithms have been developed, including many that are applicable to whole exome sequencing (WES). However, most of these solutions operate by providing the closest‐matched HLA allele among the known alleles in IPD‐IMGT/HLA Database. These database‐matching approaches have demonstrated very high performance when typing well characterized HLA alleles. However, as they rely on the completeness of the HLA database, they are not optimal for detecting novel or less well characterized alleles. Furthermore, the database‐matching approaches are also not adequate in the context of cancer, where a comprehensive characterization of somatic HLA variation and expression patterns of a tumor's HLA locus may guide therapy and clinical outcome, because of the pivotal role HLA alleles play in tumor antigen recognition and immune escape. Here, we describe a personalized HLA typing approach applied to WES data that leverages the strengths of database‐matching approaches while simultaneously allowing for the discovery of novel HLA alleles and tumor‐specific HLA variants, through the systematic integration of germline and somatic variant calling. We applied this approach on WES from 10 metastatic melanoma patients and validated the HLA typing results using HLA targeted NGS sequencing from patients where at least one HLA germline candidate was detected on Class I HLA. Targeted NGS sequencing confirmed 100% performance for the 1st and 2nd fields. In total, five out of the six detected HLA germline variants were because of Class I ambiguities at the third or fourth fields, and their detection recovered the correct HLA allele genotype. The sixth germline variant let to the formal discovery of a novel Class I allele. Finally, we demonstrated a substantially improved somatic variant detection accuracy in HLA alleles with a 91% of success rate in simulated experiments. The approach described here may allow the field to genotype more accurately using WES data, leading to the discovery of novel HLA alleles and help characterize the relationship between somatic variation in the HLA region and immunosurveillance.
Collapse
Affiliation(s)
- Irantzu Anzar
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379 Oslo Norway
| | - Angelina Sverchkova
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379 Oslo Norway
| | - Pubudu Samarakoon
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379 Oslo Norway
| | | | - Gustav Gaudernack
- Ultimovacs ASA, Oslo Cancer Cluster, Ullernchausseen 64/66 Oslo Norway
| | - Richard Stratford
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379 Oslo Norway
| | - Trevor Clancy
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379 Oslo Norway
| |
Collapse
|
39
|
Sahraeian SME, Fang LT, Karagiannis K, Moos M, Smith S, Santana-Quintero L, Xiao C, Colgan M, Hong H, Mohiyuddin M, Xiao W. Achieving robust somatic mutation detection with deep learning models derived from reference data sets of a cancer sample. Genome Biol 2022; 23:12. [PMID: 34996510 PMCID: PMC8740374 DOI: 10.1186/s13059-021-02592-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 12/28/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Accurate detection of somatic mutations is challenging but critical in understanding cancer formation, progression, and treatment. We recently proposed NeuSomatic, the first deep convolutional neural network-based somatic mutation detection approach, and demonstrated performance advantages on in silico data. RESULTS In this study, we use the first comprehensive and well-characterized somatic reference data sets from the SEQC2 consortium to investigate best practices for using a deep learning framework in cancer mutation detection. Using the high-confidence somatic mutations established for a cancer cell line by the consortium, we identify the best strategy for building robust models on multiple data sets derived from samples representing real scenarios, for example, a model trained on a combination of real and spike-in mutations had the highest average performance. CONCLUSIONS The strategy identified in our study achieved high robustness across multiple sequencing technologies for fresh and FFPE DNA input, varying tumor/normal purities, and different coverages, with significant superiority over conventional detection approaches in general, as well as in challenging situations such as low coverage, low variant allele frequency, DNA damage, and difficult genomic regions.
Collapse
Affiliation(s)
| | - Li Tai Fang
- Roche Sequencing Solutions, Santa Clara, CA, 95050, USA
| | - Konstantinos Karagiannis
- The Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD, 20993, USA
| | - Malcolm Moos
- The Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD, 20993, USA
| | - Sean Smith
- The Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD, 20993, USA
| | - Luis Santana-Quintero
- The Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD, 20993, USA
| | - Chunlin Xiao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Michael Colgan
- Office of Oncological Diseases, Office of New Drug, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD, 20993, USA
| | - Huixiao Hong
- Bioinformatics branch, Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, 3900 NCTR Road, Jefferson, AR, 72079, USA
| | | | - Wenming Xiao
- Office of Oncological Diseases, Office of New Drug, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD, 20993, USA.
| |
Collapse
|
40
|
Cooke DP. Octopus: Genotyping and Haplotyping in Diverse Experimental Designs. Methods Mol Biol 2022; 2493:29-51. [PMID: 35751807 DOI: 10.1007/978-1-0716-2293-3_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Haplotype-based variant callers have become the de facto choice for genotyping from next-generation sequencing (NGS) as they are able to resolve read-mapper alignment errors and implicitly phase heterozygous variants. Here, I describe how the haplotype-based variant calling tool Octopus can be used for genotyping and haplotyping in several common experimental designs.
Collapse
Affiliation(s)
- Daniel P Cooke
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK.
| |
Collapse
|
41
|
Zhao X, Hu AC, Wang S, Wang X. Calling small variants using universality with Bayes-factor-adjusted odds ratios. Brief Bioinform 2021; 23:6427501. [PMID: 34791010 DOI: 10.1093/bib/bbab458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 09/26/2021] [Accepted: 10/07/2021] [Indexed: 11/12/2022] Open
Abstract
The application of next-generation sequencing in research and particularly in clinical routine requires highly accurate variant calling. Here we describe UVC, a method for calling small variants of germline or somatic origin. By unifying opposite assumptions with sublation, we discovered the following two empirical laws to improve variant calling: allele fraction at high sequencing depth is inversely proportional to the cubic root of variant-calling error rate, and odds ratios adjusted with Bayes factors can model various sequencing biases. UVC outperformed other variant callers on the GIAB germline truth sets, 192 scenarios of in silico mixtures simulating 192 combinations of tumor/normal sequencing depths and tumor/normal purities, the GIAB somatic truth sets derived from physical mixture, and the SEQC2 somatic reference sets derived from the breast-cancer cell-line HCC1395. UVC achieved 100% concordance with the manual review conducted by multiple independent researchers on a Qiagen 71-gene-panel dataset derived from 16 patients with colon adenoma. UVC outperformed other unique molecular identifier (UMI)-aware variant callers on the datasets used for publishing these variant callers. Performance was measured with sensitivity-specificity trade off for called variants. The improved variant calls generated by UVC from previously published UMI-based sequencing data provided additional insight about DNA damage repair. UVC is open-sourced under the BSD 3-Clause license at https://github.com/genetronhealth/uvc and quay.io/genetronhealth/gcc-6-3-0-uvc-0-6-0-441a694.
Collapse
Affiliation(s)
- Xiaofei Zhao
- Genetron Health (Beijing) Co. Ltd, Beijing 102208, China
| | - Allison C Hu
- Genetron Health (Beijing) Co. Ltd, Beijing 102208, China
| | - Sizhen Wang
- Genetron Health (Beijing) Co. Ltd, Beijing 102208, China
| | - Xiaoyue Wang
- State Key Laboratory of Medical Molecular Biology, Center for Bioinformatics, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing 100005, China
| |
Collapse
|
42
|
Ahmad T, Al Ars Z, Hofstee HP. VC@Scale: Scalable and high-performance variant calling on cluster environments. Gigascience 2021; 10:giab057. [PMID: 34494101 PMCID: PMC8424057 DOI: 10.1093/gigascience/giab057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Revised: 06/05/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Recently many new deep learning-based variant-calling methods like DeepVariant have emerged as more accurate compared with conventional variant-calling algorithms such as GATK HaplotypeCaller, Sterlka2, and Freebayes albeit at higher computational costs. Therefore, there is a need for more scalable and higher performance workflows of these deep learning methods. Almost all existing cluster-scaled variant-calling workflows that use Apache Spark/Hadoop as big data frameworks loosely integrate existing single-node pre-processing and variant-calling applications. Using Apache Spark just for distributing/scheduling data among loosely coupled applications or using I/O-based storage for storing the output of intermediate applications does not exploit the full benefit of Apache Spark in-memory processing. To achieve this, we propose a native Spark-based workflow that uses Python and Apache Arrow to enable efficient transfer of data between different workflow stages. This benefits from the ease of programmability of Python and the high efficiency of Arrow's columnar in-memory data transformations. RESULTS Here we present a scalable, parallel, and efficient implementation of next-generation sequencing data pre-processing and variant-calling workflows. Our design tightly integrates most pre-processing workflow stages, using Spark built-in functions to sort reads by coordinates and mark duplicates efficiently. Our approach outperforms state-of-the-art implementations by >2 times for the pre-processing stages, creating a scalable and high-performance solution for DeepVariant for both CPU-only and CPU + GPU clusters. CONCLUSIONS We show the feasibility and easy scalability of our approach to achieve high performance and efficient resource utilization for variant-calling analysis on high-performance computing clusters using the standardized Apache Arrow data representations. All codes, scripts, and configurations used to run our implementations are publicly available and open sourced; see https://github.com/abs-tudelft/variant-calling-at-scale.
Collapse
Affiliation(s)
- Tanveer Ahmad
- Faculty of Electrical Engineering, Mathematics and Computer Science, Quantum & Computer Engineering Department, Mekelweg 4, 2628 CD Delft, Netherlands
| | - Zaid Al Ars
- Faculty of Electrical Engineering, Mathematics and Computer Science, Quantum & Computer Engineering Department, Mekelweg 4, 2628 CD Delft, Netherlands
| | - H Peter Hofstee
- Faculty of Electrical Engineering, Mathematics and Computer Science, Quantum & Computer Engineering Department, Mekelweg 4, 2628 CD Delft, Netherlands
- IBM Austin, TX, USA
| |
Collapse
|
43
|
Musunuri R, Arora K, Corvelo A, Shah M, Shelton J, Zody MC, Narzisi G. Somatic variant analysis of linked-reads sequencing data with Lancet. Bioinformatics 2021; 37:1918-1919. [PMID: 33241313 DOI: 10.1093/bioinformatics/btaa888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 09/03/2020] [Accepted: 10/02/2020] [Indexed: 11/14/2022] Open
Abstract
SUMMARY We present a new version of the popular somatic variant caller, Lancet, that supports the analysis of linked-reads sequencing data. By seamlessly integrating barcodes and haplotype read assignments within the colored De Bruijn graph local-assembly framework, Lancet computes a barcode-aware coverage and identifies variants that disagree with the local haplotype structure. AVAILABILITY AND IMPLEMENTATION Lancet is implemented in C++ and available for academic and non-commercial research purposes as an open-source package at https://github.com/nygenome/lancet. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rajeeva Musunuri
- Computational Biology Lab, New York Genome Center, New York, NY 10013, USA
| | - Kanika Arora
- Computational Biology Lab, New York Genome Center, New York, NY 10013, USA
| | - André Corvelo
- Computational Biology Lab, New York Genome Center, New York, NY 10013, USA
| | - Minita Shah
- Computational Biology Lab, New York Genome Center, New York, NY 10013, USA
| | - Jennifer Shelton
- Computational Biology Lab, New York Genome Center, New York, NY 10013, USA
| | - Michael C Zody
- Computational Biology Lab, New York Genome Center, New York, NY 10013, USA
| | - Giuseppe Narzisi
- Computational Biology Lab, New York Genome Center, New York, NY 10013, USA
| |
Collapse
|
44
|
Hynst J, Navrkalova V, Pal K, Pospisilova S. Bioinformatic strategies for the analysis of genomic aberrations detected by targeted NGS panels with clinical application. PeerJ 2021; 9:e10897. [PMID: 33850640 PMCID: PMC8019320 DOI: 10.7717/peerj.10897] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 01/13/2021] [Indexed: 01/21/2023] Open
Abstract
Molecular profiling of tumor samples has acquired importance in cancer research, but currently also plays an important role in the clinical management of cancer patients. Rapid identification of genomic aberrations improves diagnosis, prognosis and effective therapy selection. This can be attributed mainly to the development of next-generation sequencing (NGS) methods, especially targeted DNA panels. Such panels enable a relatively inexpensive and rapid analysis of various aberrations with clinical impact specific to particular diagnoses. In this review, we discuss the experimental approaches and bioinformatic strategies available for the development of an NGS panel for a reliable analysis of selected biomarkers. Compliance with defined analytical steps is crucial to ensure accurate and reproducible results. In addition, a careful validation procedure has to be performed before the application of NGS targeted assays in routine clinical practice. With more focus on bioinformatics, we emphasize the need for thorough pipeline validation and management in relation to the particular experimental setting as an integral part of the NGS method establishment. A robust and reproducible bioinformatic analysis running on powerful machines is essential for proper detection of genomic variants in clinical settings since distinguishing between experimental noise and real biological variants is fundamental. This review summarizes state-of-the-art bioinformatic solutions for careful detection of the SNV/Indels and CNVs for targeted sequencing resulting in translation of sequencing data into clinically relevant information. Finally, we share our experience with the development of a custom targeted NGS panel for an integrated analysis of biomarkers in lymphoproliferative disorders.
Collapse
Affiliation(s)
- Jakub Hynst
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic.,Department of Internal Medicine-Hematology and Oncology, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic.,Department of Medical Genetics and Genomics, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic
| | - Veronika Navrkalova
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic.,Department of Internal Medicine-Hematology and Oncology, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic
| | - Karol Pal
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic.,Department of Hematology, University Hospital Schleswig-Holstein, Kiel, Germany
| | - Sarka Pospisilova
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic.,Department of Internal Medicine-Hematology and Oncology, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic.,Department of Medical Genetics and Genomics, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic
| |
Collapse
|