1
|
Jia H, Tan S, Zhang YE. Chasing Sequencing Perfection: Marching Toward Higher Accuracy and Lower Costs. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae024. [PMID: 38991976 DOI: 10.1093/gpbjnl/qzae024] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 01/25/2024] [Accepted: 01/29/2024] [Indexed: 07/13/2024]
Abstract
Next-generation sequencing (NGS), represented by Illumina platforms, has been an essential cornerstone of basic and applied research. However, the sequencing error rate of 1 per 1000 bp (10-3) represents a serious hurdle for research areas focusing on rare mutations, such as somatic mosaicism or microbe heterogeneity. By examining the high-fidelity sequencing methods developed in the past decade, we summarized three major factors underlying errors and the corresponding 12 strategies mitigating these errors. We then proposed a novel framework to classify 11 preexisting representative methods according to the corresponding combinatory strategies and identified three trends that emerged during methodological developments. We further extended this analysis to eight long-read sequencing methods, emphasizing error reduction strategies. Finally, we suggest two promising future directions that could achieve comparable or even higher accuracy with lower costs in both NGS and long-read sequencing.
Collapse
Affiliation(s)
- Hangxing Jia
- CAS Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Shengjun Tan
- CAS Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Yong E Zhang
- CAS Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- CAS Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
| |
Collapse
|
2
|
Menon V, Brash DE. Next-generation sequencing methodologies to detect low-frequency mutations: "Catch me if you can". MUTATION RESEARCH. REVIEWS IN MUTATION RESEARCH 2023; 792:108471. [PMID: 37716438 PMCID: PMC10843083 DOI: 10.1016/j.mrrev.2023.108471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 09/06/2023] [Accepted: 09/07/2023] [Indexed: 09/18/2023]
Abstract
Mutations, the irreversible changes in an organism's DNA sequence, are present in tissues at a variant allele frequency (VAF) ranging from ∼10-8 per bp for a founder mutation to ∼10-3 for a histologically normal tissue sample containing several independent clones - compared to 1%- 50% for a heterozygous tumor mutation or a polymorphism. The rarity of these events poses a challenge for accurate clinical diagnosis and prognosis, toxicology, and discovering new disease etiologies. Standard Next-Generation Sequencing (NGS) technologies report VAFs as low as 0.5% per nt, but reliably observing rarer precursor events requires additional sophistication to measure ultralow-frequency mutations. We detail the challenge; define terms used to characterize the results, which vary between laboratories and sometimes conflict between biologists and bioinformaticists; and describe recent innovations to improve standard NGS methodologies including: single-strand consensus sequence methods such as Safe-SeqS and SiMSen-Seq; tandem-strand consensus sequence methods such as o2n-Seq and SMM-Seq; and ultrasensitive parent-strand consensus sequence methods such as DuplexSeq, PacBio HiFi, SinoDuplex, OPUSeq, EcoSeq, BotSeqS, Hawk-Seq, NanoSeq, SaferSeq, and CODEC. Practical applications are also noted. Several methods quantify VAF down to 10-5 at a nt and mutation frequency (MF) in a target region down to 10-7 per nt. By expanding to > 1 Mb of sites never observed twice, thus forgoing VAF, other methods quantify MF < 10-9 per nt or < 15 errors per haploid genome. Clonal expansion cannot be directly distinguished from independent mutations by sequencing, so it is essential for a paper to report whether its MF counted only different mutations - the minimum independent-mutation frequency MFminI - or all mutations observed including recurrences - the larger maximum independent-mutation frequency MFmaxI which may reflect clonal expansion. Ultrasensitive methods reveal that, without their use, even mutations with VAF 0.5-1% are usually spurious.
Collapse
Affiliation(s)
- Vijay Menon
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, CT 06520-8040, USA.
| | - Douglas E Brash
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, CT 06520-8040, USA; Department of Dermatology, Yale School of Medicine, New Haven, CT 06520-8059, USA; Yale Cancer Center, Yale School of Medicine, New Haven, CT 06520-8028, USA.
| |
Collapse
|
3
|
Bae JH, Liu R, Roberts E, Nguyen E, Tabrizi S, Rhoades J, Blewett T, Xiong K, Gydush G, Shea D, An Z, Patel S, Cheng J, Sridhar S, Liu MH, Lassen E, Skytte AB, Grońska-Pęski M, Shoag JE, Evrony GD, Parsons HA, Mayer EL, Makrigiorgos GM, Golub TR, Adalsteinsson VA. Single duplex DNA sequencing with CODEC detects mutations with high sensitivity. Nat Genet 2023; 55:871-879. [PMID: 37106072 PMCID: PMC10181940 DOI: 10.1038/s41588-023-01376-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 03/21/2023] [Indexed: 04/29/2023]
Abstract
Detecting mutations from single DNA molecules is crucial in many fields but challenging. Next-generation sequencing (NGS) affords tremendous throughput but cannot directly sequence double-stranded DNA molecules ('single duplexes') to discern the true mutations on both strands. Here we present Concatenating Original Duplex for Error Correction (CODEC), which confers single duplex resolution to NGS. CODEC affords 1,000-fold higher accuracy than NGS, using up to 100-fold fewer reads than duplex sequencing. CODEC revealed mutation frequencies of 2.72 × 10-8 in sperm of a 39-year-old individual, and somatic mutations acquired with age in blood cells. CODEC detected genome-wide, clonal hematopoiesis mutations from single DNA molecules, single mutated duplexes from tumor genomes and liquid biopsies, microsatellite instability with 10-fold greater sensitivity and mutational signatures, and specific tumor mutations with up to 100-fold fewer reads. CODEC enables more precise genetic testing and reveals biologically significant mutations, which are commonly obscured by NGS errors.
Collapse
Affiliation(s)
- Jin H Bae
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ruolin Liu
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Erica Nguyen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Shervin Tabrizi
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research at MIT, Cambridge, MA, USA
- Massachusetts General Hospital, Boston, MA, USA
| | | | | | - Kan Xiong
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Douglas Shea
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Zhenyi An
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sahil Patel
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research at MIT, Cambridge, MA, USA
- Massachusetts General Hospital, Boston, MA, USA
| | - Ju Cheng
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Mei Hong Liu
- Center for Human Genetics and Genomics, Departments of Pediatrics and Neuroscience & Physiology, New York University Grossman School of Medicine, New York City, NY, USA
| | | | | | - Marta Grońska-Pęski
- Center for Human Genetics and Genomics, Departments of Pediatrics and Neuroscience & Physiology, New York University Grossman School of Medicine, New York City, NY, USA
| | - Jonathan E Shoag
- University Hospitals Cleveland Medical Center, Case Western Reserve University School of Medicine, Case Comprehensive Cancer Center, Cleveland, OH, USA
| | - Gilad D Evrony
- Center for Human Genetics and Genomics, Departments of Pediatrics and Neuroscience & Physiology, New York University Grossman School of Medicine, New York City, NY, USA
| | | | | | | | - Todd R Golub
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | | |
Collapse
|
4
|
Distinguishing excess mutations and increased cell death based on variant allele frequencies. PLoS Comput Biol 2022; 18:e1010048. [PMID: 35468135 PMCID: PMC9071171 DOI: 10.1371/journal.pcbi.1010048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 05/05/2022] [Accepted: 03/22/2022] [Indexed: 12/03/2022] Open
Abstract
Tumors often harbor orders of magnitude more mutations than healthy tissues. The increased number of mutations may be due to an elevated mutation rate or frequent cell death and correspondingly rapid cell turnover, or a combination of the two. It is difficult to disentangle these two mechanisms based on widely available bulk sequencing data, where sequences from individual cells are intermixed and, thus, the cell lineage tree of the tumor cannot be resolved. Here we present a method that can simultaneously estimate the cell turnover rate and the rate of mutations from bulk sequencing data. Our method works by simulating tumor growth and finding the parameters with which the observed data can be reproduced with maximum likelihood. Applying this method to a real tumor sample, we find that both the mutation rate and the frequency of death may be high. Tumors frequently harbor an elevated number of mutations, compared to healthy tissue. These extra mutations may be generated either by an increased mutation rate or the presence of cell death resulting in increased cellular turnover and additional cell divisions for tumor growth. Separating the effects of these two factors is a nontrivial problem. Here we present a method which can simultaneously estimate cell turnover rate and genomic mutation rate from bulk sequencing data. Our method is based on the estimation of the parameters of a generative model of tumor growth and mutations. Applying our method to a human hepatocellular carcinoma sample reveals an elevated per cell division mutation rate and high cell turnover.
Collapse
|
5
|
Mielinis P, Sukackaitė R, Serapinaitė A, Samoilovas F, Alzbutas G, Matjošaitis K, Lubys A. MuA-based Molecular Indexing for Rare Mutation Detection by Next-Generation Sequencing. J Mol Biol 2021; 433:167209. [PMID: 34419430 DOI: 10.1016/j.jmb.2021.167209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 08/12/2021] [Accepted: 08/12/2021] [Indexed: 12/12/2022]
Abstract
Detection of low-frequency mutations in cancer genomes or other heterogeneous cell populations requires high-fidelity sequencing. Molecular barcoding is one of the key technologies that enables the differentiation of true mutations from errors, which can be caused by sequencing or library preparation processes. However, current approaches where barcodes are introduced via primer extension or adaptor ligation do not utilize the full power of barcoding, due to complicated library preparation workflows and biases. Here we demonstrate the remarkable tolerance of MuA transposase to the presence of multiple replacements in transposon sequence, and explore this unique feature to engineer the MuA transposome complex with randomised nucleotides in 12 transposon positions, which can be introduced as a barcode into the target molecule after transposition event. We applied the approach of Unique MuA-based Molecular Indexing (UMAMI) to assess the power of rare mutation detection by shortgun sequencing on the Illumina platform. Our results show that UMAMI allows detection of rare mutations readily and reliably, and in this paper we report error rate values for the number of thermophilic DNA polymerases measured by using UMAMI.
Collapse
Affiliation(s)
- Paulius Mielinis
- Thermo Fisher Scientific Baltics UAB, V. A. Graičiūno 8, Vilnius LT-02241, Lithuania
| | - Rasa Sukackaitė
- Thermo Fisher Scientific Baltics UAB, V. A. Graičiūno 8, Vilnius LT-02241, Lithuania.
| | - Aistė Serapinaitė
- Thermo Fisher Scientific Baltics UAB, V. A. Graičiūno 8, Vilnius LT-02241, Lithuania
| | - Faustas Samoilovas
- Thermo Fisher Scientific Baltics UAB, V. A. Graičiūno 8, Vilnius LT-02241, Lithuania
| | - Gediminas Alzbutas
- Thermo Fisher Scientific Baltics UAB, V. A. Graičiūno 8, Vilnius LT-02241, Lithuania
| | - Karolis Matjošaitis
- Thermo Fisher Scientific Baltics UAB, V. A. Graičiūno 8, Vilnius LT-02241, Lithuania
| | - Arvydas Lubys
- Thermo Fisher Scientific Baltics UAB, V. A. Graičiūno 8, Vilnius LT-02241, Lithuania
| |
Collapse
|
6
|
Dia A, Cheeseman IH. Single-cell genome sequencing of protozoan parasites. Trends Parasitol 2021; 37:803-814. [PMID: 34172399 PMCID: PMC8364489 DOI: 10.1016/j.pt.2021.05.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 05/21/2021] [Accepted: 05/26/2021] [Indexed: 12/27/2022]
Abstract
Despite considerable genetic variation within hosts, most parasite genome sequencing studies focus on bulk samples composed of millions of cells. Analysis of bulk samples is biased toward the dominant genotype, concealing cell-to-cell variation and rare variants. To tackle this, single-cell sequencing approaches have been developed and tailored to specific host-parasite systems. These are allowing the genetic diversity and kinship in complex parasite populations to be deciphered and for de novo genetic variation to be captured. Here, we outline the methodologies being used for single-cell sequencing of parasitic protozoans, such as Plasmodium and Leishmania spp., and how these tools are being applied to understand parasite biology.
Collapse
Affiliation(s)
- Aliou Dia
- Host-Pathogen Interaction Program, Texas Biomedical Research Institute, San Antonio, TX, USA
| | - Ian H Cheeseman
- Host-Pathogen Interaction Program, Texas Biomedical Research Institute, San Antonio, TX, USA.
| |
Collapse
|
7
|
Genomic Mosaicism Formed by Somatic Variation in the Aging and Diseased Brain. Genes (Basel) 2021; 12:genes12071071. [PMID: 34356087 PMCID: PMC8305509 DOI: 10.3390/genes12071071] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 07/09/2021] [Accepted: 07/12/2021] [Indexed: 12/22/2022] Open
Abstract
Over the past 20 years, analyses of single brain cell genomes have revealed that the brain is composed of cells with myriad distinct genomes: the brain is a genomic mosaic, generated by a host of DNA sequence-altering processes that occur somatically and do not affect the germline. As such, these sequence changes are not heritable. Some processes appear to occur during neurogenesis, when cells are mitotic, whereas others may also function in post-mitotic cells. Here, we review multiple forms of DNA sequence alterations that have now been documented: aneuploidies and aneusomies, smaller copy number variations (CNVs), somatic repeat expansions, retrotransposons, genomic cDNAs (gencDNAs) associated with somatic gene recombination (SGR), and single nucleotide variations (SNVs). A catch-all term of DNA content variation (DCV) has also been used to describe the overall phenomenon, which can include multiple forms within a single cell’s genome. A requisite step in the analyses of genomic mosaicism is ongoing technology development, which is also discussed. Genomic mosaicism alters one of the most stable biological molecules, DNA, which may have many repercussions, ranging from normal functions including effects of aging, to creating dysfunction that occurs in neurodegenerative and other brain diseases, most of which show sporadic presentation, unlinked to causal, heritable genes.
Collapse
|
8
|
Gelbart M, Harari S, Ben-Ari Y, Kustin T, Wolf D, Mandelboim M, Mor O, Pennings PS, Stern A. Drivers of within-host genetic diversity in acute infections of viruses. PLoS Pathog 2020; 16:e1009029. [PMID: 33147296 PMCID: PMC7668575 DOI: 10.1371/journal.ppat.1009029] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 11/16/2020] [Accepted: 10/04/2020] [Indexed: 12/01/2022] Open
Abstract
Genetic diversity is the fuel of evolution and facilitates adaptation to novel environments. However, our understanding of what drives differences in the genetic diversity during the early stages of viral infection is somewhat limited. Here, we use ultra-deep sequencing to interrogate 43 clinical samples taken from early infections of the human-infecting viruses HIV, RSV and CMV. Hundreds to thousands of virus templates were sequenced per sample, allowing us to reveal dramatic differences in within-host genetic diversity among virus populations. We found that increased diversity was mostly driven by presence of multiple divergent genotypes in HIV and CMV samples, which we suggest reflect multiple transmitted/founder viruses. Conversely, we detected an abundance of low frequency hyper-edited genomes in RSV samples, presumably reflecting defective virus genomes (DVGs). We suggest that RSV is characterized by higher levels of cellular co-infection, which allow for complementation and hence elevated levels of DVGs. The few days or weeks following infection with a virus, termed acute infection, are critical for virus establishment. Here we sought to characterize what leads to differences in the genetic diversity of different viruses sampled during acute infection. We performed ultra-deep sequencing of hundreds to thousands viral genomes from forty-three samples spanning three pathogenic human viruses: HIV, RSV and CMV. We found major differences in the genetic diversity of these different viruses, and in different patients infected with the same virus. We investigated the factors responsible for these differences. We found that the DNA virus CMV was less diverse, most likely since it has a lower mutation rate than the RNA viruses HIV and RSV. We also found that the samples with the highest genetic diversity, which included one CMV sample and two HIV samples, bore evidence for multiple genotype infection. In other words, patients from whom these samples were taken were infected with two different “strains” of the virus. Finally, we also found evidence that viral genomes of HIV, and in particular RSV, are edited by the innate immune system of the host, leading to the presence of defective virus genomes.
Collapse
Affiliation(s)
- Maoz Gelbart
- The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Sheri Harari
- The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Ya’ara Ben-Ari
- The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Talia Kustin
- The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Dana Wolf
- Clinical Virology Unit, Hadassah Hebrew University Medical Center, Jerusalem, Israel
- The Lautenberg Center for General and Tumor Immunology, IMRIC, the Faculty of Medicine, the Hebrew University, Jerusalem, Israel
| | - Michal Mandelboim
- Central Virology Laboratory, Ministry of Health, Sheba Medical Center, Ramat-Gan, Israel
- Department of Epidemiology and Preventive Medicine, School of Public Health, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Orna Mor
- Central Virology Laboratory, Ministry of Health, Sheba Medical Center, Ramat-Gan, Israel
- Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Pleuni S. Pennings
- Department of Biology, San Francisco State University, San Francisco, California, United States of America
| | - Adi Stern
- The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
- * E-mail:
| |
Collapse
|
9
|
Oota S. Somatic mutations - Evolution within the individual. Methods 2019; 176:91-98. [PMID: 31711929 DOI: 10.1016/j.ymeth.2019.11.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2018] [Revised: 10/31/2019] [Accepted: 11/07/2019] [Indexed: 02/08/2023] Open
Abstract
With the rapid advancement of sequencing technologies over the last two decades, it is becoming feasible to detect rare variants from somatic tissue samples. Studying such somatic mutations can provide deep insights into various senescence-related diseases, including cancer, inflammation, and sporadic psychiatric disorders. While it is still a difficult task to identify true somatic mutations, relentless efforts to combine experimental and computational methods have made it possible to obtain reliable data. Furthermore, state-of-the-art machine learning approaches have drastically improved the efficiency and sensitivity of these methods. Meanwhile, we can regard somatic mutations as a counterpart of germline mutations, and it is possible to apply well-formulated mathematical frameworks developed for population genetics and molecular evolution to analyze this 'somatic evolution'. For example, retrospective cell lineage tracing is a promising technique to elucidate the mechanism of pre-diseases using single-cell RNA-sequencing (scRNA-seq) data.
Collapse
Affiliation(s)
- Satoshi Oota
- Image Processing Research Team, Center for Advanced Photonics, RIKEN, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.
| |
Collapse
|
10
|
Wang TT, Abelson S, Zou J, Li T, Zhao Z, Dick JE, Shlush LI, Pugh TJ, Bratman SV. High efficiency error suppression for accurate detection of low-frequency variants. Nucleic Acids Res 2019; 47:e87. [PMID: 31127310 PMCID: PMC6735726 DOI: 10.1093/nar/gkz474] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Revised: 04/28/2019] [Accepted: 05/16/2019] [Indexed: 12/30/2022] Open
Abstract
Detection of cancer-associated somatic mutations has broad applications for oncology and precision medicine. However, this becomes challenging when cancer-derived DNA is in low abundance, such as in impure tissue specimens or in circulating cell-free DNA. Next-generation sequencing (NGS) is particularly prone to technical artefacts that can limit the accuracy for calling low-allele-frequency mutations. State-of-the-art methods to improve detection of low-frequency mutations often employ unique molecular identifiers (UMIs) for error suppression; however, these methods are highly inefficient as they depend on redundant sequencing to assemble consensus sequences. Here, we present a novel strategy to enhance the efficiency of UMI-based error suppression by retaining single reads (singletons) that can participate in consensus assembly. This 'Singleton Correction' methodology outperformed other UMI-based strategies in efficiency, leading to greater sensitivity with high specificity in a cell line dilution series. Significant benefits were seen with Singleton Correction at sequencing depths ≤16 000×. We validated the utility and generalizability of this approach in a cohort of >300 individuals whose peripheral blood DNA was subjected to hybrid capture sequencing at ∼5000× depth. Singleton Correction can be incorporated into existing UMI-based error suppression workflows to boost mutation detection accuracy, thus improving the cost-effectiveness and clinical impact of NGS.
Collapse
Affiliation(s)
- Ting Ting Wang
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Sagi Abelson
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Jinfeng Zou
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Tiantian Li
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Zhen Zhao
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - John E Dick
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Liran I Shlush
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Trevor J Pugh
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Scott V Bratman
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Radiation Oncology, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
11
|
Yang X, Yang X, Chen J, Li S, Zeng Q, Huang AY, Ye AY, Yu Z, Wang S, Jiang Y, Wu X, Wu Q, Wei L, Zhang Y. ATP1A3 mosaicism in families with alternating hemiplegia of childhood. Clin Genet 2019; 96:43-52. [PMID: 30891744 PMCID: PMC6850116 DOI: 10.1111/cge.13539] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2018] [Revised: 03/10/2019] [Accepted: 03/15/2019] [Indexed: 01/17/2023]
Abstract
Alternating hemiplegia of childhood (AHC) is a rare and severe neurodevelopmental disorder characterized by recurrent hemiplegic episodes. Most AHC cases are sporadic and caused by de novo ATP1A3 pathogenic variants. In this study, the aim was to identify the origin of ATP1A3 pathogenic variants in a Chinese cohort. In 105 probands including 101 sporadic and 4 familial cases, 98 patients with ATP1A3 pathogenic variants were identified, and 96.8% were confirmed as de novo. Micro-droplet digital polymerase chain reaction was applied for detecting ATP1A3 mosaicism in 80 available families. In blood samples, four asymptomatic parents, including two paternal and two maternal, and one proband with a milder phenotype were identified as mosaicism. Six (7.5%) parental mosaicisms were identified in multiple tissues, including four previously identified in blood and two additional cases identified from paternal sperms. Mosaicism was identified in multiple tissues with varied mutant allele fractions (MAFs, 0.03%-33.03%). The results suggested that MAF of mosaicism may be related to phenotype severity. This is the first systematic report of ATP1A3 mosaicism in AHC and showed mosaicism as an unrecognized source of previously considered "de novo" AHC. Identifying ATP1A3 mosaicism provides more evidence for estimating recurrence risk and has implications in genetic counseling of AHC.
Collapse
Affiliation(s)
- Xiaoling Yang
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| | - Xiaoxu Yang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Jiaoyang Chen
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| | - Shupin Li
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| | - Qi Zeng
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| | - August Y Huang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Adam Y Ye
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China.,Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.,Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Zhe Yu
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.,Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Sheng Wang
- Dr Liping Wei's lab, National Institute of Biological Sciences, Beijing, China.,College of Biological Sciences, China Agricultural University, Beijing, China
| | - Yuwu Jiang
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| | - Xiru Wu
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| | - Qixi Wu
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China.,Human Genetic Resources Core Facility, School of Life Sciences, Peking University, Beijing, China
| | - Liping Wei
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Yuehua Zhang
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| |
Collapse
|
12
|
Hu P, Martinez AF, Kruszka P, Berger S, Roessler E, Muenke M. Low-level parental mosaicism affects the recurrence risk of holoprosencephaly. Genet Med 2018; 21:1015-1020. [DOI: 10.1038/s41436-018-0261-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2018] [Accepted: 07/26/2018] [Indexed: 11/09/2022] Open
|
13
|
Sloan DB, Broz AK, Sharbrough J, Wu Z. Detecting Rare Mutations and DNA Damage with Sequencing-Based Methods. Trends Biotechnol 2018; 36:729-740. [PMID: 29550161 PMCID: PMC6004327 DOI: 10.1016/j.tibtech.2018.02.009] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 02/16/2018] [Accepted: 02/20/2018] [Indexed: 12/18/2022]
Abstract
There is a great need in biomedical and genetic research to detect DNA damage and de novo mutations, but doing so is inherently challenging because of the rarity of these events. The enormous capacity of current DNA sequencing technologies has opened the door for quantifying sequence variants present at low frequencies in vivo, such as within cancerous tissues. However, these sequencing technologies are error prone, resulting in high noise thresholds. Most DNA sequencing methods are also generally incapable of identifying chemically modified bases arising from DNA damage. In recent years, numerous specialized modifications to sequencing methods have been developed to address these shortcomings. Here, we review this landscape of emerging techniques, highlighting their respective strengths, weaknesses, and target applications.
Collapse
Affiliation(s)
- Daniel B Sloan
- Department of Biology, Colorado State University, Fort Collins, CO, USA.
| | - Amanda K Broz
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| | - Joel Sharbrough
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| | - Zhiqiang Wu
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| |
Collapse
|
14
|
Extracellular vesicles and ctDNA in lung cancer: biomarker sources and therapeutic applications. Cancer Chemother Pharmacol 2018; 82:171-183. [PMID: 29948020 DOI: 10.1007/s00280-018-3586-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Accepted: 04/20/2018] [Indexed: 02/05/2023]
Abstract
Lung cancer is the leading cause of cancer death in the world. Recently, targeted therapy and anti-programmed cell death receptor 1 (PD-1) and anti-programmed cell death ligand 1 (PD-L1) immunotherapy have made great progress in treatment of lung cancer. However, responses to these therapies are variable, influenced by genetic alterations, high microsatellite instability and mismatch repair deficiency. Liquid biopsy of extracellular vesicles and circulating tumor DNA (ctDNA) emerges as a new promising non-invasive means that enables not only biomarker determination, but also continuous monitoring of cancer treatment. Notably, tumor extracellular vesicles play important roles in tumor formation and progression, and also serve as natural carriers for anti-tumor drugs and short-interfering RNA. In this review, we summarize the latest progress in understanding the relationships of extracellular vesicles and ctDNA in cancer biology, diagnosis and drug delivery. In particular, the application of extracellular vesicles and ctDNA in anti-PD-1/PD-L1 immunotherapy is discussed.
Collapse
|
15
|
Ibrahim B, McMahon DP, Hufsky F, Beer M, Deng L, Mercier PL, Palmarini M, Thiel V, Marz M. A new era of virus bioinformatics. Virus Res 2018; 251:86-90. [PMID: 29751021 DOI: 10.1016/j.virusres.2018.05.009] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Accepted: 05/07/2018] [Indexed: 01/09/2023]
Abstract
Despite the recognized excellence of virology and bioinformatics, these two communities have interacted surprisingly sporadically, aside from some pioneering work on HIV-1 and influenza. Bringing together the expertise of bioinformaticians and virologists is crucial, since very specific but fundamental computational approaches are required for virus research, particularly in an era of big data. Collaboration between virologists and bioinformaticians is necessary to improve existing analytical tools, cloud-based systems, computational resources, data sharing approaches, new diagnostic tools, and bioinformatic training. Here, we highlight current progress and discuss potential avenues for future developments in this promising era of virus bioinformatics. We end by presenting an overview of current technologies, and by outlining some of the major challenges and advantages that bioinformatics will bring to the field of virology.
Collapse
Affiliation(s)
- Bashar Ibrahim
- European Virus Bioinformatics Center, Jena, Germany; RNA Bioinformatics and High Throughput Analysis Jena, Friedrich Schiller University Jena, Jena, Germany
| | - Dino P McMahon
- European Virus Bioinformatics Center, Jena, Germany; Host Parasite Evolution and Ecology, Institute of Biology, Free University of Berlin, Berlin, Germany; Department for Materials and Environment, BAM Federal Institute for Materials Research and Testing, Berlin, Germany
| | - Franziska Hufsky
- European Virus Bioinformatics Center, Jena, Germany; RNA Bioinformatics and High Throughput Analysis Jena, Friedrich Schiller University Jena, Jena, Germany
| | - Martin Beer
- European Virus Bioinformatics Center, Jena, Germany; Institute of Diagnostic Virology, Friedrich-Loeffler-Institute, Greifswald, Germany
| | - Li Deng
- European Virus Bioinformatics Center, Jena, Germany; Institute of Virology, Helmholtz Zentrum Munich, Munich, Germany
| | - Philippe Le Mercier
- European Virus Bioinformatics Center, Jena, Germany; Swiss-Prot Group, SIB,CMU, University of Geneva Medical School, Geneva, Switzerland
| | - Massimo Palmarini
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - Volker Thiel
- European Virus Bioinformatics Center, Jena, Germany; Federal Department of Home Affairs, Institute of Virology and Immunology, Bern and Mittelhausen, Switzerland; Department of Infectious Diseases and Pathobiology, University of Bern, Bern, Switzerland
| | - Manja Marz
- European Virus Bioinformatics Center, Jena, Germany; RNA Bioinformatics and High Throughput Analysis Jena, Friedrich Schiller University Jena, Jena, Germany.
| |
Collapse
|
16
|
Genomic mosaicism in paternal sperm and multiple parental tissues in a Dravet syndrome cohort. Sci Rep 2017; 7:15677. [PMID: 29142202 PMCID: PMC5688122 DOI: 10.1038/s41598-017-15814-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 11/02/2017] [Indexed: 12/21/2022] Open
Abstract
Genomic mosaicism in parental gametes and peripheral tissues is an important consideration for genetic counseling. We studied a Chinese cohort affected by a severe epileptic disorder, Dravet syndrome (DS). There were 56 fathers who donated semen and 15 parents who donated multiple peripheral tissue samples. We used an ultra-sensitive quantification method, micro-droplet digital PCR (mDDPCR), to detect parental mosaicism of the proband’s pathogenic mutation in SCN1A, the causal gene of DS in 112 families. Ten of the 56 paternal sperm samples were found to exhibit mosaicism of the proband’s mutations, with mutant allelic fractions (MAFs) ranging from 0.03% to 39.04%. MAFs in the mosaic fathers’ sperm were significantly higher than those in their blood (p = 0.00098), even after conditional probability correction (p’ = 0.033). In three mosaic fathers, ultra-low fractions of mosaicism (MAF < 1%) were detected in the sperm samples. In 44 of 45 cases, mosaicism was also observed in other parental peripheral tissues. Hierarchical clustering showed that MAFs measured in the paternal sperm, hair follicles and urine samples were clustered closest together. Milder epileptic phenotypes were more likely to be observed in mosaic parents (p = 3.006e-06). Our study provides new insights for genetic counseling.
Collapse
|