1
|
Jia H, Tan S, Zhang YE. Chasing Sequencing Perfection: Marching Toward Higher Accuracy and Lower Costs. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae024. [PMID: 38991976 PMCID: PMC11423848 DOI: 10.1093/gpbjnl/qzae024] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 01/25/2024] [Accepted: 01/29/2024] [Indexed: 07/13/2024]
Abstract
Next-generation sequencing (NGS), represented by Illumina platforms, has been an essential cornerstone of basic and applied research. However, the sequencing error rate of 1 per 1000 bp (10-3) represents a serious hurdle for research areas focusing on rare mutations, such as somatic mosaicism or microbe heterogeneity. By examining the high-fidelity sequencing methods developed in the past decade, we summarized three major factors underlying errors and the corresponding 12 strategies mitigating these errors. We then proposed a novel framework to classify 11 preexisting representative methods according to the corresponding combinatory strategies and identified three trends that emerged during methodological developments. We further extended this analysis to eight long-read sequencing methods, emphasizing error reduction strategies. Finally, we suggest two promising future directions that could achieve comparable or even higher accuracy with lower costs in both NGS and long-read sequencing.
Collapse
Affiliation(s)
- Hangxing Jia
- CAS Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Shengjun Tan
- CAS Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Yong E Zhang
- CAS Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- CAS Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
| |
Collapse
|
2
|
Chen Y, Zhang Y, Luo S, Yang X, Liu C, Zhang Q, Liu Y, Zhang X. Foldback-crRNA-Enhanced CRISPR/Cas13a System (FCECas13a) Enables Direct Detection of Ultrashort sncRNA. Anal Chem 2023; 95:15606-15613. [PMID: 37824705 DOI: 10.1021/acs.analchem.3c02687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2023]
Abstract
The CRISPR/Cas13a system has promising applications in clinical small noncoding RNA (sncRNA) detection because it is free from the interference of genomic DNA. However, detecting ultrashort sncRNAs (less than 20 nucleotides) has been challenging because the Cas13a nuclease requires longer crRNA-target RNA hybrids to be activated. Here, we report the development of a foldback-crRNA-enhanced CRISPR/Cas13a (FCECas13a) system that overcomes the limitations of the current CRISPR/Cas13a system in detecting ultrashort sncRNAs. The FCECas13a system employs a 3'-terminal foldback crRNA that hybridizes with the target ultrashort sncRNA, forming a double strand that "tricks" the Cas13a nuclease into activating the HEPN structural domain and generating trans-cleavage activity. The FCECas13a system can accurately detect miRNA720 (a sncRNA currently known as tRNA-derived small RNA), which is only 17 nucleotides long and has a concentration as low as 15 fM within 20 min. This FCECas13a system opens new avenues for ultrashort sncRNA detection with significant implications for basic biological research, disease prognosis, and molecular diagnosis.
Collapse
Affiliation(s)
- Yong Chen
- Research Center for Nanosensor Molecular Diagnostic & Treatment Technology, College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen 518060, Guangdong, P. R. China
- Graphene Composite Research Center, College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen 518060, Guangdong, P. R. China
| | - Yibin Zhang
- Research Center for Nanosensor Molecular Diagnostic & Treatment Technology, College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen 518060, Guangdong, P. R. China
| | - Siyuan Luo
- Research Center for Nanosensor Molecular Diagnostic & Treatment Technology, College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen 518060, Guangdong, P. R. China
| | - Xinyao Yang
- Research Center for Nanosensor Molecular Diagnostic & Treatment Technology, College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen 518060, Guangdong, P. R. China
| | - Conghui Liu
- Research Center for Nanosensor Molecular Diagnostic & Treatment Technology, College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen 518060, Guangdong, P. R. China
| | - Qianling Zhang
- Graphene Composite Research Center, College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen 518060, Guangdong, P. R. China
| | - Yizhen Liu
- Research Center for Nanosensor Molecular Diagnostic & Treatment Technology, College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen 518060, Guangdong, P. R. China
- Shenzhen Key Laboratory of Nano-Biosensing Technology, Shenzhen 518060, Guangdong, P. R. China
| | - Xueji Zhang
- Research Center for Nanosensor Molecular Diagnostic & Treatment Technology, College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen 518060, Guangdong, P. R. China
- Shenzhen Key Laboratory of Nano-Biosensing Technology, Shenzhen 518060, Guangdong, P. R. China
| |
Collapse
|
3
|
Nordentoft I, Birkenkamp-Demtröder K, Dyrskjøt L. NGS-Based Tumor-Informed Analysis of Circulating Tumor DNA. Methods Mol Biol 2023; 2684:179-197. [PMID: 37410235 DOI: 10.1007/978-1-0716-3291-8_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2023]
Abstract
Accurate circulating tumor DNA (ctDNA) detection has an immense biomarker potential in all phases of the cancer disease course. Presence of ctDNA in the blood has been shown to have prognostic value in various cancer types as it may reflect the actual tumor burden. There are two main methods to consider, a tumor-informed and a tumor-agnostic analysis of ctDNA. Both techniques exploit the short half-life of circulating cell-free DNA (cfDNA)/ctDNA for disease monitoring and ultimately future clinical treatment intervention. Urothelial carcinoma is characterized by a high mutation spectrum but very few hotspot mutations. This limits tumor agnostic usability of hotspot mutation or fixed sets of genes for ctDNA detection. Here we focus on a tumor-informed analysis for ultrasensitive patient- and tumor-specific ctDNA detection using personalized mutation panels, probes that bind to specific genomic sequences to enrich for the region of interest. In this chapter, we describe methods for purification of high-quality cfDNA and guidelines for designing tumor-informed customized capture panels for sensitive detection of ctDNA. Furthermore, a detailed protocol for library preparation and panel capture utilizing a double enrichment strategy with low amplification is described.
Collapse
Affiliation(s)
- Iver Nordentoft
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus, Denmark.
| | | | - Lars Dyrskjøt
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| |
Collapse
|
4
|
Lobo D, Linheiro R, Godinho R, Archer JP. On taming the effect of transcript level intra-condition count variation during differential expression analysis: A story of dogs, foxes and wolves. PLoS One 2022; 17:e0274591. [PMID: 36136981 PMCID: PMC9498955 DOI: 10.1371/journal.pone.0274591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 08/31/2022] [Indexed: 11/22/2022] Open
Abstract
The evolution of RNA-seq technologies has yielded datasets of scientific value that are often generated as condition associated biological replicates within expression studies. With expanding data archives opportunity arises to augment replicate numbers when conditions of interest overlap. Despite correction procedures for estimating transcript abundance, a source of ambiguity is transcript level intra-condition count variation; as indicated by disjointed results between analysis tools. We present TVscript, a tool that removes reference-based transcripts associated with intra-condition count variation above specified thresholds and we explore the effects of such variation on differential expression analysis. Initially iterative differential expression analysis involving simulated counts, where levels of intra-condition variation and sets of over represented transcripts are explicitly specified, was performed. Then counts derived from inter- and intra-study data representing brain samples of dogs, wolves and foxes (wolves vs. dogs and aggressive vs. tame foxes) were used. For simulations, the sensitivity in detecting differentially expressed transcripts increased after removing hyper-variable transcripts, although at levels of intra-condition variation above 5% detection became unreliable. For real data, prior to applying TVscript, ≈20% of the transcripts identified as being differentially expressed were associated with high levels of intra-condition variation, an over representation relative to the reference set. As transcripts harbouring such variation were removed pre-analysis, a discordance from 26 to 40% in the lists of differentially expressed transcripts is observed when compared to those obtained using the non-filtered reference. The removal of transcripts possessing intra-condition variation values within (and above) the 97th and 95th percentiles, for wolves vs. dogs and aggressive vs. tame foxes, maximized the sensitivity in detecting differentially expressed transcripts as a result of alterations within gene-wise dispersion estimates. Through analysis of our real data the support for seven genes with potential for being involved with selection for tameness is provided. TVscript is available at: https://sourceforge.net/projects/tvscript/.
Collapse
Affiliation(s)
- Diana Lobo
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- BIOPOLIS, Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
- * E-mail: (DL); (JPA)
| | - Raquel Linheiro
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
| | - Raquel Godinho
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- BIOPOLIS, Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - John Patrick Archer
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- BIOPOLIS, Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
- * E-mail: (DL); (JPA)
| |
Collapse
|
5
|
Lessons to Learn from the Gut Microbiota: A Focus on Amyotrophic Lateral Sclerosis. Genes (Basel) 2022; 13:genes13050865. [PMID: 35627250 PMCID: PMC9140531 DOI: 10.3390/genes13050865] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 05/05/2022] [Accepted: 05/10/2022] [Indexed: 02/04/2023] Open
Abstract
The gut microbiota is able to modulate the development and homeostasis of the central nervous system (CNS) through the immune, circulatory, and neuronal systems. In turn, the CNS influences the gut microbiota through stress responses and at the level of the endocrine system. This bidirectional communication forms the “gut microbiota–brain axis” and has been postulated to play a role in the etiopathology of several neurodegenerative diseases, including amyotrophic lateral sclerosis (ALS). Numerous studies in animal models of ALS and in patients have highlighted the close communication between the immune system and the gut microbiota and, therefore, it is possible that alterations in the gut microbiota may have a direct impact on neuronal function and survival in ALS patients. Consequently, if the gut dysbiosis does indeed play a role in ALS-related neurodegeneration, nutritional immunomodulatory interventions based on probiotics, prebiotics, and/or postbiotics could emerge as innovative therapeutic strategies. This review aimed to shed light on the impact of the gut microbiota in ALS disease and on the use of potential nutritional interventions based on different types of biotics to ameliorate ALS symptoms.
Collapse
|
6
|
Jurasz H, Pawłowski T, Perlejewski K. Contamination Issue in Viral Metagenomics: Problems, Solutions, and Clinical Perspectives. Front Microbiol 2021; 12:745076. [PMID: 34745046 PMCID: PMC8564396 DOI: 10.3389/fmicb.2021.745076] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 09/17/2021] [Indexed: 12/16/2022] Open
Abstract
We describe the most common internal and external sources and types of contamination encountered in viral metagenomic studies and discuss their negative impact on sequencing results, particularly for low-biomass samples and clinical applications. We also propose some basic recommendations for reducing the background noise in viral shotgun metagenomic (SM) studies, which would limit the bias introduced by various classes of contaminants. Regardless of the specific viral SM protocol, contamination cannot be totally avoided; in particular, the issue of reagent contamination should always be addressed with high priority. There is an urgent need for the development and validation of standards for viral metagenomic studies especially if viral SM protocols will be more widely applied in diagnostics.
Collapse
Affiliation(s)
- Henryk Jurasz
- Department of Immunopathology of Infectious and Parasitic Diseases, Medical University of Warsaw, Warsaw, Poland
| | - Tomasz Pawłowski
- Division of Psychotherapy and Psychosomatic Medicine, Department of Psychiatry, Wrocław Medical University, Wrocław, Poland
| | - Karol Perlejewski
- Department of Immunopathology of Infectious and Parasitic Diseases, Medical University of Warsaw, Warsaw, Poland
| |
Collapse
|
7
|
Bias in RNA-seq Library Preparation: Current Challenges and Solutions. BIOMED RESEARCH INTERNATIONAL 2021; 2021:6647597. [PMID: 33987443 PMCID: PMC8079181 DOI: 10.1155/2021/6647597] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Accepted: 04/09/2021] [Indexed: 12/26/2022]
Abstract
Although RNA sequencing (RNA-seq) has become the most advanced technology for transcriptome analysis, it also confronts various challenges. As we all know, the workflow of RNA-seq is extremely complicated and it is easy to produce bias. This may damage the quality of RNA-seq dataset and lead to an incorrect interpretation for sequencing result. Thus, our detailed understanding of the source and nature of these biases is essential for the interpretation of RNA-seq data, finding methods to improve the quality of RNA-seq experimental, or development bioinformatics tools to compensate for these biases. Here, we discuss the sources of experimental bias in RNA-seq. And for each type of bias, we discussed the method for improvement, in order to provide some useful suggestions for researcher in RNA-seq experimental.
Collapse
|
8
|
Doan RN, Miller MB, Kim SN, Rodin RE, Ganz J, Bizzotto S, Morillo KS, Huang AY, Digumarthy R, Zemmel Z, Walsh CA. MIPP-Seq: ultra-sensitive rapid detection and validation of low-frequency mosaic mutations. BMC Med Genomics 2021; 14:47. [PMID: 33579278 PMCID: PMC7881461 DOI: 10.1186/s12920-021-00893-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Accepted: 02/03/2021] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Mosaic mutations contribute to numerous human disorders. As such, the identification and precise quantification of mosaic mutations is essential for a wide range of research applications, clinical diagnoses, and early detection of cancers. Currently, the low-throughput nature of single allele assays (e.g., allele-specific ddPCR) commonly used for genotyping known mutations at very low alternate allelic fractions (AAFs) have limited the integration of low-level mosaic analyses into clinical and research applications. The growing importance of mosaic mutations requires a more rapid, low-cost solution for mutation detection and validation. METHODS To overcome these limitations, we developed Multiple Independent Primer PCR Sequencing (MIPP-Seq) which combines the power of ultra-deep sequencing and truly independent assays. The accuracy of MIPP-seq to quantifiable detect and measure extremely low allelic fractions was assessed using a combination of SNVs, insertions, and deletions at known allelic fractions in blood and brain derived DNA samples. RESULTS The Independent amplicon analyses of MIPP-Seq markedly reduce the impact of allelic dropout, amplification bias, PCR-induced, and sequencing artifacts. Using low DNA inputs of either 25 ng or 50 ng of DNA, MIPP-Seq provides sensitive and quantitative assessments of AAFs as low as 0.025% for SNVs, insertion, and deletions. CONCLUSIONS MIPP-Seq provides an ultra-sensitive, low-cost approach for detecting and validating known and novel mutations in a highly scalable system with broad utility spanning both research and clinical diagnostic testing applications. The scalability of MIPP-Seq allows for multiplexing mutations and samples, which dramatically reduce costs of variant validation when compared to methods like ddPCR. By leveraging the power of individual analyses of multiple unique and independent reactions, MIPP-Seq can validate and precisely quantitate extremely low AAFs across multiple tissues and mutational categories including both indels and SNVs. Furthermore, using Illumina sequencing technology, MIPP-seq provides a robust method for accurate detection of novel mutations at an extremely low AAF.
Collapse
Affiliation(s)
- Ryan N Doan
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Center for Life Sciences 15062, 300 Longwood Avenue, BCH3150, Boston, MA, 02115, USA.
- Allen Discovery Center for Human Brain Evolution, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA.
- Departments of Pediatrics and Neurology, Harvard Medical School, Boston, MA, USA.
| | - Michael B Miller
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Center for Life Sciences 15062, 300 Longwood Avenue, BCH3150, Boston, MA, 02115, USA
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA
| | - Sonia N Kim
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Center for Life Sciences 15062, 300 Longwood Avenue, BCH3150, Boston, MA, 02115, USA
- Program in Biological and Biomedical Sciences, Harvard University, Boston, MA, USA
| | - Rachel E Rodin
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Center for Life Sciences 15062, 300 Longwood Avenue, BCH3150, Boston, MA, 02115, USA
| | - Javier Ganz
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Center for Life Sciences 15062, 300 Longwood Avenue, BCH3150, Boston, MA, 02115, USA
| | - Sara Bizzotto
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Center for Life Sciences 15062, 300 Longwood Avenue, BCH3150, Boston, MA, 02115, USA
| | - Katherine S Morillo
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Center for Life Sciences 15062, 300 Longwood Avenue, BCH3150, Boston, MA, 02115, USA
| | - August Yue Huang
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Center for Life Sciences 15062, 300 Longwood Avenue, BCH3150, Boston, MA, 02115, USA
| | - Reethika Digumarthy
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Center for Life Sciences 15062, 300 Longwood Avenue, BCH3150, Boston, MA, 02115, USA
| | - Zachary Zemmel
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Center for Life Sciences 15062, 300 Longwood Avenue, BCH3150, Boston, MA, 02115, USA
| | - Christopher A Walsh
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Center for Life Sciences 15062, 300 Longwood Avenue, BCH3150, Boston, MA, 02115, USA.
- Allen Discovery Center for Human Brain Evolution, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, 20815, USA.
- Departments of Pediatrics and Neurology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
9
|
Dual Deep Sequencing Improves the Accuracy of Low-Frequency Somatic Mutation Detection in Cancer Gene Panel Testing. Int J Mol Sci 2020; 21:ijms21103530. [PMID: 32429412 PMCID: PMC7278996 DOI: 10.3390/ijms21103530] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 05/14/2020] [Accepted: 05/14/2020] [Indexed: 02/07/2023] Open
Abstract
Cancer gene panel testing requires accurate detection of somatic mosaic mutations, as the test sample consists of a mixture of cancer cells and normal cells; each minor clone in the tumor also has different somatic mutations. Several studies have shown that the different types of software used for variant calling for next generation sequencing (NGS) can detect low-frequency somatic mutations. However, the accuracy of these somatic variant callers is unknown. We performed cancer gene panel testing in duplicate experiments using three different high-fidelity DNA polymerases in pre-capture amplification steps and analyzed by three different variant callers, Strelka2, Mutect2, and LoFreq. We selected six somatic variants that were detected in both experiments with more than two polymerases and by at least one variant caller. Among them, five single nucleotide variants were verified by CEL nuclease-mediated heteroduplex incision with polyacrylamide gel electrophoresis and silver staining (CHIPS) and Sanger sequencing. In silico analysis indicated that the FBXW7 and MAP3K1 missense mutations cause damage at the protein level. Comparing three somatic variant callers, we found that Strelka2 detected more variants than Mutect2 and LoFreq. We conclude that dual sequencing with Strelka2 analysis is useful for detection of accurate somatic mutations in cancer gene panel testing.
Collapse
|
10
|
Ren Y, Zhang Y, Wang D, Liu F, Fu Y, Xiang S, Su L, Li J, Dai H, Huang B. SinoDuplex: An Improved Duplex Sequencing Approach to Detect Low-frequency Variants in Plasma cfDNA Samples. GENOMICS PROTEOMICS & BIOINFORMATICS 2020; 18:81-90. [PMID: 32428603 PMCID: PMC7393544 DOI: 10.1016/j.gpb.2020.02.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 11/11/2019] [Accepted: 04/30/2020] [Indexed: 01/31/2023]
Abstract
Accurate detection of low frequency mutations from plasma cell-free DNA in blood using targeted next generation sequencing technology has shown promising benefits in clinical settings. Duplex sequencing technology is the most commonly used approach in liquid biopsies. Unique molecular identifiers are attached to each double-stranded DNA template, followed by production of low-error consensus sequences to detect low frequency variants. However, high sequencing costs have hindered application of this approach in clinical practice. Here, we have developed an improved duplex sequencing approach called SinoDuplex, which utilizes a pool of adapters containing pre-defined barcode sequences to generate far fewer barcode combinations than with random sequences, and implemented a novel computational analysis algorithm to generate duplex consensus sequences more precisely. SinoDuplex increased the output of duplex sequencing technology, making it more cost-effective. We evaluated our approach using reference standard samples and cell-free DNA samples from lung cancer patients. Our results showed that SinoDuplex has high sensitivity and specificity in detecting very low allele frequency mutations. The source code for SinoDuplex is freely available at https://github.com/SinOncology/sinoduplex.
Collapse
Affiliation(s)
- Yongzhe Ren
- (1)College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China; (2)Department of Research and Development, Sinotech Genomics Inc., Kanxing Road 3399, Shanghai 201314, China
| | - Yang Zhang
- (2)Department of Research and Development, Sinotech Genomics Inc., Kanxing Road 3399, Shanghai 201314, China
| | - Dandan Wang
- (2)Department of Research and Development, Sinotech Genomics Inc., Kanxing Road 3399, Shanghai 201314, China
| | - Fengying Liu
- (2)Department of Research and Development, Sinotech Genomics Inc., Kanxing Road 3399, Shanghai 201314, China
| | - Ying Fu
- (2)Department of Research and Development, Sinotech Genomics Inc., Kanxing Road 3399, Shanghai 201314, China
| | - Shaohua Xiang
- (1)College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China
| | - Li Su
- (3)Department of Integrated Traditional and Western Medicine In Oncology, The First Affiliated Hospital of Anhui Medical University, Hefei 230022, China
| | - Jiancheng Li
- (4)Department of Radiation Oncology, Fujian Medical University Cancer Hospital and Fujian Cancer Hospital, Fuzhou 350014, China
| | - Heng Dai
- (2)Department of Research and Development, Sinotech Genomics Inc., Kanxing Road 3399, Shanghai 201314, China
| | - Bingding Huang
- (1)College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China; (2)Department of Research and Development, Sinotech Genomics Inc., Kanxing Road 3399, Shanghai 201314, China; Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Science, Shenzhen 518005, China.
| |
Collapse
|
11
|
Katsiani A, Stainton D, Lamour K, Tzanetakis IE. The population structure of Rose rosette virus in the USA. J Gen Virol 2020; 101:676-684. [PMID: 32375952 DOI: 10.1099/jgv.0.001418] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Rose rosette virus (RRV) (genus Emaravirus) is the causal agent of the homonymous disease, the most destructive malady of roses in the USA. Although the importance of the disease is recognized, little sequence information and no full genomes are available for RRV, a multi-segmented RNA virus. To better understand the population structure of the virus we implemented a Hi-Plex PCR amplicon high-throughput sequencing approach to sequence all 7 segments and to quantify polymorphisms in 91 RRV isolates collected from 16 states in the USA. Analysis revealed insertion/deletion (indel) polymorphisms primarily in the 5' and 3' non-coding, but also within coding regions, including some resulting in changes of protein length. Phylogenetic analysis showed little geographical structuring, suggesting that topography does not have a strong influence on virus evolution. Overall, the virus populations were homogeneous, possibly because of regular movement of plants, the recent emergence of RRV and/or because the virus is under strong purification selection to preserve its integrity and biological functions.
Collapse
Affiliation(s)
- Asimina Katsiani
- Department of Entomology and Plant Pathology, Division of Agriculture, University of Arkansas System, Fayetteville AR 72701, USA
| | - Daisy Stainton
- Department of Entomology and Plant Pathology, Division of Agriculture, University of Arkansas System, Fayetteville AR 72701, USA
| | - Kurt Lamour
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN 37996, USA
| | - Ioannis E Tzanetakis
- Department of Entomology and Plant Pathology, Division of Agriculture, University of Arkansas System, Fayetteville AR 72701, USA
| |
Collapse
|
12
|
Pérez-Losada M, Arenas M, Galán JC, Bracho MA, Hillung J, García-González N, González-Candelas F. High-throughput sequencing (HTS) for the analysis of viral populations. INFECTION GENETICS AND EVOLUTION 2020; 80:104208. [PMID: 32001386 DOI: 10.1016/j.meegid.2020.104208] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 01/21/2020] [Accepted: 01/24/2020] [Indexed: 12/12/2022]
Abstract
The development of High-Throughput Sequencing (HTS) technologies is having a major impact on the genomic analysis of viral populations. Current HTS platforms can capture nucleic acid variation across millions of genes for both selected amplicons and full viral genomes. HTS has already facilitated the discovery of new viruses, hinted new taxonomic classifications and provided a deeper and broader understanding of their diversity, population and genetic structure. Hence, HTS has already replaced standard Sanger sequencing in basic and applied research fields, but the next step is its implementation as a routine technology for the analysis of viruses in clinical settings. The most likely application of this implementation will be the analysis of viral genomics, because the huge population sizes, high mutation rates and very fast replacement of viral populations have demonstrated the limited information obtained with Sanger technology. In this review, we describe new technologies and provide guidelines for the high-throughput sequencing and genetic and evolutionary analyses of viral populations and metaviromes, including software applications. With the development of new HTS technologies, new and refurbished molecular and bioinformatic tools are also constantly being developed to process and integrate HTS data. These allow assembling viral genomes and inferring viral population diversity and dynamics. Finally, we also present several applications of these approaches to the analysis of viral clinical samples including transmission clusters and outbreak characterization.
Collapse
Affiliation(s)
- Marcos Pérez-Losada
- Computational Biology Institute, Milken Institute School of Public Health, George Washington University, Washington, DC, USA; CIBIO-InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, Vairão 4485-661, Portugal
| | - Miguel Arenas
- Department of Biochemistry, Genetics and Immunology, University of Vigo, 36310 Vigo, Spain; Biomedical Research Center (CINBIO), University of Vigo, 36310 Vigo, Spain.
| | - Juan Carlos Galán
- Microbiology Service, Hospital Ramón y Cajal, Madrid, Spain; CIBER in Epidemiology and Public Health, Spain.
| | - Mª Alma Bracho
- CIBER in Epidemiology and Public Health, Spain; Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain.
| | - Julia Hillung
- Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain; Institute for Integrative Systems Biology (I2SysBio), CSIC-University of Valencia, Valencia, Spain.
| | - Neris García-González
- Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain; Institute for Integrative Systems Biology (I2SysBio), CSIC-University of Valencia, Valencia, Spain.
| | - Fernando González-Candelas
- CIBER in Epidemiology and Public Health, Spain; Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain; Institute for Integrative Systems Biology (I2SysBio), CSIC-University of Valencia, Valencia, Spain.
| |
Collapse
|
13
|
Zhou JL, Xu J, Jiao AG, Yang L, Chen J, Callac P, Liu Y, Wang SX. Patterns of PCR Amplification Artifacts of the Fungal Barcode Marker in a Hybrid Mushroom. Front Microbiol 2019; 10:2686. [PMID: 31803173 PMCID: PMC6877668 DOI: 10.3389/fmicb.2019.02686] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2019] [Accepted: 11/05/2019] [Indexed: 11/16/2022] Open
Abstract
The polymerase chain reaction (PCR) is widely used in modern biology and medicine. However, PCR artifacts can complicate the interpretation of PCR-based results. The internal transcribed spacer (ITS) region of the ribosomal RNA gene cluster is the consensus fungal barcode marker and suspected PCR artifacts have been reported in many studies, especially for the analyses of environmental fungal samples. At present, the patterns of PCR artifacts in the whole fungal ITS region (ITS1+5.8S+ITS2) are not known. In this study, we analyzed the error rates of PCR at three template complexity levels using the divergent copies of ITS from the mushroom Agaricus subrufescens. Our results showed that PCR using the Phusion® High-Fidelity DNA Polymerase has a per nucleotide error rate of about 4 × 10–6 per replication. Among the detected mutations, transitions were much more frequent than transversions, insertions, and deletions. When divergent alleles were mixed as templates in the same reaction, a significant proportion (∼30%) of recombinant molecules were detected. The in vitro mixed-template results were comparable to those obtained from using the genomic DNA of the original mushroom specimen as template. Our results indicate that caution should be in place when interpreting ITS sequences from individual fungal specimens, especially those containing divergent ITS copies. Similar results could also happen to PCR-based analyses of other multicopy DNA fragments as well as single-copy DNA sequences with divergent alleles in diploid organisms.
Collapse
Affiliation(s)
- Jun-Liang Zhou
- Institute of Plant and Environment Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing Engineering Research Center for Edible Mushroom, Beijing, China.,International Exchange and Cooperation Department, Kunming University, Kunming, China
| | - Jianping Xu
- Department of Biology, McMaster University, Hamilton, ON, Canada.,Laboratory for Conservation and Utilization of Bio-Resources and Key Laboratory for Microbial Resources of the Ministry of Education, Yunnan University, Kunming, China
| | - An-Guo Jiao
- Institute of Plant and Environment Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing Engineering Research Center for Edible Mushroom, Beijing, China
| | - Li Yang
- Institute of Plant and Environment Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing Engineering Research Center for Edible Mushroom, Beijing, China
| | - Jie Chen
- Instituto de Ecología, Veracruz, Mexico
| | | | - Yu Liu
- Institute of Plant and Environment Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing Engineering Research Center for Edible Mushroom, Beijing, China
| | - Shou-Xian Wang
- Institute of Plant and Environment Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing Engineering Research Center for Edible Mushroom, Beijing, China
| |
Collapse
|
14
|
Fagan-Jeffries EP, Cooper SJB, Bradford TM, Austin AD. Intragenomic internal transcribed spacer 2 variation in a genus of parasitoid wasps (Hymenoptera: Braconidae): implications for accurate species delimitation and phylogenetic analysis. INSECT MOLECULAR BIOLOGY 2019; 28:485-498. [PMID: 30632223 DOI: 10.1111/imb.12564] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
A recent DNA barcoding study of Australian microgastrines (Hymenoptera: Braconidae) sought to use next-generation sequencing of the cytochrome c oxidase subunit 1 (COI) barcoding gene region, the wingless (WG) gene and the internal transcribed spacer 2 (ITS2) to delimit molecular species in a highly diverse group of parasitic wasps. Large intragenomic distances between ITS2 variants, often larger than the average interspecific variation, caused difficulties in using ITS2 for species delimitation in both threshold and tree-based approaches, and the gene was not included in the reported results of the previous DNA barcoding study. We here report on the intragenomic, and the intra- and interspecies, variation in ITS2in the microgastrine genus Diolcogasterto further investigate the value of ITS2as a marker for species delimitation and phylogenetics of the Microgastrinae. Distinctive intragenomic variant patterns were found in different species of Diolcogaster, with some species possessing a single major variant, and others possessing many divergent variants. Characterizing intragenomic variation of ITS2is critical as it is a widely used marker in hymenopteran phylogenetics and species delimitation, and large intragenomic distances such as those found in this study may obscure phylogenetic signal.
Collapse
Affiliation(s)
- E P Fagan-Jeffries
- Australian Centre for Evolutionary Biology and Biodiversity, School of Biological Sciences, University of Adelaide, Adelaide, Australia
| | - S J B Cooper
- Australian Centre for Evolutionary Biology and Biodiversity, School of Biological Sciences, University of Adelaide, Adelaide, Australia
- Evolutionary Biology Unit, South Australian Museum, Adelaide, Australia
| | - T M Bradford
- Australian Centre for Evolutionary Biology and Biodiversity, School of Biological Sciences, University of Adelaide, Adelaide, Australia
- Evolutionary Biology Unit, South Australian Museum, Adelaide, Australia
| | - A D Austin
- Australian Centre for Evolutionary Biology and Biodiversity, School of Biological Sciences, University of Adelaide, Adelaide, Australia
| |
Collapse
|
15
|
Marín de Evsikova C, Raplee ID, Lockhart J, Jaimes G, Evsikov AV. The Transcriptomic Toolbox: Resources for Interpreting Large Gene Expression Data within a Precision Medicine Context for Metabolic Disease Atherosclerosis. J Pers Med 2019; 9:E21. [PMID: 31032818 PMCID: PMC6617151 DOI: 10.3390/jpm9020021] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2019] [Revised: 04/20/2019] [Accepted: 04/25/2019] [Indexed: 11/16/2022] Open
Abstract
As one of the most widespread metabolic diseases, atherosclerosis affects nearly everyone as they age; arteries gradually narrow from plaque accumulation over time reducing oxygenated blood flow to central and periphery causing heart disease, stroke, kidney problems, and even pulmonary disease. Personalized medicine promises to bring treatments based on individual genome sequencing that precisely target the molecular pathways underlying atherosclerosis and its symptoms, but to date only a few genotypes have been identified. A promising alternative to this genetic approach is the identification of pathways altered in atherosclerosis by transcriptome analysis of atherosclerotic tissues to target specific aspects of disease. Transcriptomics is a potentially useful tool for both diagnostics and discovery science, exposing novel cellular and molecular mechanisms in clinical and translational models, and depending on experimental design to identify and test novel therapeutics. The cost and time required for transcriptome analysis has been greatly reduced by the development of next generation sequencing. The goal of this resource article is to provide background and a guide to appropriate technologies and downstream analyses in transcriptomics experiments generating ever-increasing amounts of gene expression data.
Collapse
Affiliation(s)
- Caralina Marín de Evsikova
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
- Epigenetics & Functional Genomics Laboratories, Department of Research and Development, Bay Pines Veteran Administration Healthcare System, Bay Pines, FL 33744, USA.
| | - Isaac D Raplee
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
| | - John Lockhart
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
| | - Gilberto Jaimes
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
| | - Alexei V Evsikov
- Epigenetics & Functional Genomics Laboratories, Department of Research and Development, Bay Pines Veteran Administration Healthcare System, Bay Pines, FL 33744, USA.
| |
Collapse
|
16
|
Kim J, Kim D, Lim JS, Maeng JH, Son H, Kang HC, Nam H, Lee JH, Kim S. The use of technical replication for detection of low-level somatic mutations in next-generation sequencing. Nat Commun 2019; 10:1047. [PMID: 30837471 PMCID: PMC6400950 DOI: 10.1038/s41467-019-09026-y] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 02/07/2019] [Indexed: 01/16/2023] Open
Abstract
Accurate genome-wide detection of somatic mutations with low variant allele frequency (VAF, <1%) has proven difficult, for which generalized, scalable methods are lacking. Herein, we describe a new computational method, called RePlow, that we developed to detect low-VAF somatic mutations based on simple, library-level replicates for next-generation sequencing on any platform. Through joint analysis of replicates, RePlow is able to remove prevailing background errors in next-generation sequencing analysis, facilitating remarkable improvement in the detection accuracy for low-VAF somatic mutations (up to ~99% reduction in false positives). The method is validated in independent cancer panel and brain tissue sequencing data. Our study suggests a new paradigm with which to exploit an overwhelming abundance of sequencing data for accurate variant detection. Somatic mutations of low allele frequencies are often difficult to detect. Here, the authors develop RePlow, a computational method that leverages technical replication for detecting low-level somatic mutations using next-generation sequencing.
Collapse
Affiliation(s)
- Junho Kim
- Department of Biomedical Systems Informatics and Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, South Korea
| | - Dachan Kim
- Department of Biomedical Systems Informatics and Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, South Korea
| | - Jae Seok Lim
- Graduate School of Medical Science and Engineering, KAIST, Daejeon, 34141, South Korea
| | - Ju Heon Maeng
- Department of Biomedical Systems Informatics and Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, South Korea
| | - Hyeonju Son
- Department of Biomedical Systems Informatics and Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, South Korea
| | - Hoon-Chul Kang
- Department of Pediatrics, Division of Pediatric Neurology, Pediatric Epilepsy Clinics, Severance Children's Hospital, Epilepsy Research Institute, Yonsei University College of Medicine, Seoul, 03722, South Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, 61005, South Korea
| | - Jeong Ho Lee
- Graduate School of Medical Science and Engineering, KAIST, Daejeon, 34141, South Korea.
| | - Sangwoo Kim
- Department of Biomedical Systems Informatics and Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, South Korea.
| |
Collapse
|
17
|
Summerer A, Schäfer E, Mautner VF, Messiaen L, Cooper DN, Kehrer-Sawatzki H. Ultra-deep amplicon sequencing indicates absence of low-grade mosaicism with normal cells in patients with type-1 NF1 deletions. Hum Genet 2018; 138:73-81. [PMID: 30478644 DOI: 10.1007/s00439-018-1961-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 11/20/2018] [Indexed: 11/26/2022]
Abstract
Different types of large NF1 deletion are distinguishable by breakpoint location and potentially also by the frequency of mosaicism with normal cells lacking the deletion. However, low-grade mosaicism with fewer than 10% normal cells has not yet been excluded for all NF1 deletion types since it is impossible to assess by the standard techniques used to identify such deletions, including MLPA and array analysis. Here, we used ultra-deep amplicon sequencing to investigate the presence of normal cells in the blood of 20 patients with type-1 NF1 deletions lacking mosaicism according to MLPA. The ultra-deep sequencing entailed the screening of 96 amplicons for heterozygous SNVs located within the NF1 deletion region. DNA samples from three previously identified patients with type-2 NF1 deletions and low-grade mosaicism with normal cells as determined by FISH or microsatellite marker analysis were used to validate our methodology. In these type-2 NF1 deletion samples, proportions of 5.3%, 6.6% and 15.0% normal cells, respectively, were detected by ultra-deep amplicon sequencing. However, using this highly sensitive method, none of the 20 patients with type-1 NF1 deletions included in our analysis exhibited low-grade mosaicism with normal cells in blood, thereby supporting the view that the vast majority of type-1 deletions are germline deletions.
Collapse
Affiliation(s)
- Anna Summerer
- Institute of Human Genetics, University of Ulm, Albert-Einstein-Allee 11, 89081, Ulm, Germany
| | - Eleonora Schäfer
- Institute of Human Genetics, University of Ulm, Albert-Einstein-Allee 11, 89081, Ulm, Germany
| | - Victor-Felix Mautner
- Department of Neurology, University Hospital Hamburg Eppendorf, 20246, Hamburg, Germany
| | - Ludwine Messiaen
- Department of Genetics, University of Alabama at Birmingham, Birmingham, USA
| | - David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, CF14 4XN, UK
| | | |
Collapse
|
18
|
Masachis S, Tourasse NJ, Chabas S, Bouchez O, Darfeuille F. FASTBAC-Seq: Functional Analysis of Toxin-Antitoxin Systems in Bacteria by Deep Sequencing. Methods Enzymol 2018; 612:67-100. [PMID: 30502958 DOI: 10.1016/bs.mie.2018.08.033] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
As the number of bacterial genomes and transcriptomes increases, so does the number of newly identified toxin-antitoxin (TA) systems. However, their functional characterization remains challenging, often requiring the use of overexpression vectors that can lead to misinterpretations of in vivo results. To fill this gap, we developed a systematic approach called FASTBAC-Seq (Functional AnalysiS of Toxin-Antitoxin Systems in BACteria by Deep Sequencing). Combining life/death phenotypic selection with next-generation sequencing, FASTBAC-Seq allows the rapid identification of loss-of-function (toxicity) mutations in toxin-encoding genes belonging to TA loci with nucleotide resolution. Here, we present the setup used on the first-time application of FASBACT-Seq to characterize a member of the aapA/IsoA family of type I TA systems hosted on the chromosome of the major human gastric pathogen Helicobacter pylori. We propose FASBACT-Seq as a powerful tool for the functional characterization of TA systems that can in addition uncover key elements for the understanding of gene expression regulation in bacteria.
Collapse
Affiliation(s)
- Sara Masachis
- ARNA Laboratory, INSERM U1212, CNRS UMR 5320, University of Bordeaux, Bordeaux, France
| | - Nicolas J Tourasse
- ARNA Laboratory, INSERM U1212, CNRS UMR 5320, University of Bordeaux, Bordeaux, France
| | - Sandrine Chabas
- ARNA Laboratory, INSERM U1212, CNRS UMR 5320, University of Bordeaux, Bordeaux, France
| | - Olivier Bouchez
- Plateforme GeT-PlaGe-Genotoul, INRA Auzeville, Castanet-Tolosan, France
| | - Fabien Darfeuille
- ARNA Laboratory, INSERM U1212, CNRS UMR 5320, University of Bordeaux, Bordeaux, France.
| |
Collapse
|
19
|
Raghwani J, Redd AD, Longosz AF, Wu CH, Serwadda D, Martens C, Kagaayi J, Sewankambo N, Porcella SF, Grabowski MK, Quinn TC, Eller MA, Eller LA, Wabwire-Mangen F, Robb ML, Fraser C, Lythgoe KA. Evolution of HIV-1 within untreated individuals and at the population scale in Uganda. PLoS Pathog 2018; 14:e1007167. [PMID: 30052678 PMCID: PMC6082572 DOI: 10.1371/journal.ppat.1007167] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Revised: 08/08/2018] [Accepted: 06/20/2018] [Indexed: 12/15/2022] Open
Abstract
HIV-1 undergoes multiple rounds of error-prone replication between transmission events, resulting in diverse viral populations within and among individuals. In addition, the virus experiences different selective pressures at multiple levels: during the course of infection, at transmission, and among individuals. Disentangling how these evolutionary forces shape the evolution of the virus at the population scale is important for understanding pathogenesis, how drug- and immune-escape variants are likely to spread in populations, and the development of preventive vaccines. To address this, we deep-sequenced two regions of the HIV-1 genome (p24 and gp41) from 34 longitudinally-sampled untreated individuals from Rakai District in Uganda, infected with subtypes A, D, and inter-subtype recombinants. This dataset substantially increases the availability of HIV-1 sequence data that spans multiple years of untreated infection, in particular for different geographical regions and viral subtypes. In line with previous studies, we estimated an approximately five-fold faster rate of evolution at the within-host compared to the population scale for both synonymous and nonsynonymous substitutions, and for all subtypes. We determined the extent to which this mismatch in evolutionary rates can be explained by the evolution of the virus towards population-level consensus, or the transmission of viruses similar to those that establish infection within individuals. Our findings indicate that both processes are likely to be important.
Collapse
Affiliation(s)
- Jayna Raghwani
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
- Department of Zoology, Peter Medawar Building, University of Oxford, Oxford, United Kingdom
| | - Andrew D. Redd
- Laboratory of Immunoregulation, Division of Intramural Research, NIAID, NIH, Baltimore MD, United States of America
- Department of Medicine, Johns Hopkins Medical Institute, Johns Hopkins University, Baltimore MD, United States of America
| | - Andrew F. Longosz
- Laboratory of Immunoregulation, Division of Intramural Research, NIAID, NIH, Baltimore MD, United States of America
| | - Chieh-Hsi Wu
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - David Serwadda
- Rakai Health Sciences Program, Kalisizo, Uganda
- School of Public Health, Makerere University, Kampala, Uganda
| | - Craig Martens
- Genomics Unit, RTS, RTB, Rocky Mountain Laboratories, Division of Intramural Research, NIAID, NIH, Hamilton MT, United States of America
| | | | - Nelson Sewankambo
- Rakai Health Sciences Program, Kalisizo, Uganda
- School of Medicine, Makerere University, Kampala, Uganda
| | - Stephen F. Porcella
- Genomics Unit, RTS, RTB, Rocky Mountain Laboratories, Division of Intramural Research, NIAID, NIH, Hamilton MT, United States of America
| | - Mary K. Grabowski
- Department of Pathology, Johns Hopkins Medical Institute, Johns Hopkins University, Baltimore, MD, United States of America
| | - Thomas C. Quinn
- Laboratory of Immunoregulation, Division of Intramural Research, NIAID, NIH, Baltimore MD, United States of America
- Department of Medicine, Johns Hopkins Medical Institute, Johns Hopkins University, Baltimore MD, United States of America
| | - Michael A. Eller
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, United States of America
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, United States of America
| | - Leigh Anne Eller
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, United States of America
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, United States of America
| | - Fred Wabwire-Mangen
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, United States of America
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, United States of America
| | - Merlin L. Robb
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, United States of America
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, United States of America
| | - Christophe Fraser
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Katrina A. Lythgoe
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
- Department of Zoology, Peter Medawar Building, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
20
|
Dou Y, Gold HD, Luquette LJ, Park PJ. Detecting Somatic Mutations in Normal Cells. Trends Genet 2018; 34:545-557. [PMID: 29731376 PMCID: PMC6029698 DOI: 10.1016/j.tig.2018.04.003] [Citation(s) in RCA: 85] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2018] [Revised: 04/03/2018] [Accepted: 04/05/2018] [Indexed: 01/12/2023]
Abstract
Somatic mutations have been studied extensively in the context of cancer. Recent studies have demonstrated that high-throughput sequencing data can be used to detect somatic mutations in non-tumor cells. Analysis of such mutations allows us to better understand the mutational processes in normal cells, explore cell lineages in development, and examine potential associations with age-related disease. We describe here approaches for characterizing somatic mutations in normal and non-tumor disease tissues. We discuss several experimental designs and common pitfalls in somatic mutation detection, as well as more recent developments such as phasing and linked-read technology. With the dramatically increasing numbers of samples undergoing genome sequencing, bioinformatic analysis will enable the characterization of somatic mutations and their impact on non-cancer tissues.
Collapse
Affiliation(s)
- Yanmei Dou
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Equal contributions
| | - Heather D Gold
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Bioinformatics and Integrative Genomics PhD Program, Harvard Medical School, Boston, MA, USA; Equal contributions
| | - Lovelace J Luquette
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Bioinformatics and Integrative Genomics PhD Program, Harvard Medical School, Boston, MA, USA; Equal contributions
| | - Peter J Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Division of Genetics, Brigham and Women's Hospital, Boston, MA, USA.
| |
Collapse
|
21
|
Hadigol M, Khiabanian H. MERIT reveals the impact of genomic context on sequencing error rate in ultra-deep applications. BMC Bioinformatics 2018; 19:219. [PMID: 29884116 PMCID: PMC5994075 DOI: 10.1186/s12859-018-2223-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Accepted: 05/29/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Rapid progress in high-throughput sequencing (HTS) and the development of novel library preparation methods have improved the sensitivity of detecting mutations in heterogeneous samples, specifically in high-depth (> 500×) clinical applications. However, HTS methods are bounded by their technical and theoretical limitations and sequencing errors cannot be completely eliminated. Comprehensive quantification of the background noise can highlight both the efficiency and the limitations of any HTS methodology, and help differentiate true mutations at low abundance from artifacts. RESULTS We introduce MERIT (Mutation Error Rate Inference Toolkit), designed for in-depth quantification of erroneous substitutions and small insertions and deletions. MERIT incorporates an all-inclusive variant caller and considers genomic context, including the nucleotides immediately at 5 'and 3 ', thereby establishing error rates for 96 possible substitutions as well as four single-base and 16 double-base indels. We applied MERIT to ultra-deep sequencing data (1,300,000 ×) obtained from the amplification of multiple clinically relevant loci, and showed a significant relationship between error rates and genomic contexts. In addition to observing significant difference between transversion and transition rates, we identified variations of more than 100-fold within each error type at high sequencing depths. For instance, T >G transversions in trinucleotide GTCs occurred 133.5 ± 65.9 more often than those in ATAs. Similarly, C >T transitions in GCGs were observed at 73.8 ± 10.5 higher rate than those in TCTs. We also devised an in silico approach to determine the optimal sequencing depth, where errors occur at rates similar to those of expected true mutations. Our analyses showed that increasing sequencing depth might improve sensitivity for detecting some mutations based on their genomic context. For example, T >G rate of error in GTCs did not change when sequenced beyond 10,000 ×; in contrast, T >G rate in TTAs consistently improved even at above 500,000 ×. CONCLUSIONS Our results demonstrate significant variation in nucleotide misincorporation rates, and suggest that genomic context should be considered for comprehensive profiling of specimen-specific and sequencing artifacts in high-depth assays. This data provide strong evidence against assigning a single allele frequency threshold to call mutations, for it can result in substantial false positive as well as false negative variants, with important clinical consequences.
Collapse
Affiliation(s)
- Mohammad Hadigol
- Center for Systems and Computational Biology, Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ USA
| | - Hossein Khiabanian
- Center for Systems and Computational Biology, Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ USA
- Department of Pathology and Laboratory Medicine, Rutgers Robert Wood Johnson Medical School, Rutgers University, New Brunswick, NJ USA
| |
Collapse
|
22
|
Salk JJ, Schmitt MW, Loeb LA. Enhancing the accuracy of next-generation sequencing for detecting rare and subclonal mutations. Nat Rev Genet 2018; 19:269-285. [PMID: 29576615 PMCID: PMC6485430 DOI: 10.1038/nrg.2017.117] [Citation(s) in RCA: 335] [Impact Index Per Article: 47.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Mutations, the fuel of evolution, are first manifested as rare DNA changes within a population of cells. Although next-generation sequencing (NGS) technologies have revolutionized the study of genomic variation between species and individual organisms, most have limited ability to accurately detect and quantify rare variants among the different genome copies in heterogeneous mixtures of cells or molecules. We describe the technical challenges in characterizing subclonal variants using conventional NGS protocols and the recent development of error correction strategies, both computational and experimental, including consensus sequencing of single DNA molecules. We also highlight major applications for low-frequency mutation detection in science and medicine, describe emerging methodologies and provide our vision for the future of DNA sequencing.
Collapse
Affiliation(s)
- Jesse J Salk
- Department of Pathology, University of Washington School of Medicine, Seattle, WA, USA
- Department of Medicine, Divisions of Hematology and Medical Oncology, University of Washington School of Medicine, Seattle, WA, USA
- Fred Hutchinson Cancer Research Center, Clinical Research Division, Seattle, WA, USA
| | - Michael W Schmitt
- Department of Pathology, University of Washington School of Medicine, Seattle, WA, USA
- Department of Medicine, Divisions of Hematology and Medical Oncology, University of Washington School of Medicine, Seattle, WA, USA
- Fred Hutchinson Cancer Research Center, Clinical Research Division, Seattle, WA, USA
| | - Lawrence A Loeb
- Department of Pathology, University of Washington School of Medicine, Seattle, WA, USA
- Department of Biochemistry, University of Washington School of Medicine, Seattle, WA, USA
| |
Collapse
|
23
|
Cartwright JF, Anderson K, Longworth J, Lobb P, James DC. Highly sensitive detection of mutations in CHO cell recombinant DNA using multi-parallel single molecule real-time DNA sequencing. Biotechnol Bioeng 2018; 115:1485-1498. [DOI: 10.1002/bit.26561] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Revised: 12/01/2017] [Accepted: 02/04/2018] [Indexed: 12/13/2022]
Affiliation(s)
- Joseph F. Cartwright
- Department of Chemical and Biological Engineering; University of Sheffield; Sheffield UK
| | - Karin Anderson
- Cell Line Development; BioTherapeutic Pharmaceutical Sciences; Pfizer Inc; Andover Massachusetts
| | - Joseph Longworth
- Department of Chemical and Biological Engineering; University of Sheffield; Sheffield UK
| | | | - David C. James
- Department of Chemical and Biological Engineering; University of Sheffield; Sheffield UK
| |
Collapse
|
24
|
Marciano MA, Panicker SX, Liddil GD, Lindgren D, Sweder KS. Development of a Method to Extract Opium Poppy (Papaver somniferum L.) DNA from Heroin. Sci Rep 2018; 8:2590. [PMID: 29416103 PMCID: PMC5803222 DOI: 10.1038/s41598-018-20996-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Accepted: 01/29/2018] [Indexed: 01/03/2023] Open
Abstract
This study is the first to report the successful development of a method to extract opium poppy (Papaver somniferum L.) DNA from heroin samples. Determining of the source of an unknown heroin sample (forensic geosourcing) is vital to informing domestic and foreign policy related to counter-narcoterrorism. Current profiling methods focus on identifying process-related chemical impurities found in heroin samples. Changes to the geographically distinct processing methods may lead to difficulties in classifying and attributing heroin samples to a region/country. This study focuses on methods to optimize the DNA extraction and amplification of samples with low levels of degraded DNA and inhibiting compounds such as heroin. We compared modified commercial-off-the-shelf extraction methods such as the Qiagen Plant, Stool and the Promega Maxwell-16 RNA-LEV tissue kits for the ability to extract opium poppy DNA from latex, raw and cooked opium, white and brown powder heroin and black tar heroin. Opium poppy DNA was successfully detected in all poppy-derived samples, including heroin. The modified Qiagen stool method with post-extraction purification and a two-stage, dual DNA polymerase amplification procedure resulted in the highest DNA yield and minimized inhibition. This paper describes the initial phase in establishing a DNA-based signature method to characterize heroin.
Collapse
Affiliation(s)
- Michael A Marciano
- Forensic & National Security Sciences Institute, Syracuse University, Syracuse, New York, 13244, USA.
| | - Sini X Panicker
- U.S. Drug Enforcement Administration, Special Testing and Research Laboratory, Dulles, VA, 20166, USA
| | - Garrett D Liddil
- Forensic & National Security Sciences Institute, Syracuse University, Syracuse, New York, 13244, USA
| | - Danielle Lindgren
- Forensic & National Security Sciences Institute, Syracuse University, Syracuse, New York, 13244, USA
| | - Kevin S Sweder
- Forensic & National Security Sciences Institute, Syracuse University, Syracuse, New York, 13244, USA
| |
Collapse
|
25
|
Stasik S, Schuster C, Ortlepp C, Platzbecker U, Bornhäuser M, Schetelig J, Ehninger G, Folprecht G, Thiede C. An optimized targeted Next-Generation Sequencing approach for sensitive detection of single nucleotide variants. BIOMOLECULAR DETECTION AND QUANTIFICATION 2018; 15:6-12. [PMID: 29349042 PMCID: PMC5766748 DOI: 10.1016/j.bdq.2017.12.001] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 11/27/2017] [Accepted: 12/18/2017] [Indexed: 01/06/2023]
Abstract
NGS based detection of low-level SNVs is feasible with sensitivities up to 10−4. PCR-induced bias could be significantly reduced by the choice of adequate enzymes. The prevalent transition vs. transversion bias affects site-specific detection limits. Results from clinical data validated the feasibility of NGS-based MRD detection. Results help to select suitable biomarkers for MRD quantification.
Monitoring of minimal residual disease (MRD) has become an important clinical aspect for early relapse detection during follow-up care after cancer treatment. Still, the sensitive detection of single base pair point mutations via Next-Generation Sequencing (NGS) is hampered mainly due to high substitution error rates. We evaluated the use of NGS for the detection of low-level variants on an Ion Torrent PGM system. As a model case we used the c.1849G > T (p.Val617Phe) mutation of the JAK2-gene. Several reaction parameters (e.g. choice of DNA-polymerase) were evaluated and a comprehensive analysis of substitution errors was performed. Using optimized conditions, we reliably detected JAK2 c.1849G > T VAFs in the range of 0.01–0.0015% which, in combination with results obtained from clinical data, validated the feasibility of NGS-based MRD detection. Particularly, PCR-induced transitions (mainly G > A and C > T) were the major source of error, which could be significantly reduced by the application of proofreading enzymes. The integration of NGS results for several common point mutations in various oncogenes (i.e. IDH1 and 2, c-KIT, DNMT3A, NRAS, KRAS, BRAF) revealed that the prevalent transition vs. transversion bias (3.57:1) has an impact on site-specific detection limits of low-level mutations. These results may help to select suitable markers for MRD detection and to identify individual cut-offs for detection and quantification.
Collapse
Affiliation(s)
- S. Stasik
- Universitätsklinikum Carl Gustav Carus, Medizinische Klinik und Poliklinik I, Dresden, Germany
- National Center for Tumor Diseases (NCT), Heidelberg, Partner Site Dresden, Germany
| | | | | | - U. Platzbecker
- Universitätsklinikum Carl Gustav Carus, Medizinische Klinik und Poliklinik I, Dresden, Germany
| | - M. Bornhäuser
- Universitätsklinikum Carl Gustav Carus, Medizinische Klinik und Poliklinik I, Dresden, Germany
- National Center for Tumor Diseases (NCT), Heidelberg, Partner Site Dresden, Germany
| | - J. Schetelig
- Universitätsklinikum Carl Gustav Carus, Medizinische Klinik und Poliklinik I, Dresden, Germany
| | - G. Ehninger
- Universitätsklinikum Carl Gustav Carus, Medizinische Klinik und Poliklinik I, Dresden, Germany
| | - G. Folprecht
- Universitätsklinikum Carl Gustav Carus, Medizinische Klinik und Poliklinik I, Dresden, Germany
| | - C. Thiede
- Universitätsklinikum Carl Gustav Carus, Medizinische Klinik und Poliklinik I, Dresden, Germany
- Corresponding author: Universitätsklinikum Carl Gustav Carus, Medizinische Klinik und Poliklinik I, Fetscherstraße 74, 01307 Dresden, Germany.
| |
Collapse
|
26
|
Fun A, Leitner T, Vandekerckhove L, Däumer M, Thielen A, Buchholz B, Hoepelman AIM, Gisolf EH, Schipper PJ, Wensing AMJ, Nijhuis M. Impact of the HIV-1 genetic background and HIV-1 population size on the evolution of raltegravir resistance. Retrovirology 2018; 15:1. [PMID: 29304821 PMCID: PMC5755036 DOI: 10.1186/s12977-017-0384-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2017] [Accepted: 12/23/2017] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND Emergence of resistance against integrase inhibitor raltegravir in human immunodeficiency virus type 1 (HIV-1) patients is generally associated with selection of one of three signature mutations: Y143C/R, Q148K/H/R or N155H, representing three distinct resistance pathways. The mechanisms that drive selection of a specific pathway are still poorly understood. We investigated the impact of the HIV-1 genetic background and population dynamics on the emergence of raltegravir resistance. Using deep sequencing we analyzed the integrase coding sequence (CDS) in longitudinal samples from five patients who initiated raltegravir plus optimized background therapy at viral loads > 5000 copies/ml. To investigate the role of the HIV-1 genetic background we created recombinant viruses containing the viral integrase coding region from pre-raltegravir samples from two patients in whom raltegravir resistance developed through different pathways. The in vitro selections performed with these recombinant viruses were designed to mimic natural population bottlenecks. RESULTS Deep sequencing analysis of the viral integrase CDS revealed that the virological response to raltegravir containing therapy inversely correlated with the relative amount of unique sequence variants that emerged suggesting diversifying selection during drug pressure. In 4/5 patients multiple signature mutations representing different resistance pathways were observed. Interestingly, the resistant population can consist of a single resistant variant that completely dominates the population but also of multiple variants from different resistance pathways that coexist in the viral population. We also found evidence for increased diversification after stronger bottlenecks. In vitro selections with low viral titers, mimicking population bottlenecks, revealed that both recombinant viruses and HXB2 reference virus were able to select mutations from different resistance pathways, although typically only one resistance pathway emerged in each individual culture. CONCLUSIONS The generation of a specific raltegravir resistant variant is not predisposed in the genetic background of the viral integrase CDS. Typically, in the early phases of therapy failure the sequence space is explored and multiple resistance pathways emerge and then compete for dominance which frequently results in a switch of the dominant population over time towards the fittest variant or even multiple variants of similar fitness that can coexist in the viral population.
Collapse
Affiliation(s)
- Axel Fun
- Department of Medical Microbiology, Virology, University Medical Center Utrecht, Heidelberglaan 100, HP G04.614, 3584 CX, Utrecht, The Netherlands
| | - Thomas Leitner
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Linos Vandekerckhove
- Department of General Internal Medicine and Infectious Diseases, Ghent University Hospital, Ghent, Belgium
| | - Martin Däumer
- Institute of Immunology and Genetics, Kaiserslautern, Germany
| | | | - Bernd Buchholz
- Pediatric Clinic, University Medical Center Mannheim, Mannheim, Germany
| | - Andy I M Hoepelman
- Department of Internal Medicine and Infectious Diseases, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Elizabeth H Gisolf
- Department of Internal Medicine, Rijnstate Hospital, Arnhem, The Netherlands
| | - Pauline J Schipper
- Department of Medical Microbiology, Virology, University Medical Center Utrecht, Heidelberglaan 100, HP G04.614, 3584 CX, Utrecht, The Netherlands
| | - Annemarie M J Wensing
- Department of Medical Microbiology, Virology, University Medical Center Utrecht, Heidelberglaan 100, HP G04.614, 3584 CX, Utrecht, The Netherlands.,Department of Internal Medicine and Infectious Diseases, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Monique Nijhuis
- Department of Medical Microbiology, Virology, University Medical Center Utrecht, Heidelberglaan 100, HP G04.614, 3584 CX, Utrecht, The Netherlands.
| |
Collapse
|
27
|
Guan Y, Mayba O, Sandmann T, Lu S, Choi Y, Darbonne WC, Leveque V, Ryner L, Humke E, Tam NW, Sujathasarma S, Cheung A, Bourgon R, Lackner MR, Wang Y. High-Throughput and Sensitive Quantification of Circulating Tumor DNA by Microfluidic-Based Multiplex PCR and Next-Generation Sequencing. J Mol Diagn 2017; 19:921-932. [DOI: 10.1016/j.jmoldx.2017.08.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2017] [Revised: 07/11/2017] [Accepted: 08/08/2017] [Indexed: 02/05/2023] Open
|
28
|
Moscona R, Ram D, Wax M, Bucris E, Levy I, Mendelson E, Mor O. Comparison between next-generation and Sanger-based sequencing for the detection of transmitted drug-resistance mutations among recently infected HIV-1 patients in Israel, 2000-2014. J Int AIDS Soc 2017; 20:21846. [PMID: 28799325 PMCID: PMC5577736 DOI: 10.7448/ias.20.1.21846] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2017] [Accepted: 07/22/2017] [Indexed: 02/07/2023] Open
Abstract
INTRODUCTION Transmitted drug-resistance mutations (TDRM) may hamper successful anti-HIV-1 therapy and impact future control of the HIV-1 epidemic. Recently infected, therapy-naïve individuals are best suited for surveillance of such TDRM. In this study, TDRM, detected by next-generation sequencing (NGS) were compared to those identified by Sanger-based population sequencing (SBS) in recently infected HIV-1 patients. METHODS Historical samples from 80 recently infected HIV-1 patients, diagnosed between 2000 and 2014, were analysed by MiSeq (NGS) and ABI (SBS). DeepChek-HIV (ABL) was used for interpretation of the results. RESULTS Most patients were males (80%); Men who have sex with men (MSM) was the major transmission group (58.8%). Overall, TDRM were detected in 31.3% of patients by NGS and 8.8% by SBS, with SBS TDRM restricted to persons infected with subtype B. All SBS-detected TDRM were identified by NGS. The prevalence of TDRM impacting protease inhibitors (PI), nucleoside reverse transcriptase inhibitors (NRTI) and non-nucleoside reverse transcriptase inhibitors (NNRTI) was 11.3, 26.2 7.5%, respectively, in NGS analyses and 0, 3.8 and 5%, respectively, in SBS analyses. More patients with NGS and SBS TDRM were identified in 2008-2014 (37.2% or 13.9%, respectively) compared to 2000-2007 (24.3% or 2.7%, respectively), and a significantly greater number of these patients had multiple NGS TDRM. The most abundant, albeit, minor-frequency RT TDRM, were the K65R and D67N, while K103N, M184V and T215S were high-frequency mutations. Minor TDRM did not become a major variant in later samples and did not hinder successful treatment. CONCLUSIONS NGS can replace SBS for mutation detection and allows for the detection of low-frequency TDRM not identified by SBS. Although rates of TDRM in Israel continued to increase from 2000 to 2014, minor TDRM did not become major species. The need for ongoing surveillance of low-frequency TDRM should be revisited in a larger study.
Collapse
Affiliation(s)
- Roy Moscona
- Central Virology Laboratory, Ministry of Health, Sheba Medical Center, Ramat-Gan, Israel
| | - Daniela Ram
- Central Virology Laboratory, Ministry of Health, Sheba Medical Center, Ramat-Gan, Israel
| | - Marina Wax
- Central Virology Laboratory, Ministry of Health, Sheba Medical Center, Ramat-Gan, Israel
| | - Efrat Bucris
- Central Virology Laboratory, Ministry of Health, Sheba Medical Center, Ramat-Gan, Israel
| | - Itzchak Levy
- Infectious Disease Unit, Sheba Medical Center, Ramat-Gan, Israel
| | - Ella Mendelson
- Central Virology Laboratory, Ministry of Health, Sheba Medical Center, Ramat-Gan, Israel
- School of Public Health, Tel Aviv University, Ramat-Aviv, Israel
| | - Orna Mor
- Central Virology Laboratory, Ministry of Health, Sheba Medical Center, Ramat-Gan, Israel
| |
Collapse
|
29
|
Kinoti WM, Constable FE, Nancarrow N, Plummer KM, Rodoni B. Analysis of intra-host genetic diversity of Prunus necrotic ringspot virus (PNRSV) using amplicon next generation sequencing. PLoS One 2017; 12:e0179284. [PMID: 28632759 PMCID: PMC5478126 DOI: 10.1371/journal.pone.0179284] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2017] [Accepted: 05/08/2017] [Indexed: 12/28/2022] Open
Abstract
PCR amplicon next generation sequencing (NGS) analysis offers a broadly applicable and targeted approach to detect populations of both high- or low-frequency virus variants in one or more plant samples. In this study, amplicon NGS was used to explore the diversity of the tripartite genome virus, Prunus necrotic ringspot virus (PNRSV) from 53 PNRSV-infected trees using amplicons from conserved gene regions of each of PNRSV RNA1, RNA2 and RNA3. Sequencing of the amplicons from 53 PNRSV-infected trees revealed differing levels of polymorphism across the three different components of the PNRSV genome with a total number of 5040, 2083 and 5486 sequence variants observed for RNA1, RNA2 and RNA3 respectively. The RNA2 had the lowest diversity of sequences compared to RNA1 and RNA3, reflecting the lack of flexibility tolerated by the replicase gene that is encoded by this RNA component. Distinct PNRSV phylo-groups, consisting of closely related clusters of sequence variants, were observed in each of PNRSV RNA1, RNA2 and RNA3. Most plant samples had a single phylo-group for each RNA component. Haplotype network analysis showed that smaller clusters of PNRSV sequence variants were genetically connected to the largest sequence variant cluster within a phylo-group of each RNA component. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups in at least one of each RNA and these phylo-groups formed distinct clades that represent PNRSV genetic strains. Variants within the same phylo-group of each Prunus plant sample had ≥97% similarity and phylo-groups within a Prunus plant sample and between samples had less ≤97% similarity. Based on the analysis of diversity, a definition of a PNRSV genetic strain was proposed. The proposed definition was applied to determine the number of PNRSV genetic strains in each of the plant samples and the complexity in defining genetic strains in multipartite genome viruses was explored.
Collapse
Affiliation(s)
- Wycliff M. Kinoti
- Agriculture Victoria, AgriBio, La Trobe University, Melbourne, VIC, Australia
- School of Applied Systems Biology, AgriBio, La Trobe University, Melbourne, VIC, Australia
| | - Fiona E. Constable
- Agriculture Victoria, AgriBio, La Trobe University, Melbourne, VIC, Australia
| | - Narelle Nancarrow
- Agriculture Victoria, AgriBio, La Trobe University, Melbourne, VIC, Australia
| | - Kim M. Plummer
- Department of Animal, Plant and Soil Sciences, AgriBio, La Trobe University, Melbourne, VIC, Australia
| | - Brendan Rodoni
- Agriculture Victoria, AgriBio, La Trobe University, Melbourne, VIC, Australia
- School of Applied Systems Biology, AgriBio, La Trobe University, Melbourne, VIC, Australia
| |
Collapse
|
30
|
Shagin DA, Shagina IA, Zaretsky AR, Barsova EV, Kelmanson IV, Lukyanov S, Chudakov DM, Shugay M. A high-throughput assay for quantitative measurement of PCR errors. Sci Rep 2017; 7:2718. [PMID: 28578414 PMCID: PMC5457411 DOI: 10.1038/s41598-017-02727-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2017] [Accepted: 04/18/2017] [Indexed: 01/01/2023] Open
Abstract
The accuracy with which DNA polymerase can replicate a template DNA sequence is an extremely important property that can vary by an order of magnitude from one enzyme to another. The rate of nucleotide misincorporation is shaped by multiple factors, including PCR conditions and proofreading capabilities, and proper assessment of polymerase error rate is essential for a wide range of sensitive PCR-based assays. In this paper, we describe a method for studying polymerase errors with exceptional resolution, which combines unique molecular identifier tagging and high-throughput sequencing. Our protocol is less laborious than commonly-used methods, and is also scalable, robust and accurate. In a series of nine PCR assays, we have measured a range of polymerase accuracies that is in line with previous observations. However, we were also able to comprehensively describe individual errors introduced by each polymerase after either 20 PCR cycles or a linear amplification, revealing specific substitution preferences and the diversity of PCR error frequency profiles. We also demonstrate that the detected high-frequency PCR errors are highly recurrent and that the position in the template sequence and polymerase-specific substitution preferences are among the major factors influencing the observed PCR error rate.
Collapse
Affiliation(s)
- Dmitriy A Shagin
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia.,Pirogov Russian National Research Medical University, Moscow, Russia.,Evrogen JSC, Moscow, Russia
| | - Irina A Shagina
- Pirogov Russian National Research Medical University, Moscow, Russia.,Evrogen JSC, Moscow, Russia
| | - Andrew R Zaretsky
- Pirogov Russian National Research Medical University, Moscow, Russia.,Evrogen JSC, Moscow, Russia
| | - Ekaterina V Barsova
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia.,Evrogen JSC, Moscow, Russia
| | - Ilya V Kelmanson
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia.,Evrogen JSC, Moscow, Russia
| | - Sergey Lukyanov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia.,Pirogov Russian National Research Medical University, Moscow, Russia
| | - Dmitriy M Chudakov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia. .,Pirogov Russian National Research Medical University, Moscow, Russia. .,Skolkovo Institute of Science and Technology, Moscow, Russia. .,Central European Institute of Technology, Masaryk University, Brno, Czech Republic.
| | - Mikhail Shugay
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia. .,Pirogov Russian National Research Medical University, Moscow, Russia. .,Central European Institute of Technology, Masaryk University, Brno, Czech Republic.
| |
Collapse
|
31
|
Saarinen L, Nummela P, Thiel A, Lehtonen R, Järvinen P, Järvinen H, Aaltonen LA, Lepistö A, Hautaniemi S, Ristimäki A. Multiple components of PKA and TGF-β pathways are mutated in pseudomyxoma peritonei. PLoS One 2017; 12:e0174898. [PMID: 28426742 PMCID: PMC5398530 DOI: 10.1371/journal.pone.0174898] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Accepted: 03/16/2017] [Indexed: 12/12/2022] Open
Abstract
Pseudomyxoma peritonei (PMP) is a subtype of mucinous adenocarcinoma mainly restricted to the peritoneal cavity and most commonly originating from the appendix. The genetic background of PMP is poorly understood and no targeted treatments are currently available for this fatal disease. While RAS signaling pathway is affected in most if not all PMP cases and over half of them also have a mutation in the GNAS gene, other genetic alterations and affected pathways are, to a large degree, poorly known. In this study, we sequenced whole coding genome of nine PMP tumors and paired normal tissues in order to identify additional, commonly mutated genes and signaling pathways affected in PMP. These exome sequencing results were validated with an ultra-deep amplicon sequencing method, leading to 14 validated variants. The validated results contain seven genes that contribute to the protein kinase A (PKA) pathway. PKA pathway, which also contains GNAS, is a major player of overproduction of mucin, which is the characteristic feature of PMP. In addition to PKA pathway, we identified mutations in six genes that belong to the transforming growth factor beta (TGF-β) pathway, which is a key regulator of cell proliferation. Since either GNAS mutation or an alternative mutation in the PKA pathway was identified in 8/9 patients, inhibition of the PKA pathway might reduce mucin production in most of the PMP patients and potentially suppress disease progression.
Collapse
Affiliation(s)
- Lilli Saarinen
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
| | - Pirjo Nummela
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
| | - Alexandra Thiel
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
| | - Rainer Lehtonen
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
| | - Petrus Järvinen
- Department of Surgery, Helsinki University Hospital, Helsinki, Finland
- Department of Urology, Helsinki University Hospital, Helsinki, Finland
| | - Heikki Järvinen
- Department of Surgery, Helsinki University Hospital, Helsinki, Finland
| | - Lauri A. Aaltonen
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
- Department of Medical Genetics, University of Helsinki, Helsinki, Finland
| | - Anna Lepistö
- Department of Surgery, Helsinki University Hospital, Helsinki, Finland
| | - Sampsa Hautaniemi
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
| | - Ari Ristimäki
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
- Department of Pathology, HUSLAB, Helsinki University Hospital and University of Helsinki, Helsinki, Finland
- * E-mail:
| |
Collapse
|
32
|
Kelton W, Waindok AC, Pesch T, Pogson M, Ford K, Parola C, Reddy ST. Reprogramming MHC specificity by CRISPR-Cas9-assisted cassette exchange. Sci Rep 2017; 7:45775. [PMID: 28374766 PMCID: PMC5379551 DOI: 10.1038/srep45775] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Accepted: 03/02/2017] [Indexed: 11/20/2022] Open
Abstract
The development of programmable nucleases has enabled the application of new genome engineering strategies for cellular immunotherapy. While targeted nucleases have mostly been used to knock-out or knock-in genes in immune cells, the scarless exchange of entire immunogenomic alleles would be of great interest. In particular, reprogramming the polymorphic MHC locus could enable the creation of matched donors for allogeneic cellular transplantation. Here we show a proof-of-concept for reprogramming MHC-specificity by performing CRISPR-Cas9-assisted cassette exchange. Using murine antigen presenting cell lines (RAW264.7 macrophages), we demonstrate that the generation of Cas9-induced double-stranded breaks flanking the native MHC-I H2-Kd locus led to exchange of an orthogonal H2-Kb allele. MHC surface expression allowed for easy selection of reprogrammed cells by flow cytometry, thus obviating the need for additional selection markers. MHC-reprogrammed cells were fully functional as they could present H2-Kd-restricted peptide and activate cognate T cells. Finally, we investigated the role of various donor template formats on exchange efficiency, discovering that templates that underwent in situ linearization resulted in the highest MHC-reprogramming efficiency. These findings highlight a potential new approach for the correcting of MHC mismatches in cellular transplantation.
Collapse
Affiliation(s)
- William Kelton
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Ann Cathrin Waindok
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Theresa Pesch
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Mark Pogson
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Kyle Ford
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Cristina Parola
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Sai T. Reddy
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| |
Collapse
|
33
|
Pereira RPA, Peplies J, Brettar I, Höfle MG. Development of a genus-specific next generation sequencing approach for sensitive and quantitative determination of the Legionella microbiome in freshwater systems. BMC Microbiol 2017; 17:79. [PMID: 28359254 PMCID: PMC5374610 DOI: 10.1186/s12866-017-0987-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Accepted: 03/21/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Next Generation Sequencing (NGS) has revolutionized the analysis of natural and man-made microbial communities by using universal primers for bacteria in a PCR based approach targeting the 16S rRNA gene. In our study we narrowed primer specificity to a single, monophyletic genus because for many questions in microbiology only a specific part of the whole microbiome is of interest. We have chosen the genus Legionella, comprising more than 20 pathogenic species, due to its high relevance for water-based respiratory infections. METHODS A new NGS-based approach was designed by sequencing 16S rRNA gene amplicons specific for the genus Legionella using the Illumina MiSeq technology. This approach was validated and applied to a set of representative freshwater samples. RESULTS Our results revealed that the generated libraries presented a low average raw error rate per base (<0.5%); and substantiated the use of high-fidelity enzymes, such as KAPA HiFi, for increased sequence accuracy and quality. The approach also showed high in situ specificity (>95%) and very good repeatability. Only in samples in which the gammabacterial clade SAR86 was present more than 1% non-Legionella sequences were observed. Next-generation sequencing read counts did not reveal considerable amplification/sequencing biases and showed a sensitive as well as precise quantification of L. pneumophila along a dilution range using a spiked-in, certified genome standard. The genome standard and a mock community consisting of six different Legionella species demonstrated that the developed NGS approach was quantitative and specific at the level of individual species, including L. pneumophila. The sensitivity of our genus-specific approach was at least one order of magnitude higher compared to the universal NGS approach. Comparison of quantification by real-time PCR showed consistency with the NGS data. Overall, our NGS approach can determine the quantitative abundances of Legionella species, i. e. the complete Legionella microbiome, without the need for species-specific primers. CONCLUSIONS The developed NGS approach provides a new molecular surveillance tool to monitor all Legionella species in qualitative and quantitative terms if a spiked-in genome standard is used to calibrate the method. Overall, the genus-specific NGS approach opens up a new avenue to massive parallel diagnostics in a quantitative, specific and sensitive way.
Collapse
Affiliation(s)
- Rui P A Pereira
- Department of Vaccinology and Applied Microbiology, RG Microbial Diagnostics, Helmholtz Centre for Infection Research (HZI), Inhoffenstr. 7, 38124, Braunschweig, Germany.,Present address: School of Life Sciences, University of Warwick, Coventry, CV4 7AL, UK
| | - Jörg Peplies
- Ribocon GmbH, Fahrenheitstraße 1, 28359, Bremen, Germany
| | - Ingrid Brettar
- Department of Vaccinology and Applied Microbiology, RG Microbial Diagnostics, Helmholtz Centre for Infection Research (HZI), Inhoffenstr. 7, 38124, Braunschweig, Germany
| | - Manfred G Höfle
- Department of Vaccinology and Applied Microbiology, RG Microbial Diagnostics, Helmholtz Centre for Infection Research (HZI), Inhoffenstr. 7, 38124, Braunschweig, Germany.
| |
Collapse
|
34
|
Glanville J, D'Angelo S, Khan TA, Reddy ST, Naranjo L, Ferrara F, Bradbury ARM. Deep sequencing in library selection projects: what insight does it bring? Curr Opin Struct Biol 2016; 33:146-60. [PMID: 26451649 DOI: 10.1016/j.sbi.2015.09.001] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Revised: 08/19/2015] [Accepted: 09/17/2015] [Indexed: 11/17/2022]
Abstract
High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology.
Collapse
Affiliation(s)
- J Glanville
- Program in Computational and Systems Immunology, Institute for Immunity, Transplantation and Infection, Stanford University School of Medicine, Stanford, CA, USA
| | - S D'Angelo
- University of New Mexico Comprehensive Cancer Center, and Division of Molecular Medicine, University of New Mexico School of Medicine, Albuquerque, NM, USA
| | - T A Khan
- ETH Zurich, Department of Biosystems Science and Engineering, Basel, Switzerland
| | - S T Reddy
- ETH Zurich, Department of Biosystems Science and Engineering, Basel, Switzerland
| | - L Naranjo
- Bioscience division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - F Ferrara
- University of New Mexico Comprehensive Cancer Center, and Division of Molecular Medicine, University of New Mexico School of Medicine, Albuquerque, NM, USA
| | - A R M Bradbury
- Bioscience division, Los Alamos National Laboratory, Los Alamos, NM, USA.
| |
Collapse
|
35
|
Turchaninova MA, Davydov A, Britanova OV, Shugay M, Bikos V, Egorov ES, Kirgizova VI, Merzlyak EM, Staroverov DB, Bolotin DA, Mamedov IZ, Izraelson M, Logacheva MD, Kladova O, Plevova K, Pospisilova S, Chudakov DM. High-quality full-length immunoglobulin profiling with unique molecular barcoding. Nat Protoc 2016; 11:1599-616. [DOI: 10.1038/nprot.2016.093] [Citation(s) in RCA: 134] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
36
|
Bálint M, Bahram M, Eren AM, Faust K, Fuhrman JA, Lindahl B, O'Hara RB, Öpik M, Sogin ML, Unterseher M, Tedersoo L. Millions of reads, thousands of taxa: microbial community structure and associations analyzed via marker genes. FEMS Microbiol Rev 2016; 40:686-700. [DOI: 10.1093/femsre/fuw017] [Citation(s) in RCA: 136] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/23/2016] [Indexed: 11/13/2022] Open
|
37
|
Herrera VLM, Steffen M, Moran AM, Tan GA, Pasion KA, Rivera K, Pappin DJ, Ruiz-Opazo N. Confirmation of translatability and functionality certifies the dual endothelin1/VEGFsp receptor (DEspR) protein. BMC Mol Biol 2016; 17:15. [PMID: 27301377 PMCID: PMC4906906 DOI: 10.1186/s12867-016-0066-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Accepted: 05/20/2016] [Indexed: 01/16/2023] Open
Abstract
Background In contrast to rat and mouse databases, the NCBI gene database lists the human dual-endothelin1/VEGFsp receptor (DEspR, formerly Dear) as a unitary transcribed pseudogene due to a stop [TGA]-codon at codon#14 in automated DNA and RNA sequences. However, re-analysis is needed given prior single gene studies detected a tryptophan [TGG]-codon#14 by manual Sanger sequencing, demonstrated DEspR translatability and functionality, and since the demonstration of actual non-translatability through expression studies, the standard-of-excellence for pseudogene designation, has not been performed. Re-analysis must meet UNIPROT criteria for demonstration of a protein’s existence at the highest (protein) level, which a priori, would override DNA- or RNA-based deductions. Methods To dissect the nucleotide sequence discrepancy, we performed Maxam–Gilbert sequencing and reviewed 727 RNA-seq entries. To comply with the highest level multiple UNIPROT criteria for determining DEspR’s existence, we performed various experiments using multiple anti-DEspR monoclonal antibodies (mAbs) targeting distinct DEspR epitopes with one spanning the contested tryptophan [TGG]-codon#14, assessing: (a) DEspR protein expression, (b) predicted full-length protein size, (c) sequence-predicted protein-specific properties beyond codon#14: receptor glycosylation and internalization, (d) protein-partner interactions, and (e) DEspR functionality via DEspR-inhibition effects. Results Maxam–Gilbert sequencing and some RNA-seq entries demonstrate two guanines, hence a tryptophan [TGG]-codon#14 within a compression site spanning an error-prone compression sequence motif. Western blot analysis using anti-DEspR mAbs targeting distinct DEspR epitopes detect the identical glycosylated 17.5 kDa pull-down protein. Decrease in DEspR-protein size after PNGase-F digest demonstrates post-translational glycosylation, concordant with the consensus-glycosylation site beyond codon#14. Like other small single-transmembrane proteins, mass spectrometry analysis of anti-DEspR mAb pull-down proteins do not detect DEspR, but detect DEspR-protein interactions with proteins implicated in intracellular trafficking and cancer. FACS analyses also detect DEspR-protein in different human cancer stem-like cells (CSCs). DEspR-inhibition studies identify DEspR-roles in CSC survival and growth. Live cell imaging detects fluorescently-labeled anti-DEspR mAb targeted-receptor internalization, concordant with the single internalization-recognition sequence also located beyond codon#14. Conclusions Data confirm translatability of DEspR, the full-length DEspR protein beyond codon#14, and elucidate DEspR-specific functionality. Along with detection of the tryptophan [TGG]-codon#14 within an error-prone compression site, cumulative data demonstrating DEspR protein existence fulfill multiple UNIPROT criteria, thus refuting its pseudogene designation. Electronic supplementary material The online version of this article (doi:10.1186/s12867-016-0066-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Victoria L M Herrera
- Whitaker Cardiovascular Institute, Boston University School of Medicine, 700 Albany Street, Boston, MA, 02118, USA.,Department of Medicine, Boston University School of Medicine, 700 Albany Street, Boston, MA, 02118, USA
| | - Martin Steffen
- Department of Pathology and Biomedical Engineering, Boston University, Boston, USA
| | - Ann Marie Moran
- Whitaker Cardiovascular Institute, Boston University School of Medicine, 700 Albany Street, Boston, MA, 02118, USA.,Department of Medicine, Boston University School of Medicine, 700 Albany Street, Boston, MA, 02118, USA
| | - Glaiza A Tan
- Whitaker Cardiovascular Institute, Boston University School of Medicine, 700 Albany Street, Boston, MA, 02118, USA.,Department of Medicine, Boston University School of Medicine, 700 Albany Street, Boston, MA, 02118, USA
| | - Khristine A Pasion
- Whitaker Cardiovascular Institute, Boston University School of Medicine, 700 Albany Street, Boston, MA, 02118, USA.,Department of Medicine, Boston University School of Medicine, 700 Albany Street, Boston, MA, 02118, USA
| | - Keith Rivera
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY, 11724, USA
| | - Darryl J Pappin
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY, 11724, USA
| | - Nelson Ruiz-Opazo
- Whitaker Cardiovascular Institute, Boston University School of Medicine, 700 Albany Street, Boston, MA, 02118, USA. .,Department of Medicine, Boston University School of Medicine, 700 Albany Street, Boston, MA, 02118, USA.
| |
Collapse
|
38
|
Preston JL, Royall AE, Randel MA, Sikkink KL, Phillips PC, Johnson EA. High-specificity detection of rare alleles with Paired-End Low Error Sequencing (PELE-Seq). BMC Genomics 2016; 17:464. [PMID: 27301885 PMCID: PMC4908710 DOI: 10.1186/s12864-016-2669-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2015] [Accepted: 04/25/2016] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Polymorphic loci exist throughout the genomes of a population and provide the raw genetic material needed for a species to adapt to changes in the environment. The minor allele frequencies of rare Single Nucleotide Polymorphisms (SNPs) within a population have been difficult to track with Next-Generation Sequencing (NGS), due to the high error rate of standard methods such as Illumina sequencing. RESULTS We have developed a wet-lab protocol and variant-calling method that identifies both sequencing and PCR errors, called Paired-End Low Error Sequencing (PELE-Seq). To test the specificity and sensitivity of the PELE-Seq method, we sequenced control E. coli DNA libraries containing known rare alleles present at frequencies ranging from 0.2-0.4 % of the total reads. PELE-Seq had higher specificity and sensitivity than standard libraries. We then used PELE-Seq to characterize rare alleles in a Caenorhabditis remanei nematode worm population before and after laboratory adaptation, and found that minor and rare alleles can undergo large changes in frequency during lab-adaptation. CONCLUSION We have developed a method of rare allele detection that mitigates both sequencing and PCR errors, called PELE-Seq. PELE-Seq was evaluated using control E. coli populations and was then used to compare a wild C. remanei population to a lab-adapted population. The PELE-Seq method is ideal for investigating the dynamics of rare alleles in a broad range of reduced-representation sequencing methods, including targeted amplicon sequencing, RAD-Seq, ddRAD, and GBS. PELE-Seq is also well-suited for whole genome sequencing of mitochondria and viruses, and for high-throughput rare mutation screens.
Collapse
Affiliation(s)
- Jessica L Preston
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon, USA.
| | - Ariel E Royall
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon, USA
| | - Melissa A Randel
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon, USA
| | - Kristin L Sikkink
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon, USA
| | - Patrick C Phillips
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon, USA
| | - Eric A Johnson
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon, USA
| |
Collapse
|
39
|
Alanio A, Gits-Muselli M, Mercier-Delarue S, Dromer F, Bretagne S. Diversity of Pneumocystis jirovecii during Infection Revealed by Ultra-Deep Pyrosequencing. Front Microbiol 2016; 7:733. [PMID: 27252684 PMCID: PMC4877386 DOI: 10.3389/fmicb.2016.00733] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 05/02/2016] [Indexed: 01/05/2023] Open
Abstract
Pneumocystis jirovecii is an uncultivable fungal pathogen responsible for Pneumocystis pneumonia (PCP) in immunocompromised patients, the physiopathology of which is only partially understood. The diversity of the Pneumocystis strains associated with acute infection has mainly been studied by Sanger sequencing techniques precluding any identification of rare genetic events (< 20% frequency). We used next-generation sequencing to detect minority variants causing infection, and analyzed the complexity of the genomes of infection-causing P. jirovecii. Ultra-deep pyrosequencing (UDPS) of PCR amplicons of two nuclear target region [internal transcribed spacer 2 (ITS2) and dihydrofolate reductase (DHFR)] and one mitochondrial DNA target region [the mitochondrial ribosomal RNA large subunit gene (mtLSU)] was performed on 31 samples from 25 patients. UDPS revealed that almost all patients (n = 23/25, 92%) were infected with mixtures of strains. An analysis of repeated samples from six patients showed that the proportion of each variant change significantly (by up to 30%) over time on treatment in three of these patients. A comparison of mitochondrial and nuclear UDPS data revealed heteroplasmy in P. jirovecii. The recognition site for the homing endonuclease I-SceI was recovered from the mtLSU gene, whereas its two conserved motifs of the enzyme were not. This suggests that heteroplasmy may result from recombination induced by unidentified homing endonucleases. This study sheds new light on the biology of P. jirovecii during infection. PCP results from infection not with a single microorganism, but with a complex mixture of different genotypes, the proportions of which change over time due to intricate selection and reinfection mechanisms that may differ between patients, treatments, and predisposing diseases.
Collapse
Affiliation(s)
- Alexandre Alanio
- Laboratoire de Parasitologie-Mycologie, Groupe Hospitalier Saint-Louis-Lariboisière-Fernand-Widal, Assistance Publique Hôpitaux de Paris, Hôpital Saint-LouisParis, France; Université Paris Diderot, Sorbonne Paris CitéParis, France; Unité de Mycologie Moléculaire, Département de Mycologie, Centre National de Référence Mycoses Invasives et Antifongiques, Institut PasteurParis, France; Centre National de la Recherche Scientifique CNRS URA3012Paris, France
| | - Maud Gits-Muselli
- Laboratoire de Parasitologie-Mycologie, Groupe Hospitalier Saint-Louis-Lariboisière-Fernand-Widal, Assistance Publique Hôpitaux de Paris, Hôpital Saint-LouisParis, France; Université Paris Diderot, Sorbonne Paris CitéParis, France
| | - Séverine Mercier-Delarue
- Laboratoire de Microbiologie, Groupe Hospitalier Saint-Louis-Lariboisière-Fernand-Widal, Assistance Publique Hôpitaux de Paris, Hôpital Saint-Louis Paris, France
| | - Françoise Dromer
- Unité de Mycologie Moléculaire, Département de Mycologie, Centre National de Référence Mycoses Invasives et Antifongiques, Institut PasteurParis, France; Centre National de la Recherche Scientifique CNRS URA3012Paris, France
| | - Stéphane Bretagne
- Laboratoire de Parasitologie-Mycologie, Groupe Hospitalier Saint-Louis-Lariboisière-Fernand-Widal, Assistance Publique Hôpitaux de Paris, Hôpital Saint-LouisParis, France; Université Paris Diderot, Sorbonne Paris CitéParis, France; Unité de Mycologie Moléculaire, Département de Mycologie, Centre National de Référence Mycoses Invasives et Antifongiques, Institut PasteurParis, France; Centre National de la Recherche Scientifique CNRS URA3012Paris, France
| |
Collapse
|
40
|
Deep Sequencing of HIV-1 RNA and DNA in Newly Diagnosed Patients with Baseline Drug Resistance Showed No Indications for Hidden Resistance and Is Biased by Strong Interference of Hypermutation. J Clin Microbiol 2016; 54:1605-1615. [PMID: 27076656 DOI: 10.1128/jcm.00030-16] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Accepted: 04/01/2016] [Indexed: 12/29/2022] Open
Abstract
Deep sequencing of plasma RNA or proviral DNA may be an interesting alternative to population sequencing for the detection of baseline transmitted HIV-1 drug resistance. Using a Roche 454 GS Junior HIV-1 prototype kit, we performed deep sequencing of the HIV-1 protease and reverse transcriptase genes on paired plasma and buffy coat samples from newly diagnosed HIV-1-positive individuals. Selection was based on the outcome of population sequencing and included 12 patients with either a revertant amino acid at codon 215 of the reverse transcriptase or a singleton resistance mutation, 4 patients with multiple resistance mutations, and 4 patients with wild-type virus. Deep sequencing of RNA and DNA detected 6 and 43 mutations, respectively, that were not identified by population sequencing. A subsequently performed hypermutation analysis, however, revealed hypermutation in 61.19% of 3,188 DNA reads with a resistance mutation. The removal of hypermutated reads dropped the number of additional mutations in DNA from 43 to 17. No hypermutation evidence was found in the RNA reads. Five of the 6 additional RNA mutations and all additional DNA mutations, after full exclusion of hypermutation bias, were observed in the 3 individuals with multiple resistance mutations detected by population sequencing. Despite focused selection of patients with T215 revertants or singleton mutations, deep sequencing failed to identify the resistant T215Y/F or M184V or any other resistance mutation, indicating that in most of these cases there is no hidden resistance and that the virus detected at diagnosis by population sequencing is the original infecting variant.
Collapse
|
41
|
Bellecave P, Recordon-Pinson P, Fleury H. Evaluation of Automatic Analysis of Ultradeep Pyrosequencing Raw Data to Determine Percentages of HIV Resistance Mutations in Patients Followed-Up in Hospital. AIDS Res Hum Retroviruses 2016; 32:85-92. [PMID: 26529549 DOI: 10.1089/aid.2015.0201] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
A major obstacle to using next generation sequencing (NGS) technology in clinical routine practice is reliable data analysis. Thousands of sequences need to be aligned and validated, to exclude sequencing artifacts and generate accurate results. We compared two analysis pipelines for Roche 454 ultradeep pyrosequencing (UDPS) raw data generated from HIV-1 clinical samples: a commercial and fully automated Web-based software NGS HIV-1 Module (SmartGene, Zug, Switzerland) vs. the Amplicon Variant Analyzer software (AVA, 454 Life Sciences; Roche). Results were also compared to those obtained with Sanger sequencing. HIV-1 reverse transcriptase and protease genes from 34 plasma samples were submitted to Sanger sequencing and GS Junior UDPS. Raw UDPS data (sff files) from all samples were analyzed with AVA 2.7 software plus manual review of the alignments and the fully automated SmartGene NGS HIV-1 Module prototype (SMG). Results obtained with both analysis pipelines showed good correlation (85.0%). Divergent results were mainly observed at homopolymer positions, such as K101, where the frame-aware alignment and error corrections of the automated approach were more efficient and more accurate, both in terms of detecting and quantifying drug resistance mutations. Our study shows that NGS data can easily be analyzed via a fully automated analysis pipeline, here the SmartGene NGS HIV-1 Module, thus minimizing the need for manual review of alignments by the user, otherwise essential to ensure accurate results. Such automated analysis pipelines may facilitate the adoption of NGS platforms in the routine clinical laboratory.
Collapse
Affiliation(s)
- Pantxika Bellecave
- CNRS-UMR 5234, Microbiologie Fondamentale et Pathogénicité, Université Bordeaux Segalen, Bordeaux, France
- Centre Hospitalier Universitaire de Bordeaux (CHU), Laboratoire de Virologie, Bordeaux, France
| | - Patricia Recordon-Pinson
- CNRS-UMR 5234, Microbiologie Fondamentale et Pathogénicité, Université Bordeaux Segalen, Bordeaux, France
- Centre Hospitalier Universitaire de Bordeaux (CHU), Laboratoire de Virologie, Bordeaux, France
| | - Hervé Fleury
- CNRS-UMR 5234, Microbiologie Fondamentale et Pathogénicité, Université Bordeaux Segalen, Bordeaux, France
- Centre Hospitalier Universitaire de Bordeaux (CHU), Laboratoire de Virologie, Bordeaux, France
| |
Collapse
|
42
|
Position-specific automated processing of V3 env ultra-deep pyrosequencing data for predicting HIV-1 tropism. Sci Rep 2015; 5:16944. [PMID: 26585833 PMCID: PMC4653658 DOI: 10.1038/srep16944] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Accepted: 10/22/2015] [Indexed: 11/11/2022] Open
Abstract
HIV-1 coreceptor usage must be accurately determined before starting CCR5 antagonist-based treatment as the presence of undetected minor CXCR4-using variants can cause subsequent virological failure. Ultra-deep pyrosequencing of HIV-1 V3 env allows to detect low levels of CXCR4-using variants that current genotypic approaches miss. However, the computation of the mass of sequence data and the need to identify true minor variants while excluding artifactual sequences generated during amplification and ultra-deep pyrosequencing is rate-limiting. Arbitrary fixed cut-offs below which minor variants are discarded are currently used but the errors generated during ultra-deep pyrosequencing are sequence-dependant rather than random. We have developed an automated processing of HIV-1 V3 env ultra-deep pyrosequencing data that uses biological filters to discard artifactual or non-functional V3 sequences followed by statistical filters to determine position-specific sensitivity thresholds, rather than arbitrary fixed cut-offs. It allows to retain authentic sequences with point mutations at V3 positions of interest and discard artifactual ones with accurate sensitivity thresholds.
Collapse
|
43
|
Identification of minority resistance mutations in the HIV-1 integrase coding region using next generation sequencing. J Clin Virol 2015; 73:95-100. [PMID: 26587787 DOI: 10.1016/j.jcv.2015.11.009] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Revised: 10/31/2015] [Accepted: 11/03/2015] [Indexed: 11/20/2022]
Abstract
BACKGROUND The current widely applied standard method to screen for HIV-1 genotypic resistance is based on Sanger population sequencing (Sseq), which does not allow for the identification of minority variants (MVs) below the limit of detection for the Sseq-method in patients receiving integrase strand-transfer inhibitors (INSTI). Next generation sequencing (NGS) has facilitated the detection of MVs at a much deeper level than Sseq. OBJECTIVES Here, we compared Illumina MiSeq and Sseq approaches to evaluate the detection of MVs involved in resistance to the three commonly used INSTI: raltegravir (RAL), elvitegravir (EVG) and dolutegravir (DTG). STUDY DESIGN NGS and Sseq were used to analyze RT-PCR products of the HIV-1 integrase coding region from six patients and in serial samples from two patients. NGS sequences were assembled and analyzed using the low frequency variant detection (LFVDT) tool in CLC genomic workbench. RESULTS Sseq detected INSTI resistance and accessory mutations in three of the patients (called INSTI Res+), while no resistance or accessory mutations were detected in the remaining three patients (called INSTI Res-). Additional INSTI resistance and/or accessory mutations were detected by NGS analysis of integrase sequences from all three INSTI Res+ and one INSTI Res- patient. CONCLUSION Our observations suggested that NGS demonstrated a higher sensitivity than sSEQ in the identification of INSTI relevant MVs both in patients at treatment baseline and in patients receiving INSTI therapy. Thus NGS can be a valuable tool in monitoring of antiretroviral minority resistance in patients receiving INSTI therapy.
Collapse
|
44
|
Daca-Roszak P, Pfeifer A, Żebracka-Gala J, Jarząb B, Witt M, Ziętkiewicz E. EurEAs_Gplex--A new SNaPshot assay for continental population discrimination and gender identification. Forensic Sci Int Genet 2015; 20:89-100. [PMID: 26520215 DOI: 10.1016/j.fsigen.2015.10.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Accepted: 10/15/2015] [Indexed: 12/21/2022]
Abstract
Assays that allow analysis of the biogeographic origin of biological samples in a standard forensic laboratory have to target a small number of highly differentiating markers. Such markers should be easy to multiplex and the assay must perform well in the degraded and scarce biological material. SNPs localized in the genome regions, which in the past were subjected to differential selective pressure in various populations, are the most widely used markers in the studies of biogeographic affiliation. SNPs reflecting biogeographic differences not related to any phenotypic traits are not sufficiently explored. The goal of our study was to identify a small set of SNPs not related to any known pigmentation/phenotype-specific genes, which would allow efficient discrimination between populations of Europe and East Asia. The selection of SNPs was based on the comparative analysis of representative European and Chinese/Japanese samples (B-lymphocyte cell lines), genotyped using the Infinium HumanOmniExpressExome microarray (Illumina). The classifier, consisting of 24 unlinked SNPs (24-SNP classifier), was selected. The performance of a 14-SNP subset of this classifier (14-SNP subclassifier) was tested using genotype data from several populations. The 14-SNP subclassifier differentiated East Asians, Europeans and Africans with ∼100% accuracy; Palestinians, representative of the Middle East, clustered with Europeans, while Amerindians and Pakistani were placed between East Asian and European populations. Based on these results, we have developed a SNaPshot assay (EurEAs_Gplex) for genotyping SNPs from the 14-SNP subclassifier, combined with an additional marker for gender identification. Forensic utility of the EurEAs_Gplex was verified using degraded and low quantity DNA samples. The performance of the EurEAs_Gplex was satisfactory when using degraded DNA; tests using low quantity DNA samples revealed a previously not described source of genotyping errors, potentially important for any SNaPshot-based assays.
Collapse
Affiliation(s)
- P Daca-Roszak
- Institute of Human Genetics, Polish Academy of Sciences, Strzeszynska 32, 60-479 Poznan, Poland
| | - A Pfeifer
- Maria Sklodowska-Curie Memorial Cancer Centre and Institute of Oncology, Gliwice Branch, Poland; Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
| | - J Żebracka-Gala
- Maria Sklodowska-Curie Memorial Cancer Centre and Institute of Oncology, Gliwice Branch, Poland
| | - B Jarząb
- Maria Sklodowska-Curie Memorial Cancer Centre and Institute of Oncology, Gliwice Branch, Poland
| | - M Witt
- Institute of Human Genetics, Polish Academy of Sciences, Strzeszynska 32, 60-479 Poznan, Poland; International Institute of Molecular and Cell Biology, Warsaw, Poland
| | - E Ziętkiewicz
- Institute of Human Genetics, Polish Academy of Sciences, Strzeszynska 32, 60-479 Poznan, Poland.
| |
Collapse
|
45
|
Payne BAI, Gardner K, Coxhead J, Chinnery PF. Deep resequencing of mitochondrial DNA. Methods Mol Biol 2015; 1264:59-66. [PMID: 25631003 DOI: 10.1007/978-1-4939-2257-4_6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
Detecting and quantifying low-level variants in mitochondrial DNA (mtDNA) by deep resequencing can lead to important insights into the biology of mtDNA in health and disease. Massively parallel ("next-generation") sequencing is an attractive tool owing to the great depth and breadth of coverage. However, there are several important challenges to be considered when using this method, in particular: the avoidance of false discovery due to the unintended amplification of nuclear pseudogenes and the approach to delineating signal from noise at very great depths of coverage. Here we present methods for whole mtDNA genome deep sequencing (Illumina MiSeq) and short amplicon deep sequencing (Roche 454 GS-FLX).
Collapse
Affiliation(s)
- Brendan A I Payne
- Mitochondrial Research Group, Institute of Genetic Medicine, Newcastle University, International Centre for Life, Central Parkway, Newcastle-upon-Tyne, NE1 3BZ, UK,
| | | | | | | |
Collapse
|
46
|
Abstract
Tumors are typically sequenced to depths of 75-100× (exome) or 30-50× (whole genome). We demonstrate that current sequencing paradigms are inadequate for tumors that are impure, aneuploid or clonally heterogeneous. To reassess optimal sequencing strategies, we performed ultra-deep (up to ~312×) whole genome sequencing (WGS) and exome capture (up to ~433×) of a primary acute myeloid leukemia, its subsequent relapse, and a matched normal skin sample. We tested multiple alignment and variant calling algorithms and validated ~200,000 putative SNVs by sequencing them to depths of ~1,000×. Additional targeted sequencing provided over 10,000× coverage and ddPCR assays provided up to ~250,000× sampling of selected sites. We evaluated the effects of different library generation approaches, depth of sequencing, and analysis strategies on the ability to effectively characterize a complex tumor. This dataset, representing the most comprehensively sequenced tumor described to date, will serve as an invaluable community resource (dbGaP accession id phs000159).
Collapse
|
47
|
Iyer S, Casey E, Bouzek H, Kim M, Deng W, Larsen BB, Zhao H, Bumgarner RE, Rolland M, Mullins JI. Comparison of Major and Minor Viral SNPs Identified through Single Template Sequencing and Pyrosequencing in Acute HIV-1 Infection. PLoS One 2015; 10:e0135903. [PMID: 26317928 PMCID: PMC4552882 DOI: 10.1371/journal.pone.0135903] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Accepted: 07/27/2015] [Indexed: 01/03/2023] Open
Abstract
Massively parallel sequencing (MPS) technologies, such as 454-pyrosequencing, allow for the identification of variants in sequence populations at lower levels than consensus sequencing and most single-template Sanger sequencing experiments. We sought to determine if the greater depth of population sampling attainable using MPS technology would allow detection of minor variants in HIV founder virus populations very early in infection in instances where Sanger sequencing detects only a single variant. We compared single nucleotide polymorphisms (SNPs) during acute HIV-1 infection from 32 subjects using both single template Sanger and 454-pyrosequencing. Pyrosequences from a median of 2400 viral templates per subject and encompassing 40% of the HIV-1 genome, were compared to a median of five individually amplified near full-length viral genomes sequenced using Sanger technology. There was no difference in the consensus nucleotide sequences over the 3.6kb compared in 84% of the subjects infected with single founders and 33% of subjects infected with multiple founder variants: among the subjects with disagreements, mismatches were found in less than 1% of the sites evaluated (of a total of nearly 117,000 sites across all subjects). The majority of the SNPs observed only in pyrosequences were present at less than 2% of the subject’s viral sequence population. These results demonstrate the utility of the Sanger approach for study of early HIV infection and provide guidance regarding the design, utility and limitations of population sequencing from variable template sources, and emphasize parameters for improving the interpretation of massively parallel sequencing data to address important questions regarding target sequence evolution.
Collapse
Affiliation(s)
- Shyamala Iyer
- Department of Microbiology, University of Washington, Seattle, WA, 98195, United States of America
| | - Eleanor Casey
- Department of Microbiology, University of Washington, Seattle, WA, 98195, United States of America
| | - Heather Bouzek
- Department of Microbiology, University of Washington, Seattle, WA, 98195, United States of America
| | - Moon Kim
- Department of Microbiology, University of Washington, Seattle, WA, 98195, United States of America
| | - Wenjie Deng
- Department of Microbiology, University of Washington, Seattle, WA, 98195, United States of America
| | - Brendan B. Larsen
- Department of Microbiology, University of Washington, Seattle, WA, 98195, United States of America
| | - Hong Zhao
- Department of Microbiology, University of Washington, Seattle, WA, 98195, United States of America
| | - Roger E. Bumgarner
- Department of Microbiology, University of Washington, Seattle, WA, 98195, United States of America
| | - Morgane Rolland
- US Military HIV Research Program, WRAIR, Silver Spring, MD, 20910, United States of America
- Henry Jackson Foundation for the Advancement of Military Medicine, Inc., Bethesda, MD, 20817, United States of America
| | - James I. Mullins
- Department of Microbiology, University of Washington, Seattle, WA, 98195, United States of America
- Department of Medicine, University of Washington, Seattle, WA, 98195, United States of America
- Department of Laboratory Medicine, Seattle, WA, 98195, United States of America
- * E-mail:
| |
Collapse
|
48
|
Ruiz-Estévez M, Ruiz-Ruano FJ, Cabrero J, Bakkali M, Perfectti F, López-León MD, Camacho JPM. Non-random expression of ribosomal DNA units in a grasshopper showing high intragenomic variation for the ITS2 region. INSECT MOLECULAR BIOLOGY 2015; 24:319-330. [PMID: 25565136 DOI: 10.1111/imb.12158] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
We analyse intragenomic variation of the ITS2 internal transcribed spacer of ribosomal DNA (rDNA) in the grasshopper Eyprepocnemis plorans, by means of tagged PCR 454 amplicon sequencing performed on both genomic DNA (gDNA) and RNA-derived complementary DNA (cDNA), using part of the ITS2 flanking coding regions (5.8S and 28S rDNA) as an internal control for sequencing errors. Six different ITS2 haplotypes (i.e. variants for at least one nucleotide in the complete ITS2 sequence) were found in a single population, one of them (Hap4) being specific to a supernumerary (B) chromosome. The analysis of both gDNA and cDNA from the same individuals provided an estimate of the expression efficiency of the different haplotypes. We found random expression (i.e. about similar recovery in gDNA and cDNA) for three haplotypes (Hap1, Hap2 and Hap5), but significant underexpression for three others (Hap3, Hap4 and Hap6). Hap4 was the most extremely underexpressed and, remarkably, it showed the lowest sequence conservation for the flanking 5.8-28S coding regions in the gDNA reads but the highest conservation (100%) in the cDNA ones, suggesting the preferential expression of mutation-free rDNA units carrying this ITS2 haplotype. These results indicate that the ITS2 region of rDNA is far from complete homogenization in this species, and that the different rDNA units are not expressed at random, with some of them being severely downregulated.
Collapse
Affiliation(s)
- M Ruiz-Estévez
- Departamento de Genética, Universidad de Granada, Granada, Spain
| | | | | | | | | | | | | |
Collapse
|
49
|
Oulas A, Pavloudi C, Polymenakou P, Pavlopoulos GA, Papanikolaou N, Kotoulas G, Arvanitidis C, Iliopoulos I. Metagenomics: tools and insights for analyzing next-generation sequencing data derived from biodiversity studies. Bioinform Biol Insights 2015; 9:75-88. [PMID: 25983555 PMCID: PMC4426941 DOI: 10.4137/bbi.s12462] [Citation(s) in RCA: 177] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2014] [Revised: 03/09/2015] [Accepted: 03/13/2015] [Indexed: 12/14/2022] Open
Abstract
Advances in next-generation sequencing (NGS) have allowed significant breakthroughs in microbial ecology studies. This has led to the rapid expansion of research in the field and the establishment of "metagenomics", often defined as the analysis of DNA from microbial communities in environmental samples without prior need for culturing. Many metagenomics statistical/computational tools and databases have been developed in order to allow the exploitation of the huge influx of data. In this review article, we provide an overview of the sequencing technologies and how they are uniquely suited to various types of metagenomic studies. We focus on the currently available bioinformatics techniques, tools, and methodologies for performing each individual step of a typical metagenomic dataset analysis. We also provide future trends in the field with respect to tools and technologies currently under development. Moreover, we discuss data management, distribution, and integration tools that are capable of performing comparative metagenomic analyses of multiple datasets using well-established databases, as well as commonly used annotation standards.
Collapse
Affiliation(s)
- Anastasis Oulas
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, Greece
| | - Christina Pavloudi
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, Greece
- Department of Biology, University of Ghent, Ghent, Belgium
- Department of Microbial Ecophysiology, University of Bremen, Bremen, Germany
| | - Paraskevi Polymenakou
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, Greece
| | - Georgios A Pavlopoulos
- Division of Basic Sciences, University of Crete, Medical School, Heraklion, Crete, Greece
| | - Nikolas Papanikolaou
- Division of Basic Sciences, University of Crete, Medical School, Heraklion, Crete, Greece
| | - Georgios Kotoulas
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, Greece
| | - Christos Arvanitidis
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, Greece
| | - Ioannis Iliopoulos
- Division of Basic Sciences, University of Crete, Medical School, Heraklion, Crete, Greece
| |
Collapse
|
50
|
Brodin J, Hedskog C, Heddini A, Benard E, Neher RA, Mild M, Albert J. Challenges with using primer IDs to improve accuracy of next generation sequencing. PLoS One 2015; 10:e0119123. [PMID: 25741706 PMCID: PMC4351057 DOI: 10.1371/journal.pone.0119123] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2014] [Accepted: 12/23/2014] [Indexed: 01/09/2023] Open
Abstract
Next generation sequencing technologies, like ultra-deep pyrosequencing (UDPS), allows detailed investigation of complex populations, like RNA viruses, but its utility is limited by errors introduced during sample preparation and sequencing. By tagging each individual cDNA molecule with barcodes, referred to as Primer IDs, before PCR and sequencing these errors could theoretically be removed. Here we evaluated the Primer ID methodology on 257,846 UDPS reads generated from a HIV-1 SG3Δenv plasmid clone and plasma samples from three HIV-infected patients. The Primer ID consisted of 11 randomized nucleotides, 4,194,304 combinations, in the primer for cDNA synthesis that introduced a unique sequence tag into each cDNA molecule. Consensus template sequences were constructed for reads with Primer IDs that were observed three or more times. Despite high numbers of input template molecules, the number of consensus template sequences was low. With 10,000 input molecules for the clone as few as 97 consensus template sequences were obtained due to highly skewed frequency of resampling. Furthermore, the number of sequenced templates was overestimated due to PCR errors in the Primer IDs. Finally, some consensus template sequences were erroneous due to hotspots for UDPS errors. The Primer ID methodology has the potential to provide highly accurate deep sequencing. However, it is important to be aware that there are remaining challenges with the methodology. In particular it is important to find ways to obtain a more even frequency of resampling of template molecules as well as to identify and remove artefactual consensus template sequences that have been generated by PCR errors in the Primer IDs.
Collapse
Affiliation(s)
- Johanna Brodin
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
- * E-mail:
| | - Charlotte Hedskog
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Alexander Heddini
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Emmanuel Benard
- Max Planck Institute for Developmental Biology, Tuebingen, Germany
| | - Richard A. Neher
- Max Planck Institute for Developmental Biology, Tuebingen, Germany
| | - Mattias Mild
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
- Unit for Support, Swedish Institute for Communicable Disease Control, Stockholm, Sweden
| | - Jan Albert
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
- Department of Clinical Microbiology, Karolinska University Hospital, Stockholm, Sweden
| |
Collapse
|