51
|
Effect of the enzyme and PCR conditions on the quality of high-throughput DNA sequencing results. Sci Rep 2015; 5:8056. [PMID: 25623996 PMCID: PMC4306961 DOI: 10.1038/srep08056] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2014] [Accepted: 01/02/2015] [Indexed: 11/25/2022] Open
Abstract
Library preparation protocols for high-throughput DNA sequencing (HTS) include amplification steps in which errors can build up. In order to have confidence in the sequencing data, it is important to understand the effects of different Taq polymerases and PCR amplification protocols on the DNA molecules sequenced. We compared thirteen enzymes in three different marker systems: simple, single copy nuclear gene and complex multi-gene family. We also tested a modified PCR protocol, which has been suggested to reduce errors associated with amplification steps. We find that enzyme choice has a large impact on the proportion of correct sequences recovered. The most complex marker systems yielded fewer correct reads, and the proportion of correct reads was greatly affected by the enzyme used. Modified cycling conditions did reduce the number of incorrect sequences obtained in some cases, but enzyme had a much greater impact on the number of correct reads. Thus, the coverage required for the safe identification of genotypes using one of the low quality enzymes could be seven times larger than with more efficient enzymes in a biallelic system with equal amplification of the two alleles. Consequently, enzyme selection for downstream HTS has important consequences, especially in complex genetic systems.
Collapse
|
52
|
Exome-wide somatic microsatellite variation is altered in cells with DNA repair deficiencies. PLoS One 2014; 9:e110263. [PMID: 25402475 PMCID: PMC4234249 DOI: 10.1371/journal.pone.0110263] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Accepted: 09/18/2014] [Indexed: 11/19/2022] Open
Abstract
Microsatellites (MST), tandem repeats of 1–6 nucleotide motifs, are mutational hot-spots with a bias for insertions and deletions (INDELs) rather than single nucleotide polymorphisms (SNPs). The majority of MST instability studies are limited to a small number of loci, the Bethesda markers, which are only informative for a subset of colorectal cancers. In this paper we evaluate non-haplotype alleles present within next-gen sequencing data to evaluate somatic MST variation (SMV) within DNA repair proficient and DNA repair defective cell lines. We confirm that alleles present within next-gen data that do not contribute to the haplotype can be reliably quantified and utilized to evaluate the SMV without requiring comparisons of matched samples. We observed that SMV patterns found in DNA repair proficient cell lines without DNA repair defects, MCF10A, HEK293 and PD20 RV:D2, had consistent patterns among samples. Further, we were able to confirm that changes in SMV patterns in cell lines lacking functional BRCA2, FANCD2 and mismatch repair were consistent with the different pathways perturbed. Using this new exome sequencing analysis approach we show that DNA instability can be identified in a sample and that patterns of instability vary depending on the impaired DNA repair mechanism, and that genes harboring minor alleles are strongly associated with cancer pathways. The MST Minor Allele Caller used for this study is available at https://github.com/zalmanv/MST_minor_allele_caller.
Collapse
|
53
|
Cunha MV, Inácio J, Freimanis G, Fusaro A, Granberg F, Höper D, King DP, Monne I, Orton R, Rosseel T. Next-generation sequencing in veterinary medicine: how can the massive amount of information arising from high-throughput technologies improve diagnosis, control, and management of infectious diseases? Methods Mol Biol 2014; 1247:415-36. [PMID: 25399113 PMCID: PMC7123048 DOI: 10.1007/978-1-4939-2004-4_30] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The development of high-throughput molecular technologies and associated bioinformatics has dramatically changed the capacities of scientists to produce, handle, and analyze large amounts of genomic, transcriptomic, and proteomic data. A clear example of this step-change is represented by the amount of DNA sequence data that can be now produced using next-generation sequencing (NGS) platforms. Similarly, recent improvements in protein and peptide separation efficiencies and highly accurate mass spectrometry have promoted the identification and quantification of proteins in a given sample. These advancements in biotechnology have increasingly been applied to the study of animal infectious diseases and are beginning to revolutionize the way that biological and evolutionary processes can be studied at the molecular level. Studies have demonstrated the value of NGS technologies for molecular characterization, ranging from metagenomic characterization of unknown pathogens or microbial communities to molecular epidemiology and evolution of viral quasispecies. Moreover, high-throughput technologies now allow detailed studies of host-pathogen interactions at the level of their genomes (genomics), transcriptomes (transcriptomics), or proteomes (proteomics). Ultimately, the interaction between pathogen and host biological networks can be questioned by analytically integrating these levels (integrative OMICS and systems biology). The application of high-throughput biotechnology platforms in these fields and their typical low-cost per information content has revolutionized the resolution with which these processes can now be studied. The aim of this chapter is to provide a current and prospective view on the opportunities and challenges associated with the application of massive parallel sequencing technologies to veterinary medicine, with particular focus on applications that have a potential impact on disease control and management.
Collapse
Affiliation(s)
- Mónica V. Cunha
- Instituto Nacional de Investigação Agrária e Veterinária, IP and Centro de Biologia Ambiental, Faculdade de Ciências, Universidade de Lisboa, Lisbon, Portugal
| | - João Inácio
- Instituto Nacional de Investigação Agrária e Veterinária, IP, Lisboa, Portugal and School of Pharmacy and Biomolecular Sciences, University of Brighton, Brighton, United Kingdom
| | | | | | | | | | | | | | | | | |
Collapse
|
54
|
Gardner K, Payne BAI, Horvath R, Chinnery PF. Use of stereotypical mutational motifs to define resolution limits for the ultra-deep resequencing of mitochondrial DNA. Eur J Hum Genet 2014; 23:413-5. [PMID: 24896153 PMCID: PMC4326723 DOI: 10.1038/ejhg.2014.96] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2014] [Revised: 04/07/2014] [Accepted: 04/24/2014] [Indexed: 11/22/2022] Open
Abstract
Massively parallel resequencing of mitochondrial DNA (mtDNA) has led to significant advances in the study of heteroplasmic mtDNA variants in health and disease, but confident resolution of very low-level variants (<2% heteroplasmy) remains challenging due to the difficulty in distinguishing signal from noise at this depth. However, it is likely that such variants are precisely those of greatest interest in the study of somatic (acquired) mtDNA mutations. Previous approaches to this issue have included the use of controls such as phage DNA and mtDNA clones, both of which may not accurately recapitulate natural mtDNA. We have therefore explored a novel approach, taking advantage of mtDNA with a known stereotyped mutational motif (nAT>C, from patient with MNGIE, mitochondrial neurogastrointestinal encephalomyopathy) and comparing mutational pattern distribution with healthy mtDNA by ligation-mediated deep resequencing (Applied Biosystems SOLiD). We empirically derived mtDNA-mutant heteroplasmy detection limits, demonstrating that the presence of stereotypical mutational motif could be statistically validated for heteroplasmy thresholds ≥0.22% (P=0.034). We therefore provide empirical evidence from biological samples that very low-level mtDNA mutants can be meaningfully resolved by massively parallel resequencing, confirming the utility of the approach for studying somatic mtDNA mutation in health and disease. Our approach could also usefully be employed in other settings to derive platform-specific deep resequencing resolution limits.
Collapse
Affiliation(s)
- Kristian Gardner
- Mitochondrial Research Group, Institute of Genetic Medicine, Newcastle University, Newcastle-upon-Tyne, UK
| | - Brendan A I Payne
- Mitochondrial Research Group, Institute of Genetic Medicine, Newcastle University, Newcastle-upon-Tyne, UK
| | - Rita Horvath
- Mitochondrial Research Group, Institute of Genetic Medicine, Newcastle University, Newcastle-upon-Tyne, UK
| | - Patrick F Chinnery
- Mitochondrial Research Group, Institute of Genetic Medicine, Newcastle University, Newcastle-upon-Tyne, UK
| |
Collapse
|
55
|
Towards error-free profiling of immune repertoires. Nat Methods 2014; 11:653-5. [DOI: 10.1038/nmeth.2960] [Citation(s) in RCA: 317] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Accepted: 04/09/2014] [Indexed: 01/17/2023]
|
56
|
Validation of an oligonucleotide ligation assay for quantification of human immunodeficiency virus type 1 drug-resistant mutants by use of massively parallel sequencing. J Clin Microbiol 2014; 52:2320-7. [PMID: 24740080 DOI: 10.1128/jcm.00306-14] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Global HIV treatment programs need sensitive and affordable tests to monitor HIV drug resistance. We compared mutant detection by the oligonucleotide ligation assay (OLA), an economical and simple test, to massively parallel sequencing. Nonnucleoside reverse transcriptase inhibitor (K103N, V106M, Y181C, and G190A) and lamivudine (M184V) resistance mutations were quantified in blood-derived plasma RNA and cell DNA specimens by OLA and 454 pyrosequencing. A median of 1,000 HIV DNA or RNA templates (range, 163 to 1,874 templates) from blood specimens collected in Mozambique (n = 60) and Kenya (n = 51) were analyzed at 4 codons in each sample (n = 441 codons assessed). Mutations were detected at 75 (17%) codons by OLA sensitive to 2.0%, at 71 codons (16%; P = 0.78) by pyrosequencing using a cutoff value of ≥ 2.0%, and at 125 codons (28%; P < 0.0001) by pyrosequencing sensitive to 0.1%. Discrepancies between the assays included 15 codons with mutant concentrations of ∼2%, one at 8.8% by pyrosequencing and not detected by OLA, and one at 69% by OLA and not detected by pyrosequencing. The latter two cases were associated with genetic polymorphisms in the regions critical for ligation of the OLA probes and pyrosequencing primers, respectively. Overall, mutant concentrations quantified by the two methods correlated well across the codons tested (R(2) > 0.8). Repeat pyrosequencing of 13 specimens showed reproducible detection of 5/24 mutations at <2% and 6/6 at ≥ 2%. In conclusion, the OLA and pyrosequencing performed similarly in the quantification of nonnucleoside reverse transcriptase inhibitor and lamivudine mutations present at >2% of the viral population in clinical specimens. While pyrosequencing was more sensitive, detection of mutants below 2% was not reproducible.
Collapse
|
57
|
Yousif M, Bell TG, Mudawi H, Glebe D, Kramvis A. Analysis of ultra-deep pyrosequencing and cloning based sequencing of the basic core promoter/precore/core region of hepatitis B virus using newly developed bioinformatics tools. PLoS One 2014; 9:e95377. [PMID: 24740330 PMCID: PMC3989311 DOI: 10.1371/journal.pone.0095377] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2013] [Accepted: 03/26/2014] [Indexed: 12/18/2022] Open
Abstract
Aims The aims of this study were to develop bioinformatics tools to explore ultra-deep pyrosequencing (UDPS) data, to test these tools, and to use them to determine the optimum error threshold, and to compare results from UDPS and cloning based sequencing (CBS). Methods Four serum samples, infected with either genotype D or E, from HBeAg-positive and HBeAg-negative patients were randomly selected. UDPS and CBS were used to sequence the basic core promoter/precore region of HBV. Two online bioinformatics tools, the “Deep Threshold Tool” and the “Rosetta Tool” (http://hvdr.bioinf.wits.ac.za/tools/), were built to test and analyze the generated data. Results A total of 10952 reads were generated by UDPS on the 454 GS Junior platform. In the four samples, substitutions, detected at 0.5% threshold or above, were identified at 39 unique positions, 25 of which were non-synonymous mutations. Sample #2 (HBeAg-negative, genotype D) had substitutions in 26 positions, followed by sample #1 (HBeAg-negative, genotype E) in 12 positions, sample #3 (HBeAg-positive, genotype D) in 7 positions and sample #4 (HBeAg-positive, genotype E) in only four positions. The ratio of nucleotide substitutions between isolates from HBeAg-negative and HBeAg-positive patients was 3.5∶1. Compared to genotype E isolates, genotype D isolates showed greater variation in the X, basic core promoter/precore and core regions. Only 18 of the 39 positions identified by UDPS were detected by CBS, which detected 14 of the 25 non-synonymous mutations detected by UDPS. Conclusion UDPS data should be approached with caution. Appropriate curation of read data is required prior to analysis, in order to clean the data and eliminate artefacts. CBS detected fewer than 50% of the substitutions detected by UDPS. Furthermore it is important that the appropriate consensus (reference) sequence is used in order to identify variants correctly.
Collapse
Affiliation(s)
- Mukhlid Yousif
- Hepatitis Virus Diversity Research Programme, Department of Internal Medicine, University of the Witwatersrand, Johannesburg, Gauteng, South Africa
| | - Trevor G. Bell
- Hepatitis Virus Diversity Research Programme, Department of Internal Medicine, University of the Witwatersrand, Johannesburg, Gauteng, South Africa
| | - Hatim Mudawi
- Department of Medicine, Faculty of Medicine, University of Khartoum, Khartoum, Khartoum State, Sudan
| | - Dieter Glebe
- Institute of Medical Virology, National Reference Centre of Hepatitis B and D, Justus, Liebig-University of Giessen, Giessen, Hesse, Germany
| | - Anna Kramvis
- Hepatitis Virus Diversity Research Programme, Department of Internal Medicine, University of the Witwatersrand, Johannesburg, Gauteng, South Africa
- * E-mail:
| |
Collapse
|
58
|
Abstract
TP53 mutations are strong predictors of poor survival and refractoriness in chronic lymphocytic leukemia (CLL) and have direct implications for disease management. Clinical information on TP53 mutations is limited to lesions represented in >20% leukemic cells. Here, we tested the clinical impact and prediction of chemorefractoriness of very small TP53 mutated subclones. The TP53 gene underwent ultra-deep-next generation sequencing (NGS) in 309 newly diagnosed CLL. A robust bioinformatic algorithm was established for the highly sensitive detection of few TP53 mutated cells (down to 3 out of ∼1000 wild-type cells). Minor subclones were validated by independent approaches. Ultra-deep-NGS identified small TP53 mutated subclones in 28/309 (9%) untreated CLL that, due to their very low abundance (median allele frequency: 2.1%), were missed by Sanger sequencing. Patients harboring small TP53 mutated subclones showed the same clinical phenotype and poor survival (hazard ratio = 2.01; P = .0250) as those of patients carrying clonal TP53 lesions. By longitudinal analysis, small TP53 mutated subclones identified before treatment became the predominant population at the time of CLL relapse and anticipated the development of chemorefractoriness. This study provides a proof-of-principle that very minor leukemia subclones detected at diagnosis are an important driver of the subsequent disease course.
Collapse
|
59
|
Sensitive deep-sequencing-based HIV-1 genotyping assay to simultaneously determine susceptibility to protease, reverse transcriptase, integrase, and maturation inhibitors, as well as HIV-1 coreceptor tropism. Antimicrob Agents Chemother 2014; 58:2167-85. [PMID: 24468782 DOI: 10.1128/aac.02710-13] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
With 29 individual antiretroviral drugs available from six classes that are approved for the treatment of HIV-1 infection, a combination of different phenotypic and genotypic tests is currently needed to monitor HIV-infected individuals. In this study, we developed a novel HIV-1 genotypic assay based on deep sequencing (DeepGen HIV) to simultaneously assess HIV-1 susceptibilities to all drugs targeting the three viral enzymes and to predict HIV-1 coreceptor tropism. Patient-derived gag-p2/NCp7/p1/p6/pol-PR/RT/IN- and env-C2V3 PCR products were sequenced using the Ion Torrent Personal Genome Machine. Reads spanning the 3' end of the Gag, protease (PR), reverse transcriptase (RT), integrase (IN), and V3 regions were extracted, truncated, translated, and assembled for genotype and HIV-1 coreceptor tropism determination. DeepGen HIV consistently detected both minority drug-resistant viruses and non-R5 HIV-1 variants from clinical specimens with viral loads of ≥1,000 copies/ml and from B and non-B subtypes. Additional mutations associated with resistance to PR, RT, and IN inhibitors, previously undetected by standard (Sanger) population sequencing, were reliably identified at frequencies as low as 1%. DeepGen HIV results correlated with phenotypic (original Trofile, 92%; enhanced-sensitivity Trofile assay [ESTA], 80%; TROCAI, 81%; and VeriTrop, 80%) and genotypic (population sequencing/Geno2Pheno with a 10% false-positive rate [FPR], 84%) HIV-1 tropism test results. DeepGen HIV (83%) and Trofile (85%) showed similar concordances with the clinical response following an 8-day course of maraviroc monotherapy (MCT). In summary, this novel all-inclusive HIV-1 genotypic and coreceptor tropism assay, based on deep sequencing of the PR, RT, IN, and V3 regions, permits simultaneous multiplex detection of low-level drug-resistant and/or non-R5 viruses in up to 96 clinical samples. This comprehensive test, the first of its class, will be instrumental in the development of new antiretroviral drugs and, more importantly, will aid in the treatment and management of HIV-infected individuals.
Collapse
|
60
|
Wang X, Li Y, Ni T, Xie X, Zhu J, Zheng ZM. Genome sequencing accuracy by RCA-seq versus long PCR template cloning and sequencing in identification of human papillomavirus type 58. Cell Biosci 2014; 4:5. [PMID: 24410913 PMCID: PMC3903022 DOI: 10.1186/2045-3701-4-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2013] [Accepted: 12/16/2013] [Indexed: 11/20/2022] Open
Abstract
Background Genome variations in human papillomaviruses (HPVs) are common and have been widely investigated in the past two decades. HPV genotyping depends on the finding of the viral genome variations in the L1 ORF. Other parts of the viral genome variations have also been implicated as a possible genetic factor in viral pathogenesis and/or oncogenicity. Results In this study, the HPV58 genome in cervical lesions was completely sequenced both by rolling-circle amplification of total cell DNA and deep sequencing (RCA-seq) and by long PCR template cloning and sequencing. By comparison of three HPV58 genome sequences decoded from three clinical samples to reference HPV-58, we demonstrated that RCA-seq is much more accurate than long-PCR template cloning and sequencing in decoding HPV58 genome. Three HPV58 genomes decoded by RCA-seq displayed a total of 52 nucleotide substitutions from reference HPV58, which could be verified by long PCR template cloning and sequencing. However, the long PCR template cloning and sequencing led to additional nucleotide substitutions, insertions, and deletions from an authentic HPV58 genome in a clinical sample, which vary from one cloned sequence to another. Because the inherited error-prone nature of Tgo DNA polymerase used in preparation of the long PCR templates of HPV58 genome from the clinical samples, the measurable error rate in incorporation of nucleotide into an elongating DNA template was about 0.149% ±0.038% in our studies. Conclusions Since PCR template cloning and sequencing is widely used in identification of single nucleotide polymorphism (SNP), our data indicate that a serious caution should be taken in finding of true SNPs in various genetic studies.
Collapse
Affiliation(s)
| | | | | | | | | | - Zhi-Ming Zheng
- Tumor Virus RNA Biology Section, Gene Regulation and Chromosome Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|