1
|
Lu M, Sun X, Zhao Y, Zheng L, Lin J, Tang C, Chao K, Chen Y, Li K, Zhou Y, Xiao J. Low cycle number multiplex PCR: A novel strategy for the construction of amplicon libraries for next-generation sequencing. Electrophoresis 2024; 45:1398-1407. [PMID: 38533931 DOI: 10.1002/elps.202300160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 02/11/2024] [Accepted: 02/15/2024] [Indexed: 03/28/2024]
Abstract
Multiplex PCR is a critical step when preparing amplicon library for next-generation sequencing. However, there are several challenges related to multiplex PCR including poor uniformity, nonspecific amplification, and primer-dimers. To address these issues, we propose a novel solution strategy that involves using a low cycle number (<10 cycles) in multiplex PCR and then employing carrier DNAs and magnetic beads for the selection of targeted products. This technique improves the amplicon uniformity while also reducing primer-dimers and PCR artifacts. To evaluate our technique, we initially utilized 120 DNA fragments from mouse genome containing single nucleotide polymorphism (SNP) sites. Sequencing results demonstrated that with only 7 cycles of multiplex PCR, 95.8% of the targeted SNP sites were mapped, with a coverage of at least 1×. The average sequencing depth of all amplicons was 1705.79 ± 1205.30×; 87% of them reached a coverage depth that exceeded 0.2-fold of the average sequencing depth. Our method had a greater uniformity (87%) when compared to Hi-Plex PCR (53.3%). Furthermore, we validated our strategy by randomly selecting 90 primer pairs twice from the initial set of 120 primer-pairs. Next, we used the same protocol to prepare amplicon libraries. The two groups had an average sequencing depth of 1013.30 ± 585.57× and 219.10 ± 158.27×, respectively; over 84% of the amplicons had a sequencing depth that exceeded 0.2-fold of average depth. These results suggest that the use of a low cycle number in multiplex PCR is a cost-effective and efficient approach for the preparation of amplicon libraries.
Collapse
Affiliation(s)
- Meng Lu
- College of Biological Science and Medical Engineering, Donghua University, Shanghai, P. R. China
| | - Xiuxiu Sun
- College of Biological Science and Medical Engineering, Donghua University, Shanghai, P. R. China
| | - Yuxin Zhao
- College of Biological Science and Medical Engineering, Donghua University, Shanghai, P. R. China
| | - Linlin Zheng
- College of Biological Science and Medical Engineering, Donghua University, Shanghai, P. R. China
| | - Junjie Lin
- College of Biological Science and Medical Engineering, Donghua University, Shanghai, P. R. China
| | - Chen Tang
- College of Biological Science and Medical Engineering, Donghua University, Shanghai, P. R. China
| | - Kaiyue Chao
- Shanghai Biowing Biotechnology Application Co., Ltd, Shanghai, P. R. China
| | - Ye Chen
- Shanghai Biowing Biotechnology Application Co., Ltd, Shanghai, P. R. China
| | - Kai Li
- College of Biological Science and Medical Engineering, Donghua University, Shanghai, P. R. China
| | - Yuxun Zhou
- College of Biological Science and Medical Engineering, Donghua University, Shanghai, P. R. China
| | - Junhua Xiao
- College of Biological Science and Medical Engineering, Donghua University, Shanghai, P. R. China
| |
Collapse
|
2
|
Atkins A, Chung CH, Allen AG, Dampier W, Gurrola TE, Sariyer IK, Nonnemacher MR, Wigdahl B. Off-Target Analysis in Gene Editing and Applications for Clinical Translation of CRISPR/Cas9 in HIV-1 Therapy. Front Genome Ed 2021; 3:673022. [PMID: 34713260 PMCID: PMC8525399 DOI: 10.3389/fgeed.2021.673022] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Accepted: 06/21/2021] [Indexed: 12/26/2022] Open
Abstract
As genome-editing nucleases move toward broader clinical applications, the need to define the limits of their specificity and efficiency increases. A variety of approaches for nuclease cleavage detection have been developed, allowing a full-genome survey of the targeting landscape and the detection of a variety of repair outcomes for nuclease-induced double-strand breaks. Each approach has advantages and disadvantages relating to the means of target-site capture, target enrichment mechanism, cellular environment, false discovery, and validation of bona fide off-target cleavage sites in cells. This review examines the strengths, limitations, and origins of the different classes of off-target cleavage detection systems including anchored primer enrichment (GUIDE-seq), in situ detection (BLISS), in vitro selection libraries (CIRCLE-seq), chromatin immunoprecipitation (ChIP) (DISCOVER-Seq), translocation sequencing (LAM PCR HTGTS), and in vitro genomic DNA digestion (Digenome-seq and SITE-Seq). Emphasis is placed on the specific modifications that give rise to the enhanced performance of contemporary techniques over their predecessors and the comparative performance of techniques for different applications. The clinical relevance of these techniques is discussed in the context of assessing the safety of novel CRISPR/Cas9 HIV-1 curative strategies. With the recent success of HIV-1 and SIV-1 viral suppression in humanized mice and non-human primates, respectively, using CRISPR/Cas9, rigorous exploration of potential off-target effects is of critical importance. Such analyses would benefit from the application of the techniques discussed in this review.
Collapse
Affiliation(s)
- Andrew Atkins
- Department of Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, PA, United States,Center for Molecular Virology and Translational Neuroscience, Institute for Molecular Medicine and Infectious Disease, Drexel University College of Medicine, Philadelphia, PA, United States
| | - Cheng-Han Chung
- Department of Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, PA, United States,Center for Molecular Virology and Translational Neuroscience, Institute for Molecular Medicine and Infectious Disease, Drexel University College of Medicine, Philadelphia, PA, United States
| | - Alexander G. Allen
- Department of Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, PA, United States,Center for Molecular Virology and Translational Neuroscience, Institute for Molecular Medicine and Infectious Disease, Drexel University College of Medicine, Philadelphia, PA, United States
| | - Will Dampier
- Department of Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, PA, United States,Center for Molecular Virology and Translational Neuroscience, Institute for Molecular Medicine and Infectious Disease, Drexel University College of Medicine, Philadelphia, PA, United States
| | - Theodore E. Gurrola
- Department of Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, PA, United States,Center for Molecular Virology and Translational Neuroscience, Institute for Molecular Medicine and Infectious Disease, Drexel University College of Medicine, Philadelphia, PA, United States
| | - Ilker K. Sariyer
- Department of Neuroscience and Center for Neurovirology, Temple University Lewis Katz School of Medicine, Philadelphia, PA, United States
| | - Michael R. Nonnemacher
- Department of Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, PA, United States,Center for Molecular Virology and Translational Neuroscience, Institute for Molecular Medicine and Infectious Disease, Drexel University College of Medicine, Philadelphia, PA, United States
| | - Brian Wigdahl
- Department of Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, PA, United States,Center for Molecular Virology and Translational Neuroscience, Institute for Molecular Medicine and Infectious Disease, Drexel University College of Medicine, Philadelphia, PA, United States,Sidney Kimmel Cancer Center, Thomas Jefferson University, Philadelphia, PA, United States,*Correspondence: Brian Wigdahl
| |
Collapse
|
3
|
Wallace AD, Sasani TA, Swanier J, Gates BL, Greenland J, Pedersen BS, Varley KE, Quinlan AR. CaBagE: A Cas9-based Background Elimination strategy for targeted, long-read DNA sequencing. PLoS One 2021; 16:e0241253. [PMID: 33830997 PMCID: PMC8031414 DOI: 10.1371/journal.pone.0241253] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 01/19/2021] [Indexed: 11/29/2022] Open
Abstract
A substantial fraction of the human genome is difficult to interrogate with short-read DNA sequencing technologies due to paralogy, complex haplotype structures, or tandem repeats. Long-read sequencing technologies, such as Oxford Nanopore's MinION, enable direct measurement of complex loci without introducing many of the biases inherent to short-read methods, though they suffer from relatively lower throughput. This limitation has motivated recent efforts to develop amplification-free strategies to target and enrich loci of interest for subsequent sequencing with long reads. Here, we present CaBagE, a method for target enrichment that is efficient and useful for sequencing large, structurally complex targets. The CaBagE method leverages the stable binding of Cas9 to its DNA target to protect desired fragments from digestion with exonuclease. Enriched DNA fragments are then sequenced with Oxford Nanopore's MinION long-read sequencing technology. Enrichment with CaBagE resulted in a median of 116X coverage (range 39-416) of target loci when tested on five genomic targets ranging from 4-20kb in length using healthy donor DNA. Four cancer gene targets were enriched in a single reaction and multiplexed on a single MinION flow cell. We further demonstrate the utility of CaBagE in two ALS patients with C9orf72 short tandem repeat expansions to produce genotype estimates commensurate with genotypes derived from repeat-primed PCR for each individual. With CaBagE there is a physical enrichment of on-target DNA in a given sample prior to sequencing. This feature allows adaptability across sequencing platforms and potential use as an enrichment strategy for applications beyond sequencing. CaBagE is a rapid enrichment method that can illuminate regions of the 'hidden genome' underlying human disease.
Collapse
Affiliation(s)
- Amelia D. Wallace
- Department of Human Genetics, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
- Utah Center for Genetic Discovery, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
| | - Thomas A. Sasani
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Jordan Swanier
- Department of Human Genetics, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
| | - Brooke L. Gates
- Department of Oncological Sciences, Huntsman Cancer Institute, Salt Lake City, Utah, United States of America
| | - Jeff Greenland
- Department of Oncological Sciences, Huntsman Cancer Institute, Salt Lake City, Utah, United States of America
| | - Brent S. Pedersen
- Department of Human Genetics, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
- Utah Center for Genetic Discovery, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
| | - Katherine E. Varley
- Department of Oncological Sciences, Huntsman Cancer Institute, Salt Lake City, Utah, United States of America
| | - Aaron R. Quinlan
- Department of Human Genetics, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
- Utah Center for Genetic Discovery, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
- Department of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
| |
Collapse
|
4
|
Identification of SNPs in crucial starch biosynthesis genes in rice. J Genet 2021. [DOI: 10.1007/s12041-020-01251-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
5
|
Khanbo S, Tangphatsornruang S, Piriyapongsa J, Wirojsirasak W, Punpee P, Klomsa-Ard P, Ukoskit K. Candidate gene association of gene expression data in sugarcane contrasting for sucrose content. Genomics 2020; 113:229-237. [PMID: 33321201 DOI: 10.1016/j.ygeno.2020.12.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Revised: 12/03/2020] [Accepted: 12/10/2020] [Indexed: 11/19/2022]
Abstract
Association mapping of gene expression data, generated from transcriptome and proteome studies, provides a means of understanding the functional significance and trait association potential of candidate genes. In this study, we applied candidate gene association mapping to validate sugarcane genes, using data from the starch and sucrose metabolism pathway, transcriptome, and proteome. We performed multiplex PCR targeted amplicon sequencing of 109 candidate genes, using NGS technology. A range of statistical models, both single-locus and multi-locus, were compared for minimization of false positives in association mapping of four sugar-related traits with different heritability. The Fixed and random model Circulating Probability Unification model effectively suppressed false positives for both low- and high-heritability traits. We identified favorable alleles of the candidate genes involved in signalling and transcriptional regulation. The results will support genetic improvement of sugarcane and may help clarify the genetic architecture of sugar-related traits.
Collapse
Affiliation(s)
- Supaporn Khanbo
- Department of Biotechnology, Faculty of Science and Technology, Thammasat University, Rangsit Campus, Klong Luang, Pathumtani 12121, Thailand
| | - Sithichoke Tangphatsornruang
- National Science and Technology Development Agency, 113 Thailand Science Park, Khlong Luang, Pathum Thani 12120, Thailand
| | - Jittima Piriyapongsa
- National Science and Technology Development Agency, 113 Thailand Science Park, Khlong Luang, Pathum Thani 12120, Thailand
| | - Warodom Wirojsirasak
- Mitr Phol Innovation and Research Centre, 399 Moo 1, Chumphae-Phukiao Rd. Khoksa-at, Phu Khiao, Chaiyaphum 36110, Thailand
| | - Prapat Punpee
- Mitr Phol Innovation and Research Centre, 399 Moo 1, Chumphae-Phukiao Rd. Khoksa-at, Phu Khiao, Chaiyaphum 36110, Thailand
| | - Peeraya Klomsa-Ard
- Mitr Phol Innovation and Research Centre, 399 Moo 1, Chumphae-Phukiao Rd. Khoksa-at, Phu Khiao, Chaiyaphum 36110, Thailand
| | - Kittipat Ukoskit
- Department of Biotechnology, Faculty of Science and Technology, Thammasat University, Rangsit Campus, Klong Luang, Pathumtani 12121, Thailand.
| |
Collapse
|
6
|
Ossandon MR, Agrawal L, Bernhard EJ, Conley BA, Dey SM, Divi RL, Guan P, Lively TG, McKee TC, Sorg BS, Tricoli JV. Circulating Tumor DNA Assays in Clinical Cancer Research. J Natl Cancer Inst 2018; 110:929-934. [PMID: 29931312 PMCID: PMC6136923 DOI: 10.1093/jnci/djy105] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Revised: 04/13/2018] [Accepted: 05/11/2018] [Indexed: 01/01/2023] Open
Abstract
The importance of circulating free DNA (cfDNA) in cancer clinical research was recognized in 1994 when a mutated RAS gene fragment was detected in a patient's blood sample. Up to 1% of the total circulating DNA in patients with cancer is circulating tumor DNA (ctDNA) that originates from tumor cells. As ctDNA is rapidly cleared from the blood stream and can be obtained by minimally invasive methods, it can be used as a dynamic cancer biomarker for cancer early detection, diagnosis, and treatment monitoring. Despite the potential for clinical use, few ctDNA assays have been cleared or approved by the US Food and Drug Administration. As tools for clinical and translational research, current ctDNA assays face some challenges, and more research is needed to advance use of these assays. On September 29-30, 2016, the Division of Cancer Treatment and Diagnosis at the National Cancer Institute convened a workshop entitled "Circulating Tumor DNA Assays in Clinical Cancer Research" to garner input from industry experts, academia, and government research and regulatory agencies to understand and promote the translation of ctDNA assays to clinical research, with potential to advance to use in clinical practice. This Commentary presents the topics of the workshop covered in the presentations and points made in the discussions that followed: 1) background on ctDNA, 2) potential clinical utility of ctDNA assays, 3) assay technology, 4) assay clinical and analytical validation, and 5) industry perspectives. Additional relevant information that has come to light since the workshop has been included.
Collapse
Affiliation(s)
- Miguel R Ossandon
- National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Lokesh Agrawal
- National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Eric J Bernhard
- National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Barbara A Conley
- National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Sumana M Dey
- National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Rao L Divi
- National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Ping Guan
- National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Tracy G Lively
- National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Tawnya C McKee
- National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Brian S Sorg
- National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - James V Tricoli
- National Cancer Institute, National Institutes of Health, Bethesda, MD
| |
Collapse
|
7
|
|
8
|
Emerman AB, Bowman SK, Barry A, Henig N, Patel KM, Gardner AF, Hendrickson CL. NEBNext Direct: A Novel, Rapid, Hybridization-Based Approach for the Capture and Library Conversion of Genomic Regions of Interest. ACTA ACUST UNITED AC 2017; 119:7.30.1-7.30.24. [PMID: 28678441 DOI: 10.1002/cpmb.39] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Next-generation sequencing (NGS) is a powerful tool for genomic studies, translational research, and clinical diagnostics that enables the detection of single nucleotide polymorphisms, insertions and deletions, copy number variations, and other genetic variations. Target enrichment technologies improve the efficiency of NGS by only sequencing regions of interest, which reduces sequencing costs while increasing coverage of the selected targets. Here we present NEBNext Direct® , a hybridization-based, target-enrichment approach that addresses many of the shortcomings of traditional target-enrichment methods. This approach features a simple, 7-hr workflow that uses enzymatic removal of off-target sequences to achieve a high specificity for regions of interest. Additionally, unique molecular identifiers are incorporated for the identification and filtering of PCR duplicates. The same protocol can be used across a wide range of input amounts, input types, and panel sizes, enabling NEBNext Direct to be broadly applicable across a wide variety of research and diagnostic needs. © 2017 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
| | | | | | - Noa Henig
- Directed Genomics, Inc, Ipswich, Massachusetts
| | | | | | | |
Collapse
|
9
|
Abstract
Conventional microbiological methods have been readily taken over by newer molecular techniques due to the ease of use, reproducibility, sensitivity and speed of working with nucleic acids. These tools allow high throughput analysis of complex and diverse microbial communities, such as those in soil, freshwater, saltwater, or the microbiota living in collaboration with a host organism (plant, mouse, human, etc). For instance, these methods have been robustly used for characterizing the plant (rhizosphere), animal and human microbiome specifically the complex intestinal microbiota. The human body has been referred to as the Superorganism since microbial genes are more numerous than the number of human genes and are essential to the health of the host. In this review we provide an overview of the Next Generation tools currently available to study microbial ecology, along with their limitations and advantages.
Collapse
Affiliation(s)
- Lisa A Boughner
- Center for Microbial Ecology, Michigan State University, E. Lansing MI 48823
| | - Pallavi Singh
- Department of Microbiology and Molecular Genetics, Michigan State University, E. Lansing MI 48823
| |
Collapse
|
10
|
Chen K, Zhou YX, Li K, Qi LX, Zhang QF, Wang MC, Xiao JH. A novel three-round multiplex PCR for SNP genotyping with next generation sequencing. Anal Bioanal Chem 2016; 408:4371-7. [PMID: 27113460 DOI: 10.1007/s00216-016-9536-6] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Revised: 03/27/2016] [Accepted: 03/31/2016] [Indexed: 11/28/2022]
Abstract
Owing to the high throughput and low cost, next generation sequencing has attracted much attention for SNP genotyping application for researchers. Here, we introduce a new method based on three-round multiplex PCR to precisely genotype SNPs with next generation sequencing. This method can as much as possible consume the equivalent amount of each pair of specific primers to largely eliminate the amplification discrepancy between different loci. After the PCR amplification, the products can be directly subjected to next generation sequencing platform. We simultaneously amplified 37 SNP loci of 757 samples and sequenced all amplicons on ion torrent PGM platform; 90.5 % of the target SNP loci were accurately genotyped (at least 15×) and 90.4 % amplicons had uniform coverage with a variation less than 50-fold. Ligase detection reaction (LDR) was performed to genotype the 19 SNP loci (as part of the 37 SNP loci) with 91 samples randomly selected from the 757 samples, and 99.5 % genotyping data were consistent with the next generation sequencing results. Our results demonstrate that three-round PCR coupled with next generation sequencing is an efficient and economical genotyping approach. Graphical Abstract The schematic diagram of three-round PCR.
Collapse
Affiliation(s)
- Ke Chen
- College of Environmental Science and Engineering, Donghua University, Shanghai, 05003365, China
| | - Yu-Xun Zhou
- Institute of Biological Sciences and Biotechnology, Donghua University, Shanghai, 05003365, China
| | - Kai Li
- Institute of Biological Sciences and Biotechnology, Donghua University, Shanghai, 05003365, China
| | - Li-Xin Qi
- Institute of Biological Sciences and Biotechnology, Donghua University, Shanghai, 05003365, China
| | - Qi-Fei Zhang
- Institute of Biological Sciences and Biotechnology, Donghua University, Shanghai, 05003365, China
| | - Mao-Chun Wang
- Institute of Biological Sciences and Biotechnology, Donghua University, Shanghai, 05003365, China
| | - Jun-Hua Xiao
- Institute of Biological Sciences and Biotechnology, Donghua University, Shanghai, 05003365, China.
| |
Collapse
|
11
|
Kozarewa I, Armisen J, Gardner AF, Slatko BE, Hendrickson C. Overview of Target Enrichment Strategies. ACTA ACUST UNITED AC 2015; 112:7.21.1-7.21.23. [DOI: 10.1002/0471142727.mb0721s112] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Iwanka Kozarewa
- Oncology Translational Science, Innovative Medicines & Early Development, AstraZeneca Cambridge United Kingdom
| | | | | | | | | |
Collapse
|
12
|
Murray DC, Coghlan ML, Bunce M. From benchtop to desktop: important considerations when designing amplicon sequencing workflows. PLoS One 2015; 10:e0124671. [PMID: 25902146 PMCID: PMC4406758 DOI: 10.1371/journal.pone.0124671] [Citation(s) in RCA: 107] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Accepted: 03/16/2015] [Indexed: 02/08/2023] Open
Abstract
Amplicon sequencing has been the method of choice in many high-throughput DNA sequencing (HTS) applications. To date there has been a heavy focus on the means by which to analyse the burgeoning amount of data afforded by HTS. In contrast, there has been a distinct lack of attention paid to considerations surrounding the importance of sample preparation and the fidelity of library generation. No amount of high-end bioinformatics can compensate for poorly prepared samples and it is therefore imperative that careful attention is given to sample preparation and library generation within workflows, especially those involving multiple PCR steps. This paper redresses this imbalance by focusing on aspects pertaining to the benchtop within typical amplicon workflows: sample screening, the target region, and library generation. Empirical data is provided to illustrate the scope of the problem. Lastly, the impact of various data analysis parameters is also investigated in the context of how the data was initially generated. It is hoped this paper may serve to highlight the importance of pre-analysis workflows in achieving meaningful, future-proof data that can be analysed appropriately. As amplicon sequencing gains traction in a variety of diagnostic applications from forensics to environmental DNA (eDNA) it is paramount workflows and analytics are both fit for purpose.
Collapse
Affiliation(s)
- Dáithí C. Murray
- Trace and Environmental DNA Laboratory, Department of Environment and Agriculture, Curtin University, Perth, Western Australia, Australia
| | - Megan L. Coghlan
- Trace and Environmental DNA Laboratory, Department of Environment and Agriculture, Curtin University, Perth, Western Australia, Australia
| | - Michael Bunce
- Trace and Environmental DNA Laboratory, Department of Environment and Agriculture, Curtin University, Perth, Western Australia, Australia
| |
Collapse
|
13
|
Edge effects in calling variants from targeted amplicon sequencing. BMC Genomics 2014; 15:1073. [PMID: 25480444 PMCID: PMC4302139 DOI: 10.1186/1471-2164-15-1073] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2014] [Accepted: 11/26/2014] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Analysis of targeted amplicon sequencing data presents some unique challenges in comparison to the analysis of random fragment sequencing data. Whereas reads from randomly fragmented DNA have arbitrary start positions, the reads from amplicon sequencing have fixed start positions that coincide with the amplicon boundaries. As a result, any variants near the amplicon boundaries can cause misalignments of multiple reads that can ultimately lead to false-positive or false-negative variant calls. RESULTS We show that amplicon boundaries are variant calling blind spots where the variant calls are highly inaccurate. We propose that an effective strategy to avoid these blind spots is to incorporate the primer bases in obtaining read alignments and post-processing of the alignments, thereby effectively moving these blind spots into the primer binding regions (which are not used for variant calling). Targeted sequencing data analysis pipelines can provide better variant calling accuracy when primer bases are retained and sequenced. CONCLUSIONS Read bases beyond the variant site are necessary for analysis of amplicon sequencing data. Enzymatic primer digestion, if used in the target enrichment process, should leave at least a few primer bases to ensure that these bases are available during data analysis. The primer bases should only be removed immediately before the variant calling step to ensure that the variants can be called irrespective of where they occur within the amplicon insert region.
Collapse
|
14
|
Zheng Z, Liebers M, Zhelyazkova B, Cao Y, Panditi D, Lynch KD, Chen J, Robinson HE, Shim HS, Chmielecki J, Pao W, Engelman JA, Iafrate AJ, Le LP. Anchored multiplex PCR for targeted next-generation sequencing. Nat Med 2014; 20:1479-84. [PMID: 25384085 DOI: 10.1038/nm.3729] [Citation(s) in RCA: 668] [Impact Index Per Article: 66.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2013] [Accepted: 07/29/2014] [Indexed: 12/13/2022]
Abstract
We describe a rapid target enrichment method for next-generation sequencing, termed anchored multiplex PCR (AMP), that is compatible with low nucleic acid input from formalin-fixed paraffin-embedded (FFPE) specimens. AMP is effective in detecting gene rearrangements (without prior knowledge of the fusion partners), single nucleotide variants, insertions, deletions and copy number changes. Validation of a gene rearrangement panel using 319 FFPE samples showed 100% sensitivity (95% confidence limit: 96.5-100%) and 100% specificity (95% confidence limit: 99.3-100%) compared with reference assays. On the basis of our experience with performing AMP on 986 clinical FFPE samples, we show its potential as both a robust clinical assay and a powerful discovery tool, which we used to identify new therapeutically important gene fusions: ARHGEF2-NTRK1 and CHTOP-NTRK1 in glioblastoma, MSN-ROS1, TRIM4-BRAF, VAMP2-NRG1, TPM3-NTRK1 and RUFY2-RET in lung cancer, FGFR2-CREB5 in cholangiocarcinoma and PPL-NTRK1 in thyroid carcinoma. AMP is a scalable and efficient next-generation sequencing target enrichment method for research and clinical applications.
Collapse
Affiliation(s)
- Zongli Zheng
- 1] Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA. [2] Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Matthew Liebers
- Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Boryana Zhelyazkova
- Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Yi Cao
- Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Divya Panditi
- Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Kerry D Lynch
- Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Juxiang Chen
- 1] Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA. [2] Department of Neurosurgery, Shanghai Changzheng Hospital, Shanghai, China
| | - Hayley E Robinson
- Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Hyo Sup Shim
- 1] Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA. [2] Department of Pathology, Yonsei University College of Medicine, Seoul, Korea
| | | | - William Pao
- Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Jeffrey A Engelman
- Cancer Center, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - A John Iafrate
- 1] Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA. [2] Cancer Center, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Long Phi Le
- 1] Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA. [2] Cancer Center, Massachusetts General Hospital, Boston, Massachusetts, USA
| |
Collapse
|
15
|
Day K, Song J, Absher D. Targeted sequencing of large genomic regions with CATCH-Seq. PLoS One 2014; 9:e111756. [PMID: 25357200 PMCID: PMC4214737 DOI: 10.1371/journal.pone.0111756] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2014] [Accepted: 09/30/2014] [Indexed: 01/06/2023] Open
Abstract
Current target enrichment systems for large-scale next-generation sequencing typically require synthetic oligonucleotides used as capture reagents to isolate sequences of interest. The majority of target enrichment reagents are focused on gene coding regions or promoters en masse. Here we introduce development of a customizable targeted capture system using biotinylated RNA probe baits transcribed from sheared bacterial artificial chromosome clone templates that enables capture of large, contiguous blocks of the genome for sequencing applications. This clone adapted template capture hybridization sequencing (CATCH-Seq) procedure can be used to capture both coding and non-coding regions of a gene, and resolve the boundaries of copy number variations within a genomic target site. Furthermore, libraries constructed with methylated adapters prior to solution hybridization also enable targeted bisulfite sequencing. We applied CATCH-Seq to diverse targets ranging in size from 125 kb to 3.5 Mb. Our approach provides a simple and cost effective alternative to other capture platforms because of template-based, enzymatic probe synthesis and the lack of oligonucleotide design costs. Given its similarity in procedure, CATCH-Seq can also be performed in parallel with commercial systems.
Collapse
Affiliation(s)
- Kenneth Day
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama, United States of America
| | - Jun Song
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama, United States of America
| | - Devin Absher
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama, United States of America
- * E-mail:
| |
Collapse
|
16
|
Haas J, Barb I, Katus HA, Meder B. Targeted next-generation sequencing: the clinician's stethoscope for genetic disorders. Per Med 2014; 11:581-592. [PMID: 29758803 DOI: 10.2217/pme.14.40] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Genetic biomarkers are crucial for diagnosis, guiding of treatments and estimation of prognosis. In the past, clinical genetic diagnostics was limited by the sequencing information gained from selected exons and single genes. For genetically heterogeneous diseases, such as cardiomyopathies, where underlying mutations in more than 1000 exons are known, a Sanger-based comprehensive test would have been extremely expensive and labor intensive. Next-generation sequencing has overcome these problems in terms of costs, speed and throughput. In this review we discuss available methods for targeted next-generation sequencing that ease the introduction of this technology into routine clinical application. We further provide results of a study we have performed to compare two state-of-the-art methods for their enrichment efficiency and detection accuracy of variants in a clinical setting.
Collapse
Affiliation(s)
- Jan Haas
- Department of Internal Medicine III, University of Heidelberg, Heidelberg, Germany.,DZHK (German Centre for Cardiovascular Research), Germany
| | - Ioana Barb
- Department of Internal Medicine III, University of Heidelberg, Heidelberg, Germany.,DZHK (German Centre for Cardiovascular Research), Germany
| | - Hugo A Katus
- Department of Internal Medicine III, University of Heidelberg, Heidelberg, Germany.,DZHK (German Centre for Cardiovascular Research), Germany
| | - Benjamin Meder
- Department of Internal Medicine III, University of Heidelberg, Heidelberg, Germany.,DZHK (German Centre for Cardiovascular Research), Germany
| |
Collapse
|
17
|
Shen P, Wang W, Chi AK, Fan Y, Davis RW, Scharfe C. Multiplex target capture with double-stranded DNA probes. Genome Med 2013; 5:50. [PMID: 23718862 PMCID: PMC3706973 DOI: 10.1186/gm454] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2013] [Revised: 04/25/2013] [Accepted: 05/29/2013] [Indexed: 11/10/2022] Open
Abstract
Target enrichment technologies utilize single-stranded oligonucleotide probes to capture candidate genomic regions from a DNA sample before sequencing. We describe target capture using double-stranded probes, which consist of single-stranded, complementary long padlock probes (cLPPs), each selectively capturing one strand of a genomic target through circularization. Using two probes per target increases sensitivity for variant detection and cLPPs are easily produced by PCR at low cost. Additionally, we introduce an approach for generating capture libraries with uniformly randomized template orientations. This facilitates bidirectional sequencing of both the sense and antisense template strands during one paired-end read, which maximizes target coverage.
Collapse
Affiliation(s)
- Peidong Shen
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA 94304, USA
| | - Wenyi Wang
- Department of Bioinformatics and Computational Biology, UT MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Aung-Kyaw Chi
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA 94304, USA
| | - Yu Fan
- Department of Bioinformatics and Computational Biology, UT MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Ronald W Davis
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA 94304, USA
| | - Curt Scharfe
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA 94304, USA
| |
Collapse
|
18
|
Abstract
Advances in DNA sequencing provide tools for efficient large-scale discovery of markers for use in plants. Discovery options include large-scale amplicon sequencing, transcriptome sequencing, gene-enriched genome sequencing and whole genome sequencing. Examples of each of these approaches and their potential to generate molecular markers for specific applications have been described. Sequencing the whole genome of parents identifies all the polymorphisms available for analysis in their progeny. Sequencing PCR amplicons of sets of candidate genes from DNA bulks can be used to define the available variation in these genes that might be exploited in a population or germplasm collection. Sequencing of the transcriptomes of genotypes varying for the trait of interest may identify genes with patterns of expression that could explain the phenotypic variation. Sequencing genomic DNA enriched for genes by hybridization with probes for all or some of the known genes simplifies sequencing and analysis of differences in gene sequences between large numbers of genotypes and genes especially when working with complex genomes. Examples of application of the above-mentioned techniques have been described.
Collapse
|
19
|
Dias MDS, Hernan I, Pascual B, Borràs E, Mañé B, Gamundi MJ, Carballo M. Detection of novel mutations that cause autosomal dominant retinitis pigmentosa in candidate genes by long-range PCR amplification and next-generation sequencing. Mol Vis 2013; 19:654-64. [PMID: 23559859 PMCID: PMC3611935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 03/19/2013] [Indexed: 10/27/2022] Open
Abstract
PURPOSE To devise an effective method for detecting mutations in 12 genes (CA4, CRX, IMPDH1, NR2E3, RP9, PRPF3, PRPF8, PRPF31, PRPH2, RHO, RP1, and TOPORS) commonly associated with autosomal dominant retinitis pigmentosa (adRP) that account for more than 95% of known mutations. METHODS We used long-range PCR (LR-PCR) amplification and next-generation sequencing (NGS) performed in a GS Junior 454 benchtop sequencing platform. Twenty LR-PCR fragments, between 3,000 and 10,000 bp, containing all coding exons and flanking regions of the 12 genes, were obtained from DNA samples of patients with adRP. Sequencing libraries were prepared with an enzymatic (Fragmentase technology) method. RESULTS Complete coverage of the coding and flanking sequences of the 12 genes assayed was obtained with NGS, with an average sequence depth of 380× (ranging from 128× to 1,077×). Five previous known mutations in the adRP genes were detected with a sequence variation percentage between 35% and 65%. We also performed a parallel sequence analysis of four samples, three of them new patients with index adRP, in which two novel mutations were detected in RHO (p.Asn73del) and PRPF31 (p.Ile109del). CONCLUSIONS The results demonstrate that genomic LR-PCR amplification together with NGS is an effective method for analyzing individual patient samples for mutations in a monogenic heterogeneous disease such as adRP. This approach proved effective for the parallel analysis of adRP and has been introduced as routine. Additionally, this approach could be extended to other heterogeneous genetic diseases.
Collapse
|
20
|
Multiplex PCR in Molecular Differential Diagnosis of Microbial Infections: Methods, Utility, and Platforms. ADVANCED TECHNIQUES IN DIAGNOSTIC MICROBIOLOGY 2013. [PMCID: PMC7121114 DOI: 10.1007/978-1-4614-3970-7_34] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We are entering the age of personalized medicine where treatments are designed to target specific causes, rather than a group of patients with similar symptoms. However, personalized medicine is impossible without a personalized diagnosis that considers all the possible causes of a person’s disease. Traditional molecular diagnostic methods, such as PCR and qPCR, cannot provide the necessary information to practice personalized medicine, because they cannot be multiplexed, allowing the detection of only one or a few (no more than 3) targets at a time in one sample. Practicing personalized medicine, therefore, requires multiplex PCR (mPCR), which can evaluate many molecular targets at once, in one reaction, from one sample.
Collapse
|
21
|
Elsharawy A, Forster M, Schracke N, Keller A, Thomsen I, Petersen BS, Stade B, Stähler P, Schreiber S, Rosenstiel P, Franke A. Improving mapping and SNP-calling performance in multiplexed targeted next-generation sequencing. BMC Genomics 2012; 13:417. [PMID: 22913592 PMCID: PMC3563481 DOI: 10.1186/1471-2164-13-417] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2011] [Accepted: 08/10/2012] [Indexed: 11/10/2022] Open
Abstract
Background Compared to classical genotyping, targeted next-generation sequencing (tNGS) can be custom-designed to interrogate entire genomic regions of interest, in order to detect novel as well as known variants. To bring down the per-sample cost, one approach is to pool barcoded NGS libraries before sample enrichment. Still, we lack a complete understanding of how this multiplexed tNGS approach and the varying performance of the ever-evolving analytical tools can affect the quality of variant discovery. Therefore, we evaluated the impact of different software tools and analytical approaches on the discovery of single nucleotide polymorphisms (SNPs) in multiplexed tNGS data. To generate our own test model, we combined a sequence capture method with NGS in three experimental stages of increasing complexity (E. coli genes, multiplexed E. coli, and multiplexed HapMap BRCA1/2 regions). Results We successfully enriched barcoded NGS libraries instead of genomic DNA, achieving reproducible coverage profiles (Pearson correlation coefficients of up to 0.99) across multiplexed samples, with <10% strand bias. However, the SNP calling quality was substantially affected by the choice of tools and mapping strategy. With the aim of reducing computational requirements, we compared conventional whole-genome mapping and SNP-calling with a new faster approach: target-region mapping with subsequent ‘read-backmapping’ to the whole genome to reduce the false detection rate. Consequently, we developed a combined mapping pipeline, which includes standard tools (BWA, SAMtools, etc.), and tested it on public HiSeq2000 exome data from the 1000 Genomes Project. Our pipeline saved 12 hours of run time per Hiseq2000 exome sample and detected ~5% more SNPs than the conventional whole genome approach. This suggests that more potential novel SNPs may be discovered using both approaches than with just the conventional approach. Conclusions We recommend applying our general ‘two-step’ mapping approach for more efficient SNP discovery in tNGS. Our study has also shown the benefit of computing inter-sample SNP-concordances and inspecting read alignments in order to attain more confident results.
Collapse
Affiliation(s)
- Abdou Elsharawy
- Institute of Clinical Molecular Biology, Christian-Albrechts-University, Kiel, Germany
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Hernan I, Borràs E, de Sousa Dias M, Gamundi MJ, Mañé B, Llort G, Agúndez JAG, Blanca M, Carballo M. Detection of genomic variations in BRCA1 and BRCA2 genes by long-range PCR and next-generation sequencing. J Mol Diagn 2012; 14:286-93. [PMID: 22426013 DOI: 10.1016/j.jmoldx.2012.01.013] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Revised: 01/10/2012] [Accepted: 01/24/2012] [Indexed: 02/09/2023] Open
Abstract
Advances in sequencing technologies, such as next-generation sequencing (NGS), represent an opportunity to perform genetic testing in a clinical scenario. In this study, we developed and tested a method for the detection of mutations in the large BRCA1 and BRCA2 tumor suppressor genes, using long-range PCR (LR-PCR) and NGS, in samples from individuals with a personal and/or family history of breast and/or ovarian cancer. Eleven LR-PCR fragments, between 3000 and 15,300 bp, containing all coding exons and flanking splice junctions of BRCA1 and BRCA2, were obtained from DNA samples of five individuals carrying mutations in either BRCA1 or BRCA2. Libraries for NGS were prepared using an enzymatic (Nextera technology) method. We analyzed five individual samples in parallel by NGS and obtained complete coverage of all LR-PCR fragments, with an average coding sequence depth for each nucleotide of >30 reads, running from ×7 (in exon 22 of BRCA1) to >×150. We detected and confirmed 100% of the mutations that predispose to the risk of cancer, together with other genomic variations in BRCA1 and BRCA2. Our approach demonstrates that genomic LR-PCR, together with NGS, using the GS Junior 454 System platform, is an effective method for patient sample analysis of BRCA1 and BRCA2 genes. In addition, this method could be performed in regular molecular genetics laboratories.
Collapse
Affiliation(s)
- Imma Hernan
- Molecular Genetics Unit, Hospital of Terrassa, Terrassa, Spain
| | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Cronn R, Knaus BJ, Liston A, Maughan PJ, Parks M, Syring JV, Udall J. Targeted enrichment strategies for next-generation plant biology. AMERICAN JOURNAL OF BOTANY 2012; 99:291-311. [PMID: 22312117 DOI: 10.3732/ajb.1100356] [Citation(s) in RCA: 120] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
PREMISE OF THE STUDY The dramatic advances offered by modern DNA sequencers continue to redefine the limits of what can be accomplished in comparative plant biology. Even with recent achievements, however, plant genomes present obstacles that can make it difficult to execute large-scale population and phylogenetic studies on next-generation sequencing platforms. Factors like large genome size, extensive variation in the proportion of organellar DNA in total DNA, polyploidy, and gene number/redundancy contribute to these challenges, and they demand flexible targeted enrichment strategies to achieve the desired goals. METHODS In this article, we summarize the many available targeted enrichment strategies that can be used to target partial-to-complete organellar genomes, as well as known and anonymous nuclear targets. These methods fall under four categories: PCR-based enrichment, hybridization-based enrichment, restriction enzyme-based enrichment, and enrichment of expressed gene sequences. KEY RESULTS Examples of plant-specific applications exist for nearly all methods described. While some methods are well established (e.g., transcriptome sequencing), other promising methods are in their infancy (hybridization enrichment). A direct comparison of methods shows that PCR-based enrichment may be a reasonable strategy for accessing small genomic targets (e.g., ≤50 kbp), but that hybridization and transcriptome sequencing scale more efficiently if larger targets are desired. CONCLUSIONS While the benefits of targeted sequencing are greatest in plants with large genomes, nearly all comparative projects can benefit from the improved throughput offered by targeted multiplex DNA sequencing, particularly as the amount of data produced from a single instrument approaches a trillion bases per run.
Collapse
Affiliation(s)
- Richard Cronn
- Pacific Northwest Research Station, USDA Forest Service, Corvallis, Oregon 97331, USA.
| | | | | | | | | | | | | |
Collapse
|
24
|
Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis. BMC Genomics 2012; 13:43. [PMID: 22276739 PMCID: PMC3284879 DOI: 10.1186/1471-2164-13-43] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2011] [Accepted: 01/25/2012] [Indexed: 01/29/2023] Open
Abstract
BACKGROUND The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. RESULTS Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. CONCLUSIONS By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand.
Collapse
|
25
|
Good JM. Reduced representation methods for subgenomic enrichment and next-generation sequencing. Methods Mol Biol 2012; 772:85-103. [PMID: 22065433 DOI: 10.1007/978-1-61779-228-1_5] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Several methods have been developed to enrich DNA for subsets of the genome prior to next-generation sequencing. These front-end enrichment strategies provide powerful and cost-effective tools for researchers interested in collecting large-scale genomic sequence data. In this review, I provide an overview of both general and targeted reduced representation enrichment strategies that are commonly used in tandem with next-generation sequencing. I focus on several key issues that are likely to be important when deciding which enrichment strategy is most appropriate for a given experiment. Overall, these techniques can enable the collection of large-scale genomic data in diverse species, providing a powerful tool for the study of evolutionary biology.
Collapse
Affiliation(s)
- Jeffrey M Good
- Division of Biological Sciences, University of Montana, Missoula, MT, USA.
| |
Collapse
|
26
|
Myllykangas S, Natsoulis G, Bell JM, Ji HP. Targeted sequencing library preparation by genomic DNA circularization. BMC Biotechnol 2011; 11:122. [PMID: 22168766 PMCID: PMC3280942 DOI: 10.1186/1472-6750-11-122] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2011] [Accepted: 12/14/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND For next generation DNA sequencing, we have developed a rapid and simple approach for preparing DNA libraries of targeted DNA content. Current protocols for preparing DNA for next-generation targeted sequencing are labor-intensive, require large amounts of starting material, and are prone to artifacts that result from necessary PCR amplification of sequencing libraries. Typically, sample preparation for targeted NGS is a two-step process where (1) the desired regions are selectively captured and (2) the ends of the DNA molecules are modified to render them compatible with any given NGS sequencing platform. RESULTS In this proof-of-concept study, we present an integrated approach that combines these two separate steps into one. Our method involves circularization of a specific genomic DNA molecule that directly incorporates the necessary components for conducting sequencing in a single assay and requires only one PCR amplification step. We also show that specific regions of the genome can be targeted and sequenced without any PCR amplification. CONCLUSION We anticipate that these rapid targeted libraries will be useful for validation of variants and may have diagnostic application.
Collapse
Affiliation(s)
- Samuel Myllykangas
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | | | | | | |
Collapse
|
27
|
Kharabian-Masouleh A, Waters DLE, Reinke RF, Henry RJ. Discovery of polymorphisms in starch-related genes in rice germplasm by amplification of pooled DNA and deeply parallel sequencing. PLANT BIOTECHNOLOGY JOURNAL 2011; 9:1074-85. [PMID: 21645201 DOI: 10.1111/j.1467-7652.2011.00629.x] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
High-throughput sequencing of pooled DNA was applied to polymorphism discovery in candidate genes involved in starch synthesis. This approach employed semi- to long-range PCR (LR-PCR) followed by next-generation sequencing technology. A total of 17 rice starch synthesis genes encoding seven classes of enzymes, including ADP-glucose pyrophosphorylase (AGPase), granule starch synthase (GBSS), soluble starch synthase (SS), starch branching enzyme (BE), starch debranching enzyme (DBE) and starch phosphorylase (SPHOL) and phosphate translocator (GPT1) from 233 genotypes were PCR amplified using semi- to long-range PCR. The amplification products were equimolarly pooled and sequenced using massively parallel sequencing technology (MPS). By detecting single nucleotide polymorphism (SNP)/Indels in both coding and noncoding areas of the genes, we identified genetic differences and characterized the SNP/Indel variation and distribution patterns among individual starch candidate genes. Approximately, 60.9 million reads were generated, of which 54.8 million (90%) mapped to the reference sequences. The average coverage rate ranged from 12,708 to 38,300 times for SSIIa and SSIIIb, respectively. SNPs and single/multiple-base Indels were analysed in a total assembled length of 116,403 bp. In total, 501 SNPs and 113 Indels were detected across the 17 starch-related loci. The ratio of synonymous to nonsynonymous SNPs (Ka/Ks) test indicated GBSSI and isoamylase 1 (ISA1) as the least diversified (most purified) and conservative genes as the studied populations have been through cycles of selection. This report demonstrates a useful strategy for screening germplasm by MPS to discover variants in a specific target group of genes.
Collapse
Affiliation(s)
- Ardashir Kharabian-Masouleh
- Southern Cross Plant Science, Centre for Plant Conservation Genetics, Southern Cross University, Lismore, NSW 2480, Australia.
| | | | | | | |
Collapse
|
28
|
Moorthie S, Mattocks CJ, Wright CF. Review of massively parallel DNA sequencing technologies. THE HUGO JOURNAL 2011. [PMID: 23205160 DOI: 10.1007/s11568-011-9156-3] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Since the development of technologies that can determine the base-pair sequence of DNA, the ability to sequence genes has contributed much to science and medicine. However, it has remained a relatively costly and laborious process, hindering its use as a routine biomedical tool. Recent times are seeing rapid developments in this field, both in the availability of novel sequencing platforms, as well as supporting technologies involved in processes such as targeting and data analysis. This is leading to significant reductions in the cost of sequencing a human genome and the potential for its use as a routine biomedical tool. This review is a snapshot of this rapidly moving field examining the current state of the art, forthcoming developments and some of the issues still to be resolved prior to the use of new sequencing technologies in routine clinical diagnosis.
Collapse
|
29
|
Hirani R, Connolly AR, Putral L, Dobrovic A, Trau M. Sensitive quantification of somatic mutations using molecular inversion probes. Anal Chem 2011; 83:8215-21. [PMID: 21942816 DOI: 10.1021/ac2019409] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Somatic mutations in DNA can serve as cancer specific biomarkers and are increasingly being used to direct treatment. However, they can be difficult to detect in tissue biopsies because there is often only a minimal amount of sample and the mutations are often masked by the presence of wild type alleles from nontumor material in the sample. To facilitate the sensitive and specific analysis of DNA mutations in tissues, a multiplex assay capable of detecting nucleotide changes in less than 150 cells was developed. The assay extends the application of molecular inversion probes to enable sensitive discrimination and quantification of nucleotide mutations that are present in less than 0.1% of a cell population. The assay was characterized by detecting selected mutations in the KRAS gene, which has been implicated in up to 25% of all cancers. These mutations were detected in a single multiplex assay by incorporating the rapid flow cytometric readout of multiplexable DNA biosensors.
Collapse
Affiliation(s)
- Rena Hirani
- Centre for Biomarker Research and Development, Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Brisbane, QLD, Australia
| | | | | | | | | |
Collapse
|
30
|
Abstract
With rapid development of next-generation sequencing (NGS) technologies, it is becoming increasingly feasible to sequence entire genomes of various organisms from virus to human. However, in many occasions, it is still more practical to sequence and analyze only small regions of the entire genome that are informative for the purpose of the experiment. Although many target-enrichment or target capture methods exist, each method has its own strength and weakness in terms of the number of enriched targets, specificity, drop-off rate, and uniformity in capturing target DNA sequences. Many applications require a consistently low drop-off rate and high uniformity of enriched targets for routine collection of meaningful data. Here, we describe a simple and robust PCR-based protocol that can allow simultaneous amplification of numerous target regions. This method employs target-specific hairpin selectors to create DNA templates that contain target regions flanked by common universal priming sequences. We demonstrated the utility of this method by applying it for simultaneous amplification of 21 targets in the range of 191-604 bp from 41 different Salmonella strains using bar-coded universal primers. Analysis of 454 FLX pyrosequencing data demonstrated the promising performance of this method in terms of specificity and uniformity. This method, with great potential for robust amplification of hundreds of targets, should find broad applications for efficient analysis of multiple genomic targets for various experimental goals.
Collapse
|
31
|
MassCode liquid arrays as a tool for multiplexed high-throughput genetic profiling. PLoS One 2011; 6:e18967. [PMID: 21544191 PMCID: PMC3081317 DOI: 10.1371/journal.pone.0018967] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2010] [Accepted: 03/18/2011] [Indexed: 02/04/2023] Open
Abstract
Multiplexed detection assays that analyze a modest number of nucleic acid targets over large sample sets are emerging as the preferred testing approach in such applications as routine pathogen typing, outbreak monitoring, and diagnostics. However, very few DNA testing platforms have proven to offer a solution for mid-plexed analysis that is high-throughput, sensitive, and with a low cost per test. In this work, an enhanced genotyping method based on MassCode technology was devised and integrated as part of a high-throughput mid-plexing analytical system that facilitates robust qualitative differential detection of DNA targets. Samples are first analyzed using MassCode PCR (MC-PCR) performed with an array of primer sets encoded with unique mass tags. Lambda exonuclease and an array of MassCode probes are then contacted with MC-PCR products for further interrogation and target sequences are specifically identified. Primer and probe hybridizations occur in homogeneous solution, a clear advantage over micro- or nanoparticle suspension arrays. The two cognate tags coupled to resultant MassCode hybrids are detected in an automated process using a benchtop single quadrupole mass spectrometer. The prospective value of using MassCode probe arrays for multiplexed bioanalysis was demonstrated after developing a 14plex proof of concept assay designed to subtype a select panel of Salmonella enterica serogroups and serovars. This MassCode system is very flexible and test panels can be customized to include more, less, or different markers.
Collapse
|
32
|
Myllykangas S, Ji HP. Targeted deep resequencing of the human cancer genome using next-generation technologies. Biotechnol Genet Eng Rev 2011; 27:135-58. [PMID: 21415896 DOI: 10.1080/02648725.2010.10648148] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Next-generation sequencing technologies have revolutionized our ability to identify genetic variants, either germline or somatic point mutations, that occur in cancer. Parallelization and miniaturization of DNA sequencing enables massive data throughput and for the first time, large-scale, nucleotide resolution views of cancer genomes can be achieved. Systematic, large-scale sequencing surveys have revealed that the genetic spectrum of mutations in cancers appears to be highly complex with numerous low frequency bystander somatic variations, and a limited number of common, frequently mutated genes. Large sample sizes and deeper resequencing are much needed in resolving clinical and biological relevance of the mutations as well as in detecting somatic variants in heterogeneous samples and cancer cell sub-populations. However, even with the next-generation sequencing technologies, the overwhelming size of the human genome and need for very high fold coverage represents a major challenge for up-scaling cancer genome sequencing projects. Assays to target, capture, enrich or partition disease-specific regions of the genome offer immediate solutions for reducing the complexity of the sequencing libraries. Integration of targeted DNA capture assays and next-generation deep resequencing improves the ability to identify clinically and biologically relevant mutations.
Collapse
Affiliation(s)
- Samuel Myllykangas
- Stanford Genome Technology Center and Division of Oncology, Department of Medicine, Stanford University School of Medicine, CCSR 3215, Stanford, California 94305, USA
| | | |
Collapse
|
33
|
De Leeneer K, Hellemans J, De Schrijver J, Baetens M, Poppe B, Van Criekinge W, De Paepe A, Coucke P, Claes K. Massive parallel amplicon sequencing of the breast cancer genes BRCA1 and BRCA2: opportunities, challenges, and limitations. Hum Mutat 2011; 32:335-44. [PMID: 21305653 DOI: 10.1002/humu.21428] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2010] [Accepted: 12/01/2010] [Indexed: 12/11/2022]
Abstract
This study describes how the new massive parallel sequencing technology can be implemented in a diagnostic setting for the breast cancer susceptibility genes (BRCA1 and BRCA2). The throughput was maximized by increasing uniformity in coverage, obtained by a multiplex approach, which outperformed pooling of singleplex PCRs. We evaluated the sensitivity by analysis of 133 distinct sequence variants; three (2%) deletions or duplications in homopolymers of greater than or equal to seven nucleotides remained undetected, illustrating a limitation of pyrosequencing. Furthermore, other limitations like nonrandom sequencing errors, pseudogene amplification, and failure to detect multiexon deletions are thoroughly described. Our workflow illustrates the potential of massive parallel sequencing of large genes in a diagnostic setting, which is of great importance to meet the increasing expectations of genetic testing. Implementation of this approach will hopefully lead to a strong reduction in turnaround times. As a consequence a wider spectrum of at risk women will be able to benefit from therapeutic interventions and prophylactic interventions.
Collapse
Affiliation(s)
- Kim De Leeneer
- Center for Medical Genetics, Ghent University Hospital, De Pintelaan, Belgium
| | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Otto EA, Ramaswami G, Janssen S, Chaki M, Allen SJ, Zhou W, Airik R, Hurd TW, Ghosh AK, Wolf MT, Hoppe B, Neuhaus TJ, Bockenhauer D, Milford DV, Soliman NA, Saunier S, Johnson CA, Hildebrandt F. Mutation analysis of 18 nephronophthisis associated ciliopathy disease genes using a DNA pooling and next generation sequencing strategy. J Med Genet 2011; 48:105-16. [PMID: 21068128 PMCID: PMC3913043 DOI: 10.1136/jmg.2010.082552] [Citation(s) in RCA: 106] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
BACKGROUND Nephronophthisis associated ciliopathies (NPHP-AC) comprise a group of autosomal recessive cystic kidney diseases that includes nephronophthisis (NPHP), Senior-Loken syndrome (SLS), Joubert syndrome (JBTS), and Meckel-Gruber syndrome (MKS). To date, causative mutations in NPHP-AC have been described for 18 different genes, rendering mutation analysis tedious and expensive. To overcome the broad genetic locus heterogeneity, a strategy of DNA pooling with consecutive massively parallel resequencing (MPR) was devised. METHODS In 120 patients with severe NPHP-AC phenotypes, five pools of genomic DNA with 24 patients each were prepared which were used as templates in order to PCR amplify all 376 exons of 18 NPHP-AC genes (NPHP1, INVS, NPHP3, NPHP4, IQCB1, CEP290, GLIS2, RPGRIP1L, NEK8, TMEM67, INPP5E, TMEM216, AHI1, ARL13B, CC2D2A, TTC21B, MKS1, and XPNPEP3). PCR products were then subjected to MPR on an Illumina Genome-Analyser and mutations were subsequently assigned to their respective mutation carrier via CEL I endonuclease based heteroduplex screening and confirmed by Sanger sequencing. RESULTS For proof of principle, DNA from patients with known mutations was used and detection of 22 out of 24 different alleles (92% sensitivity) was demonstrated. MPR led to the molecular diagnosis in 30/120 patients (25%) and 54 pathogenic mutations (27 novel) were identified in seven different NPHP-AC genes. Additionally, in 24 patients only single heterozygous variants of unknown significance were found. CONCLUSIONS The combined approach of DNA pooling followed by MPR strongly facilitates mutation analysis in broadly heterogeneous single gene disorders. The lack of mutations in 75% of patients in this cohort indicates further extensive heterogeneity in NPHP-AC.
Collapse
Affiliation(s)
- Edgar A. Otto
- Department of Pediatrics, University of Michigan, Ann Arbor, Michigan, USA
| | - Gokul Ramaswami
- Department of Pediatrics, University of Michigan, Ann Arbor, Michigan, USA
| | - Sabine Janssen
- Department of Pediatrics, University of Michigan, Ann Arbor, Michigan, USA
| | - Moumita Chaki
- Department of Pediatrics, University of Michigan, Ann Arbor, Michigan, USA
| | - Susan J. Allen
- Department of Pediatrics, University of Michigan, Ann Arbor, Michigan, USA
| | - Weibin Zhou
- Department of Pediatrics, University of Michigan, Ann Arbor, Michigan, USA
| | - Rannar Airik
- Department of Pediatrics, University of Michigan, Ann Arbor, Michigan, USA
| | - Toby W. Hurd
- Department of Pediatrics, University of Michigan, Ann Arbor, Michigan, USA
| | - Amiya K. Ghosh
- Department of Pediatrics, University of Michigan, Ann Arbor, Michigan, USA
| | - Matthias T. Wolf
- Pediatric Nephrology, Children’s Medical Center, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Bernd Hoppe
- Department of Pediatrics, Division of Pediatric Nephrology, University Hospital Cologne, Germany
| | - Thomas J. Neuhaus
- Department of Pediatrics, Children’s Hospital Lucerne, Lucerne, Switzerland
| | - Detlef Bockenhauer
- Department of Nephrology, Great Ormond Street Hospital for Children NHS Trust, London, UK
| | - David V. Milford
- Department of Pediatric Nephrology, Birmingham Children’s Hospital, Birmingham, UK
| | - Neveen A. Soliman
- Center of Pediatric Nephrology & Transplantation, Cairo University, Cairo, Egypt
- Egyptian Group for Orphan Renal Diseases (EGORD), Cairo, Egypt
| | - the GPN Study Group, Corinne Antignac
- Department of Genetics, Hopital Necker-Enfants Malades, Assistance Publique–Hopitaux de Paris, Paris, France
- INSERM U-983, Hopital Necker-Enfants Malades, Universite Paris Descartes, Paris, France
| | - Sophie Saunier
- Department of Genetics, Hopital Necker-Enfants Malades, Assistance Publique–Hopitaux de Paris, Paris, France
| | - Colin A. Johnson
- Division of Molecular & Translational Medicine, Leeds Institute of Molecular Medicine, University of Leeds, Leeds, United Kingdom
| | - Friedhelm Hildebrandt
- Department of Pediatrics, University of Michigan, Ann Arbor, Michigan, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, USA
- Howard Hughes Medical Institute
| |
Collapse
|
35
|
Kenny EM, Cormican P, Gilks WP, Gates AS, O'Dushlaine CT, Pinto C, Corvin AP, Gill M, Morris DW. Multiplex target enrichment using DNA indexing for ultra-high throughput SNP detection. DNA Res 2010; 18:31-8. [PMID: 21163834 PMCID: PMC3041504 DOI: 10.1093/dnares/dsq029] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Screening large numbers of target regions in multiple DNA samples for sequence variation is an important application of next-generation sequencing but an efficient method to enrich the samples in parallel has yet to be reported. We describe an advanced method that combines DNA samples using indexes or barcodes prior to target enrichment to facilitate this type of experiment. Sequencing libraries for multiple individual DNA samples, each incorporating a unique 6-bp index, are combined in equal quantities, enriched using a single in-solution target enrichment assay and sequenced in a single reaction. Sequence reads are parsed based on the index, allowing sequence analysis of individual samples. We show that the use of indexed samples does not impact on the efficiency of the enrichment reaction. For three- and nine-indexed HapMap DNA samples, the method was found to be highly accurate for SNP identification. Even with sequence coverage as low as 8x, 99% of sequence SNP calls were concordant with known genotypes. Within a single experiment, this method can sequence the exonic regions of hundreds of genes in tens of samples for sequence and structural variation using as little as 1 μg of input DNA per sample.
Collapse
Affiliation(s)
- Elaine M Kenny
- Trinity Genome Sequencing Laboratory, Neuropsychiatric Genetics Research Group, Department of Psychiatry, Institute of Molecular Medicine, Trinity College Dublin, Ireland.
| | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Johansson H, Isaksson M, Sörqvist EF, Roos F, Stenberg J, Sjöblom T, Botling J, Micke P, Edlund K, Fredriksson S, Kultima HG, Ericsson O, Nilsson M. Targeted resequencing of candidate genes using selector probes. Nucleic Acids Res 2010; 39:e8. [PMID: 21059679 PMCID: PMC3025563 DOI: 10.1093/nar/gkq1005] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Targeted genome enrichment is a powerful tool for making use of the massive throughput of novel DNA-sequencing instruments. We herein present a simple and scalable protocol for multiplex amplification of target regions based on the Selector technique. The updated version exhibits improved coverage and compatibility with next-generation-sequencing (NGS) library-construction procedures for shotgun sequencing with NGS platforms. To demonstrate the performance of the technique, all 501 exons from 28 genes frequently involved in cancer were enriched for and sequenced in specimens derived from cell lines and tumor biopsies. DNA from both fresh frozen and formalin-fixed paraffin-embedded biopsies were analyzed and 94% specificity and 98% coverage of the targeted region was achieved. Reproducibility between replicates was high (R2 = 0, 98) and readily enabled detection of copy-number variations. The procedure can be carried out in <24 h and does not require any dedicated instrumentation.
Collapse
Affiliation(s)
- H Johansson
- Department of Genetics and Pathology, Uppsala University, Rudbeck Laboratory, Uppsala, Sweden
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Varley KE, Mitra RD. Bisulfite Patch PCR enables multiplexed sequencing of promoter methylation across cancer samples. Genome Res 2010; 20:1279-87. [PMID: 20627893 DOI: 10.1101/gr.101212.109] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Aberrant DNA methylation frequently occurs at gene promoters during cancer progression. It is important to identify these loci because they are often misregulated and drive tumorigenesis. Bisulfite sequencing is the most direct and highest resolution assay for identifying aberrant promoter methylation. Recently, genomic capture methods have been combined with next-generation sequencing to enable genome-scale surveys of methylation in individual samples. However, it is challenging to validate candidate loci identified by these approaches because an efficient method to bisulfite sequence more than 50 differentially methylated loci across a large number of samples does not exist. To address this problem, we developed Bisulfite Patch PCR, which enables highly multiplexed bisulfite PCR and sequencing across many samples. Using this method, we successfully amplified 100% of 94 targeted gene promoters simultaneously in the same reaction. By incorporating sample-specific DNA barcodes into the amplicons, we analyzed 48 samples in a single run of the 454 Life Sciences (Roche) FLX sequencer. The method requires small amounts of starting DNA (250 ng) and does not require a shotgun library construction. The method was highly specific; 90% of sequencing reads aligned to targeted loci. The targeted promoters were from genes that are frequently mutated in breast and colon cancer, and the samples included breast and colon tumor and adjacent normal tissue. This approach allowed us to identify nine gene promoters that exhibit tumor-specific DNA methylation defects that occur frequently in colon and breast cancer. We also analyzed single nucleotide polymorphisms to observe DNA methylation that accumulated on specific alleles during tumor development. This method is broadly applicable for studying DNA methylation across large numbers of patient samples using next-generation sequencing.
Collapse
Affiliation(s)
- Katherine Elena Varley
- Department of Genetics, Center for Genome Sciences, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| | | |
Collapse
|
38
|
Anderson MW, Schrijver I. Next generation DNA sequencing and the future of genomic medicine. Genes (Basel) 2010; 1:38-69. [PMID: 24710010 PMCID: PMC3960862 DOI: 10.3390/genes1010038] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2010] [Revised: 05/20/2010] [Accepted: 05/21/2010] [Indexed: 12/20/2022] Open
Abstract
In the years since the first complete human genome sequence was reported, there has been a rapid development of technologies to facilitate high-throughput sequence analysis of DNA (termed “next-generation” sequencing). These novel approaches to DNA sequencing offer the promise of complete genomic analysis at a cost feasible for routine clinical diagnostics. However, the ability to more thoroughly interrogate genomic sequence raises a number of important issues with regard to result interpretation, laboratory workflow, data storage, and ethical considerations. This review describes the current high-throughput sequencing platforms commercially available, and compares the inherent advantages and disadvantages of each. The potential applications for clinical diagnostics are considered, as well as the need for software and analysis tools to interpret the vast amount of data generated. Finally, we discuss the clinical and ethical implications of the wealth of genetic information generated by these methods. Despite the challenges, we anticipate that the evolution and refinement of high-throughput DNA sequencing technologies will catalyze a new era of personalized medicine based on individualized genomic analysis.
Collapse
Affiliation(s)
- Matthew W Anderson
- Department of Pathology, Stanford University Medical Center, 300 Pasteur Drive, Room L235, Stanford, CA 94305-5627, USA.
| | - Iris Schrijver
- Department of Pathology, Stanford University Medical Center, 300 Pasteur Drive, Room L235, Stanford, CA 94305-5627, USA.
| |
Collapse
|
39
|
Hu H, Wrogemann K, Kalscheuer V, Tzschach A, Richard H, Haas SA, Menzel C, Bienek M, Froyen G, Raynaud M, Van Bokhoven H, Chelly J, Ropers H, Chen W. Mutation screening in 86 known X-linked mental retardation genes by droplet-based multiplex PCR and massive parallel sequencing. THE HUGO JOURNAL 2010; 3:41-9. [PMID: 21836662 PMCID: PMC2882650 DOI: 10.1007/s11568-010-9137-y] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/13/2010] [Revised: 02/24/2010] [Accepted: 03/12/2010] [Indexed: 12/25/2022]
Abstract
Massive parallel sequencing has revolutionized the search for pathogenic variants in the human genome, but for routine diagnosis, re-sequencing of the complete human genome in a large cohort of patients is still far too expensive. Recently, novel genome partitioning methods have been developed that allow to target re-sequencing to specific genomic compartments, but practical experience with these methods is still limited. In this study, we have combined a novel droplet-based multiplex PCR method and next generation sequencing to screen patients with X-linked mental retardation (XLMR) for mutations in 86 previously identified XLMR genes. In total, affected males from 24 large XLMR families were analyzed, including three in whom the mutations were already known. Amplicons corresponding to functionally relevant regions of these genes were sequenced on an Illumina/Solexa Genome Analyzer II platform. Highly specific and uniform enrichment was achieved: on average, 67.9% unambiguously mapped reads were derived from amplicons, and for 88.5% of the targeted bases, the sequencing depth was sufficient to reliably detect variations. Potentially disease-causing sequence variants were identified in 10 out of 24 patients, including the three mutations that were already known, and all of these could be confirmed by Sanger sequencing. The robust performance of this approach demonstrates the general utility of droplet-based multiplex PCR for parallel mutation screening in hundreds of genes, which is a prerequisite for the diagnosis of mental retardation and other disorders that may be due to defects of a wide variety of genes.
Collapse
Affiliation(s)
- Hao Hu
- Max-Planck-Institute for Molecular Genetics, Berlin, Germany
| | - Klaus Wrogemann
- Max-Planck-Institute for Molecular Genetics, Berlin, Germany
- Department of Biochemistry & Medical Genetics, University of Manitoba, Winnipeg, MB Canada
| | - Vera Kalscheuer
- Max-Planck-Institute for Molecular Genetics, Berlin, Germany
| | | | - Hugues Richard
- Max-Planck-Institute for Molecular Genetics, Berlin, Germany
| | - Stefan A. Haas
- Max-Planck-Institute for Molecular Genetics, Berlin, Germany
| | - Corinna Menzel
- Max-Planck-Institute for Molecular Genetics, Berlin, Germany
| | - Melanie Bienek
- Max-Planck-Institute for Molecular Genetics, Berlin, Germany
| | - Guy Froyen
- Human Genome Laboratory, Centre for Human Genetics, VIB, K.U.Leuven, Leuven, Belgium
| | - Martine Raynaud
- INSERM, U930; Centre Hospitalier Régional Universitaire de Tours, Service de Genetique, 37044 Tours, France
| | - Hans Van Bokhoven
- Department of Human Genetics, Radboud University Medical Centre, Nijmegen, The Netherlands
| | - Jamel Chelly
- Faculté de Médecine Cochin, INSERM 129-ICGM, Paris, France
| | - Hilger Ropers
- Max-Planck-Institute for Molecular Genetics, Berlin, Germany
| | - Wei Chen
- Max-Planck-Institute for Molecular Genetics, Berlin, Germany
- Max-Delbrück-Centrum für Molekulare Medizin, Berlin Institute for Medical Systems Biology, Berlin, Germany
| |
Collapse
|
40
|
Roberts CH, Mayor NP, Madrigal JA, Marsh SGE. Short template amplicon and multiplex megaprimer-enabled relay (STAMMER) sequencing, a simultaneous approach to higher throughput sequence-based typing of polymorphic genes. Immunogenetics 2010; 62:253-60. [PMID: 20204613 DOI: 10.1007/s00251-010-0432-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2009] [Accepted: 02/05/2010] [Indexed: 11/28/2022]
Abstract
Sequence-based typing (SBT) is a powerful method of genotyping in highly polymorphic gene systems. In standard SBT methods, both strands of a double-stranded template amplicon are sequenced in separate reactions in order to achieve high quality data across the region of interest. The amount of informative data that is obtained from the second strand sequence is often low, whilst the impact of performing second strand sequencing on costs and throughput are significant. Here we present short template amplicon and multiplex megaprimer-enabled relay (STAMMER) sequencing, a novel simultaneous sequence-based typing methodology that allows the detection of any practical amount of useful sequence from a plurality of distinct polymerase chain reaction products in a single sequencing reaction. In addition to simultaneous bidirectional sequencing, we show how the STAMMER approach can be used to simultaneously sequence a number of regions of interest that are not physically linked within the range of a single sequencing reaction. The efficiencies of this method could impact significantly on the output of SBT laboratories.
Collapse
|
41
|
Mamanova L, Coffey AJ, Scott CE, Kozarewa I, Turner EH, Kumar A, Howard E, Shendure J, Turner DJ. Target-enrichment strategies for next-generation sequencing. Nat Methods 2010; 7:111-8. [PMID: 20111037 DOI: 10.1038/nmeth.1419] [Citation(s) in RCA: 761] [Impact Index Per Article: 54.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We have not yet reached a point at which routine sequencing of large numbers of whole eukaryotic genomes is feasible, and so it is often necessary to select genomic regions of interest and to enrich these regions before sequencing. There are several enrichment approaches, each with unique advantages and disadvantages. Here we describe our experiences with the leading target-enrichment technologies, the optimizations that we have performed and typical results that can be obtained using each. We also provide detailed protocols for each technology so that end users can find the best compromise between sensitivity, specificity and uniformity for their particular project.
Collapse
Affiliation(s)
- Lira Mamanova
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Mir KU. Sequencing genomes: from individuals to populations. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2010; 8:367-78. [PMID: 19808932 DOI: 10.1093/bfgp/elp040] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The whole genome sequences of Jim Watson and Craig Venter are early examples of personalized genomics, which promises to change how we approach healthcare in the future. Before personal sequencing can have practical medical benefits, however, and before it should be advocated for implementation at the population-scale, there needs to be a better understanding of which genetic variants influence which traits and how their effects are modified by epigenetic factors. Nonetheless, for forging links between DNA sequence and phenotype, efforts to sequence the genomes of individuals need to continue; this includes sequencing sub-populations for association studies which analyse the difference in sequence between disease affected and unaffected individuals. Such studies can only be applied on a large enough scale to be effective if the massive strides in sequencing technology that have recently occurred also continue.
Collapse
Affiliation(s)
- Kalim U Mir
- The Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK.
| |
Collapse
|
43
|
Out AA, van Minderhout IJHM, Goeman JJ, Ariyurek Y, Ossowski S, Schneeberger K, Weigel D, van Galen M, Taschner PEM, Tops CMJ, Breuning MH, van Ommen GJB, den Dunnen JT, Devilee P, Hes FJ. Deep sequencing to reveal new variants in pooled DNA samples. Hum Mutat 2010; 30:1703-12. [PMID: 19842214 DOI: 10.1002/humu.21122] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We evaluated massive parallel sequencing and long-range PCR (LRP) for rare variant detection and allele frequency estimation in pooled DNA samples. Exons 2 to 16 of the MUTYH gene were analyzed in breast cancer patients with Illumina's (Solexa) technology. From a pool of 287 genomic DNA samples we generated a single LRP product, while the same LRP was performed on 88 individual samples and the resulting products then pooled. Concentrations of constituent samples were measured with fluorimetry for genomic DNA and high-resolution melting curve analysis (HR-MCA) for LRP products. Illumina sequencing results were compared to Sanger sequencing data of individual samples. Correlation between allele frequencies detected by both methods was poor in the first pool, presumably because the genomic samples amplified unequally in the LRP, due to DNA quality variability. In contrast, allele frequencies correlated well in the second pool, in which all expected alleles at a frequency of 1% and higher were reliably detected, plus the majority of singletons (0.6% allele frequency). We describe custom bioinformatics and statistics to optimize detection of rare variants and to estimate required sequencing depth. Our results provide directions for designing high-throughput analyses of candidate genes.
Collapse
Affiliation(s)
- Astrid A Out
- Department of Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Zhang Y, Italia MJ, Auger KR, Halsey WS, Van Horn SF, Sathe GM, Magid-Slav M, Brown JR, Holbrook JD. Molecular evolutionary analysis of cancer cell lines. Mol Cancer Ther 2010; 9:279-91. [PMID: 20124449 DOI: 10.1158/1535-7163.mct-09-0508] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
With genome-wide cancer studies producing large DNA sequence data sets, novel computational approaches toward better understanding the role of mutations in tumor survival and proliferation are greatly needed. Tumors are widely viewed to be influenced by Darwinian processes, yet molecular evolutionary analysis, invaluable in other DNA sequence studies, has seen little application in cancer biology. Here, we describe the phylogenetic analysis of 353 cancer cell lines based on multiple sequence alignments of 3,252 nucleotides and 1,170 amino acids built from the concatenation of variant codons and residues across 494 and 523 genes, respectively. Reconstructed phylogenetic trees cluster cell lines by shared DNA variant patterns rather than cancer tissue type, suggesting that tumors originating from diverse histologies have similar oncogenic pathways. A well-supported clade of 91 cancer cell lines representing multiple tumor types also had significantly different gene expression profiles from the remaining cell lines according to statistical analyses of mRNA microarray data. This suggests that phylogenetic clustering of tumor cell lines based on DNA variants might reflect functional similarities in cellular pathways. Positive selection analysis revealed specific DNA variants that might be potential driver mutations. Our study shows the potential role of molecular evolutionary analyses in tumor classification and the development of novel anticancer strategies.
Collapse
Affiliation(s)
- Yan Zhang
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Abstract
While pharmacogenetics - the correlation of genotype and response to medicines - currently has a small but measurable impact on the prescribing practice of clinicians, the advent of the ;personal genome' is likely to change this significantly. Advances in high-throughput technologies aimed at characterizing human genetic variation, including chip-based genotyping and next-generation sequencing, are poised to provide a flood of information that will affect both pharmacogenetic discovery and pharmacogenetic application in clinical practice. In order for this flood of information to not overwhelm both researchers and clinicians alike, a variety of new and expanded information management tools will be needed, including electronic medical records, bioinformatic algorithms for analyzing sequence data, information management systems for storing, retrieving and interpreting whole-genome sequence data, and pharmacogenetic decision tools for prescribers.
Collapse
Affiliation(s)
- Michael J Wagner
- Institute for Pharmacogenomics and Individualized Therapy, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-27361, USA
| |
Collapse
|
46
|
Abstract
The emergence of massively parallel DNA sequencing platforms has made resequencing an affordable approach to study genetic variation. However, the cost of whole genome resequencing remains too high to apply to large numbers of human samples. Genomic partitioning methods allow enrichment for regions of interest at a scale that is matched to the throughput of the new sequencing platforms. We review general categories of methods for genomic partitioning including multiplex PCR, capture-by-circularization, and capture-by-hybridization. Parameters that are relevant to the performance of any given method include multiplexity, specificity, uniformity, input requirements, scalability, and cost. The successful development of genomic partitioning strategies will be key to taking full advantage of massively parallel sequencing, at least until resequencing of complete mammalian genomes becomes widely affordable.
Collapse
Affiliation(s)
- Emily H Turner
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195-5065, USA.
| | | | | | | |
Collapse
|
47
|
Tucker T, Marra M, Friedman JM. Massively parallel sequencing: the next big thing in genetic medicine. Am J Hum Genet 2009; 85:142-54. [PMID: 19679224 DOI: 10.1016/j.ajhg.2009.06.022] [Citation(s) in RCA: 214] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2009] [Revised: 06/24/2009] [Accepted: 06/29/2009] [Indexed: 01/24/2023] Open
Abstract
Massively parallel sequencing has reduced the cost and increased the throughput of genomic sequencing by more than three orders of magnitude, and it seems likely that costs will fall and throughput improve even more in the next few years. Clinical use of massively parallel sequencing will provide a way to identify the cause of many diseases of unknown etiology through simultaneous screening of thousands of loci for pathogenic mutations and by sequencing biological specimens for the genomic signatures of novel infectious agents. In addition to providing these entirely new diagnostic capabilities, massively parallel sequencing may also replace arrays and Sanger sequencing in clinical applications where they are currently being used. Routine clinical use of massively parallel sequencing will require higher accuracy, better ways to select genomic subsets of interest, and improvements in the functionality, speed, and ease of use of data analysis software. In addition, substantial enhancements in laboratory computer infrastructure, data storage, and data transfer capacity will be needed to handle the extremely large data sets produced. Clinicians and laboratory personnel will require training to use the sequence data effectively, and appropriate methods will need to be developed to deal with the incidental discovery of pathogenic mutations and variants of uncertain clinical significance. Massively parallel sequencing has the potential to transform the practice of medical genetics and related fields, but the vast amount of personal genomic data produced will increase the responsibility of geneticists to ensure that the information obtained is used in a medically and socially responsible manner.
Collapse
|
48
|
Stiller M, Knapp M, Stenzel U, Hofreiter M, Meyer M. Direct multiplex sequencing (DMPS)--a novel method for targeted high-throughput sequencing of ancient and highly degraded DNA. Genome Res 2009; 19:1843-8. [PMID: 19635845 DOI: 10.1101/gr.095760.109] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Although the emergence of high-throughput sequencing technologies has enabled whole-genome sequencing from extinct organisms, little progress has been made in accelerating targeted sequencing from highly degraded DNA. Here, we present a novel and highly sensitive method for targeted sequencing of ancient and degraded DNA, which couples multiplex PCR directly with sample barcoding and high-throughput sequencing. Using this approach, we obtained a 96% complete mitochondrial genome data set from 31 cave bear (Ursus spelaeus) samples using only two 454 Life Sciences (Roche) GS FLX runs. In contrast to previous studies relying only on short sequence fragments, the overlapping portion of our data comprises almost 10 kb of replicated mitochondrial genome sequence, allowing for the unambiguous differentiation of three major cave bear clades. Our method opens up the opportunity to simultaneously generate many kilobases of overlapping sequence data from large sets of difficult samples, such as museum specimens, medical collections, or forensic samples. Embedded in our approach, we present a new protocol for the construction of barcoded sequencing libraries, which is compatible with all current high-throughput technologies and can be performed entirely in plate setup.
Collapse
Affiliation(s)
- Mathias Stiller
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | | | | | | | | |
Collapse
|
49
|
Varley KE, Mitra RD. Nested Patch PCR for highly multiplexed amplification of genomic loci. Cold Spring Harb Protoc 2009; 2009:pdb.prot5252. [PMID: 20147217 DOI: 10.1101/pdb.prot5252] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Nested Patch polymerase chain reaction (PCR) amplifies a large number (greater than 90) of targeted loci from genomic DNA simultaneously in the same reaction. These amplified loci can then be sequenced on a second-generation sequencing machine to detect single nucleotide polymorphisms (SNPs) and mutations. The reaction is highly specific: 90% of sequencing reads match targeted loci. Nested Patch PCR can be performed on many samples in parallel, and by using sample-specific DNA barcodes, these can be pooled and sequenced in a single reaction. Thus, the Nested Patch PCR protocol that is described here provides an easy workflow to identify SNPs and mutations across many targeted loci for many samples in parallel.
Collapse
Affiliation(s)
- Katherine E Varley
- Department of Genetics, Center for Genome Sciences, Washington University School of Medicine, St. Louis, MO 63108, USA
| | | |
Collapse
|