1
|
González Silos R, Fischer C, Lorenzo Bermejo J. NGS allele counts versus called genotypes for testing genetic association. Comput Struct Biotechnol J 2022; 20:3729-3733. [PMID: 35891781 PMCID: PMC9294184 DOI: 10.1016/j.csbj.2022.07.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Revised: 07/07/2022] [Accepted: 07/07/2022] [Indexed: 11/28/2022] Open
Abstract
RNA sequence data are commonly summarized as read counts. By contrast, so far there is no alternative to genotype calling for investigating the relationship between genetic variants determined by next-generation sequencing (NGS) and a phenotype of interest. Here we propose and evaluate the direct analysis of allele counts for genetic association tests. Specifically, we assess the potential advantage of the ratio of alternative allele counts to the total number of reads aligned at a specific position of the genome (coverage) over called genotypes. We simulated association studies based on NGS data from HapMap individuals. Genotype quality scores and allele counts were simulated using NGS data from the Personal Genome Project. Real data from the 1000 Genomes Project was also used to compare the two competing approaches. The average proportions of probability values lower or equal to 0.05 amounted to 0.0496 for called genotypes and 0.0485 for the ratio of alternative allele counts to coverage in the null scenario, and to 0.69 for called genotypes and 0.75 for the ratio of alternative allele counts to coverage in the alternative scenario (9% power increase). The advantage in statistical power of the novel approach increased with decreasing coverage, with decreasing genotype quality and with decreasing allele frequency – 124% power increase for variants with a minor allele frequency lower than 0.05. We provide computer code in R to implement the novel approach, which does not preclude the use of complementary data quality filters before or after identification of the most promising association signals. Author summary Genetic association tests usually rely on called genotypes. We postulate here that the direct analysis of allele counts from sequence data improves the quality of statistical inference. To evaluate this hypothesis, we investigate simulated and real data using distinct statistical approaches. We demonstrate that association tests based on allele counts rather than called genotypes achieve higher statistical power with controlled type I error rates.
Collapse
Affiliation(s)
| | - Christine Fischer
- Institute of Human Genetics, University of Heidelberg, 69120, Germany
| | | |
Collapse
|
2
|
Liu S, Punthambaker S, Iyer EPR, Ferrante T, Goodwin D, Fürth D, Pawlowski AC, Jindal K, Tam JM, Mifflin L, Alon S, Sinha A, Wassie AT, Chen F, Cheng A, Willocq V, Meyer K, Ling KH, Camplisson CK, Kohman RE, Aach J, Lee JH, Yankner BA, Boyden ES, Church GM. Barcoded oligonucleotides ligated on RNA amplified for multiplexed and parallel in situ analyses. Nucleic Acids Res 2021; 49:e58. [PMID: 33693773 PMCID: PMC8191787 DOI: 10.1093/nar/gkab120] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 01/29/2021] [Accepted: 03/02/2021] [Indexed: 12/13/2022] Open
Abstract
We present barcoded oligonucleotides ligated on RNA amplified for multiplexed and parallel insitu analyses (BOLORAMIS), a reverse transcription-free method for spatially-resolved, targeted, in situ RNA identification of single or multiple targets. BOLORAMIS was demonstrated on a range of cell types and human cerebral organoids. Singleplex experiments to detect coding and non-coding RNAs in human iPSCs showed a stem-cell signature pattern. Specificity of BOLORAMIS was found to be 92% as illustrated by a clear distinction between human and mouse housekeeping genes in a co-culture system, as well as by recapitulation of subcellular localization of lncRNA MALAT1. Sensitivity of BOLORAMIS was quantified by comparing with single molecule FISH experiments and found to be 11%, 12% and 35% for GAPDH, TFRC and POLR2A, respectively. To demonstrate BOLORAMIS for multiplexed gene analysis, we targeted 96 mRNAs within a co-culture of iNGN neurons and HMC3 human microglial cells. We used fluorescence in situ sequencing to detect error-robust 8-base barcodes associated with each of these genes. We then used this data to uncover the spatial relationship among cells and transcripts by performing single-cell clustering and gene–gene proximity analyses. We anticipate the BOLORAMIS technology for in situ RNA detection to find applications in basic and translational research.
Collapse
Affiliation(s)
- Songlei Liu
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Sukanya Punthambaker
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Eswar P R Iyer
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Thomas Ferrante
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Daniel Goodwin
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Media Arts and Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Daniel Fürth
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Andrew C Pawlowski
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Kunal Jindal
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Jenny M Tam
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Lauren Mifflin
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Shahar Alon
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Media Arts and Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Anubhav Sinha
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Media Arts and Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Harvard-MIT Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Asmamaw T Wassie
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Media Arts and Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Fei Chen
- Media Arts and Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Broad Institute, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Anne Cheng
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Valerie Willocq
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Katharina Meyer
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - King-Hwa Ling
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.,Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, 43400 Serdang, Selangor, Malaysia
| | - Conor K Camplisson
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Richie E Kohman
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - John Aach
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Je Hyuk Lee
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bruce A Yankner
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Edward S Boyden
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Media Arts and Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02142, USA.,Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA 02142, USA.,Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USAHoward Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - George M Church
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| |
Collapse
|
3
|
Liang N, Li B, Jia Z, Wang C, Wu P, Zheng T, Wang Y, Qiu F, Wu Y, Su J, Xu J, Xu F, Chu H, Fang S, Yang X, Wu C, Cao Z, Cao L, Bing Z, Liu H, Li L, Huang C, Qin Y, Cui Y, Han-Zhang H, Xiang J, Liu H, Guo X, Li S, Zhao H, Zhang Z. Ultrasensitive detection of circulating tumour DNA via deep methylation sequencing aided by machine learning. Nat Biomed Eng 2021; 5:586-599. [PMID: 34131323 DOI: 10.1038/s41551-021-00746-5] [Citation(s) in RCA: 69] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Accepted: 05/13/2021] [Indexed: 01/30/2023]
Abstract
The low abundance of circulating tumour DNA (ctDNA) in plasma samples makes the analysis of ctDNA biomarkers for the detection or monitoring of early-stage cancers challenging. Here we show that deep methylation sequencing aided by a machine-learning classifier of methylation patterns enables the detection of tumour-derived signals at dilution factors as low as 1 in 10,000. For a total of 308 patients with surgery-resectable lung cancer and 261 age- and sex-matched non-cancer control individuals recruited from two hospitals, the assay detected 52-81% of the patients at disease stages IA to III with a specificity of 96% (95% confidence interval (CI) 93-98%). In a subgroup of 115 individuals, the assay identified, at 100% specificity (95% CI 91-100%), nearly twice as many patients with cancer as those identified by ultradeep mutation sequencing analysis. The low amounts of ctDNA permitted by machine-learning-aided deep methylation sequencing could provide advantages in cancer screening and the assessment of treatment efficacy.
Collapse
Affiliation(s)
- Naixin Liang
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Bingsi Li
- Burning Rock Biotech, Guangzhou, China
| | - Ziqi Jia
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | | | - Pancheng Wu
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Tao Zheng
- Burning Rock Biotech, Guangzhou, China
| | - Yanyu Wang
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Fujun Qiu
- Burning Rock Biotech, Guangzhou, China
| | - Yijun Wu
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Jing Su
- Burning Rock Biotech, Guangzhou, China
| | - Jiayue Xu
- Burning Rock Biotech, Guangzhou, China
| | - Feng Xu
- Burning Rock Biotech, Guangzhou, China
| | | | | | | | - Chengju Wu
- Department of Industrial Engineering & Operations Research, University of California, Berkeley, Berkeley, CA, USA
| | - Zhili Cao
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Lei Cao
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Zhongxing Bing
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Hongsheng Liu
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Li Li
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Cheng Huang
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Yingzhi Qin
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Yushang Cui
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | | | | | - Hao Liu
- Burning Rock Biotech, Guangzhou, China
| | - Xin Guo
- Department of Industrial Engineering & Operations Research, University of California, Berkeley, Berkeley, CA, USA
| | - Shanqing Li
- Department of Thoracic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China. .,Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.
| | - Heng Zhao
- Department of Thoracic Surgery, Shanghai Chest Hospital, Shanghai, China.
| | | |
Collapse
|
4
|
Almomani R, Marchi M, Sopacua M, Lindsey P, Salvi E, de Koning B, Santoro S, Magri S, Smeets HJM, Martinelli Boneschi F, Malik RR, Ziegler D, Hoeijmakers JGJ, Bönhof G, Dib-Hajj S, Waxman SG, Merkies ISJ, Lauria G, Faber CG, Gerrits MM. Evaluation of molecular inversion probe versus TruSeq® custom methods for targeted next-generation sequencing. PLoS One 2020; 15:e0238467. [PMID: 32877464 PMCID: PMC7467307 DOI: 10.1371/journal.pone.0238467] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 08/16/2020] [Indexed: 11/18/2022] Open
Abstract
Resolving the genetic architecture of painful neuropathy will lead to better disease management strategies. We aimed to develop a reliable method to re-sequence multiple genes in a large cohort of painful neuropathy patients at low cost. In this study, we compared sensitivity, specificity, targeting efficiency, performance and cost effectiveness of Molecular Inversion Probes-Next generation sequencing (MIPs-NGS) and TruSeq® Custom Amplicon-Next generation sequencing (TSCA-NGS). Capture probes were designed to target nine sodium channel genes (SCN3A, SCN8A-SCN11A, and SCN1B-SCN4B). One hundred sixty-six patients with diabetic and idiopathic neuropathy were tested by both methods, 70 patients were validated by Sanger sequencing. Sensitivity, specificity and performance of both techniques were comparable, and in agreement with Sanger sequencing. The average targeted regions coverage for MIPs-NGS was 97.3% versus 93.9% for TSCA-NGS. MIPs-NGS has a more versatile assay design and is more flexible than TSCA-NGS. The cost of MIPs-NGS is >5 times cheaper than TSCA-NGS when 500 or more samples are tested. In conclusion, MIPs-NGS is a reliable, flexible, and relatively inexpensive method to detect genetic variations in a large cohort of patients. In our centers, MIPs-NGS is currently implemented as a routine diagnostic tool for screening of sodium channel genes in painful neuropathy patients.
Collapse
Affiliation(s)
- Rowida Almomani
- Department of Genetics and Cell Biology, Clinical Genomics Unit, Maastricht University, Maastricht, The Netherlands
- MHeNs school of Mental Health and Neuroscience, Maastricht University, Maastricht, The Netherlands
- Department of Medical Laboratory Sciences, Jordan University of Science and Technology, Irbid, Jordan
| | - Margherita Marchi
- Neuroalgology Units, Fondazione IRCCS Istituto Neurologico “Carlo Besta” Milan, Milan, Italy
| | - Maurice Sopacua
- MHeNs school of Mental Health and Neuroscience, Maastricht University, Maastricht, The Netherlands
- Department of Neurology, Maastricht University Medical Center+, Maastricht, the Netherlands
| | - Patrick Lindsey
- Department of Genetics and Cell Biology, Clinical Genomics Unit, Maastricht University, Maastricht, The Netherlands
| | - Erika Salvi
- Neuroalgology Units, Fondazione IRCCS Istituto Neurologico “Carlo Besta” Milan, Milan, Italy
| | - Bart de Koning
- Department of Clinical Genetics, Maastricht University Medical Center+, Maastricht, the Netherlands
| | - Silvia Santoro
- Laboratory of Human Genetics of Neurological Disorders, Institute of Experimental Neurology (INSPE), Division of Neuroscience, San Raffaele Scientific Institute, Milan, Italy
| | - Stefania Magri
- Neuroalgology Units, Fondazione IRCCS Istituto Neurologico “Carlo Besta” Milan, Milan, Italy
| | - Hubert J. M. Smeets
- Department of Genetics and Cell Biology, Clinical Genomics Unit, Maastricht University, Maastricht, The Netherlands
- MHeNs school of Mental Health and Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Filippo Martinelli Boneschi
- Laboratory of Human Genetics of Neurological Disorders, Institute of Experimental Neurology (INSPE), Division of Neuroscience, San Raffaele Scientific Institute, Milan, Italy
| | - Rayaz R. Malik
- Institute of Human Development, Centre for Endocrinology and Diabetes, University of Manchester and Central Manchester NHS Foundation Trust, Manchester Academic Health Science Center, Manchester, United Kingdom
- Department of Medicine, Weill Cornell Medicine, Doha, Qatar
| | - Dan Ziegler
- Institute for Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research, Düsseldorf, Germany
- Department of Endocrinology and Diabetology, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
| | - Janneke G. J. Hoeijmakers
- MHeNs school of Mental Health and Neuroscience, Maastricht University, Maastricht, The Netherlands
- Department of Neurology, Maastricht University Medical Center+, Maastricht, the Netherlands
| | - Gidon Bönhof
- Institute for Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research, Düsseldorf, Germany
| | - Sulayman Dib-Hajj
- Department of Neurology, Yale University School of Medicine, Yale, New Haven, United States of America
- Center for Neuroscience and Regeneration Research, Yale University School of Medicine, Yale, New Haven, United States of America
- Center for Neuroscience and Regeneration Research, Veterans Affairs Medical Center, West Haven, Connecticut, United States of America
| | - Stephen G. Waxman
- Department of Neurology, Yale University School of Medicine, Yale, New Haven, United States of America
- Center for Neuroscience and Regeneration Research, Yale University School of Medicine, Yale, New Haven, United States of America
- Center for Neuroscience and Regeneration Research, Veterans Affairs Medical Center, West Haven, Connecticut, United States of America
| | - Ingemar S. J. Merkies
- Department of Neurology, Maastricht University Medical Center+, Maastricht, the Netherlands
- Department of Neurology, St Elisabeth Hospital, Willemstad, Curaçao
| | - Giuseppe Lauria
- Neuroalgology Units, Fondazione IRCCS Istituto Neurologico “Carlo Besta” Milan, Milan, Italy
- Department of Biomedical and Clinical Sciences "Luigi Sacco", University of Milan, Milan, Italy
| | - Catharina G. Faber
- MHeNs school of Mental Health and Neuroscience, Maastricht University, Maastricht, The Netherlands
- Department of Neurology, Maastricht University Medical Center+, Maastricht, the Netherlands
| | - Monique M. Gerrits
- Department of Clinical Genetics, Maastricht University Medical Center+, Maastricht, the Netherlands
- * E-mail:
| | | |
Collapse
|
5
|
Diels S, Vanden Berghe W, Van Hul W. Insights into the multifactorial causation of obesity by integrated genetic and epigenetic analysis. Obes Rev 2020; 21:e13019. [PMID: 32170999 DOI: 10.1111/obr.13019] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 02/24/2020] [Accepted: 03/04/2020] [Indexed: 12/11/2022]
Abstract
Obesity is a highly heritable multifactorial disease that places an enormous burden on human health. Its increasing prevalence and the concomitant-reduced life expectancy has intensified the search for new analytical methods that can reduce the knowledge gap between genetic susceptibility and functional consequences of the disease pathology. Although the influence of genetics and epigenetics has been studied independently in the past, there is increasing evidence that genetic variants interact with environmental factors through epigenetic regulation. This suggests that a combined analysis of genetic and epigenetic variation may be more effective in characterizing the obesity phenotype. To date, limited genome-wide integrative analyses have been performed. In this review, we provide an overview of the latest findings, advantages, and challenges and discuss future perspectives.
Collapse
Affiliation(s)
- Sara Diels
- Department of Medical Genetics, University of Antwerp, Antwerp, Belgium
| | - Wim Vanden Berghe
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Wim Van Hul
- Department of Medical Genetics, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
6
|
Chen X, Sun YC, Church GM, Lee JH, Zador AM. Efficient in situ barcode sequencing using padlock probe-based BaristaSeq. Nucleic Acids Res 2019; 46:e22. [PMID: 29190363 PMCID: PMC5829746 DOI: 10.1093/nar/gkx1206] [Citation(s) in RCA: 101] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Accepted: 11/26/2017] [Indexed: 12/15/2022] Open
Abstract
Cellular DNA/RNA tags (barcodes) allow for multiplexed cell lineage tracing and neuronal projection mapping with cellular resolution. Conventional approaches to reading out cellular barcodes trade off spatial resolution with throughput. Bulk sequencing achieves high throughput but sacrifices spatial resolution, whereas manual cell picking has low throughput. In situ sequencing could potentially achieve both high spatial resolution and high throughput, but current in situ sequencing techniques are inefficient at reading out cellular barcodes. Here we describe BaristaSeq, an optimization of a targeted, padlock probe-based technique for in situ barcode sequencing compatible with Illumina sequencing chemistry. BaristaSeq results in a five-fold increase in amplification efficiency, with a sequencing accuracy of at least 97%. BaristaSeq could be used for barcode-assisted lineage tracing, and to map long-range neuronal projections.
Collapse
Affiliation(s)
- Xiaoyin Chen
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Yu-Chi Sun
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - George M Church
- Wyss Institute, Harvard Medical School, Boston, MA, USA.,Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Je Hyuk Lee
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Anthony M Zador
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| |
Collapse
|
7
|
Biezuner T, Spiro A, Raz O, Amir S, Milo L, Adar R, Chapal-Ilani N, Berman V, Fried Y, Ainbinder E, Cohen G, Barr HM, Halaban R, Shapiro E. A generic, cost-effective, and scalable cell lineage analysis platform. Genome Res 2016; 26:1588-1599. [PMID: 27558250 PMCID: PMC5088600 DOI: 10.1101/gr.202903.115] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2015] [Accepted: 08/11/2016] [Indexed: 02/05/2023]
Abstract
Advances in single-cell genomics enable commensurate improvements in methods for uncovering lineage relations among individual cells. Current sequencing-based methods for cell lineage analysis depend on low-resolution bulk analysis or rely on extensive single-cell sequencing, which is not scalable and could be biased by functional dependencies. Here we show an integrated biochemical-computational platform for generic single-cell lineage analysis that is retrospective, cost-effective, and scalable. It consists of a biochemical-computational pipeline that inputs individual cells, produces targeted single-cell sequencing data, and uses it to generate a lineage tree of the input cells. We validated the platform by applying it to cells sampled from an ex vivo grown tree and analyzed its feasibility landscape by computer simulations. We conclude that the platform may serve as a generic tool for lineage analysis and thus pave the way toward large-scale human cell lineage discovery.
Collapse
Affiliation(s)
- Tamir Biezuner
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel.,Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Adam Spiro
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel.,Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Ofir Raz
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel.,Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Shiran Amir
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel.,Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Lilach Milo
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel.,Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Rivka Adar
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel.,Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Noa Chapal-Ilani
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel.,Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Veronika Berman
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel.,Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Yael Fried
- Department of Biological Services, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Elena Ainbinder
- Department of Biological Services, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Galit Cohen
- Maurice and Vivienne Wohl Institute for Drug Discovery, G-INCPM, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Haim M Barr
- Maurice and Vivienne Wohl Institute for Drug Discovery, G-INCPM, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Ruth Halaban
- Department of Dermatology, Yale University School of Medicine, New Haven, Connecticut 06520-8059, USA
| | - Ehud Shapiro
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel.,Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 761001, Israel
| |
Collapse
|
8
|
Scalable amplification of strand subsets from chip-synthesized oligonucleotide libraries. Nat Commun 2015; 6:8634. [PMID: 26567534 PMCID: PMC4660042 DOI: 10.1038/ncomms9634] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Accepted: 09/14/2015] [Indexed: 02/01/2023] Open
Abstract
Synthetic oligonucleotides are the main cost factor for studies in DNA nanotechnology, genetics and synthetic biology, which all require thousands of these at high quality. Inexpensive chip-synthesized oligonucleotide libraries can contain hundreds of thousands of distinct sequences, however only at sub-femtomole quantities per strand. Here we present a selective oligonucleotide amplification method, based on three rounds of rolling-circle amplification, that produces nanomole amounts of single-stranded oligonucleotides per millilitre reaction. In a multistep one-pot procedure, subsets of hundreds or thousands of single-stranded DNAs with different lengths can selectively be amplified and purified together. These oligonucleotides are used to fold several DNA nanostructures and as primary fluorescence in situ hybridization probes. The amplification cost is lower than other reported methods (typically around US$ 20 per nanomole total oligonucleotides produced) and is dominated by the use of commercial enzymes.
Collapse
|
9
|
Nussbacher JK, Batra R, Lagier-Tourenne C, Yeo GW. RNA-binding proteins in neurodegeneration: Seq and you shall receive. Trends Neurosci 2015; 38:226-36. [PMID: 25765321 PMCID: PMC4403644 DOI: 10.1016/j.tins.2015.02.003] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2014] [Revised: 02/02/2015] [Accepted: 02/09/2015] [Indexed: 12/13/2022]
Abstract
As critical players in gene regulation, RNA binding proteins (RBPs) are taking center stage in our understanding of cellular function and disease. In our era of bench-top sequencers and unprecedented computational power, biological questions can be addressed in a systematic, genome-wide manner. Development of high-throughput sequencing (Seq) methodologies provides unparalleled potential to discover new mechanisms of disease-associated perturbations of RNA homeostasis. Complementary to candidate single-gene studies, these innovative technologies may elicit the discovery of unexpected mechanisms, and enable us to determine the widespread influence of the multifunctional RBPs on their targets. Given that the disruption of RNA processing is increasingly implicated in neurological diseases, these approaches will continue to provide insights into the roles of RBPs in disease pathogenesis.
Collapse
Affiliation(s)
- Julia K Nussbacher
- Department of Cellular and Molecule Medicine, Institute for Genomic Medicine, UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
| | - Ranjan Batra
- Department of Cellular and Molecule Medicine, Institute for Genomic Medicine, UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
| | - Clotilde Lagier-Tourenne
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA; Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA.
| | - Gene W Yeo
- Department of Cellular and Molecule Medicine, Institute for Genomic Medicine, UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Department of Physiology, National University of Singapore, Singapore.
| |
Collapse
|
10
|
Role of Analytics in Viral Safety. VACCINE ANALYSIS: STRATEGIES, PRINCIPLES, AND CONTROL 2015. [PMCID: PMC7122056 DOI: 10.1007/978-3-662-45024-6_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
In summary, this chapter reviews the principles of how the current and routine tests detect adventitious agents, and reviews how novel and emerging methods differ in their detection principles. These facets may permit novel methods to emerge to supplement, refine, or replace the routine methods. We have suggested a framework for risk assessment to assure biosafety in vaccines and suggested quantitative modeling to help crystallize thinking about the place of testing, either routine or novel, in this assurance. We assert that testing for adventitious agents should not be the sole basis on which product biosafety is assured. Appropriate sourcing and quality control of raw and starting materials, adherence to principles of Good Manufacturing Practices, including environmental and personnel monitoring and process validation, and finally, testing as verification are the package needed for maximal assurance of biosafety. Thus, a pathway forward to a new paradigm for adventitious agent testing exists in which detection of a broader array of potential adventitious agents might be included in the testing, with adequate sensitivity to provide the needed assurance of verification that there has been no catastrophic breach, in the context of the overall process, design, and adherence to cGMP. Furthermore, it is our hope that we may be able to implement the 3 Rs policy to reduce, replace, and/or refine the use of animals in product safety testing, at the same time that we provide greater assurance of the biosafety of vaccines.
Collapse
|
11
|
Yoon JK, Ahn J, Kim HS, Han SM, Jang H, Lee MG, Lee JH, Bang D. microDuMIP: target-enrichment technique for microarray-based duplex molecular inversion probes. Nucleic Acids Res 2014; 43:e28. [PMID: 25414325 PMCID: PMC4357688 DOI: 10.1093/nar/gku1188] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Molecular inversion probe (MIP)-based capture is a scalable and effective target-enrichment technology that can use synthetic single-stranded oligonucleotides as probes. Unlike the straightforward use of synthetic oligonucleotides for low-throughput target capture, high-throughput MIP capture has required laborious protocols to generate thousands of single-stranded probes from DNA microarray because of multiple enzymatic steps, gel purifications and extensive PCR amplifications. Here, we developed a simple and efficient microarray-based MIP preparation protocol using only one enzyme with double-stranded probes and improved target capture yields by designing probes with overlapping targets and unique barcodes. To test our strategy, we produced 11 510 microarray-based duplex MIPs (microDuMIPs) and captured 3554 exons of 228 genes in a HapMap genomic DNA sample (NA12878). Under our protocol, capture performance and precision of calling were compatible to conventional MIP capture methods, yet overlapping targets and unique barcodes allowed us to precisely genotype with as little as 50 ng of input genomic DNA without library preparation. microDuMIP method is simpler and cheaper, allowing broader applications and accurate target sequencing with a scalable number of targets.
Collapse
Affiliation(s)
- Jung-Ki Yoon
- College of Medicine, Seoul National University, Seoul 110-799, Korea
| | - Jinwoo Ahn
- Department of Chemistry, Yonsei University, Seoul 120-752, Korea
| | - Han Sang Kim
- Department of Pharmacology, Pharmacogenomic Research Center for Membrane Transporters, Brain Korea 21 PLUS Project for Medical Sciences, Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul 120-752, Korea Yonsei Cancer Center, Division of Medical Oncology, Department of Internal Medicine, Yonsei University College of Medicine, Seoul 120-752, Korea
| | - Soo Min Han
- Department of Pharmacology, Pharmacogenomic Research Center for Membrane Transporters, Brain Korea 21 PLUS Project for Medical Sciences, Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul 120-752, Korea
| | - Hoon Jang
- Department of Chemistry, Yonsei University, Seoul 120-752, Korea
| | - Min Goo Lee
- Department of Pharmacology, Pharmacogenomic Research Center for Membrane Transporters, Brain Korea 21 PLUS Project for Medical Sciences, Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul 120-752, Korea
| | - Ji Hyun Lee
- Department of Oral Biology, Yonsei University College of Dentistry, Seoul 120-752, Korea
| | - Duhee Bang
- Department of Chemistry, Yonsei University, Seoul 120-752, Korea
| |
Collapse
|
12
|
Lau HY, Palanisamy R, Trau M, Botella JR. Molecular inversion probe: a new tool for highly specific detection of plant pathogens. PLoS One 2014; 9:e111182. [PMID: 25343255 PMCID: PMC4208852 DOI: 10.1371/journal.pone.0111182] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Accepted: 09/28/2014] [Indexed: 11/18/2022] Open
Abstract
Highly specific detection methods, capable of reliably identifying plant pathogens are crucial in plant disease management strategies to reduce losses in agriculture by preventing the spread of diseases. We describe a novel molecular inversion probe (MIP) assay that can be potentially developed into a robust multiplex platform to detect and identify plant pathogens. A MIP has been designed for the plant pathogenic fungus Fusarium oxysporum f.sp. conglutinans and the proof of concept for the efficiency of this technology is provided. We demonstrate that this methodology can detect as little as 2.5 ng of pathogen DNA and is highly specific, being able to accurately differentiate Fusarium oxysporum f.sp. conglutinans from other fungal pathogens such as Botrytis cinerea and even pathogens of the same species such as Fusarium oxysporum f.sp. lycopersici. The MIP assay was able to detect the presence of the pathogen in infected Arabidopsis thaliana plants as soon as the tissues contained minimal amounts of pathogen. MIP methods are intrinsically highly multiplexable and future development of specific MIPs could lead to the establishment of a diagnostic method that could potentially screen infected plants for hundreds of pathogens in a single assay.
Collapse
Affiliation(s)
- Han Yih Lau
- Plant Genetic Engineering Laboratory, School of Agriculture and Food Sciences, University of Queensland, Brisbane, Queensland, Australia
- Australian Institute for Bioengineering and Nanotechnology, University of Queensland, Brisbane, Queensland, Australia
| | - Ramkumar Palanisamy
- Australian Institute for Bioengineering and Nanotechnology, University of Queensland, Brisbane, Queensland, Australia
| | - Matt Trau
- Australian Institute for Bioengineering and Nanotechnology, University of Queensland, Brisbane, Queensland, Australia
| | - Jose R. Botella
- Plant Genetic Engineering Laboratory, School of Agriculture and Food Sciences, University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
13
|
Kosuri S, Church GM. Large-scale de novo DNA synthesis: technologies and applications. Nat Methods 2014; 11:499-507. [PMID: 24781323 PMCID: PMC7098426 DOI: 10.1038/nmeth.2918] [Citation(s) in RCA: 486] [Impact Index Per Article: 48.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Accepted: 03/10/2014] [Indexed: 12/23/2022]
Abstract
For over 60 years, the synthetic production of new DNA sequences has helped researchers understand and engineer biology. Here we summarize methods and caveats for the de novo synthesis of DNA, with particular emphasis on recent technologies that allow for large-scale and low-cost production. In addition, we discuss emerging applications enabled by large-scale de novo DNA constructs, as well as the challenges and opportunities that lie ahead.
Collapse
Affiliation(s)
- Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, California, USA
| | - George M Church
- 1] Wyss Institute for Biologically Inspired Engineering, Boston, Massachusetts, USA. [2] Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
14
|
Boyle EA, O'Roak BJ, Martin BK, Kumar A, Shendure J. MIPgen: optimized modeling and design of molecular inversion probes for targeted resequencing. ACTA ACUST UNITED AC 2014; 30:2670-2. [PMID: 24867941 DOI: 10.1093/bioinformatics/btu353] [Citation(s) in RCA: 123] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
UNLABELLED Molecular inversion probes (MIPs) enable cost-effective multiplex targeted gene resequencing in large cohorts. However, the design of individual MIPs is a critical parameter governing the performance of this technology with respect to capture uniformity and specificity. MIPgen is a user-friendly package that simplifies the process of designing custom MIP assays to arbitrary targets. New logistic and SVM-derived models enable in silico predictions of assay success, and assay redesign exhibits improved coverage uniformity relative to previous methods, which in turn improves the utility of MIPs for cost-effective targeted sequencing for candidate gene validation and for diagnostic sequencing in a clinical setting. AVAILABILITY AND IMPLEMENTATION MIPgen is implemented in C++. Source code and accompanying Python scripts are available at http://shendurelab.github.io/MIPGEN/.
Collapse
Affiliation(s)
- Evan A Boyle
- Department of Genome Sciences, University of Washington, Seattle, WA 98105 and Department of Molecular & Medical Genetics, Oregon Health & Science University, Portland, OR 97239, USA
| | - Brian J O'Roak
- Department of Genome Sciences, University of Washington, Seattle, WA 98105 and Department of Molecular & Medical Genetics, Oregon Health & Science University, Portland, OR 97239, USA
| | - Beth K Martin
- Department of Genome Sciences, University of Washington, Seattle, WA 98105 and Department of Molecular & Medical Genetics, Oregon Health & Science University, Portland, OR 97239, USA
| | - Akash Kumar
- Department of Genome Sciences, University of Washington, Seattle, WA 98105 and Department of Molecular & Medical Genetics, Oregon Health & Science University, Portland, OR 97239, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA 98105 and Department of Molecular & Medical Genetics, Oregon Health & Science University, Portland, OR 97239, USA
| |
Collapse
|
15
|
|
16
|
Shapiro E, Biezuner T, Linnarsson S. Single-cell sequencing-based technologies will revolutionize whole-organism science. Nat Rev Genet 2013; 14:618-30. [PMID: 23897237 DOI: 10.1038/nrg3542] [Citation(s) in RCA: 774] [Impact Index Per Article: 70.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
The unabated progress in next-generation sequencing technologies is fostering a wave of new genomics, epigenomics, transcriptomics and proteomics technologies. These sequencing-based technologies are increasingly being targeted to individual cells, which will allow many new and longstanding questions to be addressed. For example, single-cell genomics will help to uncover cell lineage relationships; single-cell transcriptomics will supplant the coarse notion of marker-based cell types; and single-cell epigenomics and proteomics will allow the functional states of individual cells to be analysed. These technologies will become integrated within a decade or so, enabling high-throughput, multi-dimensional analyses of individual cells that will produce detailed knowledge of the cell lineage trees of higher organisms, including humans. Such studies will have important implications for both basic biological research and medicine.
Collapse
Affiliation(s)
- Ehud Shapiro
- 1] Department of Computer Science and Applied Math, Weizmann Institute of Science, Rehovot 76100, Israel. [2] Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel
| | | | | |
Collapse
|
17
|
Palanisamy R, Connolly AR, Trau M. Accurate detection of methylated cytosine in complex methylation landscapes. Anal Chem 2013; 85:6575-9. [PMID: 23768008 DOI: 10.1021/ac3031948] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Monitoring DNA methylation can be a useful biomarker for disease diagnosis and prognosis. However, monitoring the methylation status of a specific cytosine biomarker is often confounded by heterogeneous peripheral DNA methylation. To address this issue, molecular inversion probes were designed with inosine strategically positioned to complement suspected DNA methylation sites. This enabled the methylation status of a specific cytosine to be accurately measured with a high level of specificity, irrespective of adjacent epigenetic modifications.
Collapse
Affiliation(s)
- Ramkumar Palanisamy
- Centre for Biomarker Research and Development, Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD 4072, Australia
| | | | | |
Collapse
|
18
|
Schweiger MR, Barmeyer C, Timmermann B. Genomics and epigenomics: new promises of personalized medicine for cancer patients. Brief Funct Genomics 2013; 12:411-21. [PMID: 23814132 DOI: 10.1093/bfgp/elt024] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Recent years have brought about a marked extension of our understanding of the somatic basis of cancer. Parallel to the large-scale investigation of diverse tumor genomes the knowledge arose that cancer pathologies are most often not restricted to single genomic events. In contrast, a large number of different alterations in the genomes and epigenomes come together and promote the malignant transformation. The combination of mutations, structural variations and epigenetic alterations differs between each tumor, making individual diagnosis and treatment strategies necessary. This view is summarized in the new discipline of personalized medicine. To satisfy the ideas of this approach each tumor needs to be fully characterized and individual diagnostic and therapeutic strategies designed. Here, we will discuss the power of high-throughput sequencing technologies for genomic and epigenomic analyses. We will provide insight into the current status and how these technologies can be transferred to routine clinical usage.
Collapse
Affiliation(s)
- Michal-Ruth Schweiger
- Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany. Tel.: +49 30 84131339; Fax: +49 30 84131380;
| | | | | |
Collapse
|
19
|
McGraw S, Shojaei Saadi HA, Robert C. Meeting the methodological challenges in molecular mapping of the embryonic epigenome. Mol Hum Reprod 2013; 19:809-27. [PMID: 23783346 DOI: 10.1093/molehr/gat046] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
The past decade of life sciences research has been driven by progress in genomics. Many voices are already proclaiming the post-genomics era, in which phenomena other than sequence polymorphism influence gene expression and also explain complex phenotypes. One of these burgeoning fields is the study of the epigenome. Although the mechanisms by which chromatin structure and reorganization as well as cytosine methylation influence gene expression are not fully understood, they are being invoked to explain the now-accepted long-term impact of the environment on gene expression, which appears to be a factor in the development of numerous diseases. Such studies are particularly relevant in early embryonic development, during which waves of epigenetic reprogramming are known to have profound impacts. Since gametes and zygotes are in the process of resetting the genome in order to create embryonic stem cells that will each differentiate to create one of many specific tissue types, this phase of life is now viewed as a window of susceptibility to epigenetic reprogramming errors. Epigenetics could explain the influence of factors such as the nutritional/metabolic status of the mother or the artificial environment of assisted reproductive technologies. However, the peculiar nature of early embryos in addition to their scarcity poses numerous technological challenges that are slowly being overcome. The principal subject of this article is to review the suitability of various current and emerging technological platforms to study oocytes and early embryonic epigenome with more emphasis on studying DNA methylation. Furthermore, the constraint of samples size, inherent to the study of preimplantation embryo development, was put in perspective with the various molecular platforms described.
Collapse
Affiliation(s)
- Serge McGraw
- Department of Human Genetics, Montreal Children's Hospital Research Institute, McGill University, Montréal, QC H3Z 2Z3, Canada
| | | | | |
Collapse
|
20
|
Abstract
The functional impact of aberrant DNA methylation and the widespread alterations in DNA methylation in cancer development have led to the development of a variety of methods to characterize the DNA methylation patterns. This chapter critiques and describes the major approaches to analyzing DNA methylation.
Collapse
|
21
|
Liu X, Wang J, Chen L. Whole-exome sequencing reveals recurrent somatic mutation networks in cancer. Cancer Lett 2012; 340:270-6. [PMID: 23153794 DOI: 10.1016/j.canlet.2012.11.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Revised: 10/30/2012] [Accepted: 11/02/2012] [Indexed: 11/26/2022]
Abstract
The second-generation sequencing technologies have been extensively used to reveal the mechanism of tumorigenesis and find critical genes in cancer progression that can be potential targets of clinic treatment. Exome is a part of genome formed by exons which are the protein-coding portions of genes. The whole-exome sequencing information can reflect the mutations of the protein-coding region in the genome and depict the causal relationship between the mutations and phenotypes. Now, many network-based methods have been developed to identify cancer driver modules or pathways, which not only provide new insights into molecular mechanism of disease progression at network level but also can avoid low coverage or lowly recurrent on disease samples in contrast to individual driver genes. In this review, we focus on the recent advances on network-based methods for identifying cancer driver modules or pathways, including methods of whole-exome sequencing, somatic mutation detection, driver mutation identification, and mutation network reconstruction.
Collapse
Affiliation(s)
- Xiaoping Liu
- Key Laboratory of Systems Biology, SIBS-Novo Nordisk Translational Research Centre for PreDiabetes, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | | | | |
Collapse
|
22
|
Abstract
Rapid advances in DNA sequencing promise to enable new diagnostics and individualized therapies. Achieving personalized medicine, however, will require extensive research on highly reidentifiable, integrated datasets of genomic and health information. To assist with this, participants in the Personal Genome Project choose to forgo privacy via our institutional review board- approved "open consent" process. The contribution of public data and samples facilitates both scientific discovery and standardization of methods. We present our findings after enrollment of more than 1,800 participants, including whole-genome sequencing of 10 pilot participant genomes (the PGP-10). We introduce the Genome-Environment-Trait Evidence (GET-Evidence) system. This tool automatically processes genomes and prioritizes both published and novel variants for interpretation. In the process of reviewing the presumed healthy PGP-10 genomes, we find numerous literature references implying serious disease. Although it is sometimes impossible to rule out a late-onset effect, stringent evidence requirements can address the high rate of incidental findings. To that end we develop a peer production system for recording and organizing variant evaluations according to standard evidence guidelines, creating a public forum for reaching consensus on interpretation of clinically relevant variants. Genome analysis becomes a two-step process: using a prioritized list to record variant evaluations, then automatically sorting reviewed variants using these annotations. Genome data, health and trait information, participant samples, and variant interpretations are all shared in the public domain-we invite others to review our results using our participant samples and contribute to our interpretations. We offer our public resource and methods to further personalized medical research.
Collapse
|
23
|
Ritari J, Koskinen K, Hultman J, Kurola JM, Kymäläinen M, Romantschuk M, Paulin L, Auvinen P. Molecular analysis of meso- and thermophilic microbiota associated with anaerobic biowaste degradation. BMC Microbiol 2012; 12:121. [PMID: 22727142 PMCID: PMC3408363 DOI: 10.1186/1471-2180-12-121] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Accepted: 06/22/2012] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Microbial anaerobic digestion (AD) is used as a waste treatment process to degrade complex organic compounds into methane. The archaeal and bacterial taxa involved in AD are well known, whereas composition of the fungal community in the process has been less studied. The present study aimed to reveal the composition of archaeal, bacterial and fungal communities in response to increasing organic loading in mesophilic and thermophilic AD processes by applying 454 amplicon sequencing technology. Furthermore, a DNA microarray method was evaluated in order to develop a tool for monitoring the microbiological status of AD. RESULTS The 454 sequencing showed that the diversity and number of bacterial taxa decreased with increasing organic load, while archaeal i.e. methanogenic taxa remained more constant. The number and diversity of fungal taxa increased during the process and varied less in composition with process temperature than bacterial and archaeal taxa, even though the fungal diversity increased with temperature as well. Evaluation of the microarray using AD sample DNA showed correlation of signal intensities with sequence read numbers of corresponding target groups. The sensitivity of the test was found to be about 1%. CONCLUSIONS The fungal community survives in anoxic conditions and grows with increasing organic loading, suggesting that Fungi may contribute to the digestion by metabolising organic nutrients for bacterial and methanogenic groups. The microarray proof of principle tests suggest that the method has the potential for semiquantitative detection of target microbial groups given that comprehensive sequence data is available for probe design.
Collapse
Affiliation(s)
- Jarmo Ritari
- Institute of Biotechnology, University of Helsinki, Viikinkaari 4, 00790, Helsinki, Finland.
| | | | | | | | | | | | | | | |
Collapse
|
24
|
Jiang Y, Guo Y, Wang P, Dong Q, Opriessnig T, Cheng J, Xu H, Ding X, Guo J. A novel diagnostic platform based on multiplex ligase detection–PCR and microarray for simultaneous detection of swine viruses. J Virol Methods 2011; 178:171-8. [DOI: 10.1016/j.jviromet.2011.09.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2010] [Revised: 08/30/2011] [Accepted: 09/12/2011] [Indexed: 10/17/2022]
|
25
|
Jiménez-Gómez JM. Next generation quantitative genetics in plants. FRONTIERS IN PLANT SCIENCE 2011; 2:77. [PMID: 22645550 PMCID: PMC3355736 DOI: 10.3389/fpls.2011.00077] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Accepted: 10/23/2011] [Indexed: 05/31/2023]
Abstract
Most characteristics in living organisms show continuous variation, which suggests that they are controlled by multiple genes. Quantitative trait loci (QTL) analysis can identify the genes underlying continuous traits by establishing associations between genetic markers and observed phenotypic variation in a segregating population. The new high-throughput sequencing (HTS) technologies greatly facilitate QTL analysis by providing genetic markers at genome-wide resolution in any species without previous knowledge of its genome. In addition HTS serves to quantify molecular phenotypes, which aids to identify the loci responsible for QTLs and to understand the mechanisms underlying diversity. The constant improvements in price, experimental protocols, computational pipelines, and statistical frameworks are making feasible the use of HTS for any research group interested in quantitative genetics. In this review I discuss the application of HTS for molecular marker discovery, population genotyping, and expression profiling in QTL analysis.
Collapse
Affiliation(s)
- José M. Jiménez-Gómez
- Department of Plant Breeding and Genetics, Max Planck Institute for Plant Breeding ResearchKöln, Germany
| |
Collapse
|
26
|
Moorthie S, Mattocks CJ, Wright CF. Review of massively parallel DNA sequencing technologies. THE HUGO JOURNAL 2011. [PMID: 23205160 DOI: 10.1007/s11568-011-9156-3] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Since the development of technologies that can determine the base-pair sequence of DNA, the ability to sequence genes has contributed much to science and medicine. However, it has remained a relatively costly and laborious process, hindering its use as a routine biomedical tool. Recent times are seeing rapid developments in this field, both in the availability of novel sequencing platforms, as well as supporting technologies involved in processes such as targeting and data analysis. This is leading to significant reductions in the cost of sequencing a human genome and the potential for its use as a routine biomedical tool. This review is a snapshot of this rapidly moving field examining the current state of the art, forthcoming developments and some of the issues still to be resolved prior to the use of new sequencing technologies in routine clinical diagnosis.
Collapse
|
27
|
Cooper DN, Bacolla A, Férec C, Vasquez KM, Kehrer-Sawatzki H, Chen JM. On the sequence-directed nature of human gene mutation: the role of genomic architecture and the local DNA sequence environment in mediating gene mutations underlying human inherited disease. Hum Mutat 2011; 32:1075-99. [PMID: 21853507 PMCID: PMC3177966 DOI: 10.1002/humu.21557] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2011] [Accepted: 06/17/2011] [Indexed: 12/21/2022]
Abstract
Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher order features of the genomic architecture. The human genome is now recognized to contain "pervasive architectural flaws" in that certain DNA sequences are inherently mutation prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. Here, we explore how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment. The mutability of a given gene or genomic region may also be influenced indirectly by a variety of noncanonical (non-B) secondary structures whose formation is facilitated by the underlying DNA sequence. Since these non-B DNA structures can interfere with subsequent DNA replication and repair and may serve to increase mutation frequencies in generalized fashion (i.e., both in the context of subtle mutations and SVs), they have the potential to serve as a unifying concept in studies of mutational mechanisms underlying human inherited disease.
Collapse
Affiliation(s)
- David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom.
| | | | | | | | | | | |
Collapse
|
28
|
Nielsen R, Paul JS, Albrechtsen A, Song YS. Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 2011; 12:443-51. [PMID: 21587300 PMCID: PMC3593722 DOI: 10.1038/nrg2986] [Citation(s) in RCA: 877] [Impact Index Per Article: 67.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Meaningful analysis of next-generation sequencing (NGS) data, which are produced extensively by genetics and genomics studies, relies crucially on the accurate calling of SNPs and genotypes. Recently developed statistical methods both improve and quantify the considerable uncertainty associated with genotype calling, and will especially benefit the growing number of studies using low- to medium-coverage data. We review these methods and provide a guide for their use in NGS studies.
Collapse
Affiliation(s)
- Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA.
| | | | | | | |
Collapse
|
29
|
Niedringhaus TP, Milanova D, Kerby MB, Snyder MP, Barron AE. Landscape of next-generation sequencing technologies. Anal Chem 2011; 83:4327-41. [PMID: 21612267 DOI: 10.1021/ac2010857] [Citation(s) in RCA: 180] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
30
|
Methylation-mediated deamination of 5-methylcytosine appears to give rise to mutations causing human inherited disease in CpNpG trinucleotides, as well as in CpG dinucleotides. Hum Genomics 2011; 4:406-10. [PMID: 20846930 PMCID: PMC3525222 DOI: 10.1186/1479-7364-4-6-406] [Citation(s) in RCA: 95] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
The cytosine-guanine (CpG) dinucleotide has long been known to be a hotspot for pathological mutation in the human genome. This hypermutability is related to its role as the major site of cytosine methylation with the attendant risk of spontaneous deamination of 5-methylcytosine (5mC) to yield thymine. Cytosine methylation, however, also occurs in the context of CpNpG sites in the human genome, an unsurprising finding since the intrinsic symmetry of CpNpG renders it capable of supporting a semi-conservative model of replication of the methylation pattern. Recently, it has become clear that significant DNA methylation occurs in a CpHpG context (where H = A, C or T) in a variety of human somatic tissues. If we assume that CpHpG methylation also occurs in the germline, and that 5mC deamination can occur within a CpHpG context, then we might surmise that methylated CpHpG sites could also constitute mutation hotspots causing human genetic disease. To test this postulate, 54,625 missense and nonsense mutations from 2,113 genes causing inherited disease were retrieved from the Human Gene Mutation Database (http://www.hgmd.org). Some 18.2 per cent of these pathological lesions were found to be C → T and G → A transitions located in CpG dinucleotides (compatible with a model of methylation-mediated deamination of 5mC), an approximately ten-fold higher proportion than would have been expected by chance alone. The corresponding proportion for the CpHpG trinucleotide was 9.9 per cent, an approximately two-fold higher proportion than would have been expected by chance. We therefore estimate that ∼5 per cent of missense/nonsense mutations causing human inherited disease may be attributable to methylation-mediated deamination of 5mC within a CpHpG context.
Collapse
|
31
|
Kiialainen A, Karlberg O, Ahlford A, Sigurdsson S, Lindblad-Toh K, Syvänen AC. Performance of microarray and liquid based capture methods for target enrichment for massively parallel sequencing and SNP discovery. PLoS One 2011; 6:e16486. [PMID: 21347407 PMCID: PMC3036585 DOI: 10.1371/journal.pone.0016486] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2010] [Accepted: 12/21/2010] [Indexed: 11/18/2022] Open
Abstract
Targeted sequencing is a cost-efficient way to obtain answers to biological questions in many projects, but the choice of the enrichment method to use can be difficult. In this study we compared two hybridization methods for target enrichment for massively parallel sequencing and single nucleotide polymorphism (SNP) discovery, namely Nimblegen sequence capture arrays and the SureSelect liquid-based hybrid capture system. We prepared sequencing libraries from three HapMap samples using both methods, sequenced the libraries on the Illumina Genome Analyzer, mapped the sequencing reads back to the genome, and called variants in the sequences. 74-75% of the sequence reads originated from the targeted region in the SureSelect libraries and 41-67% in the Nimblegen libraries. We could sequence up to 99.9% and 99.5% of the regions targeted by capture probes from the SureSelect libraries and from the Nimblegen libraries, respectively. The Nimblegen probes covered 0.6 Mb more of the original 3.1 Mb target region than the SureSelect probes. In each sample, we called more SNPs and detected more novel SNPs from the libraries that were prepared using the Nimblegen method. Thus the Nimblegen method gave better results when judged by the number of SNPs called, but this came at the cost of more over-sampling.
Collapse
Affiliation(s)
- Anna Kiialainen
- Department of Medical Sciences, Uppsala University, Uppsala, Sweden.
| | | | | | | | | | | |
Collapse
|
32
|
Abstract
What are the key considerations to take into account when large-scale epigenomics projects are being implemented?
Collapse
|
33
|
Voelkerding KV, Dames S, Durtschi JD. Next generation sequencing for clinical diagnostics-principles and application to targeted resequencing for hypertrophic cardiomyopathy: a paper from the 2009 William Beaumont Hospital Symposium on Molecular Pathology. J Mol Diagn 2011; 12:539-51. [PMID: 20805560 DOI: 10.2353/jmoldx.2010.100043] [Citation(s) in RCA: 96] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
During the past five years, new high-throughput DNA sequencing technologies have emerged; these technologies are collectively referred to as next generation sequencing (NGS). By virtue of sequencing clonally amplified DNA templates or single DNA molecules in a massively parallel fashion in a flow cell, NGS provides both qualitative and quantitative sequence data. This combination of information has made NGS the technology of choice for complex genetic analyses that were previously either technically infeasible or cost prohibitive. As a result, NGS has had a fundamental and broad impact on many facets of biomedical research. In contrast, the dissemination of NGS into the clinical diagnostic realm is in its early stages. Though NGS is powerful and can be envisioned to have multiple applications in clinical diagnostics, the technology is currently complex. Successful adoption of NGS into the clinical laboratory will require expertise in both molecular biology techniques and bioinformatics. The current report presents principles that underlie NGS including sequencing library preparation, sequencing chemistries, and an introduction to NGS data analysis. These concepts are subsequently further illustrated by showing representative results from a case study using NGS for targeted resequencing of genes implicated in hypertrophic cardiomyopathy.
Collapse
|
34
|
Bioinformatic Approaches for Identification of A-to-I Editing Sites. Curr Top Microbiol Immunol 2011; 353:145-62. [DOI: 10.1007/82_2011_147] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
|
35
|
Profiling epigenetic alterations in disease. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2011; 711:162-77. [PMID: 21627049 DOI: 10.1007/978-1-4419-8216-2_12] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Nowadays, epigenetics is one of the fastest growing research areas in biomedicine. Studies have demonstrated that changes in the epigenome are not only common in cancer, but are also involved in the pathogenesis of noncancerous diseases like immunological, cardiovascular, developmental and neurological/psychiatric disorders. At the same time, during the last years, a technological revolution has taken place in the field of epigenomics, which is defined as the study of epigenetic changes throughout the whole genome. Microarray technologies and more recently, the development of next generation sequencing devices are now providing researchers with tools to draw high-resolution maps of DNA methylation and histone modifications in normal tissues and diseases. This chapter will review the currently available high-throughput techniques for studying the epigenome and their applications for characterizing human diseases.
Collapse
|
36
|
Wong E, Wei CL. Genome-wide distribution of DNA methylation at single-nucleotide resolution. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2011; 101:459-77. [PMID: 21507362 DOI: 10.1016/b978-0-12-387685-0.00015-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
DNA methylation, a well-known epigenetic modification in mammalian genomes, is important for development and health. Dysregulation of DNA methylation can cause abnormal gene regulation, leading to anomalous development and diseases. Until recently, the ability to understand the functions and dynamics of DNA methylation was limited by the availability of technologies for comprehensively characterizing methylation on a genome-wide scale. Rapid advances in high-throughput approaches (particularly next-generation sequencing), coupled with molecular techniques, have enabled unbiased genome-wide profiling of DNA modifications at single-base resolution and helped to elucidate their impact on gene regulation. Here, we discuss the development of genomic approaches to decipher the global methylome at single-base resolution, the challenges faced, and the emerging new insights. Our ability to decipher this important epigenetic modification and how it impacts gene expression will provide a framework for understanding numerous disease mechanisms, and suggest means to treat or prevent them in the future.
Collapse
Affiliation(s)
- Eleanor Wong
- Genome Technology and Biology, Genome Institute of Singapore, Singapore
| | | |
Collapse
|
37
|
Kenny EM, Cormican P, Gilks WP, Gates AS, O'Dushlaine CT, Pinto C, Corvin AP, Gill M, Morris DW. Multiplex target enrichment using DNA indexing for ultra-high throughput SNP detection. DNA Res 2010; 18:31-8. [PMID: 21163834 PMCID: PMC3041504 DOI: 10.1093/dnares/dsq029] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Screening large numbers of target regions in multiple DNA samples for sequence variation is an important application of next-generation sequencing but an efficient method to enrich the samples in parallel has yet to be reported. We describe an advanced method that combines DNA samples using indexes or barcodes prior to target enrichment to facilitate this type of experiment. Sequencing libraries for multiple individual DNA samples, each incorporating a unique 6-bp index, are combined in equal quantities, enriched using a single in-solution target enrichment assay and sequenced in a single reaction. Sequence reads are parsed based on the index, allowing sequence analysis of individual samples. We show that the use of indexed samples does not impact on the efficiency of the enrichment reaction. For three- and nine-indexed HapMap DNA samples, the method was found to be highly accurate for SNP identification. Even with sequence coverage as low as 8x, 99% of sequence SNP calls were concordant with known genotypes. Within a single experiment, this method can sequence the exonic regions of hundreds of genes in tens of samples for sequence and structural variation using as little as 1 μg of input DNA per sample.
Collapse
Affiliation(s)
- Elaine M Kenny
- Trinity Genome Sequencing Laboratory, Neuropsychiatric Genetics Research Group, Department of Psychiatry, Institute of Molecular Medicine, Trinity College Dublin, Ireland.
| | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Scalable gene synthesis by selective amplification of DNA pools from high-fidelity microchips. Nat Biotechnol 2010; 28:1295-9. [PMID: 21113165 PMCID: PMC3139991 DOI: 10.1038/nbt.1716] [Citation(s) in RCA: 187] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2010] [Accepted: 10/25/2010] [Indexed: 12/14/2022]
Abstract
Development of cheap, high-throughput, and reliable gene synthesis methods will broadly stimulate progress in biology and biotechnology1. Currently, the reliance on column-synthesized oligonucleotides as a source of DNA limits further cost reductions in gene synthesis2. Oligonucleotides from DNA microchips can reduce costs by at least an order of magnitude3,4,5, yet efforts to scale their use have been largely unsuccessful due to the high error rates and complexity of the oligonucleotide mixtures. Here we use high-fidelity DNA microchips, selective oligonucleotide pool amplification, optimized gene assembly protocols, and enzymatic error correction to develop a highly parallel gene synthesis platform. We tested our platform by assembling 47 genes, including 42 challenging therapeutic antibody sequences, encoding a total of ~35 kilo-basepairs of DNA. These assemblies were performed from a complex background containing 13,000 oligonucleotides encoding ~2.5 megabases of DNA, which is at least 50 times larger than previously published attempts.
Collapse
|
39
|
Gowrisankar S, Lerner-Ellis JP, Cox S, White ET, Manion M, LeVan K, Liu J, Farwell LM, Iartchouk O, Rehm HL, Funke BH. Evaluation of second-generation sequencing of 19 dilated cardiomyopathy genes for clinical applications. J Mol Diagn 2010; 12:818-27. [PMID: 20864638 DOI: 10.2353/jmoldx.2010.100014] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Medical sequencing for diseases with locus and allelic heterogeneities has been limited by the high cost and low throughput of traditional sequencing technologies. "Second-generation" sequencing (SGS) technologies allow the parallel processing of a large number of genes and, therefore, offer great promise for medical sequencing; however, their use in clinical laboratories is still in its infancy. Our laboratory offers clinical resequencing for dilated cardiomyopathy (DCM) using an array-based platform that interrogates 19 of more than 30 genes known to cause DCM. We explored both the feasibility and cost effectiveness of using PCR amplification followed by SGS technology for sequencing these 19 genes in a set of five samples enriched for known sequence alterations (109 unique substitutions and 27 insertions and deletions). While the analytical sensitivity for substitutions was comparable to that of the DCM array (98%), SGS technology performed better than the DCM array for insertions and deletions (90.6% versus 58%). Overall, SGS performed substantially better than did the current array-based testing platform; however, the operational cost and projected turnaround time do not meet our current standards. Therefore, efficient capture methods and/or sample pooling strategies that shorten the turnaround time and decrease reagent and labor costs are needed before implementing this platform into routine clinical applications.
Collapse
Affiliation(s)
- Sivakumar Gowrisankar
- Laboratory for Molecular Medicine, Partners HealthCare Center for Personalized Genetic Medicine, 65 Landsdowne St., Cambridge, MA 02139, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Martin ER, Kinnamon DD, Schmidt MA, Powell EH, Zuchner S, Morris RW. SeqEM: an adaptive genotype-calling approach for next-generation sequencing studies. ACTA ACUST UNITED AC 2010; 26:2803-10. [PMID: 20861027 PMCID: PMC2971572 DOI: 10.1093/bioinformatics/btq526] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Motivation: Next-generation sequencing presents several statistical challenges, with one of the most fundamental being determining an individual's genotype from multiple aligned short read sequences at a position. Some simple approaches for genotype calling apply fixed filters, such as calling a heterozygote if more than a specified percentage of the reads have variant nucleotide calls. Other genotype-calling methods, such as MAQ and SOAPsnp, are implementations of Bayes classifiers in that they classify genotypes using posterior genotype probabilities. Results: Here, we propose a novel genotype-calling algorithm that, in contrast to the other methods, estimates parameters underlying the posterior probabilities in an adaptive way rather than arbitrarily specifying them a priori. The algorithm, which we call SeqEM, applies the well-known Expectation-Maximization algorithm to an appropriate likelihood for a sample of unrelated individuals with next-generation sequence data, leveraging information from the sample to estimate genotype probabilities and the nucleotide-read error rate. We demonstrate using analytic calculations and simulations that SeqEM results in genotype-call error rates as small as or smaller than filtering approaches and MAQ. We also apply SeqEM to exome sequence data in eight related individuals and compare the results to genotypes from an Illumina SNP array, showing that SeqEM behaves well in real data that deviates from idealized assumptions. Conclusion: SeqEM offers an improved, robust and flexible genotype-calling approach that can be widely applied in the next-generation sequencing studies. Availability and implementation: Software for SeqEM is freely available from our website: www.hihg.org under Software Download. Contact:emartin1@med.miami.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- E R Martin
- John P. Hussman Institute for Human Genomics and the Dr. John T. Macdonald Foundation Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, Florida, USA.
| | | | | | | | | | | |
Collapse
|
41
|
Teer JK, Bonnycastle LL, Chines PS, Hansen NF, Aoyama N, Swift AJ, Abaan HO, Albert TJ, Margulies EH, Green ED, Collins FS, Mullikin JC, Biesecker LG. Systematic comparison of three genomic enrichment methods for massively parallel DNA sequencing. Genome Res 2010; 20:1420-31. [PMID: 20810667 DOI: 10.1101/gr.106716.110] [Citation(s) in RCA: 187] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Massively parallel DNA sequencing technologies have greatly increased our ability to generate large amounts of sequencing data at a rapid pace. Several methods have been developed to enrich for genomic regions of interest for targeted sequencing. We have compared three of these methods: Molecular Inversion Probes (MIP), Solution Hybrid Selection (SHS), and Microarray-based Genomic Selection (MGS). Using HapMap DNA samples, we compared each of these methods with respect to their ability to capture an identical set of exons and evolutionarily conserved regions associated with 528 genes (2.61 Mb). For sequence analysis, we developed and used a novel Bayesian genotype-assigning algorithm, Most Probable Genotype (MPG). All three capture methods were effective, but sensitivities (percentage of targeted bases associated with high-quality genotypes) varied for an equivalent amount of pass-filtered sequence: for example, 70% (MIP), 84% (SHS), and 91% (MGS) for 400 Mb. In contrast, all methods yielded similar accuracies of >99.84% when compared to Infinium 1M SNP BeadChip-derived genotypes and >99.998% when compared to 30-fold coverage whole-genome shotgun sequencing data. We also observed a low false-positive rate with all three methods; of the heterozygous positions identified by each of the capture methods, >99.57% agreed with 1M SNP BeadChip, and >98.840% agreed with the whole-genome shotgun data. In addition, we successfully piloted the genomic enrichment of a set of 12 pooled samples via the MGS method using molecular bar codes. We find that these three genomic enrichment methods are highly accurate and practical, with sensitivities comparable to that of 30-fold coverage whole-genome shotgun data.
Collapse
Affiliation(s)
- Jamie K Teer
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Abstract
The development of massively parallel sequencing technologies, coupled with new massively parallel DNA enrichment technologies (genomic capture), has allowed the sequencing of targeted regions of the human genome in rapidly increasing numbers of samples. Genomic capture can target specific areas in the genome, including genes of interest and linkage regions, but this limits the study to what is already known. Exome capture allows an unbiased investigation of the complete protein-coding regions in the genome. Researchers can use exome capture to focus on a critical part of the human genome, allowing larger numbers of samples than are currently practical with whole-genome sequencing. In this review, we briefly describe some of the methodologies currently used for genomic and exome capture and highlight recent applications of this technology.
Collapse
Affiliation(s)
- Jamie K Teer
- Genetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, 5625 Fishers Lane, Bethesda, MD 20892, USA
| | | |
Collapse
|
43
|
Abstract
Aberrant DNA methylation in the genome is found in almost all types of cancer and contributes to malignant transformation by silencing multiple tumour-suppressor genes, sometimes simultaneously. Therefore, deciphering the signature of DNA methylation in each tumour is required to better understand tumour behaviour and might be of benefit for clinical diagnostics and therapy. Recent technologies for high-throughput genome-wide DNA methylation analyses are promising and potent tools for epigenetic profiling. Since epigenetic therapy is now in clinical use or trials for several types of cancers, efficient epigenetic profiling is required. In this review, the current key technologies available to assess genome-wide DNA methylation are introduced and the implications of DNA methylation profiling in human cancers are discussed.
Collapse
|
44
|
Hellman A, Chess A. Extensive sequence-influenced DNA methylation polymorphism in the human genome. Epigenetics Chromatin 2010; 3:11. [PMID: 20497546 PMCID: PMC2893533 DOI: 10.1186/1756-8935-3-11] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2010] [Accepted: 05/24/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Epigenetic polymorphisms are a potential source of human diversity, but their frequency and relationship to genetic polymorphisms are unclear. DNA methylation, an epigenetic mark that is a covalent modification of the DNA itself, plays an important role in the regulation of gene expression. Most studies of DNA methylation in mammalian cells have focused on CpG methylation present in CpG islands (areas of concentrated CpGs often found near promoters), but there are also interesting patterns of CpG methylation found outside of CpG islands. RESULTS We compared DNA methylation patterns on both alleles between many pairs (and larger groups) of related and unrelated individuals. Direct observation and simulation experiments revealed that around 10% of common single nucleotide polymorphisms (SNPs) reside in regions with differences in the propensity for local DNA methylation between the two alleles. We further showed that for the most common form of SNP, a polymorphism at a CpG dinucleotide, the presence of the CpG at the SNP positively affected local DNA methylation in cis. CONCLUSIONS Taken together with the known effect of DNA methylation on mutation rate, our results suggest an interesting interdependence between genetics and epigenetics underlying diversity in the human genome.
Collapse
Affiliation(s)
- Asaf Hellman
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada, The Hebrew University-Hadassah Medical School, Jerusalem 91120, Israel.
| | | |
Collapse
|
45
|
Wang H, Chattopadhyay A, Li Z, Daines B, Li Y, Gao C, Gibbs R, Zhang K, Chen R. Rapid identification of heterozygous mutations in Drosophila melanogaster using genomic capture sequencing. Genome Res 2010; 20:981-8. [PMID: 20472684 DOI: 10.1101/gr.102921.109] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
One of the key advantages of using Drosophila melanogaster as a genetic model organism is the ability to conduct saturation mutagenesis screens to identify genes and pathways underlying a given phenotype. Despite the large number of genetic tools developed to facilitate downstream cloning of mutations obtained from such screens, the current procedure remains labor intensive, time consuming, and costly. To address this issue, we designed an efficient strategy for rapid identification of heterozygous mutations in the fly genome by combining rough genetic mapping, targeted DNA capture, and second generation sequencing technology. We first tested this method on heterozygous flies carrying either a previously characterized dac(5) or sens(E2) mutation. Targeted amplification of genomic regions near these two loci was used to enrich DNA for sequencing, and both point mutations were successfully identified. When this method was applied to uncharacterized twr mutant flies, the underlying mutation was identified as a single-base mutation in the gene Spase18-21. This targeted-genome-sequencing method reduces time and effort required for mutation cloning by up to 80% compared with the current approach and lowers the cost to <$1000 for each mutant. Introduction of this and other sequencing-based methods for mutation cloning will enable broader usage of forward genetics screens and have significant impacts in the field of model organisms such as Drosophila.
Collapse
Affiliation(s)
- Hui Wang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Allele-specific methylation is prevalent and is contributed by CpG-SNPs in the human genome. Genome Res 2010; 20:883-9. [PMID: 20418490 DOI: 10.1101/gr.104695.109] [Citation(s) in RCA: 269] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
In diploid mammalian genomes, parental alleles can exhibit different methylation patterns (allele-specific DNA methylation, ASM), which have been documented in a small number of cases except for the imprinted regions and X chromosomes in females. We carried out a chromosome-wide survey of ASM across 16 human pluripotent and adult cell lines using Illumina bisulfite sequencing. We applied the principle of linkage disequilibrium (LD) analysis to characterize the correlation of methylation between adjacent CpG sites on single DNA molecules, and also investigated the correlation between CpG methylation and single nucleotide polymorphisms (SNPs). We observed ASM on 23% approximately 37% heterozygous SNPs in any given cell line. ASM is often cell-type-specific. Furthermore, we found that a significant fraction (38%-88%) of ASM regions is dependent on the presence of heterozygous SNPs in CpG dinucleotides that disrupt their methylation potential. This study identified distinct types of ASM across many cell types and suggests a potential role for CpG-SNP in connecting genetic variation with the epigenome.
Collapse
|
47
|
Boerno ST, Grimm C, Lehrach H, Schweiger MR. Next-generation sequencing technologies for DNA methylation analyses in cancer genomics. Epigenomics 2010; 2:199-207. [DOI: 10.2217/epi.09.50] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
For the first time, the development of next-generation sequencing technologies has brought about tools to investigate epigenetic alterations in an unbiased, yet genome-wide approach. The importance of this innovative technology is undeniable since it has already been established that changes in DNA methylation play an important role in cancer initiation and progression. The first methylation maps have already been created, and it is only a matter of time until the complete epigenetic maps of healthy and diseased human genomes are available. In this review, we summarize the use of next-generation sequencing for diverse epigenetic technologies, give an overview of the status quo and outline future perspectives for its application in oncology and basic research.
Collapse
Affiliation(s)
- Stefan T Boerno
- Max Planck Institute for Molecular Genetics, Ihnestraße 63–73, 14195 Berlin, Germany
| | - Christina Grimm
- Max Planck Institute for Molecular Genetics, Ihnestraße 63–73, 14195 Berlin, Germany
| | - Hans Lehrach
- Max Planck Institute for Molecular Genetics, Ihnestraße 63–73, 14195 Berlin, Germany
| | - Michal-Ruth Schweiger
- Max Planck Institute for Molecular Genetics, Ihnestraße 63–73, 14195 Berlin, Germany
| |
Collapse
|
48
|
LeProust EM, Peck BJ, Spirin K, McCuen HB, Moore B, Namsaraev E, Caruthers MH. Synthesis of high-quality libraries of long (150mer) oligonucleotides by a novel depurination controlled process. Nucleic Acids Res 2010; 38:2522-40. [PMID: 20308161 PMCID: PMC2860131 DOI: 10.1093/nar/gkq163] [Citation(s) in RCA: 202] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
We have achieved the ability to synthesize thousands of unique, long oligonucleotides (150mers) in fmol amounts using parallel synthesis of DNA on microarrays. The sequence accuracy of the oligonucleotides in such large-scale syntheses has been limited by the yields and side reactions of the DNA synthesis process used. While there has been significant demand for libraries of long oligos (150mer and more), the yields in conventional DNA synthesis and the associated side reactions have previously limited the availability of oligonucleotide pools to lengths <100 nt. Using novel array based depurination assays, we show that the depurination side reaction is the limiting factor for the synthesis of libraries of long oligonucleotides on Agilent Technologies’ SurePrint® DNA microarray platform. We also demonstrate how depurination can be controlled and reduced by a novel detritylation process to enable the synthesis of high quality, long (150mer) oligonucleotide libraries and we report the characterization of synthesis efficiency for such libraries. Oligonucleotide libraries prepared with this method have changed the economics and availability of several existing applications (e.g. targeted resequencing, preparation of shRNA libraries, site-directed mutagenesis), and have the potential to enable even more novel applications (e.g. high-complexity synthetic biology).
Collapse
Affiliation(s)
- Emily M LeProust
- Agilent Technologies Inc., LSSU - Genomics, 5301 Stevens Creek Blvd, Santa Clara, CA 95051, USA.
| | | | | | | | | | | | | |
Collapse
|
49
|
Laird PW. Principles and challenges of genome-wide DNA methylation analysis. Nat Rev Genet 2010; 11:191-203. [DOI: 10.1038/nrg2732] [Citation(s) in RCA: 1074] [Impact Index Per Article: 76.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
50
|
Hedges D, Burges D, Powell E, Almonte C, Huang J, Young S, Boese B, Schmidt M, Pericak-Vance MA, Martin E, Zhang X, Harkins TT, Züchner S. Exome sequencing of a multigenerational human pedigree. PLoS One 2009; 4:e8232. [PMID: 20011588 PMCID: PMC2788131 DOI: 10.1371/journal.pone.0008232] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2009] [Accepted: 11/11/2009] [Indexed: 01/27/2023] Open
Abstract
Over the next few years, the efficient use of next-generation sequencing (NGS) in human genetics research will depend heavily upon the effective mechanisms for the selective enrichment of genomic regions of interest. Recently, comprehensive exome capture arrays have become available for targeting approximately 33 Mb or ∼180,000 coding exons across the human genome. Selective genomic enrichment of the human exome offers an attractive option for new experimental designs aiming to quickly identify potential disease-associated genetic variants, especially in family-based studies. We have evaluated a 2.1 M feature human exome capture array on eight individuals from a three-generation family pedigree. We were able to cover up to 98% of the targeted bases at a long-read sequence read depth of ≥3, 86% at a read depth of ≥10, and over 50% of all targets were covered with ≥20 reads. We identified up to 14,284 SNPs and small indels per individual exome, with up to 1,679 of these representing putative novel polymorphisms. Applying the conservative genotype calling approach HCDiff, the average rate of detection of a variant allele based on Illumina 1 M BeadChips genotypes was 95.2% at ≥10x sequence. Further, we propose an advantageous genotype calling strategy for low covered targets that empirically determines cut-off thresholds at a given coverage depth based on existing genotype data. Application of this method was able to detect >99% of SNPs covered ≥8x. Our results offer guidance for “real-world” applications in human genetics and provide further evidence that microarray-based exome capture is an efficient and reliable method to enrich for chromosomal regions of interest in next-generation sequencing experiments.
Collapse
Affiliation(s)
- Dale Hedges
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Dan Burges
- Roche Diagnostics Corporation Inc., Indianapolis, Indiana, United States of America
| | - Eric Powell
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Cherylyn Almonte
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Jia Huang
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Stuart Young
- Center for Computational Sciences, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Benjamin Boese
- Roche Diagnostics Corporation Inc., Indianapolis, Indiana, United States of America
| | - Mike Schmidt
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Margaret A. Pericak-Vance
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Eden Martin
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Xinmin Zhang
- Roche Diagnostics Corporation Inc., Indianapolis, Indiana, United States of America
| | - Timothy T. Harkins
- Roche Diagnostics Corporation Inc., Indianapolis, Indiana, United States of America
| | - Stephan Züchner
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America
- * E-mail:
| |
Collapse
|