251
|
The mutational landscape of adenoid cystic carcinoma. Nat Genet 2013; 45:791-8. [PMID: 23685749 PMCID: PMC3708595 DOI: 10.1038/ng.2643] [Citation(s) in RCA: 339] [Impact Index Per Article: 30.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2013] [Accepted: 04/25/2013] [Indexed: 12/14/2022]
Abstract
Adenoid cystic carcinomas (ACCs) are among the most enigmatic of human malignancies. These aggressive salivary cancers frequently recur and metastasize despite definitive treatment, with no known effective chemotherapy regimen. Here, we determined the ACC mutational landscape and report the exome or whole genome sequences of 60 ACC tumor/normal pairs. These analyses revealed a low exonic somatic mutation rate (0.31 non-silent events/megabase) and wide mutational diversity. Interestingly, mutations selectively involved chromatin state regulators, such as SMARCA2, CREBBP, and KDM6A, suggesting aberrant epigenetic regulation in ACC oncogenesis. Mutations in genes central to DNA damage and protein kinase A signaling also implicate these processes. We observed MYB-NFIB translocations and somatic mutations in MYB-associated genes, solidifying these aberrations as critical events. Lastly, we identified recurrent mutations in the FGF/IGF/PI3K pathway that may potentially offer new avenues for therapy (30%). Collectively, our observations establish a molecular foundation for understanding and exploring new treatments for ACC.
Collapse
|
252
|
Coin LJM, Cao D, Ren J, Zuo X, Sun L, Yang S, Zhang X, Cui Y, Li Y, Jin X, Wang J. An exome sequencing pipeline for identifying and genotyping common CNVs associated with disease with application to psoriasis. Bioinformatics 2013; 28:i370-i374. [PMID: 22962454 PMCID: PMC3436806 DOI: 10.1093/bioinformatics/bts379] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Motivation: Despite the prevalence of copy number variation (CNV) in the human genome, only a handful of confirmed associations have been reported between common CNVs and complex disease. This may be partially attributed to the difficulty in accurately genotyping CNVs in large cohorts using array-based technologies. Exome sequencing is now widely being applied to case–control cohorts and presents an exciting opportunity to look for common CNVs associated with disease. Results: We developed ExoCNVTest: an exome sequencing analysis pipeline to identify disease-associated CNVs and to generate absolute copy number genotypes at putatively associated loci. Our method re-discovered the LCE3B_LCE3C CNV association with psoriasis (P-value = 5 × 10e−6) while controlling inflation of test statistics (λ < 1). ExoCNVTest-derived absolute CNV genotypes were 97.4% concordant with PCR-derived genotypes at this locus. Availability and implementation: ExoCNVTest has been implemented in Java and R and is freely available from www1.imperial.ac.uk/medicine/people/l.coin/. Contact:wangj@genomics.org.cn or Lachlan.J.M.Coin@genomics.org.cn
Collapse
|
253
|
|
254
|
Chang VY, Federman N, Martinez-Agosto J, Tatishchev SF, Nelson SF. Whole exome sequencing of pediatric gastric adenocarcinoma reveals an atypical presentation of Li-Fraumeni syndrome. Pediatr Blood Cancer 2013; 60:570-4. [PMID: 23015295 PMCID: PMC4170733 DOI: 10.1002/pbc.24316] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/27/2012] [Accepted: 08/14/2012] [Indexed: 12/14/2022]
Abstract
BACKGROUND Gastric adenocarcinoma is a rare diagnosis in childhood. A 14-year-old male patient presented with metastatic gastric adenocarcinoma, and a strong family history of colon cancer. Clinical sequencing of CDH1 and APC were negative. Whole exome sequencing was therefore applied to capture the majority of protein-coding regions for the identification of single-nucleotide variants, small insertion/deletions, and copy number abnormalities in the patient's germline as well as primary tumor. MATERIALS AND METHODS DNA was extracted from the patient's blood, primary tumor, and the unaffected mother's blood. DNA libraries were constructed and sequenced on Illumina HiSeq2000. Data were post-processed using Picard and Samtools, then analyzed with the Genome Analysis Toolkit. Variants were annotated using an in-house Ensembl-based program. Copy number was assessed using ExomeCNV. RESULTS Each sample was sequenced to a mean depth of coverage of greater than 120×. A rare non-synonymous coding single-nucleotide variant (SNV) in TP53 was identified in the germline. There were 10 somatic cancer protein-damaging variants that were not observed in the unaffected mother genome. ExomeCNV comparing tumor to the patient's germline, identified abnormal copy number, spanning 6,946 genes. CONCLUSION We present an unusual case of Li-Fraumeni detected by whole exome sequencing. There were also likely driver somatic mutations in the gastric adenocarcinoma. These results highlight the need for more thorough and broad scale germline and cancer analyses to accurately inform patients of inherited risk to cancer and to identify somatic mutations.
Collapse
Affiliation(s)
- Vivian Y Chang
- Department of Pediatrics, Division of Hematology/Oncology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California 90095, USA.
| | | | | | | | | |
Collapse
|
255
|
Shi Y, Majewski J. FishingCNV: a graphical software package for detecting rare copy number variations in exome-sequencing data. ACTA ACUST UNITED AC 2013; 29:1461-2. [PMID: 23539306 DOI: 10.1093/bioinformatics/btt151] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
SUMMARY Rare copy number variations (CNVs) are frequent causes of genetic diseases. We developed a graphical software package based on a novel approach that can consistently identify CNVs of all types (homozygous deletions, heterozygous deletions, heterozygous duplications) from exome-sequencing data without the need of a paired control. The algorithm compares coverage depth in a test sample against a background distribution of control samples and uses principal component analysis to remove batch effects. It is user friendly and can be run on a personal computer. AVAILABILITY AND IMPLEMENTATION The main scripts are implemented in R (2.15), and the GUI is created using Java 1.6. It can be run on all major operating systems. A non-GUI version for pipeline implementation is also available. The program is freely available online: https://sourceforge.net/projects/fishingcnv/ SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yuhao Shi
- Department of Human Genetics, McGill University, Montreal, Quebec, Canada.
| | | |
Collapse
|
256
|
Mayrhofer M, DiLorenzo S, Isaksson A. Patchwork: allele-specific copy number analysis of whole-genome sequenced tumor tissue. Genome Biol 2013; 14:R24. [PMID: 23531354 PMCID: PMC4053982 DOI: 10.1186/gb-2013-14-3-r24] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2012] [Revised: 02/27/2013] [Accepted: 03/25/2013] [Indexed: 12/28/2022] Open
Abstract
Whole-genome sequencing of tumor tissue has the potential to provide comprehensive characterization of genomic alterations in tumor samples. We present Patchwork, a new bioinformatic tool for allele-specific copy number analysis using whole-genome sequencing data. Patchwork can be used to determine the copy number of homologous sequences throughout the genome, even in aneuploid samples with moderate sequence coverage and tumor cell content. No prior knowledge of average ploidy or tumor cell content is required. Patchwork is freely available as an R package, installable via R-Forge (http://patchwork.r-forge.r-project.org/).
Collapse
Affiliation(s)
- Markus Mayrhofer
- Science for Life Laboratory, Department of Medical Sciences, Uppsala University, SE-751 85 Uppsala, Sweden
| | - Sebastian DiLorenzo
- Science for Life Laboratory, Department of Medical Sciences, Uppsala University, SE-751 85 Uppsala, Sweden
| | - Anders Isaksson
- Science for Life Laboratory, Department of Medical Sciences, Uppsala University, SE-751 85 Uppsala, Sweden
| |
Collapse
|
257
|
Zhang J, Shi Y, Lalonde E, Li L, Cavallone L, Ferenczy A, Gotlieb WH, Foulkes WD, Majewski J. Exome profiling of primary, metastatic and recurrent ovarian carcinomas in a BRCA1-positive patient. BMC Cancer 2013; 13:146. [PMID: 23522120 PMCID: PMC3614563 DOI: 10.1186/1471-2407-13-146] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2012] [Accepted: 03/13/2013] [Indexed: 02/03/2023] Open
Abstract
BACKGROUND Ovarian carcinoma is a common, and often deadly, gynecological cancer. Mutations in BRCA1 and BRCA2 genes are present in at least a fifth of patients. Uncovering other genes that become mutated subsequent to BRCA1/BRCA2 inactivation during cancer development will be helpful for more effective treatments. METHODS We performed exome sequencing on the blood, primary tumor, omental metastasis and recurrence following therapy with carboplatin and paclitaxel, from a patient carrying a BRCA1 S1841R mutation. RESULTS We observed loss of heterozygosity in the BRCA1 mutation in the primary and subsequent tumors, and somatic mutations in the TP53 and NF1 genes were identified, suggesting their role along with BRCA1 driving the tumor development. Notably, we show that exome sequencing is effective in detecting large chromosomal rearrangements such as deletions and amplifications in cancer. We found that a large deletion was present in the three tumors in the regions containing BRCA1, TP53, and NF1 mutations, and an amplification in the regions containing MYC. We did not observe the emergence of any new mutations among tumors from diagnosis to relapse after chemotherapy, suggesting that mutations already present in the primary tumor contributed to metastases and chemotherapy resistance. CONCLUSIONS Our findings suggest that exome sequencing of matched samples from one patient is a powerful method of detecting somatic mutations and prioritizing their potential role in the development of the disease.
Collapse
Affiliation(s)
- Jian Zhang
- Department of Human Genetics, McGill University, Montreal, QC, Canada
| | | | | | | | | | | | | | | | | |
Collapse
|
258
|
|
259
|
Xi R, Lee S, Park PJ. A survey of copy-number variation detection tools based on high-throughput sequencing data. ACTA ACUST UNITED AC 2013; Chapter 7:Unit7.19. [PMID: 23074071 DOI: 10.1002/0471142905.hg0719s75] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Copy-number variation (CNV) is a major class of genomic variation with potentially important functional consequences in both normal and diseased populations. Remarkable advances in development of next-generation sequencing (NGS) platforms provide an unprecedented opportunity for accurate, high-resolution characterization of CNVs. In this unit, we give an overview of available computational tools for detection of CNVs and discuss comparative advantages and disadvantages of different approaches.
Collapse
Affiliation(s)
- Ruibin Xi
- Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
| | | | | |
Collapse
|
260
|
Yost SE, Pastorino S, Rozenzhak S, Smith EN, Chao YS, Jiang P, Kesari S, Frazer KA, Harismendy O. High-resolution mutational profiling suggests the genetic validity of glioblastoma patient-derived pre-clinical models. PLoS One 2013; 8:e56185. [PMID: 23441165 PMCID: PMC3575368 DOI: 10.1371/journal.pone.0056185] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2012] [Accepted: 01/07/2013] [Indexed: 11/19/2022] Open
Abstract
Recent advances in the ability to efficiently characterize tumor genomes is enabling targeted drug development, which requires rigorous biomarker-based patient selection to increase effectiveness. Consequently, representative DNA biomarkers become equally important in pre-clinical studies. However, it is still unclear how well these markers are maintained between the primary tumor and the patient-derived tumor models. Here, we report the comprehensive identification of somatic coding mutations and copy number aberrations in four glioblastoma (GBM) primary tumors and their matched pre-clinical models: serum-free neurospheres, adherent cell cultures, and mouse xenografts. We developed innovative methods to improve the data quality and allow a strict comparison of matched tumor samples. Our analysis identifies known GBM mutations altering PTEN and TP53 genes, and new actionable mutations such as the loss of PIK3R1, and reveals clear patient-to-patient differences. In contrast, for each patient, we do not observe any significant remodeling of the mutational profile between primary to model tumors and the few discrepancies can be attributed to stochastic errors or differences in sample purity. Similarly, we observe ∼96% primary-to-model concordance in copy number calls in the high-cellularity samples. In contrast to previous reports based on gene expression profiles, we do not observe significant differences at the DNA level between in vitro compared to in vivo models. This study suggests, at a remarkable resolution, the genome-wide conservation of a patient’s tumor genetics in various pre-clinical models, and therefore supports their use for the development and testing of personalized targeted therapies.
Collapse
Affiliation(s)
- Shawn E. Yost
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, California, United States of America
- Division of Genome Information Sciences, Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, La Jolla, California, United States of America
| | - Sandra Pastorino
- Department of Neurosciences, University of California San Diego, La Jolla, California, United States of America
- Translational Neuro-oncology Laboratories, University of California San Diego, La Jolla, California, United States of America
| | - Sophie Rozenzhak
- Division of Genome Information Sciences, Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, La Jolla, California, United States of America
| | - Erin N. Smith
- Division of Genome Information Sciences, Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, La Jolla, California, United States of America
| | - Ying S. Chao
- Department of Neurosciences, University of California San Diego, La Jolla, California, United States of America
- Translational Neuro-oncology Laboratories, University of California San Diego, La Jolla, California, United States of America
| | - Pengfei Jiang
- Department of Neurosciences, University of California San Diego, La Jolla, California, United States of America
- Translational Neuro-oncology Laboratories, University of California San Diego, La Jolla, California, United States of America
| | - Santosh Kesari
- Department of Neurosciences, University of California San Diego, La Jolla, California, United States of America
- Translational Neuro-oncology Laboratories, University of California San Diego, La Jolla, California, United States of America
- Moores UCSD Cancer Center, University of California San Diego, La Jolla, California, United States of America
- * E-mail: (OH); (SK)
| | - Kelly A. Frazer
- Division of Genome Information Sciences, Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, La Jolla, California, United States of America
- Moores UCSD Cancer Center, University of California San Diego, La Jolla, California, United States of America
- Clinical and Translational Research Institute, University of California San Diego, La Jolla, California, United States of America
- Institute for Genomic Medicine, University of California San Diego, La Jolla, California, United States of America
| | - Olivier Harismendy
- Division of Genome Information Sciences, Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, La Jolla, California, United States of America
- Moores UCSD Cancer Center, University of California San Diego, La Jolla, California, United States of America
- Clinical and Translational Research Institute, University of California San Diego, La Jolla, California, United States of America
- * E-mail: (OH); (SK)
| |
Collapse
|
261
|
Kim S, Jeong K, Bafna V. Wessim: a whole-exome sequencing simulator based on in silico exome capture. ACTA ACUST UNITED AC 2013; 29:1076-7. [PMID: 23413434 DOI: 10.1093/bioinformatics/btt074] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
SUMMARY We propose a targeted re-sequencing simulator Wessim that generates synthetic exome sequencing reads from a given sample genome. Wessim emulates conventional exome capture technologies, including Agilent's SureSelect and NimbleGen's SeqCap, to generate DNA fragments from genomic target regions. The target regions can be either specified by genomic coordinates or inferred from in silico probe hybridization. Coupled with existing next-generation sequencing simulators, Wessim generates a realistic artificial exome sequencing data, which is essential for developing and evaluating exome-targeted variant callers. AVAILABILITY Source code and the packaged version of Wessim with manuals are available at http://sak042.github.com/Wessim/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sangwoo Kim
- Department of Computer Science and Engineering and Department of Electrical and Computer Engineering, University of California at San Diego, La Jolla, CA 92093, USA.
| | | | | |
Collapse
|
262
|
Ku CS, Polychronakos C, Tan EK, Naidoo N, Pawitan Y, Roukos DH, Mort M, Cooper DN. A new paradigm emerges from the study of de novo mutations in the context of neurodevelopmental disease. Mol Psychiatry 2013; 18:141-53. [PMID: 22641181 DOI: 10.1038/mp.2012.58] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
The study of de novo point mutations (new germline mutations arising from the gametes of the parents) remained largely static until the arrival of next-generation sequencing technologies, which made both whole-exome sequencing (WES) and whole-genome sequencing (WGS) feasible in practical terms. Single nucleotide polymorphism genotyping arrays have been used to identify de novo copy-number variants in a number of common neurodevelopmental conditions such as schizophrenia and autism. By contrast, as point mutations and microlesions occurring de novo are refractory to analysis by these microarray-based methods, little was known about either their frequency or impact upon neurodevelopmental disease, until the advent of WES. De novo point mutations have recently been implicated in schizophrenia, autism and mental retardation through the WES of case-parent trios. Taken together, these findings strengthen the hypothesis that the occurrence of de novo mutations could account for the high prevalence of such diseases that are associated with a marked reduction in fecundity. De novo point mutations are also known to be responsible for many sporadic cases of rare dominant mendelian disorders such as Kabuki syndrome, Schinzel-Giedion syndrome and Bohring-Opitz syndrome. These disorders share a common feature in that they are all characterized by intellectual disability. In summary, recent WES studies of neurodevelopmental and neuropsychiatric disease have provided new insights into the role of de novo mutations in these disorders. Our knowledge of de novo mutations is likely to be further accelerated by WGS. However, the collection of case-parent trios will be a prerequisite for such studies. This review aims to discuss recent developments in the study of de novo mutations made possible by technological advances in DNA sequencing.
Collapse
Affiliation(s)
- C S Ku
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore.
| | | | | | | | | | | | | | | |
Collapse
|
263
|
Amarasinghe KC, Li J, Halgamuge SK. CoNVEX: copy number variation estimation in exome sequencing data using HMM. BMC Bioinformatics 2013; 14 Suppl 2:S2. [PMID: 23368785 PMCID: PMC3549847 DOI: 10.1186/1471-2105-14-s2-s2] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Background One of the main types of genetic variations in cancer is Copy Number Variations (CNV). Whole exome sequenicng (WES) is a popular alternative to whole genome sequencing (WGS) to study disease specific genomic variations. However, finding CNV in Cancer samples using WES data has not been fully explored. Results We present a new method, called CoNVEX, to estimate copy number variation in whole exome sequencing data. It uses ratio of tumour and matched normal average read depths at each exonic region, to predict the copy gain or loss. The useful signal produced by WES data will be hindered by the intrinsic noise present in the data itself. This limits its capacity to be used as a highly reliable CNV detection source. Here, we propose a method that consists of discrete wavelet transform (DWT) to reduce noise. The identification of copy number gains/losses of each targeted region is performed by a Hidden Markov Model (HMM). Conclusion HMM is frequently used to identify CNV in data produced by various technologies including Array Comparative Genomic Hybridization (aCGH) and WGS. Here, we propose an HMM to detect CNV in cancer exome data. We used modified data from 1000 Genomes project to evaluate the performance of the proposed method. Using these data we have shown that CoNVEX outperforms the existing methods significantly in terms of precision. Overall, CoNVEX achieved a sensitivity of more than 92% and a precision of more than 50%.
Collapse
Affiliation(s)
- Kaushalya C Amarasinghe
- Department of Mechanical Engineering, University of Melbourne, Parkville, VIC 3010, Australia.
| | | | | |
Collapse
|
264
|
Pabinger S, Dander A, Fischer M, Snajder R, Sperk M, Efremova M, Krabichler B, Speicher MR, Zschocke J, Trajanoski Z. A survey of tools for variant analysis of next-generation genome sequencing data. Brief Bioinform 2013; 15:256-78. [PMID: 23341494 PMCID: PMC3956068 DOI: 10.1093/bib/bbs086] [Citation(s) in RCA: 335] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straightforward interpretation of analysis results. While whole-exome and, in the near future, whole-genome sequencing are becoming commodities, data analysis still poses significant challenges and led to the development of a plethora of tools supporting specific parts of the analysis workflow or providing a complete solution. Here, we surveyed 205 tools for whole-genome/whole-exome sequencing data analysis supporting five distinct analytical steps: quality assessment, alignment, variant identification, variant annotation and visualization. We report an overview of the functionality, features and specific requirements of the individual tools. We then selected 32 programs for variant identification, variant annotation and visualization, which were subjected to hands-on evaluation using four data sets: one set of exome data from two patients with a rare disease for testing identification of germline mutations, two cancer data sets for testing variant callers for somatic mutations, copy number variations and structural variations, and one semi-synthetic data set for testing identification of copy number variations. Our comprehensive survey and evaluation of NGS tools provides a valuable guideline for human geneticists working on Mendelian disorders, complex diseases and cancers.
Collapse
Affiliation(s)
- Stephan Pabinger
- Division for Bioinformatics, Innsbruck Medical University, Innrain 80, 6020 Innsbruck, Austria. Tel.: +43-512-9003-71401; Fax: +43-512-9003-73100;
| | | | | | | | | | | | | | | | | | | |
Collapse
|
265
|
Bioinformatic perspectives in the neuronal ceroid lipofuscinoses. Biochim Biophys Acta Mol Basis Dis 2012; 1832:1831-41. [PMID: 23274885 DOI: 10.1016/j.bbadis.2012.12.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Revised: 12/16/2012] [Accepted: 12/19/2012] [Indexed: 02/06/2023]
Abstract
The neuronal ceroid lipofuscinoses (NCLs) are a group of rare genetic diseases characterised clinically by the progressive deterioration of mental, motor and visual functions and histopathologically by the intracellular accumulation of autofluorescent lipopigment - ceroid - in affected tissues. The NCLs are clinically and genetically heterogeneous and more than 14 genetically distinct NCL subtypes have been described to date (CLN1-CLN14) (Haltia and Goebel, 2012 [1]). In this review we will chronologically summarise work which has led over the years to identification of NCL genes, and outline the potential of novel genomic techniques and related bioinformatic approaches for further genetic dissection and diagnosis of NCLs. This article is part of a Special Issue entitled: The Neuronal Ceroid Lipofuscinoses or Batten Disease.
Collapse
|
266
|
Valdés-Mas R, Bea S, Puente DA, López-Otín C, Puente XS. Estimation of copy number alterations from exome sequencing data. PLoS One 2012; 7:e51422. [PMID: 23284693 PMCID: PMC3526607 DOI: 10.1371/journal.pone.0051422] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2012] [Accepted: 10/31/2012] [Indexed: 11/18/2022] Open
Abstract
Exome sequencing constitutes an important technology for the study of human hereditary diseases and cancer. However, the ability of this approach to identify copy number alterations in primary tumor samples has not been fully addressed. Here we show that somatic copy number alterations can be reliably estimated using exome sequencing data through a strategy that we have termed exome2cnv. Using data from 86 paired normal and primary tumor samples, we identified losses and gains of complete chromosomes or large genomic regions, as well as smaller regions affecting a minimum of one gene. Comparison with high-resolution comparative genomic hybridization (CGH) arrays revealed a high sensitivity and a low number of false positives in the copy number estimation between both approaches. We explore the main factors affecting sensitivity and false positives with real data, and provide a side by side comparison with CGH arrays. Together, these results underscore the utility of exome sequencing to study cancer samples by allowing not only the identification of substitutions and indels, but also the accurate estimation of copy number alterations.
Collapse
Affiliation(s)
- Rafael Valdés-Mas
- Departamento de Bioquímica y Biología Molecular, Instituto Universitario de Oncología (IUOPA), Universidad de Oviedo, Oviedo, Spain
| | - Silvia Bea
- Hematopathology Unit, Hospital Clinic, IDIBAPS, Barcelona, Spain
| | - Diana A. Puente
- Departamento de Bioquímica y Biología Molecular, Instituto Universitario de Oncología (IUOPA), Universidad de Oviedo, Oviedo, Spain
| | - Carlos López-Otín
- Departamento de Bioquímica y Biología Molecular, Instituto Universitario de Oncología (IUOPA), Universidad de Oviedo, Oviedo, Spain
| | - Xose S. Puente
- Departamento de Bioquímica y Biología Molecular, Instituto Universitario de Oncología (IUOPA), Universidad de Oviedo, Oviedo, Spain
- * E-mail:
| |
Collapse
|
267
|
Copy Number Variation detection from 1000 Genomes Project exon capture sequencing data. BMC Bioinformatics 2012; 13:305. [PMID: 23157288 PMCID: PMC3563612 DOI: 10.1186/1471-2105-13-305] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2012] [Accepted: 11/07/2012] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND DNA capture technologies combined with high-throughput sequencing now enable cost-effective, deep-coverage, targeted sequencing of complete exomes. This is well suited for SNP discovery and genotyping. However there has been little attention devoted to Copy Number Variation (CNV) detection from exome capture datasets despite the potentially high impact of CNVs in exonic regions on protein function. RESULTS As members of the 1000 Genomes Project analysis effort, we investigated 697 samples in which 931 genes were targeted and sampled with 454 or Illumina paired-end sequencing. We developed a rigorous Bayesian method to detect CNVs in the genes, based on read depth within target regions. Despite substantial variability in read coverage across samples and targeted exons, we were able to identify 107 heterozygous deletions in the dataset. The experimentally determined false discovery rate (FDR) of the cleanest dataset from the Wellcome Trust Sanger Institute is 12.5%. We were able to substantially improve the FDR in a subset of gene deletion candidates that were adjacent to another gene deletion call (17 calls). The estimated sensitivity of our call-set was 45%. CONCLUSIONS This study demonstrates that exonic sequencing datasets, collected both in population based and medical sequencing projects, will be a useful substrate for detecting genic CNV events, particularly deletions. Based on the number of events we found and the sensitivity of the methods in the present dataset, we estimate on average 16 genic heterozygous deletions per individual genome. Our power analysis informs ongoing and future projects about sequencing depth and uniformity of read coverage required for efficient detection.
Collapse
|
268
|
Li W, Olivier M. Current analysis platforms and methods for detecting copy number variation. Physiol Genomics 2012; 45:1-16. [PMID: 23132758 DOI: 10.1152/physiolgenomics.00082.2012] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Copy number variation (CNV), generated through duplication or deletion events that affect one or more loci, is widespread in the human genomes and is often associated with functional consequences that may include changes in gene expression levels or fusion of genes. Genome-wide association studies indicate that some disease phenotypes and physiological pathways might be impacted by CNV in a small number of characterized genomic regions. However, the pervasiveness and full impact of such variation remains unclear. Suitable analytic methods are needed to thoroughly mine human genomes for genomic structural variation, and to explore the interplay between observed CNV and disease phenotypes, but many medical researchers are unfamiliar with the features and nuances of recently developed technologies for detecting CNV. In this article, we evaluate a suite of commonly used and recently developed approaches to uncovering genome-wide CNVs and discuss the relative merits of each.
Collapse
Affiliation(s)
- Wenli Li
- Biotechnology and Bioengineering Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | | |
Collapse
|
269
|
Lee H, Nelson SF. Rethinking clinical practice: clinical implementation of exome sequencing. Per Med 2012; 9:785-787. [PMID: 29776228 DOI: 10.2217/pme.12.101] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Affiliation(s)
- Hane Lee
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA.
| | - Stanley F Nelson
- Department of Pathology & Laboratory Medicine & Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| |
Collapse
|
270
|
COPS: a sensitive and accurate tool for detecting somatic Copy Number Alterations using short-read sequence data from paired samples. PLoS One 2012; 7:e47812. [PMID: 23110103 PMCID: PMC3478291 DOI: 10.1371/journal.pone.0047812] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2012] [Accepted: 09/18/2012] [Indexed: 01/05/2023] Open
Abstract
Copy Number Alterations (CNAs) such as deletions and duplications; compose a larger percentage of genetic variations than single nucleotide polymorphisms or other structural variations in cancer genomes that undergo major chromosomal re-arrangements. It is, therefore, imperative to identify cancer-specific somatic copy number alterations (SCNAs), with respect to matched normal tissue, in order to understand their association with the disease. We have devised an accurate, sensitive, and easy-to-use tool, COPS, COpy number using Paired Samples, for detecting SCNAs. We rigorously tested the performance of COPS using short sequence simulated reads at various sizes and coverage of SCNAs, read depths, read lengths and also with real tumor:normal paired samples. We found COPS to perform better in comparison to other known SCNA detection tools for all evaluated parameters, namely, sensitivity (detection of true positives), specificity (detection of false positives) and size accuracy. COPS performed well for sequencing reads of all lengths when used with most upstream read alignment tools. Additionally, by incorporating a downstream boundary segmentation detection tool, the accuracy of SCNA boundaries was further improved. Here, we report an accurate, sensitive and easy to use tool in detecting cancer-specific SCNAs using short-read sequence data. In addition to cancer, COPS can be used for any disease as long as sequence reads from both disease and normal samples from the same individual are available. An added boundary segmentation detection module makes COPS detected SCNA boundaries more specific for the samples studied. COPS is available at ftp://115.119.160.213 with username “cops” and password “cops”.
Collapse
|
271
|
Fromer M, Moran J, Chambert K, Banks E, Bergen S, Ruderfer D, Handsaker R, McCarroll S, O’Donovan M, Owen M, Kirov G, Sullivan P, Hultman C, Sklar P, Purcell S. Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am J Hum Genet 2012; 91:597-607. [PMID: 23040492 DOI: 10.1016/j.ajhg.2012.08.005] [Citation(s) in RCA: 437] [Impact Index Per Article: 36.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2012] [Revised: 06/23/2012] [Accepted: 08/09/2012] [Indexed: 12/20/2022] Open
Abstract
Sequencing of gene-coding regions (the exome) is increasingly used for studying human disease, for which copy-number variants (CNVs) are a critical genetic component. However, detecting copy number from exome sequencing is challenging because of the noncontiguous nature of the captured exons. This is compounded by the complex relationship between read depth and copy number; this results from biases in targeted genomic hybridization, sequence factors such as GC content, and batching of samples during collection and sequencing. We present a statistical tool (exome hidden Markov model [XHMM]) that uses principal-component analysis (PCA) to normalize exome read depth and a hidden Markov model (HMM) to discover exon-resolution CNV and genotype variation across samples. We evaluate performance on 90 schizophrenia trios and 1,017 case-control samples. XHMM detects a median of two rare (<1%) CNVs per individual (one deletion and one duplication) and has 79% sensitivity to similarly rare CNVs overlapping three or more exons discovered with microarrays. With sensitivity similar to state-of-the-art methods, XHMM achieves higher specificity by assigning quality metrics to the CNV calls to filter out bad ones, as well as to statistically genotype the discovered CNV in all individuals, yielding a trio call set with Mendelian-inheritance properties highly consistent with expectation. We also show that XHMM breakpoint quality scores enable researchers to explicitly search for novel classes of structural variation. For example, we apply XHMM to extract those CNVs that are highly likely to disrupt (delete or duplicate) only a portion of a gene.
Collapse
|
272
|
Newey PJ, Nesbit MA, Rimmer AJ, Attar M, Head RT, Christie PT, Gorvin CM, Stechman M, Gregory L, Mihai R, Sadler G, McVean G, Buck D, Thakker RV. Whole-exome sequencing studies of nonhereditary (sporadic) parathyroid adenomas. J Clin Endocrinol Metab 2012; 97:E1995-2005. [PMID: 22855342 PMCID: PMC4446457 DOI: 10.1210/jc.2012-2303] [Citation(s) in RCA: 107] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
CONTEXT Genetic abnormalities, such as those of multiple endocrine neoplasia type 1 (MEN1) and Cyclin D1 (CCND1) genes, occur in <50% of nonhereditary (sporadic) parathyroid adenomas. OBJECTIVE To identify genetic abnormalities in nonhereditary parathyroid adenomas by whole-exome sequence analysis. DESIGN Whole-exome sequence analysis was performed on parathyroid adenomas and leukocyte DNA samples from 16 postmenopausal women without a family history of parathyroid tumors or MEN1 and in whom primary hyperparathyroidism due to single-gland disease was cured by surgery. Somatic variants confirmed in this discovery set were assessed in 24 other parathyroid adenomas. RESULTS Over 90% of targeted exons were captured and represented by more than 10 base reads. Analysis identified 212 somatic variants (median eight per tumor; range, 2-110), with the majority being heterozygous nonsynonymous single-nucleotide variants that predicted missense amino acid substitutions. Somatic MEN1 mutations occurred in six of 16 (∼35%) parathyroid adenomas, in association with loss of heterozygosity on chromosome 11. However, no other gene was mutated in more than one tumor. Mutations in several genes that may represent low-frequency driver mutations were identified, including a protection of telomeres 1 (POT1) mutation that resulted in exon skipping and disruption to the single-stranded DNA-binding domain, which may contribute to increased genomic instability and the observed high mutation rate in one tumor. CONCLUSIONS Parathyroid adenomas typically harbor few somatic variants, consistent with their low proliferation rates. MEN1 mutation represents the major driver in sporadic parathyroid tumorigenesis although multiple low-frequency driver mutations likely account for tumors not harboring somatic MEN1 mutations.
Collapse
Affiliation(s)
- Paul J Newey
- Academic Endocrine Unit, Nuffield Department of Clinical Medicine, University of Oxford, and Department of Surgery, John Radcliffe Hospital, Oxford University Hospitals Trust, Oxford, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
273
|
Improving indel detection specificity of the Ion Torrent PGM benchtop sequencer. PLoS One 2012; 7:e45798. [PMID: 23029247 PMCID: PMC3446914 DOI: 10.1371/journal.pone.0045798] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2012] [Accepted: 08/24/2012] [Indexed: 01/27/2023] Open
Abstract
The emergence of benchtop sequencers has made clinical genetic testing using next-generation sequencing more feasible. Ion Torrent's PGMTM is one such benchtop sequencer that shows clinical promise in detecting single nucleotide variations (SNVs) and microindel variations (indels). However, the large number of false positive indels caused by the high frequency of homopolymer sequencing errors has impeded PGMTM's usage for clinical genetic testing. An extensive analysis of PGMTM data from the sequencing reads of the well-characterized genome of the Escherichia coli DH10B strain and sequences of the BRCA1 and BRCA2 genes from six germline samples was done. Three commonly used variant detection tools, SAMtools, Dindel, and GATK's Unified Genotyper, all had substantial false positive rates for indels. By incorporating filters on two major measures we could dramatically improve false positive rates without sacrificing sensitivity. The two measures were: B-Allele Frequency (BAF) and VARiation of the Width of gaps and inserts (VARW) per indel position. A BAF threshold applied to indels detected by UnifiedGenotyper removed ∼99% of the indel errors detected in both the DH10B and BRCA sequences. The optimum BAF threshold for BRCA sequences was determined by requiring 100% detection sensitivity and minimum false discovery rate, using variants detected from Sanger sequencing as reference. This resulted in 15 indel errors remaining, of which 7 indel errors were removed by selecting a VARW threshold of zero. VARW specific errors increased in frequency with higher read depth in the BRCA datasets, suggesting that homopolymer-associated indel errors cannot be reduced by increasing the depth of coverage. Thus, using a VARW threshold is likely to be important in reducing indel errors from data with higher coverage. In conclusion, BAF and VARW thresholds provide simple and effective filtering criteria that can improve the specificity of indel detection in PGMTM data without compromising sensitivity.
Collapse
|
274
|
Ware JS, John S, Roberts AM, Buchan R, Gong S, Peters NS, Robinson DO, Lucassen A, Behr ER, Cook SA. Next generation diagnostics in inherited arrhythmia syndromes : a comparison of two approaches. J Cardiovasc Transl Res 2012; 6:94-103. [PMID: 22956155 PMCID: PMC3546298 DOI: 10.1007/s12265-012-9401-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/05/2012] [Accepted: 08/09/2012] [Indexed: 11/26/2022]
Abstract
Next-generation sequencing (NGS) provides an unprecedented opportunity to assess genetic variation underlying human disease. Here, we compared two NGS approaches for diagnostic sequencing in inherited arrhythmia syndromes. We compared PCR-based target enrichment and long-read sequencing (PCR-LR) with in-solution hybridization-based enrichment and short-read sequencing (Hyb-SR). The PCR-LR assay comprehensively assessed five long-QT genes routinely sequenced in diagnostic laboratories and “hot spots” in RYR2. The Hyb-SR assay targeted 49 genes, including those in the PCR-LR assay. The sensitivity for detection of control variants did not differ between approaches. In both assays, the major limitation was upstream target capture, particular in regions of extreme GC content. These initial experiences with NGS cardiovascular diagnostics achieved up to 89 % sensitivity at a fraction of current costs. In the next iteration of these assays we anticipate sensitivity above 97 % for all LQT genes. NGS assays will soon replace conventional sequencing for LQT diagnostics and molecular pathology.
Collapse
Affiliation(s)
- James S Ware
- MRC Clinical Sciences Centre, Imperial College London, London, UK.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
275
|
Teo SM, Pawitan Y, Ku CS, Chia KS, Salim A. Statistical challenges associated with detecting copy number variations with next-generation sequencing. ACTA ACUST UNITED AC 2012; 28:2711-8. [PMID: 22942022 DOI: 10.1093/bioinformatics/bts535] [Citation(s) in RCA: 170] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
MOTIVATION Analysing next-generation sequencing (NGS) data for copy number variations (CNVs) detection is a relatively new and challenging field, with no accepted standard protocols or quality control measures so far. There are by now several algorithms developed for each of the four broad methods for CNV detection using NGS, namely the depth of coverage (DOC), read-pair, split-read and assembly-based methods. However, because of the complexity of the genome and the short read lengths from NGS technology, there are still many challenges associated with the analysis of NGS data for CNVs, no matter which method or algorithm is used. RESULTS In this review, we describe and discuss areas of potential biases in CNV detection for each of the four methods. In particular, we focus on issues pertaining to (i) mappability, (ii) GC-content bias, (iii) quality control measures of reads and (iv) difficulty in identifying duplications. To gain insights to some of the issues discussed, we also download real data from the 1000 Genomes Project and analyse its DOC data. We show examples of how reads in repeated regions can affect CNV detection, demonstrate current GC-correction algorithms, investigate sensitivity of DOC algorithm before and after quality control of reads and discuss reasons for which duplications are harder to detect than deletions.
Collapse
Affiliation(s)
- Shu Mei Teo
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597
| | | | | | | | | |
Collapse
|
276
|
Plagnol V, Curtis J, Epstein M, Mok KY, Stebbings E, Grigoriadou S, Wood NW, Hambleton S, Burns SO, Thrasher AJ, Kumararatne D, Doffinger R, Nejentsev S. A robust model for read count data in exome sequencing experiments and implications for copy number variant calling. ACTA ACUST UNITED AC 2012; 28:2747-54. [PMID: 22942019 PMCID: PMC3476336 DOI: 10.1093/bioinformatics/bts526] [Citation(s) in RCA: 478] [Impact Index Per Article: 39.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Motivation: Exome sequencing has proven to be an effective tool to discover the genetic basis of Mendelian disorders. It is well established that copy number variants (CNVs) contribute to the etiology of these disorders. However, calling CNVs from exome sequence data is challenging. A typical read depth strategy consists of using another sample (or a combination of samples) as a reference to control for the variability at the capture and sequencing steps. However, technical variability between samples complicates the analysis and can create spurious CNV calls. Results: Here, we introduce ExomeDepth, a new CNV calling algorithm designed to control for this technical variability. ExomeDepth uses a robust model for the read count data and uses this model to build an optimized reference set in order to maximize the power to detect CNVs. As a result, ExomeDepth is effective across a wider range of exome datasets than the previously existing tools, even for small (e.g. one to two exons) and heterozygous deletions. We used this new approach to analyse exome data from 24 patients with primary immunodeficiencies. Depending on data quality and the exact target region, we find between 170 and 250 exonic CNV calls per sample. Our analysis identified two novel causative deletions in the genes GATA2 and DOCK8. Availability: The code used in this analysis has been implemented into an R package called ExomeDepth and is available at the Comprehensive R Archive Network (CRAN). Contact: v.plagnol@ucl.ac.uk Supplementary Information:Supplementary data are available at Bioinformatics online.
Collapse
|
277
|
Abstract
PURPOSE OF REVIEW This review examines the application of next-generation sequencing (NGS) technologies in the identification of the causation of nonsyndromic genetic cardiomyopathies. RECENT FINDINGS NGS sequencing of the entire genetic coding sequence (the exome) has successfully identified five novel genes and causative variants for cardiomyopathies without previously known cause within the last 12 months. Continual rapidly decreasing costs of NGS will shortly allow cost-effective sequencing of the entire genomes of affected individuals and their relatives to include noncoding and regulatory variant discovery and epigenetic profiling. Despite this rapid technological progress with sequencing, analysis of these large data sets remains challenging, particularly for assigning causality to novel rare variants identified in DNA samples from patients with cardiomyopathy. SUMMARY NGS technologies are rapidly moving to identify novel rare variants in patients with cardiomyopathy, but assigning pathogenicity to these novel variants remains challenging.
Collapse
|
278
|
Li Y, Xu X, Song L, Hou Y, Li Z, Tsang S, Li F, Im KM, Wu K, Wu H, Ye X, Li G, Wang L, Zhang B, Liang J, Xie W, Wu R, Jiang H, Liu X, Yu C, Zheng H, Jian M, Nie L, Wan L, Shi M, Sun X, Tang A, Guo G, Gui Y, Cai Z, Li J, Wang W, Lu Z, Zhang X, Bolund L, Kristiansen K, Wang J, Yang H, Dean M, Wang J. Single-cell sequencing analysis characterizes common and cell-lineage-specific mutations in a muscle-invasive bladder cancer. Gigascience 2012; 1:12. [PMID: 23587365 PMCID: PMC3626503 DOI: 10.1186/2047-217x-1-12] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2012] [Accepted: 08/02/2012] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Cancers arise through an evolutionary process in which cell populations are subjected to selection; however, to date, the process of bladder cancer, which is one of the most common cancers in the world, remains unknown at a single-cell level. RESULTS We carried out single-cell exome sequencing of 66 individual tumor cells from a muscle-invasive bladder transitional cell carcinoma (TCC). Analyses of the somatic mutant allele frequency spectrum and clonal structure revealed that the tumor cells were derived from a single ancestral cell, but that subsequent evolution occurred, leading to two distinct tumor cell subpopulations. By analyzing recurrently mutant genes in an additional cohort of 99 TCC tumors, we identified genes that might play roles in the maintenance of the ancestral clone and in the muscle-invasive capability of subclones of this bladder cancer, respectively. CONCLUSIONS This work provides a new approach of investigating the genetic details of bladder tumoral changes at the single-cell level and a new method for assessing bladder cancer evolution at a cell-population level.
Collapse
Affiliation(s)
- Yingrui Li
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Xun Xu
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Luting Song
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
- CAS-Max Planck Junior Research Group, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences (CAS), 32# Jiao-chang Road, Kunming, Yunnan, 650223, People’s Republic of China
- Graduate University of the Chinese Academy of Sciences, 19A Yuquanlu, Beijing, 100049, People’s Republic of China
- College of Life Sciences, Wuhan University, Luojia Hill, Wuhan, 430072, People’s Republic of China
| | - Yong Hou
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
- School of Biological Science and Medical Engineering, Southeast University, Sipailou 2#, Nanjing, 210096, People’s Republic of China
- State Key Laboratory of Bioelectronics, Southeast University, Sipailou 2#, Nanjing, 210096, People’s Republic of China
| | - Zesong Li
- Shenzhen Key Laboratory of Genitourinary Tumor, Shenzhen Second People’s Hospital, First Affiliated Hospital of Shenzhen University, Shenzhen, 518035, People’s Republic of China
- Department of Urology, Shenzhen Second People’s Hospital, Shenzhen, 518035, People’s Republic of China
- The Institute of Urogenital Diseases, Shenzhen University, Shenzhen, 518060, People’s Republic of China
| | - Shirley Tsang
- BioMatrix, LLC, 3029 Windy Knoll Court, Rockville, MD, 20850, USA
| | - Fuqiang Li
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Kate McGee Im
- Cancer and Inflammation Program, National Cancer Institute at Frederick, Building 560, Frederick, MD, 21702, USA
| | - Kui Wu
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Hanjie Wu
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
- School of Bioscience and Biotechnology, Guangzhou Higher Education Mega Centre, South China University of Technology, Panyu District, Guangzhou, 510006, People’s Republic of China
| | - Xiaofei Ye
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Guibo Li
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Linlin Wang
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Bo Zhang
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Jie Liang
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Wei Xie
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
- School of Biological Science and Medical Engineering, Southeast University, Sipailou 2#, Nanjing, 210096, People’s Republic of China
- State Key Laboratory of Bioelectronics, Southeast University, Sipailou 2#, Nanjing, 210096, People’s Republic of China
| | - Renhua Wu
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Hui Jiang
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Xiao Liu
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Chang Yu
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Hancheng Zheng
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Min Jian
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Liping Nie
- Guangdong and Shenzhen Key Laboratory of Male Reproductive Medicine and Genetics, Institute of Urology, Shenzhen PKU-HKUST Medical Center, Peking University Shenzhen Hospital, 1120 Lian Hua Road, Futian District, Shenzhen, 518036, People’s Republic of China
| | - Lei Wan
- Department of Urology, Longgang Central Hospital, Shenhui Road, Longgang Town, Shenzhen, 518116, People’s Republic of China
| | - Min Shi
- Guangdong and Shenzhen Key Laboratory of Male Reproductive Medicine and Genetics, Institute of Urology, Shenzhen PKU-HKUST Medical Center, Peking University Shenzhen Hospital, 1120 Lian Hua Road, Futian District, Shenzhen, 518036, People’s Republic of China
| | - Xiaojuan Sun
- Shenzhen Key Laboratory of Genitourinary Tumor, Shenzhen Second People’s Hospital, First Affiliated Hospital of Shenzhen University, Shenzhen, 518035, People’s Republic of China
- Department of Urology, Shenzhen Second People’s Hospital, Shenzhen, 518035, People’s Republic of China
- The Institute of Urogenital Diseases, Shenzhen University, Shenzhen, 518060, People’s Republic of China
| | - Aifa Tang
- Shenzhen Key Laboratory of Genitourinary Tumor, Shenzhen Second People’s Hospital, First Affiliated Hospital of Shenzhen University, Shenzhen, 518035, People’s Republic of China
- Department of Urology, Shenzhen Second People’s Hospital, Shenzhen, 518035, People’s Republic of China
- The Institute of Urogenital Diseases, Shenzhen University, Shenzhen, 518060, People’s Republic of China
| | - Guangwu Guo
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Yaoting Gui
- Guangdong and Shenzhen Key Laboratory of Male Reproductive Medicine and Genetics, Institute of Urology, Shenzhen PKU-HKUST Medical Center, Peking University Shenzhen Hospital, 1120 Lian Hua Road, Futian District, Shenzhen, 518036, People’s Republic of China
| | - Zhiming Cai
- Department of Urology, Shenzhen Second People’s Hospital, Shenzhen, 518035, People’s Republic of China
- The Institute of Urogenital Diseases, Shenzhen University, Shenzhen, 518060, People’s Republic of China
- Guangdong and Shenzhen Key Laboratory of Male Reproductive Medicine and Genetics, Institute of Urology, Shenzhen PKU-HKUST Medical Center, Peking University Shenzhen Hospital, 1120 Lian Hua Road, Futian District, Shenzhen, 518036, People’s Republic of China
| | - Jingxiang Li
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Wen Wang
- CAS-Max Planck Junior Research Group, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences (CAS), 32# Jiao-chang Road, Kunming, Yunnan, 650223, People’s Republic of China
| | - Zuhong Lu
- School of Biological Science and Medical Engineering, Southeast University, Sipailou 2#, Nanjing, 210096, People’s Republic of China
- State Key Laboratory of Bioelectronics, Southeast University, Sipailou 2#, Nanjing, 210096, People’s Republic of China
| | - Xiuqing Zhang
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Lars Bolund
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
- Institute of Human Genetics, University of Aarhus, Aarhus, 8100, Denmark
| | - Karsten Kristiansen
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
- The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Ole Maaløes Vej 5, Copenhagen, DK, 2200, Denmark
| | - Jian Wang
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Huanming Yang
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
| | - Michael Dean
- Cancer and Inflammation Program, National Cancer Institute at Frederick, Building 560, Frederick, MD, 21702, USA
| | - Jun Wang
- BGI-Shenzhen, Beishan Industrial Zone, Beishan Road, Yantian, Shenzhen, 518083, People’s Republic of China
- The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Ole Maaløes Vej 5, Copenhagen, DK, 2200, Denmark
- Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, Copenhagen, DK, 2200, Denmark
| |
Collapse
|
279
|
Ku CS, Cooper DN, Wu M, Roukos DH, Pawitan Y, Soong R, Iacopetta B. Gene discovery in familial cancer syndromes by exome sequencing: prospects for the elucidation of familial colorectal cancer type X. Mod Pathol 2012; 25:1055-68. [PMID: 22522846 DOI: 10.1038/modpathol.2012.62] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Recent advances in genotyping and sequencing technologies have provided powerful tools with which to explore the genetic basis of both Mendelian (monogenic) and sporadic (polygenic) diseases. Several hundred genome-wide association studies have so far been performed to explore the genetics of various polygenic or complex diseases including those cancers with a genetic predisposition. Exome sequencing has also proven very successful in elucidating the etiology of a range of hitherto poorly understood Mendelian disorders caused by high-penetrance mutations. Despite such progress, the genetic etiology of several familial cancers, such as familial colorectal cancer type X, has remained elusive. Familial colorectal cancer type X and Lynch syndrome are similar in terms of their fulfilling certain clinical criteria, but the former group is not characterized by germline mutations in DNA mismatch-repair genes. On the other hand, the genetics of sporadic colorectal cancer have been investigated by genome-wide association studies, leading to the identification of multiple new susceptibility loci. In addition, there is increasing evidence to suggest that familial and sporadic cancers exhibit similarities in terms of their genetic etiologies. In this review, we have summarized our current knowledge of familial colorectal cancer type X, discussed current approaches to probing its genetic etiology through the application of new sequencing technologies and the recruitment of the results of colorectal cancer genome-wide association studies, and explore the challenges that remain to be overcome given the uncertainty of the current genetic model (ie, monogenic vs polygenic) of familial colorectal cancer type X.
Collapse
Affiliation(s)
- Chee-Seng Ku
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore.
| | | | | | | | | | | | | |
Collapse
|
280
|
Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma. Nat Genet 2012; 44:1006-14. [PMID: 22842228 PMCID: PMC3432702 DOI: 10.1038/ng.2359] [Citation(s) in RCA: 893] [Impact Index Per Article: 74.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Accepted: 06/28/2012] [Indexed: 02/06/2023]
Abstract
We characterized the mutational landscape of melanoma, the form of skin cancer with the highest mortality rate, by sequencing the exomes of 147 melanomas. Sun-exposed melanomas had markedly more ultraviolet (UV)-like C>T somatic mutations compared to sun-shielded acral, mucosal and uveal melanomas. Among the newly identified cancer genes was PPP6C, encoding a serine/threonine phosphatase, which harbored mutations that clustered in the active site in 12% of sun-exposed melanomas, exclusively in tumors with mutations in BRAF or NRAS. Notably, we identified a recurrent UV-signature, an activating mutation in RAC1 in 9.2% of sun-exposed melanomas. This activating mutation, the third most frequent in our cohort of sun-exposed melanoma after those of BRAF and NRAS, changes Pro29 to serine (RAC1P29S) in the highly conserved switch I domain. Crystal structures, and biochemical and functional studies of RAC1P29S showed that the alteration releases the conformational restraint conferred by the conserved proline, causes an increased binding of the protein to downstream effectors, and promotes melanocyte proliferation and migration. These findings raise the possibility that pharmacological inhibition of downstream effectors of RAC1 signaling could be of therapeutic benefit.
Collapse
|
281
|
Solomon BD, Pineda-Alvarez DE, Bear KA, Mullikin JC, Evans JP. Applying Genomic Analysis to Newborn Screening. Mol Syndromol 2012; 3:59-67. [PMID: 23112750 DOI: 10.1159/000341253] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/06/2012] [Indexed: 01/30/2023] Open
Abstract
Large-scale genomic analysis such as whole-exome and whole-genome sequencing is becoming increasingly prevalent in the research arena. Clinically, many potential uses of this technology have been proposed. One such application is the extension or augmentation of newborn screening. In order to explore this application, we examined data from 3 children with normal newborn screens who underwent whole-exome sequencing as part of research participation. We analyzed sequence information for 151 selected genes associated with conditions ascertained by newborn screening. We compared findings with publicly available databases and results from over 500 individuals who underwent whole-exome sequencing at the same facility. Novel variants were confirmed through bidirectional dideoxynucleotide sequencing. High-density microarrays (Illumina Omni1-Quad) were also performed to detect potential copy number variations affecting these genes. We detected an average of 87 genetic variants per individual. After excluding artifacts, 96% of the variants were found to be reported in public databases and have no evidence of pathogenicity. No variants were identified that would predict disease in the tested individuals, which is in accordance with their normal newborn screens. However, we identified 6 previously reported variants and 2 novel variants that, according to published literature, could result in affected offspring if the reproductive partner were also a mutation carrier; other specific molecular findings highlight additional means by which genomic testing could augment newborn screening.
Collapse
Affiliation(s)
- B D Solomon
- Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Md., USA
| | | | | | | | | | | |
Collapse
|
282
|
Rigaill GJ, Cadot S, Kluin RJ, Xue Z, Bernards R, Majewski IJ, Wessels LF. A regression model for estimating DNA copy number applied to capture sequencing data. Bioinformatics 2012; 28:2357-65. [DOI: 10.1093/bioinformatics/bts448] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
283
|
Lo RS. Combinatorial therapies to overcome B-RAF inhibitor resistance in melanomas. Pharmacogenomics 2012; 13:125-8. [PMID: 22256862 DOI: 10.2217/pgs.11.166] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
|
284
|
Ku CS, Wu M, Cooper DN, Naidoo N, Pawitan Y, Pang B, Iacopetta B, Soong R. Technological advances in DNA sequence enrichment and sequencing for germline genetic diagnosis. Expert Rev Mol Diagn 2012; 12:159-73. [PMID: 22369376 DOI: 10.1586/erm.11.95] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The potential applications of next-generation sequencing technologies in diagnostic laboratories have become increasingly evident despite the various technical challenges that still need to be overcome to potentiate its widespread adoption in a clinical setting. Whole-genome sequencing is now both technically feasible and 'cost effective' using next-generation sequencing techniques. However, this approach is still considered to be 'expensive' for a diagnostic test. Although the goal of the US$1000 genome is fast approaching, neither the analytical hurdles nor the ethical issues involved are trivial. In addition, the cost of data analysis and storage has been much higher than initially expected. As a result, it is widely perceived that targeted sequencing and whole-exome sequencing are more likely to be adopted as diagnostic tools in the foreseeable future. However, the information-generating power of whole-exome sequencing has also sparked considerable debate in relation to its deployment in genetic diagnostics, particularly with reference to the revelation of incidental findings. In this review, we focus on the targeted sequencing approach and its potential as a genetic diagnostic tool.
Collapse
Affiliation(s)
- Chee-Seng Ku
- Cancer Science Institute of Singapore, National University of Singapore, Singapore.
| | | | | | | | | | | | | | | |
Collapse
|
285
|
Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer. Genome Res 2012; 22:1995-2007. [PMID: 22637570 PMCID: PMC3460194 DOI: 10.1101/gr.137570.112] [Citation(s) in RCA: 190] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Loss of heterozygosity (LOH) and copy number alteration (CNA) feature prominently in the somatic genomic landscape of tumors. As such, karyotypic aberrations in cancer genomes have been studied extensively to discover novel oncogenes and tumor-suppressor genes. Advances in sequencing technology have enabled the cost-effective detection of tumor genome and transcriptome mutation events at single-base-pair resolution; however, computational methods for predicting segmental regions of LOH in this context are not yet fully explored. Consequently, whole transcriptome, nucleotide-level resolution analysis of monoallelic expression patterns associated with LOH has not yet been undertaken in cancer. We developed a novel approach for inference of LOH from paired tumor/normal sequence data and applied it to a cohort of 23 triple-negative breast cancer (TNBC) genomes. Following extensive benchmarking experiments, we describe the nucleotide-resolution landscape of LOH in TNBC and assess the consequent effect of LOH on the transcriptomes of these tumors using RNA-seq-derived measurements of allele-specific expression. We show that the majority of monoallelic expression in the transcriptomes of triple-negative breast cancer can be explained by genomic regions of LOH and establish an upper bound for monoallelic expression that may be explained by other tumor-specific modifications such as epigenetics or mutations. Monoallelically expressed genes associated with LOH reveal that cell cycle, homologous recombination and actin-cytoskeletal functions are putatively disrupted by LOH in TNBC. Finally, we show how inference of LOH can be used to interpret allele frequencies of somatic mutations and postulate on temporal ordering of mutations in the evolutionary history of these tumors.
Collapse
|
286
|
High-resolution melting analysis of 15 genes in 60 patients with cytochrome-c oxidase deficiency. J Hum Genet 2012; 57:442-8. [PMID: 22592081 DOI: 10.1038/jhg.2012.49] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Cytochrome-c oxidase (COX) deficiency is one of the common childhood mitochondrial disorders. Mutations in genes for the assembly factors SURF1 and SCO2 are prevalent in children with COX deficiency in the Slavonic population. Molecular diagnosis is difficult because of the number of genes involved in COX biogenesis and assembly. The aim of this study was to screen for mutations in 15 nuclear genes that encode the 10 structural subunits, their isoforms and two assembly factors of COX in 60 unrelated Czech children with COX deficiency. Nine novel variants were identified in exons and adjacent intronic regions of COX4I2, COX6A1, COX6A2, COX7A1, COX7A2 and COX10 using high-resolution melting (HRM) analysis. Online bioinformatics servers were used to predict the importance of the newly identified amino-acid substitutions. The newly characterized variants updated the contemporary spectrum of known genetic sequence variations that are present in the Czech population, which will be important for further targeted mutation screening in Czech COX-deficient children. HRM and predictive bioinformatics methodologies are advantageous because they are low-cost screening tools that complement large-scale genomic studies and reduce the required time and effort.
Collapse
|
287
|
Krumm N, Sudmant PH, Ko A, O'Roak BJ, Malig M, Coe BP, Quinlan AR, Nickerson DA, Eichler EE. Copy number variation detection and genotyping from exome sequence data. Genome Res 2012; 22:1525-32. [PMID: 22585873 PMCID: PMC3409265 DOI: 10.1101/gr.138115.112] [Citation(s) in RCA: 475] [Impact Index Per Article: 39.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
While exome sequencing is readily amenable to single-nucleotide variant discovery, the sparse and nonuniform nature of the exome capture reaction has hindered exome-based detection and characterization of genic copy number variation. We developed a novel method using singular value decomposition (SVD) normalization to discover rare genic copy number variants (CNVs) as well as genotype copy number polymorphic (CNP) loci with high sensitivity and specificity from exome sequencing data. We estimate the precision of our algorithm using 122 trios (366 exomes) and show that this method can be used to reliably predict (94% overall precision) both de novo and inherited rare CNVs involving three or more consecutive exons. We demonstrate that exome-based genotyping of CNPs strongly correlates with whole-genome data (median r2 = 0.91), especially for loci with fewer than eight copies, and can estimate the absolute copy number of multi-allelic genes with high accuracy (78% call level). The resulting user-friendly computational pipeline, CoNIFER (copy number inference from exome reads), can reliably be used to discover disruptive genic CNVs missed by standard approaches and should have broad application in human genetic studies of disease.
Collapse
Affiliation(s)
- Niklas Krumm
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
288
|
A genetic model for neurodevelopmental disease. Curr Opin Neurobiol 2012; 22:829-36. [PMID: 22560351 PMCID: PMC3437230 DOI: 10.1016/j.conb.2012.04.007] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2012] [Revised: 02/16/2012] [Accepted: 04/05/2012] [Indexed: 12/20/2022]
Abstract
The genetic basis of neurodevelopmental and neuropsychiatric diseases has been advanced by the discovery of large and recurrent copy number variants significantly enriched in cases when compared to controls. The pattern of this variation strongly implies that rare variants contribute significantly to neurological disease; that different genes will be responsible for similar diseases in different families; and that the same 'primary' genetic lesions can result in a different disease outcome depending potentially on the genetic background. Next-generation sequencing technologies are beginning to broaden the spectrum of disease-causing variation and provide specificity by pinpointing both genes and pathways for future diagnostics and therapeutics.
Collapse
|
289
|
Arboleda VA, Lee H, Sánchez FJ, Délot EC, Sandberg DE, Grody WW, Nelson SF, Vilain E. Targeted massively parallel sequencing provides comprehensive genetic diagnosis for patients with disorders of sex development. Clin Genet 2012; 83:35-43. [PMID: 22435390 DOI: 10.1111/j.1399-0004.2012.01879.x] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Disorders of sex development (DSD) are rare disorders in which there is discordance between chromosomal, gonadal, and phenotypic sex. Only a minority of patients clinically diagnosed with DSD obtains a molecular diagnosis, leaving a large gap in our understanding of the prevalence, management, and outcomes in affected patients. We created a novel DSD-genetic diagnostic tool, in which sex development genes are captured using RNA probes and undergo massively parallel sequencing. In the pilot group of 14 patients, we determined sex chromosome dosage, copy number variation, and gene mutations. In the patients with a known genetic diagnosis (obtained either on a clinical or research basis), this test identified the molecular cause in 100% (7/7) of patients. In patients in whom no molecular diagnosis had been made, this tool identified a genetic diagnosis in two of seven patients. Targeted sequencing of genes representing a specific spectrum of disorders can result in a higher rate of genetic diagnoses than current diagnostic approaches. Our DSD diagnostic tool provides for first time, in a single blood test, a comprehensive genetic diagnosis in patients presenting with a wide range of urogenital anomalies.
Collapse
Affiliation(s)
- V A Arboleda
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, USA
| | | | | | | | | | | | | | | |
Collapse
|
290
|
Xu X, Hou Y, Yin X, Bao L, Tang A, Song L, Li F, Tsang S, Wu K, Wu H, He W, Zeng L, Xing M, Wu R, Jiang H, Liu X, Cao D, Guo G, Hu X, Gui Y, Li Z, Xie W, Sun X, Shi M, Cai Z, Wang B, Zhong M, Li J, Lu Z, Gu N, Zhang X, Goodman L, Bolund L, Wang J, Yang H, Kristiansen K, Dean M, Li Y, Wang J. Single-cell exome sequencing reveals single-nucleotide mutation characteristics of a kidney tumor. Cell 2012; 148:886-95. [PMID: 22385958 DOI: 10.1016/j.cell.2012.02.025] [Citation(s) in RCA: 487] [Impact Index Per Article: 40.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2011] [Revised: 12/15/2011] [Accepted: 02/15/2012] [Indexed: 02/07/2023]
Abstract
Clear cell renal cell carcinoma (ccRCC) is the most common kidney cancer and has very few mutations that are shared between different patients. To better understand the intratumoral genetics underlying mutations of ccRCC, we carried out single-cell exome sequencing on a ccRCC tumor and its adjacent kidney tissue. Our data indicate that this tumor was unlikely to have resulted from mutations in VHL and PBRM1. Quantitative population genetic analysis indicates that the tumor did not contain any significant clonal subpopulations and also showed that mutations that had different allele frequencies within the population also had different mutation spectrums. Analyses of these data allowed us to delineate a detailed intratumoral genetic landscape at a single-cell level. Our pilot study demonstrates that ccRCC may be more genetically complex than previously thought and provides information that can lead to new ways to investigate individual tumors, with the aim of developing more effective cellular targeted therapies.
Collapse
Affiliation(s)
- Xun Xu
- BGI-Shenzhen, Shenzhen, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
291
|
Li J, Lupat R, Amarasinghe KC, Thompson ER, Doyle MA, Ryland GL, Tothill RW, Halgamuge SK, Campbell IG, Gorringe KL. CONTRA: copy number analysis for targeted resequencing. ACTA ACUST UNITED AC 2012; 28:1307-13. [PMID: 22474122 PMCID: PMC3348560 DOI: 10.1093/bioinformatics/bts146] [Citation(s) in RCA: 256] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
MOTIVATION In light of the increasing adoption of targeted resequencing (TR) as a cost-effective strategy to identify disease-causing variants, a robust method for copy number variation (CNV) analysis is needed to maximize the value of this promising technology. RESULTS We present a method for CNV detection for TR data, including whole-exome capture data. Our method calls copy number gains and losses for each target region based on normalized depth of coverage. Our key strategies include the use of base-level log-ratios to remove GC-content bias, correction for an imbalanced library size effect on log-ratios, and the estimation of log-ratio variations via binning and interpolation. Our methods are made available via CONTRA (COpy Number Targeted Resequencing Analysis), a software package that takes standard alignment formats (BAM/SAM) and outputs in variant call format (VCF4.0), for easy integration with other next-generation sequencing analysis packages. We assessed our methods using samples from seven different target enrichment assays, and evaluated our results using simulated data and real germline data with known CNV genotypes.
Collapse
Affiliation(s)
- Jason Li
- Bioinformatics Core Facility, Peter MacCallum Cancer Centre, VIC 3002, Australia.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
292
|
Ku CS, Cooper DN, Polychronakos C, Naidoo N, Wu M, Soong R. Exome sequencing: dual role as a discovery and diagnostic tool. Ann Neurol 2012; 71:5-14. [PMID: 22275248 DOI: 10.1002/ana.22647] [Citation(s) in RCA: 123] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Recent developments in high-throughput sequence capture methods and next-generation sequencing technologies have now made exome sequencing a viable approach to elucidate the genetic basis of Mendelian disorders with hitherto unknown etiology. In addition, exome sequencing is increasingly being employed as a diagnostic tool for specific genetic diseases, particularly in the context of those disorders characterized by significant genetic and phenotypic heterogeneity, for example, Charcot-Marie-Tooth disease and congenital disorders of glycosylation. Such disorders are challenging to interrogate with conventional polymerase chain reaction-Sanger sequencing methods, because of the inherent difficulty in prioritizing candidate genes for diagnostic testing. Here, we explore the value of exome sequencing as a diagnostic tool and discuss whether exome sequencing can come to serve a dual role in diagnosis and discovery. We summarize the current status of exome sequencing, the technical challenges facing it, and its adaptation to diagnostics, and make recommendations for the use of exome sequencing as a routine diagnostic tool. Finally, we discuss pertinent ethical concerns, such as the use of exome sequencing data, originally generated in a diagnostic context, in research investigations.
Collapse
Affiliation(s)
- Chee-Seng Ku
- Cancer Science Institute of Singapore, National University of Singapore, Singapore.
| | | | | | | | | | | |
Collapse
|
293
|
Melanoma whole-exome sequencing identifies (V600E)B-RAF amplification-mediated acquired B-RAF inhibitor resistance. Nat Commun 2012; 3:724. [PMID: 22395615 PMCID: PMC3530385 DOI: 10.1038/ncomms1727] [Citation(s) in RCA: 503] [Impact Index Per Article: 41.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2011] [Accepted: 02/03/2012] [Indexed: 12/11/2022] Open
Abstract
The development of acquired drug resistance hampers the long-term success of B-RAF inhibitor (B-RAFi) therapy for melanoma patients. Here we show V600EB-RAF copy number gain as a mechanism of acquired B-RAFi resistance in four out of twenty (20%) patients treated with B-RAFi. In cell lines, V600EB-RAF over-expression and knockdown conferred B-RAFi resistance and sensitivity, respectively. In V600EB-RAF amplification-driven (vs. mutant N-RAS-driven) B-RAFi resistance, ERK reactivation is saturable, with higher doses of vemurafenib down-regulating pERK and re-sensitizing melanoma cells to B-RAFi. These two mechanisms of ERK reactivation are sensitive to the MEK1/2 inhibitor AZD6244/selumetinib or its combination with the B-RAFi vemurafenib. In contrast to mutant N-RAS-mediated V600EB-RAF bypass, which is sensitive to C-RAF knockdown, V600EB-RAF amplification-mediated resistance functions largely independently of C-RAF. Thus, alternative clinical strategies may potentially overcome distinct modes of ERK reactivation underlying acquired B-RAFi resistance in melanoma.
Collapse
|
294
|
Tantravahi SK, Williams LB, Digre KB, Creel DJ, Smock KJ, DeAngelis MM, Clayton FC, Vitale AT, Rodgers GM. An inherited disorder with splenomegaly, cytopenias, and vision loss. Am J Med Genet A 2012; 158A:475-81. [PMID: 22307799 DOI: 10.1002/ajmg.a.34437] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2011] [Accepted: 11/21/2011] [Indexed: 12/13/2022]
Abstract
We describe a novel inherited disorder consisting of idiopathic massive splenomegaly, cytopenias, anhidrosis, chronic optic nerve edema, and vision loss. This disorder involves three affected patients in a single non-consanguineous Caucasian family, a mother and two daughters, who are half-sisters. All three patients have had splenectomies; histopathology revealed congestion of the red pulp, but otherwise no abnormalities. Electron microscopic studies of splenic tissue showed no evidence for a storage disorder or other ultrastructural abnormality. Two of the three patients had bone marrow examinations that were non-diagnostic. All three patients developed progressive vision loss such that the two oldest patients are now blind, possibly due to a cone-rod dystrophy. Characteristics of vision loss in this family include early chronic optic nerve edema, and progressive vision loss, particularly central and color vision. Despite numerous medical and ophthalmic evaluations, no diagnosis has been discovered.
Collapse
Affiliation(s)
- Srinivas K Tantravahi
- Department of Internal Medicine, University of Utah Health Sciences Center, Salt Lake City, Utah 84132, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
295
|
Klambauer G, Schwarzbauer K, Mayr A, Clevert DA, Mitterecker A, Bodenhofer U, Hochreiter S. cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate. Nucleic Acids Res 2012; 40:e69. [PMID: 22302147 PMCID: PMC3351174 DOI: 10.1093/nar/gks003] [Citation(s) in RCA: 317] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Quantitative analyses of next-generation sequencing (NGS) data, such as the detection of copy number variations (CNVs), remain challenging. Current methods detect CNVs as changes in the depth of coverage along chromosomes. Technological or genomic variations in the depth of coverage thus lead to a high false discovery rate (FDR), even upon correction for GC content. In the context of association studies between CNVs and disease, a high FDR means many false CNVs, thereby decreasing the discovery power of the study after correction for multiple testing. We propose ‘Copy Number estimation by a Mixture Of PoissonS’ (cn.MOPS), a data processing pipeline for CNV detection in NGS data. In contrast to previous approaches, cn.MOPS incorporates modeling of depths of coverage across samples at each genomic position. Therefore, cn.MOPS is not affected by read count variations along chromosomes. Using a Bayesian approach, cn.MOPS decomposes variations in the depth of coverage across samples into integer copy numbers and noise by means of its mixture components and Poisson distributions, respectively. The noise estimate allows for reducing the FDR by filtering out detections having high noise that are likely to be false detections. We compared cn.MOPS with the five most popular methods for CNV detection in NGS data using four benchmark datasets: (i) simulated data, (ii) NGS data from a male HapMap individual with implanted CNVs from the X chromosome, (iii) data from HapMap individuals with known CNVs, (iv) high coverage data from the 1000 Genomes Project. cn.MOPS outperformed its five competitors in terms of precision (1–FDR) and recall for both gains and losses in all benchmark data sets. The software cn.MOPS is publicly available as an R package at http://www.bioinf.jku.at/software/cnmops/ and at Bioconductor.
Collapse
Affiliation(s)
- Günter Klambauer
- Institute of Bioinformatics, Johannes Kepler University, A-4040 Linz, Austria
| | | | | | | | | | | | | |
Collapse
|
296
|
Castle JC, Kreiter S, Diekmann J, Löwer M, van de Roemer N, de Graaf J, Selmi A, Diken M, Boegel S, Paret C, Koslowski M, Kuhn AN, Britten CM, Huber C, Türeci O, Sahin U. Exploiting the mutanome for tumor vaccination. Cancer Res 2012; 72:1081-91. [PMID: 22237626 DOI: 10.1158/0008-5472.can-11-3722] [Citation(s) in RCA: 594] [Impact Index Per Article: 49.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Multiple genetic events and subsequent clonal evolution drive carcinogenesis, making disease elimination with single-targeted drugs difficult. The multiplicity of gene mutations derived from clonal heterogeneity therefore represents an ideal setting for multiepitope tumor vaccination. Here, we used next generation sequencing exome resequencing to identify 962 nonsynonymous somatic point mutations in B16F10 murine melanoma cells, with 563 of those mutations in expressed genes. Potential driver mutations occurred in classical tumor suppressor genes and genes involved in proto-oncogenic signaling pathways that control cell proliferation, adhesion, migration, and apoptosis. Aim1 and Trrap mutations known to be altered in human melanoma were included among those found. The immunogenicity and specificity of 50 validated mutations was determined by immunizing mice with long peptides encoding the mutated epitopes. One-third of these peptides were found to be immunogenic, with 60% in this group eliciting immune responses directed preferentially against the mutated sequence as compared with the wild-type sequence. In tumor transplant models, peptide immunization conferred in vivo tumor control in protective and therapeutic settings, thereby qualifying mutated epitopes that include single amino acid substitutions as effective vaccines. Together, our findings provide a comprehensive picture of the mutanome of B16F10 melanoma which is used widely in immunotherapy studies. In addition, they offer insight into the extent of the immunogenicity of nonsynonymous base substitution mutations. Lastly, they argue that the use of deep sequencing to systematically analyze immunogenicity mutations may pave the way for individualized immunotherapy of cancer patients.
Collapse
Affiliation(s)
- John C Castle
- TRON-Translational Oncology at the University Medical Center Mainz, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
297
|
Thompson R, Drew CJG, Thomas RH. Next generation sequencing in the clinical domain: clinical advantages, practical, and ethical challenges. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2012; 89:27-63. [PMID: 23046881 DOI: 10.1016/b978-0-12-394287-6.00002-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
There has been an academic "gold rush" with researchers mining the deep seams of whole-exome and whole-genome sequencing since 2008. Although undoubtedly a major advance initially for identifying new disease-associated genes for rare monogenetic disorders--more recently, common and complex conditions have been successfully studied using these techniques. With great power comes great responsibility, however, and we must not forget that next generation sequencing produces unique ethical conundrums and validation challenges. We review the progression of published papers using whole-exome sequencing from a clinical and technical viewpoint before then reflecting on the key arguments that need to be fully understood before these tools can become a routine part of clinical practice and we ask what may be the role for the biomedical scientists?
Collapse
Affiliation(s)
- Rose Thompson
- Welsh Centre for Learning Disabilities, Cardiff University, Cardiff, UK
| | | | | |
Collapse
|
298
|
Boeva V, Popova T, Bleakley K, Chiche P, Cappo J, Schleiermacher G, Janoueix-Lerosey I, Delattre O, Barillot E. Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. ACTA ACUST UNITED AC 2011; 28:423-5. [PMID: 22155870 PMCID: PMC3268243 DOI: 10.1093/bioinformatics/btr670] [Citation(s) in RCA: 655] [Impact Index Per Article: 50.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Summary: More and more cancer studies use next-generation sequencing (NGS) data to detect various types of genomic variation. However, even when researchers have such data at hand, single-nucleotide polymorphism arrays have been considered necessary to assess copy number alterations and especially loss of heterozygosity (LOH). Here, we present the tool Control-FREEC that enables automatic calculation of copy number and allelic content profiles from NGS data, and consequently predicts regions of genomic alteration such as gains, losses and LOH. Taking as input aligned reads, Control-FREEC constructs copy number and B-allele frequency profiles. The profiles are then normalized, segmented and analyzed in order to assign genotype status (copy number and allelic content) to each genomic region. When a matched normal sample is provided, Control-FREEC discriminates somatic from germline events. Control-FREEC is able to analyze overdiploid tumor samples and samples contaminated by normal cells. Low mappability regions can be excluded from the analysis using provided mappability tracks. Availability: C++ source code is available at: http://bioinfo.curie.fr/projects/freec/ Contact:freec@curie.fr Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
|
299
|
Mei L, Ding X, Tsang SY, Pun FW, Ng SK, Yang J, Zhao C, Li D, Wan W, Yu CH, Tan TC, Poon WS, Leung GKK, Ng HK, Zhang L, Xue H. AluScan: a method for genome-wide scanning of sequence and structure variations in the human genome. BMC Genomics 2011; 12:564. [PMID: 22087792 PMCID: PMC3228862 DOI: 10.1186/1471-2164-12-564] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2011] [Accepted: 11/17/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND To complement next-generation sequencing technologies, there is a pressing need for efficient pre-sequencing capture methods with reduced costs and DNA requirement. The Alu family of short interspersed nucleotide elements is the most abundant type of transposable elements in the human genome and a recognized source of genome instability. With over one million Alu elements distributed throughout the genome, they are well positioned to facilitate genome-wide sequence amplification and capture of regions likely to harbor genetic variation hotspots of biological relevance. RESULTS Here we report on the use of inter-Alu PCR with an enhanced range of amplicons in conjunction with next-generation sequencing to generate an Alu-anchored scan, or 'AluScan', of DNA sequences between Alu transposons, where Alu consensus sequence-based 'H-type' PCR primers that elongate outward from the head of an Alu element are combined with 'T-type' primers elongating from the poly-A containing tail to achieve huge amplicon range. To illustrate the method, glioma DNA was compared with white blood cell control DNA of the same patient by means of AluScan. The over 10 Mb sequences obtained, derived from more than 8,000 genes spread over all the chromosomes, revealed a highly reproducible capture of genomic sequences enriched in genic sequences and cancer candidate gene regions. Requiring only sub-micrograms of sample DNA, the power of AluScan as a discovery tool for genetic variations was demonstrated by the identification of 357 instances of loss of heterozygosity, 341 somatic indels, 274 somatic SNVs, and seven potential somatic SNV hotspots between control and glioma DNA. CONCLUSIONS AluScan, implemented with just a small number of H-type and T-type inter-Alu PCR primers, provides an effective capture of a diversity of genome-wide sequences for analysis. The method, by enabling an examination of gene-enriched regions containing exons, introns, and intergenic sequences with modest capture and sequencing costs, computation workload and DNA sample requirement is particularly well suited for accelerating the discovery of somatic mutations, as well as analysis of disease-predisposing germline polymorphisms, by making possible the comparative genome-wide scanning of DNA sequences from large human cohorts.
Collapse
Affiliation(s)
- Lingling Mei
- Division of Life Science and Applied Genomics Centre, Hong Kong University of Science and Technology, 1 University Road, Clear Water Bay, Kowloon, Hong Kong, China.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
300
|
Modeling read counts for CNV detection in exome sequencing data. Stat Appl Genet Mol Biol 2011; 10:/j/sagmb.2011.10.issue-1/1544-6115.1732/1544-6115.1732.xml. [PMID: 23089826 DOI: 10.2202/1544-6115.1732] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Varying depth of high-throughput sequencing reads along a chromosome makes it possible to observe copy number variants (CNVs) in a sample relative to a reference. In exome and other targeted sequencing projects, technical factors increase variation in read depth while reducing the number of observed locations, adding difficulty to the problem of identifying CNVs. We present a hidden Markov model for detecting CNVs from raw read count data, using background read depth from a control set as well as other positional covariates such as GC-content. The model, exomeCopy, is applied to a large chromosome X exome sequencing project identifying a list of large unique CNVs. CNVs predicted by the model and experimentally validated are then recovered using a cross-platform control set from publicly available exome sequencing data. Simulations show high sensitivity for detecting heterozygous and homozygous CNVs, outperforming normalization and state-of-the-art segmentation methods.
Collapse
|