251
|
Prevalence and clinical significance of the MYD88 (L265P) somatic mutation in Waldenström’s macroglobulinemia and related lymphoid neoplasms. Blood 2013; 121:2522-8. [DOI: 10.1182/blood-2012-09-457101] [Citation(s) in RCA: 242] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Key Points
Using a sensitive method, the MYD88 (L265P) mutation is detectable in all patients with Waldenström’s macroglobulinemia, therefore representing a hallmark of the disease. MYD88 (L265P) is also found in a substantial proportion of patients with IgM-MGUS.
Collapse
|
252
|
Mayrhofer M, DiLorenzo S, Isaksson A. Patchwork: allele-specific copy number analysis of whole-genome sequenced tumor tissue. Genome Biol 2013; 14:R24. [PMID: 23531354 PMCID: PMC4053982 DOI: 10.1186/gb-2013-14-3-r24] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2012] [Revised: 02/27/2013] [Accepted: 03/25/2013] [Indexed: 12/28/2022] Open
Abstract
Whole-genome sequencing of tumor tissue has the potential to provide comprehensive characterization of genomic alterations in tumor samples. We present Patchwork, a new bioinformatic tool for allele-specific copy number analysis using whole-genome sequencing data. Patchwork can be used to determine the copy number of homologous sequences throughout the genome, even in aneuploid samples with moderate sequence coverage and tumor cell content. No prior knowledge of average ploidy or tumor cell content is required. Patchwork is freely available as an R package, installable via R-Forge (http://patchwork.r-forge.r-project.org/).
Collapse
Affiliation(s)
- Markus Mayrhofer
- Science for Life Laboratory, Department of Medical Sciences, Uppsala University, SE-751 85 Uppsala, Sweden
| | - Sebastian DiLorenzo
- Science for Life Laboratory, Department of Medical Sciences, Uppsala University, SE-751 85 Uppsala, Sweden
| | - Anders Isaksson
- Science for Life Laboratory, Department of Medical Sciences, Uppsala University, SE-751 85 Uppsala, Sweden
| |
Collapse
|
253
|
Duan J, Zhang JG, Deng HW, Wang YP. Comparative studies of copy number variation detection methods for next-generation sequencing technologies. PLoS One 2013; 8:e59128. [PMID: 23527109 PMCID: PMC3604020 DOI: 10.1371/journal.pone.0059128] [Citation(s) in RCA: 116] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2012] [Accepted: 02/12/2013] [Indexed: 11/25/2022] Open
Abstract
Copy number variation (CNV) has played an important role in studies of susceptibility or resistance to complex diseases. Traditional methods such as fluorescence in situ hybridization (FISH) and array comparative genomic hybridization (aCGH) suffer from low resolution of genomic regions. Following the emergence of next generation sequencing (NGS) technologies, CNV detection methods based on the short read data have recently been developed. However, due to the relatively young age of the procedures, their performance is not fully understood. To help investigators choose suitable methods to detect CNVs, comparative studies are needed. We compared six publicly available CNV detection methods: CNV-seq, FREEC, readDepth, CNVnator, SegSeq and event-wise testing (EWT). They are evaluated both on simulated and real data with different experiment settings. The receiver operating characteristic (ROC) curve is employed to demonstrate the detection performance in terms of sensitivity and specificity, box plot is employed to compare their performances in terms of breakpoint and copy number estimation, Venn diagram is employed to show the consistency among these methods, and F-score is employed to show the overlapping quality of detected CNVs. The computational demands are also studied. The results of our work provide a comprehensive evaluation on the performances of the selected CNV detection methods, which will help biological investigators choose the best possible method.
Collapse
Affiliation(s)
- Junbo Duan
- Department of Biomedical Engineering, Tulane University, New Orleans, Louisiana, United States of America
- Center for Bioinformatics and Genomics, Tulane University, New Orleans, Louisiana, United States of America
| | - Ji-Gang Zhang
- Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, Louisiana, United States of America
- Center for Bioinformatics and Genomics, Tulane University, New Orleans, Louisiana, United States of America
| | - Hong-Wen Deng
- Department of Biomedical Engineering, Tulane University, New Orleans, Louisiana, United States of America
- Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, Louisiana, United States of America
- Center for Bioinformatics and Genomics, Tulane University, New Orleans, Louisiana, United States of America
| | - Yu-Ping Wang
- Department of Biomedical Engineering, Tulane University, New Orleans, Louisiana, United States of America
- Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, Louisiana, United States of America
- Center for Bioinformatics and Genomics, Tulane University, New Orleans, Louisiana, United States of America
- * E-mail:
| |
Collapse
|
254
|
Zahreddine H, Borden KLB. Mechanisms and insights into drug resistance in cancer. Front Pharmacol 2013; 4:28. [PMID: 23504227 PMCID: PMC3596793 DOI: 10.3389/fphar.2013.00028] [Citation(s) in RCA: 466] [Impact Index Per Article: 38.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2012] [Accepted: 02/25/2013] [Indexed: 11/24/2022] Open
Abstract
Cancer drug resistance continues to be a major impediment in medical oncology. Clinically, resistance can arise prior to or as a result of cancer therapy. In this review, we discuss different mechanisms adapted by cancerous cells to resist treatment, including alteration in drug transport and metabolism, mutation and amplification of drug targets, as well as genetic rewiring which can lead to impaired apoptosis. Tumor heterogeneity may also contribute to resistance, where small subpopulations of cells may acquire or stochastically already possess some of the features enabling them to emerge under selective drug pressure. Making the problem even more challenging, some of these resistance pathways lead to multidrug resistance, generating an even more difficult clinical problem to overcome. We provide examples of these mechanisms and some insights into how understanding these processes can influence the next generation of cancer therapies.
Collapse
Affiliation(s)
- Hiba Zahreddine
- Department of Pathology and Cell Biology, Institute of Research in Immunology and Cancer, Université de Montréal Montreal, QC, Canada
| | | |
Collapse
|
255
|
Kim HS, Mitsudomi T, Soo RA, Cho BC. Personalized therapy on the horizon for squamous cell carcinoma of the lung. Lung Cancer 2013; 80:249-55. [PMID: 23489560 DOI: 10.1016/j.lungcan.2013.02.015] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2012] [Revised: 02/15/2013] [Accepted: 02/16/2013] [Indexed: 12/28/2022]
Abstract
Squamous cell carcinoma (SQCC) of the lung is the second-largest subtype of non-small cell lung cancer (NSCLC), causing an estimated 400,000 deaths per year worldwide. Recent developments in cancer genome sequencing technology expanded our knowledge of driver mutations, which were identified as novel candidates for targeted therapy in various cancers. Successful targeted treatments for lung adenocarcinoma, NSCLC's primary subtype, with EGFR mutation or ALK fusion are clinically available, and a clinical trial of personalized targeted therapy in patients with lung adenocarcinoma is underway by the Lung Cancer Mutation Consortium. Although there are targeted treatments for lung adenocarcinoma, no personalized therapies currently exist for SQCC. Recently, comprehensive genomic characterization of lung SQCC using massively parallel sequencing has enabled us to identify several potential driver mutations/signaling pathways. These are FGFR1 amplifications, PI3KCA mutations, PTEN mutations/deletions, PDGFRA amplifications/mutations, and DDR2 mutations. The march toward personalized therapy may have taken a step forward with the discovery of these potential biomarkers for the treatment of SQCC of the lung. This article reviewed the current knowledge of genomic landscape of lung SQCC and summarized ongoing clinical trials of targeted agents for lung SQCC. Also, we will suggest several other actionable mutations with matching drugs that should be investigated in future clinical trials for the personalized treatment of lung SQCC.
Collapse
Affiliation(s)
- Han Sang Kim
- Division of Medical Oncology, Yonsei Cancer Center, Yonsei University College of Medicine, Seoul, Republic of Korea
| | | | | | | |
Collapse
|
256
|
Gullapalli RR, Lyons-Weiler M, Petrosko P, Dhir R, Becich MJ, LaFramboise WA. Clinical integration of next-generation sequencing technology. Clin Lab Med 2013; 32:585-99. [PMID: 23078661 DOI: 10.1016/j.cll.2012.07.005] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Recent advances in next-generation sequencing (NGS) methods and technology have substantially reduced costs and operational complexity leading to production of benchtop sequencers and commercial software solutions for implementation in small research and clinical laboratories. This article addresses requirements and limitations to successful implementation of these systems, including (1) calibration and validation of the instrumentation, experimental paradigm, and primary readout, (2) secure data transfer, storage, and secondary processing, (3) implementation of software tools for targeted analysis, and (4) training of research and clinical personnel to evaluate data fidelity and interpret the molecular significance of the genomic output.
Collapse
Affiliation(s)
- R R Gullapalli
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | | | | | | | | | | |
Collapse
|
257
|
Grimm D, Hagmann J, Koenig D, Weigel D, Borgwardt K. Accurate indel prediction using paired-end short reads. BMC Genomics 2013; 14:132. [PMID: 23442375 PMCID: PMC3614465 DOI: 10.1186/1471-2164-14-132] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Accepted: 02/06/2013] [Indexed: 11/12/2022] Open
Abstract
Background One of the major open challenges in next generation sequencing (NGS) is the accurate identification of structural variants such as insertions and deletions (indels). Current methods for indel calling assign scores to different types of evidence or counter-evidence for the presence of an indel, such as the number of split read alignments spanning the boundaries of a deletion candidate or reads that map within a putative deletion. Candidates with a score above a manually defined threshold are then predicted to be true indels. As a consequence, structural variants detected in this manner contain many false positives. Results Here, we present a machine learning based method which is able to discover and distinguish true from false indel candidates in order to reduce the false positive rate. Our method identifies indel candidates using a discriminative classifier based on features of split read alignment profiles and trained on true and false indel candidates that were validated by Sanger sequencing. We demonstrate the usefulness of our method with paired-end Illumina reads from 80 genomes of the first phase of the 1001 Genomes Project (
http://www.1001genomes.org) in Arabidopsis thaliana. Conclusion In this work we show that indel classification is a necessary step to reduce the number of false positive candidates. We demonstrate that missing classification may lead to spurious biological interpretations. The software is available at:
http://agkb.is.tuebingen.mpg.de/Forschung/SV-M/.
Collapse
Affiliation(s)
- Dominik Grimm
- Machine Learning and Computational Biology Research Group, Max Planck Institute for Developmental Biology and Max Planck Institute for Intelligent Systems, Tübingen, Germany.
| | | | | | | | | |
Collapse
|
258
|
Sepúlveda N, Campino SG, Assefa SA, Sutherland CJ, Pain A, Clark TG. A Poisson hierarchical modelling approach to detecting copy number variation in sequence coverage data. BMC Genomics 2013; 14:128. [PMID: 23442253 PMCID: PMC3679970 DOI: 10.1186/1471-2164-14-128] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2012] [Accepted: 02/11/2013] [Indexed: 11/23/2022] Open
Abstract
Background The advent of next generation sequencing technology has accelerated efforts to map and catalogue copy number variation (CNV) in genomes of important micro-organisms for public health. A typical analysis of the sequence data involves mapping reads onto a reference genome, calculating the respective coverage, and detecting regions with too-low or too-high coverage (deletions and amplifications, respectively). Current CNV detection methods rely on statistical assumptions (e.g., a Poisson model) that may not hold in general, or require fine-tuning the underlying algorithms to detect known hits. We propose a new CNV detection methodology based on two Poisson hierarchical models, the Poisson-Gamma and Poisson-Lognormal, with the advantage of being sufficiently flexible to describe different data patterns, whilst robust against deviations from the often assumed Poisson model. Results Using sequence coverage data of 7 Plasmodium falciparum malaria genomes (3D7 reference strain, HB3, DD2, 7G8, GB4, OX005, and OX006), we showed that empirical coverage distributions are intrinsically asymmetric and overdispersed in relation to the Poisson model. We also demonstrated a low baseline false positive rate for the proposed methodology using 3D7 resequencing data and simulation. When applied to the non-reference isolate data, our approach detected known CNV hits, including an amplification of the PfMDR1 locus in DD2 and a large deletion in the CLAG3.2 gene in GB4, and putative novel CNV regions. When compared to the recently available FREEC and cn.MOPS approaches, our findings were more concordant with putative hits from the highest quality array data for the 7G8 and GB4 isolates. Conclusions In summary, the proposed methodology brings an increase in flexibility, robustness, accuracy and statistical rigour to CNV detection using sequence coverage data.
Collapse
Affiliation(s)
- Nuno Sepúlveda
- London School of Hygiene and Tropical Medicine, London, UK.
| | | | | | | | | | | |
Collapse
|
259
|
McCallum KJ, Wang JP. Quantifying copy number variations using a hidden Markov model with inhomogeneous emission distributions. Biostatistics 2013; 14:600-11. [PMID: 23428932 DOI: 10.1093/biostatistics/kxt003] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Copy number variations (CNVs) are a significant source of genetic variation and have been found frequently associated with diseases such as cancers and autism. High-throughput sequencing data are increasingly being used to detect and quantify CNVs; however, the distributional properties of the data are not fully understood. A hidden Markov model (HMM) is proposed using inhomogeneous emission distributions based on negative binomial regression to account for the sequencing biases. The model is tested on the whole genome sequencing data and simulated data sets. An algorithm for CNV detection is implemented in the R package CNVfinder. The model based on negative binomial regression is shown to provide a good fit to the data and provides competitive performance compared with methods based on normalization of read counts.
Collapse
|
260
|
Weischenfeldt J, Simon R, Feuerbach L, Schlangen K, Weichenhan D, Minner S, Wuttig D, Warnatz HJ, Stehr H, Rausch T, Jäger N, Gu L, Bogatyrova O, Stütz AM, Claus R, Eils J, Eils R, Gerhäuser C, Huang PH, Hutter B, Kabbe R, Lawerenz C, Radomski S, Bartholomae CC, Fälth M, Gade S, Schmidt M, Amschler N, Haß T, Galal R, Gjoni J, Kuner R, Baer C, Masser S, von Kalle C, Zichner T, Benes V, Raeder B, Mader M, Amstislavskiy V, Avci M, Lehrach H, Parkhomchuk D, Sultan M, Burkhardt L, Graefen M, Huland H, Kluth M, Krohn A, Sirma H, Stumm L, Steurer S, Grupp K, Sültmann H, Sauter G, Plass C, Brors B, Yaspo ML, Korbel JO, Schlomm T. Integrative genomic analyses reveal an androgen-driven somatic alteration landscape in early-onset prostate cancer. Cancer Cell 2013; 23:159-70. [PMID: 23410972 DOI: 10.1016/j.ccr.2013.01.002] [Citation(s) in RCA: 267] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/10/2012] [Revised: 08/16/2012] [Accepted: 01/03/2013] [Indexed: 12/11/2022]
Abstract
Early-onset prostate cancer (EO-PCA) represents the earliest clinical manifestation of prostate cancer. To compare the genomic alteration landscapes of EO-PCA with "classical" (elderly-onset) PCA, we performed deep sequencing-based genomics analyses in 11 tumors diagnosed at young age, and pursued comparative assessments with seven elderly-onset PCA genomes. Remarkable age-related differences in structural rearrangement (SR) formation became evident, suggesting distinct disease pathomechanisms. Whereas EO-PCAs harbored a prevalence of balanced SRs, with a specific abundance of androgen-regulated ETS gene fusions including TMPRSS2:ERG, elderly-onset PCAs displayed primarily non-androgen-associated SRs. Data from a validation cohort of > 10,000 patients showed age-dependent androgen receptor levels and a prevalence of SRs affecting androgen-regulated genes, further substantiating the activity of a characteristic "androgen-type" pathomechanism in EO-PCA.
Collapse
Affiliation(s)
- Joachim Weischenfeldt
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstr. 1, 69117 Heidelberg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
261
|
Duan J, Zhang JG, Deng HW, Wang YP. Detection of common copy number variation with application to population clustering from next generation sequencing data. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2013; 2012:1246-9. [PMID: 23366124 DOI: 10.1109/embc.2012.6346163] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Copy number variation (CNV) is a structural variation in human genome that has been associated with many complex diseases. In this paper we present a method to detect common copy number variation from next generation sequencing data. First, copy number variations are detected from each individual sample, which is formulated as a total variation penalized least square problem. Second, the common copy number discovery from multiple samples is obtained using source separation techniques such as the non-negative matrix factorization (NMF). Finally, the method is applied to population clustering. The results on real data analysis show that two family trio with different ancestries can be clustered into two ethnic groups based on their common CNVs, demonstrating the potential of the proposed method for application to population genetics.
Collapse
Affiliation(s)
- Junbo Duan
- Department of Biomedical Engineering, Tulane University, New Orleans, USA.
| | | | | | | |
Collapse
|
262
|
Amarasinghe KC, Li J, Halgamuge SK. CoNVEX: copy number variation estimation in exome sequencing data using HMM. BMC Bioinformatics 2013; 14 Suppl 2:S2. [PMID: 23368785 PMCID: PMC3549847 DOI: 10.1186/1471-2105-14-s2-s2] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Background One of the main types of genetic variations in cancer is Copy Number Variations (CNV). Whole exome sequenicng (WES) is a popular alternative to whole genome sequencing (WGS) to study disease specific genomic variations. However, finding CNV in Cancer samples using WES data has not been fully explored. Results We present a new method, called CoNVEX, to estimate copy number variation in whole exome sequencing data. It uses ratio of tumour and matched normal average read depths at each exonic region, to predict the copy gain or loss. The useful signal produced by WES data will be hindered by the intrinsic noise present in the data itself. This limits its capacity to be used as a highly reliable CNV detection source. Here, we propose a method that consists of discrete wavelet transform (DWT) to reduce noise. The identification of copy number gains/losses of each targeted region is performed by a Hidden Markov Model (HMM). Conclusion HMM is frequently used to identify CNV in data produced by various technologies including Array Comparative Genomic Hybridization (aCGH) and WGS. Here, we propose an HMM to detect CNV in cancer exome data. We used modified data from 1000 Genomes project to evaluate the performance of the proposed method. Using these data we have shown that CoNVEX outperforms the existing methods significantly in terms of precision. Overall, CoNVEX achieved a sensitivity of more than 92% and a precision of more than 50%.
Collapse
Affiliation(s)
- Kaushalya C Amarasinghe
- Department of Mechanical Engineering, University of Melbourne, Parkville, VIC 3010, Australia.
| | | | | |
Collapse
|
263
|
Abstract
The rapid technological developments following the Human Genome Project have made possible the availability of personalized genomes. As the focus now shifts from characterizing genomes to making personalized disease associations, in combination with the availability of other omics technologies, the next big push will be not only to obtain a personalized genome, but to quantitatively follow other omics. This will include transcriptomes, proteomes, metabolomes, antibodyomes, and new emerging technologies, enabling the profiling of thousands of molecular components in individuals. Furthermore, omics profiling performed longitudinally can probe the temporal patterns associated with both molecular changes and associated physiological health and disease states. Such data necessitates the development of computational methodology to not only handle and descriptively assess such data, but also construct quantitative biological models. Here we describe the availability of personal genomes and developing omics technologies that can be brought together for personalized implementations and how these novel integrated approaches may effectively provide a precise personalized medicine that focuses on not only characterization and treatment but ultimately the prevention of disease.
Collapse
|
264
|
Schaffer ME, Platero JS. Pharmacogenomics in Cancer Therapeutics. Pharmacogenomics 2013. [DOI: 10.1016/b978-0-12-391918-2.00004-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
|
265
|
Day E, Dear PH, McCaughan F. Digital PCR strategies in the development and analysis of molecular biomarkers for personalized medicine. Methods 2013; 59:101-7. [DOI: 10.1016/j.ymeth.2012.08.001] [Citation(s) in RCA: 144] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Revised: 07/30/2012] [Accepted: 08/02/2012] [Indexed: 12/18/2022] Open
|
266
|
Szatkiewicz JP, Wang W, Sullivan PF, Wang W, Sun W. Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation. Nucleic Acids Res 2012; 41:1519-32. [PMID: 23275535 PMCID: PMC3561969 DOI: 10.1093/nar/gks1363] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative data (e.g. read-depth, read-pair, split-read) need be considered, and that sophisticated methods are needed for more accurate CNV detection. We observed that various sources of experimental biases in HTS confound read-depth estimation, and note that bias correction has not been adequately addressed by existing methods. We present a novel read-depth-based method, GENSENG, which uses a hidden Markov model and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS data sets, we conclude that our method outperforms existing read-depth-based CNV detection algorithms. The concept of simultaneous bias correction and CNV detection can serve as a basis for combining read-depth with other types of information such as read-pair or split-read in a single analysis. A user-friendly and computationally efficient implementation of our method is freely available.
Collapse
Affiliation(s)
- Jin P Szatkiewicz
- Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599-7264, USA.
| | | | | | | | | |
Collapse
|
267
|
Abstract
Differences between individual human genomes, or between human and cancer genomes, range in scale from single nucleotide variants (SNVs) through intermediate and large-scale duplications, deletions, and rearrangements of genomic segments. The latter class, called structural variants (SVs), have received considerable attention in the past several years as they are a previously under appreciated source of variation in human genomes. Much of this recent attention is the result of the availability of higher-resolution technologies for measuring these variants, including both microarray-based techniques, and more recently, high-throughput DNA sequencing. We describe the genomic technologies and computational techniques currently used to measure SVs, focusing on applications in human and cancer genomics.
Collapse
Affiliation(s)
- Benjamin J Raphael
- Department of Computer Science and Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America.
| |
Collapse
|
268
|
Schulte I, Batty EM, Pole JCM, Blood KA, Mo S, Cooke SL, Ng C, Howe KL, Chin SF, Brenton JD, Caldas C, Howarth KD, Edwards PAW. Structural analysis of the genome of breast cancer cell line ZR-75-30 identifies twelve expressed fusion genes. BMC Genomics 2012; 13:719. [PMID: 23260012 PMCID: PMC3548764 DOI: 10.1186/1471-2164-13-719] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Accepted: 12/14/2012] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND It has recently emerged that common epithelial cancers such as breast cancers have fusion genes like those in leukaemias. In a representative breast cancer cell line, ZR-75-30, we searched for fusion genes, by analysing genome rearrangements. RESULTS We first analysed rearrangements of the ZR-75-30 genome, to around 10kb resolution, by molecular cytogenetic approaches, combining array painting and array CGH. We then compared this map with genomic junctions determined by paired-end sequencing. Most of the breakpoints found by array painting and array CGH were identified in the paired end sequencing-55% of the unamplified breakpoints and 97% of the amplified breakpoints (as these are represented by more sequence reads). From this analysis we identified 9 expressed fusion genes: APPBP2-PHF20L1, BCAS3-HOXB9, COL14A1-SKAP1, TAOK1-PCGF2, TIAM1-NRIP1, TIMM23-ARHGAP32, TRPS1-LASP1, USP32-CCDC49 and ZMYM4-OPRD1. We also determined the genomic junctions of a further three expressed fusion genes that had been described by others, BCAS3-ERBB2, DDX5-DEPDC6/DEPTOR and PLEC1-ENPP2. Of this total of 12 expressed fusion genes, 9 were in the coamplification. Due to the sensitivity of the technologies used, we estimate these 12 fusion genes to be around two-thirds of the true total. Many of the fusions seem likely to be driver mutations. For example, PHF20L1, BCAS3, TAOK1, PCGF2, and TRPS1 are fused in other breast cancers. HOXB9 and PHF20L1 are members of gene families that are fused in other neoplasms. Several of the other genes are relevant to cancer-in addition to ERBB2, SKAP1 is an adaptor for Src, DEPTOR regulates the mTOR pathway and NRIP1 is an estrogen-receptor coregulator. CONCLUSIONS This is the first structural analysis of a breast cancer genome that combines classical molecular cytogenetic approaches with sequencing. Paired-end sequencing was able to detect almost all breakpoints, where there was adequate read depth. It supports the view that gene breakage and gene fusion are important classes of mutation in breast cancer, with a typical breast cancer expressing many fusion genes.
Collapse
Affiliation(s)
- Ina Schulte
- Hutchison/MRC Research Centre and Department of Pathology, University of Cambridge, Cambridge, UK
| | - Elizabeth M Batty
- Hutchison/MRC Research Centre and Department of Pathology, University of Cambridge, Cambridge, UK
- Current addresses: Department of Statistics, University of Oxford, 1 South Parks Road, Oxford, OX1 3TG, UK
| | - Jessica CM Pole
- Hutchison/MRC Research Centre and Department of Pathology, University of Cambridge, Cambridge, UK
- Current addresses: BlueGnome Ltd, CPC4, Capital Park, Fulbourn, Cambridge, CB21 5XE, UK
| | - Katherine A Blood
- Hutchison/MRC Research Centre and Department of Pathology, University of Cambridge, Cambridge, UK
- Current addresses: Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 2N1, Canada
| | - Steven Mo
- Hutchison/MRC Research Centre and Department of Pathology, University of Cambridge, Cambridge, UK
- Current addresses: Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, UK
| | - Susanna L Cooke
- Cancer Research UK Cambridge Research Institute and Department of Oncology, University of Cambridge, Li Ka-Shing Centre, Cambridge, UK
- Current addresses: Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Charlotte Ng
- Cancer Research UK Cambridge Research Institute and Department of Oncology, University of Cambridge, Li Ka-Shing Centre, Cambridge, UK
- Current addresses: Breakthrough Breast Cancer Research Centre, Institute of Cancer Research, 237 Fulham Road, London, SW3 6JB, UK
| | - Kevin L Howe
- Cancer Research UK Cambridge Research Institute and Department of Oncology, University of Cambridge, Li Ka-Shing Centre, Cambridge, UK
- Current addresses: European Bioinformatics Institute, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Suet-Feung Chin
- Cancer Research UK Cambridge Research Institute and Department of Oncology, University of Cambridge, Li Ka-Shing Centre, Cambridge, UK
| | - James D Brenton
- Cancer Research UK Cambridge Research Institute and Department of Oncology, University of Cambridge, Li Ka-Shing Centre, Cambridge, UK
| | - Carlos Caldas
- Cancer Research UK Cambridge Research Institute and Department of Oncology, University of Cambridge, Li Ka-Shing Centre, Cambridge, UK
| | - Karen D Howarth
- Hutchison/MRC Research Centre and Department of Pathology, University of Cambridge, Cambridge, UK
| | - Paul AW Edwards
- Hutchison/MRC Research Centre and Department of Pathology, University of Cambridge, Cambridge, UK
| |
Collapse
|
269
|
Valdés-Mas R, Bea S, Puente DA, López-Otín C, Puente XS. Estimation of copy number alterations from exome sequencing data. PLoS One 2012; 7:e51422. [PMID: 23284693 PMCID: PMC3526607 DOI: 10.1371/journal.pone.0051422] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2012] [Accepted: 10/31/2012] [Indexed: 11/18/2022] Open
Abstract
Exome sequencing constitutes an important technology for the study of human hereditary diseases and cancer. However, the ability of this approach to identify copy number alterations in primary tumor samples has not been fully addressed. Here we show that somatic copy number alterations can be reliably estimated using exome sequencing data through a strategy that we have termed exome2cnv. Using data from 86 paired normal and primary tumor samples, we identified losses and gains of complete chromosomes or large genomic regions, as well as smaller regions affecting a minimum of one gene. Comparison with high-resolution comparative genomic hybridization (CGH) arrays revealed a high sensitivity and a low number of false positives in the copy number estimation between both approaches. We explore the main factors affecting sensitivity and false positives with real data, and provide a side by side comparison with CGH arrays. Together, these results underscore the utility of exome sequencing to study cancer samples by allowing not only the identification of substitutions and indels, but also the accurate estimation of copy number alterations.
Collapse
Affiliation(s)
- Rafael Valdés-Mas
- Departamento de Bioquímica y Biología Molecular, Instituto Universitario de Oncología (IUOPA), Universidad de Oviedo, Oviedo, Spain
| | - Silvia Bea
- Hematopathology Unit, Hospital Clinic, IDIBAPS, Barcelona, Spain
| | - Diana A. Puente
- Departamento de Bioquímica y Biología Molecular, Instituto Universitario de Oncología (IUOPA), Universidad de Oviedo, Oviedo, Spain
| | - Carlos López-Otín
- Departamento de Bioquímica y Biología Molecular, Instituto Universitario de Oncología (IUOPA), Universidad de Oviedo, Oviedo, Spain
| | - Xose S. Puente
- Departamento de Bioquímica y Biología Molecular, Instituto Universitario de Oncología (IUOPA), Universidad de Oviedo, Oviedo, Spain
- * E-mail:
| |
Collapse
|
270
|
Lin CY, Lovén J, Rahl PB, Paranal RM, Burge CB, Bradner JE, Lee TI, Young RA. Transcriptional amplification in tumor cells with elevated c-Myc. Cell 2012; 151:56-67. [PMID: 23021215 DOI: 10.1016/j.cell.2012.08.026] [Citation(s) in RCA: 1184] [Impact Index Per Article: 91.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2012] [Revised: 05/29/2012] [Accepted: 08/08/2012] [Indexed: 12/12/2022]
Abstract
Elevated expression of the c-Myc transcription factor occurs frequently in human cancers and is associated with tumor aggression and poor clinical outcome. The effect of high levels of c-Myc on global gene regulation is poorly understood but is widely thought to involve newly activated or repressed "Myc target genes." We report here that in tumor cells expressing high levels of c-Myc the transcription factor accumulates in the promoter regions of active genes and causes transcriptional amplification, producing increased levels of transcripts within the cell's gene expression program. Thus, rather than binding and regulating a new set of genes, c-Myc amplifies the output of the existing gene expression program. These results provide an explanation for the diverse effects of oncogenic c-Myc on gene expression in different tumor cells and suggest that transcriptional amplification reduces rate-limiting constraints for tumor cell growth and proliferation.
Collapse
Affiliation(s)
- Charles Y Lin
- Whitehead Institute for Biomedical Research, Cambridge Center, MA 02142, USA
| | | | | | | | | | | | | | | |
Collapse
|
271
|
Wilkerson PM, Reis-Filho JS. the 11q13-q14 amplicon: Clinicopathological correlations and potential drivers. Genes Chromosomes Cancer 2012; 52:333-55. [DOI: 10.1002/gcc.22037] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2012] [Accepted: 11/01/2012] [Indexed: 01/04/2023] Open
|
272
|
Marotta M, Chen X, Inoshita A, Stephens R, Budd GT, Crowe JP, Lyons J, Kondratova A, Tubbs R, Tanaka H. A common copy-number breakpoint of ERBB2 amplification in breast cancer colocalizes with a complex block of segmental duplications. Breast Cancer Res 2012. [PMID: 23181561 PMCID: PMC4053137 DOI: 10.1186/bcr3362] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Introduction Segmental duplications (low-copy repeats) are the recently duplicated genomic segments in the human genome that display nearly identical (> 90%) sequences and account for about 5% of euchromatic regions. In germline, duplicated segments mediate nonallelic homologous recombination and thus cause both non-disease-causing copy-number variants and genomic disorders. To what extent duplicated segments play a role in somatic DNA rearrangements in cancer remains elusive. Duplicated segments often cluster and form genomic blocks enriched with both direct and inverted repeats (complex genomic regions). Such complex regions could be fragile and play a mechanistic role in the amplification of the ERBB2 gene in breast tumors, because repeated sequences are known to initiate gene amplification in model systems. Methods We conducted polymerase chain reaction (PCR)-based assays for primary breast tumors and analyzed publically available array-comparative genomic hybridization data to map a common copy-number breakpoint in ERBB2-amplified primary breast tumors. We further used molecular, bioinformatics, and population-genetics approaches to define duplication contents, structural variants, and haplotypes within the common breakpoint. Results We found a large (> 300-kb) block of duplicated segments that was colocalized with a common-copy number breakpoint for ERBB2 amplification. The breakpoint that potentially initiated ERBB2 amplification localized in a region 1.5 megabases (Mb) on the telomeric side of ERBB2. The region is very complex, with extensive duplications of KRTAP genes, structural variants, and, as a result, a paucity of single-nucleotide polymorphism (SNP) markers. Duplicated segments are varied in size and degree of sequence homology, indicating that duplications have occurred recurrently during genome evolution. Conclusions Amplification of the ERBB2 gene in breast tumors is potentially initiated by a complex region that has unusual genomic features and thus requires rigorous, labor-intensive investigation. The haplotypes we provide could be useful to identify the potential association between the complex region and ERBB2 amplification.
Collapse
|
273
|
Abstract
Genomic sequencing has provided critical insights into the etiology of both simple and complex diseases. The enormous reductions in cost for whole genome sequencing have allowed this technology to gain increasing use. Whole genome analysis has impacted research of complex diseases including cancer by allowing the systematic analysis of entire genomes in a single experiment, thereby facilitating the discovery of somatic and germline mutations, and identification of the insertions, deletions, and structural rearrangements, including translocations and inversions, in novel disease genes. Whole-genome sequencing can be used to provide the most comprehensive characterization of the cancer genome, the complexity of which we are only beginning to understand. Hence in this review, we focus on whole-genome sequencing in cancer.
Collapse
Affiliation(s)
- Musaffe Tuna
- Department of Genetics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
| | | |
Collapse
|
274
|
Hussein SMI, Elbaz J, Nagy AA. Genome damage in induced pluripotent stem cells: Assessing the mechanisms and their consequences. Bioessays 2012; 35:152-62. [DOI: 10.1002/bies.201200114] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
275
|
MacConaill LE, Van Hummelen P, Meyerson M, Hahn WC. Clinical implementation of comprehensive strategies to characterize cancer genomes: opportunities and challenges. Cancer Discov 2012; 1:297-311. [PMID: 21935500 DOI: 10.1158/2159-8290.cd-11-0110] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
An increasing number of anticancer therapeutic agents target specific mutant proteins that are expressed by many different tumor types. Recent evidence suggests that the selection of patients whose tumors harbor specific genetic alterations identifies the subset of patients who are most likely to benefit from the use of such agents. As the number of genetic alterations that provide diagnostic and/or therapeutic information increases, the comprehensive characterization of cancer genomes will be necessary to understand the spectrum of distinct genomic alterations in cancer, to identify patients who are likely to respond to particular therapies, and to facilitate the selection of treatment modalities. Rapid developments in new technologies for genomic analysis now provide the means to perform comprehensive analyses of cancer genomes. In this article, we review the current state of cancer genome analysis and discuss the challenges and opportunities necessary to implement these technologies in a clinical setting.
Collapse
Affiliation(s)
- Laura E MacConaill
- Center for Cancer Genome Discovery, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts 02215, USA
| | | | | | | |
Collapse
|
276
|
Zhou JB, Zhang T, Wang BF, Gao HZ, Xu X. Identification of a novel gene fusion RNF213‑SLC26A11 in chronic myeloid leukemia by RNA-Seq. Mol Med Rep 2012; 7:591-7. [PMID: 23151810 DOI: 10.3892/mmr.2012.1183] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2012] [Accepted: 10/25/2012] [Indexed: 11/05/2022] Open
Abstract
Chronic myeloid leukemia (CML) was the first hematological malignancy to be associated with a specific genetic lesion. The Philadelphia translocation, producing a BCR‑ABL hybrid oncogene, is the most common mechanism of CML development. However, in the present study, b3a2, b2a2 and ela2 fusion junctions of the breakpoint cluster region (BCR)-V-abl Abelson murine leukemia viral oncogene homolog 1 (ABL) gene were not detected in patients diagnosed with CML three and four years previously. RNA-Seq technology, with an average coverage of ~30‑fold, was used to detect gene fusion events in a patient with a 6-year history of CML, identified to be in the chronic phase of the disease. Using deFuse and TopHat‑fusion programs with improved filtering methods, we identified two reliable gene fusions in a blood sample obtained from the CML patient, including extremely low expression levels of the classic BCR‑ABL1 gene fusion. In addition, a novel gene fusion involving the ring finger protein 213 (RNF213)-solute carrier family 26, member 11 (SLC26A11) was identified and validated by reverse transcription polymerase chain reaction. Further bioinformatic analysis revealed that specific domains of SLC26A11 were damaged, which may affect the function of sulfate transportation of the normal gene. The present study demonstrated that, in specific cases, alternative gene fusions, besides BCR‑ABL, may be associated with the development of CML.
Collapse
Affiliation(s)
- Jian-Bo Zhou
- Department of Clinical Laboratories, Jiang Yin People's Hospital, Jiang Yin, Jiangsu 214400, P.R. China.
| | | | | | | | | |
Collapse
|
277
|
Frenkel-Morgenstern M, Gorohovski A, Lacroix V, Rogers M, Ibanez K, Boullosa C, Andres Leon E, Ben-Hur A, Valencia A. ChiTaRS: a database of human, mouse and fruit fly chimeric transcripts and RNA-sequencing data. Nucleic Acids Res 2012; 41:D142-51. [PMID: 23143107 PMCID: PMC3531201 DOI: 10.1093/nar/gks1041] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Chimeric RNAs that comprise two or more different transcripts have been identified in many cancers and among the Expressed Sequence Tags (ESTs) isolated from different organisms; they might represent functional proteins and produce different disease phenotypes. The ChiTaRS database of Chimeric Transcripts and RNA-Sequencing data (http://chitars.bioinfo.cnio.es/) collects more than 16 000 chimeric RNAs from humans, mice and fruit flies, 233 chimeras confirmed by RNA-seq reads and ∼2000 cancer breakpoints. The database indicates the expression and tissue specificity of these chimeras, as confirmed by RNA-seq data, and it includes mass spectrometry results for some human entries at their junctions. Moreover, the database has advanced features to analyze junction consistency and to rank chimeras based on the evidence of repeated junction sites. Finally, ‘Junction Search’ screens through the RNA-seq reads found at the chimeras’ junction sites to identify putative junctions in novel sequences entered by users. Thus, ChiTaRS is an extensive catalog of human, mouse and fruit fly chimeras that will extend our understanding of the evolution of chimeric transcripts in eukaryotes and can be advantageous in the analysis of human cancer breakpoints.
Collapse
Affiliation(s)
- Milana Frenkel-Morgenstern
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | | | | | | | | | | | | | | | | |
Collapse
|
278
|
Microhomology directs diverse DNA break repair pathways and chromosomal translocations. PLoS Genet 2012; 8:e1003026. [PMID: 23144625 PMCID: PMC3493447 DOI: 10.1371/journal.pgen.1003026] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2012] [Accepted: 08/24/2012] [Indexed: 11/23/2022] Open
Abstract
Chromosomal structural change triggers carcinogenesis and the formation of other genetic diseases. The breakpoint junctions of these rearrangements often contain small overlapping sequences called “microhomology,” yet the genetic pathway(s) responsible have yet to be defined. We report a simple genetic system to detect microhomology-mediated repair (MHMR) events after a DNA double-strand break (DSB) in budding yeast cells. MHMR using >15 bp operates as a single-strand annealing variant, requiring the non-essential DNA polymerase subunit Pol32. MHMR is inhibited by sequence mismatches, but independent of extensive DNA synthesis like break-induced replication. However, MHMR using less than 14 bp is genetically distinct from that using longer microhomology and far less efficient for the repair of distant DSBs. MHMR catalyzes chromosomal translocation almost as efficiently as intra-chromosomal repair. The results suggest that the intrinsic annealing propensity between microhomology sequences efficiently leads to chromosomal rearrangements. Cancer results from an accumulation of mutations that transform a normal cell into one that proliferates uncontrollably. DNA double-strand breaks (DSBs) can lead to genetic mutations and chromosome rearrangements, underscoring the importance of functional DNA DSB repair pathways in the maintenance of chromosome integrity and tumor suppression. Ample evidence suggests that cells possess multiple DSB repair mechanisms with distinct mutational potentials, and one or more of these pathways is likely responsible for the formation of chromosomal translocations. Importantly, at the junctions of many rearrangements, small (2–20 bp in length) overlapping sequences from each of the original sequences, termed “microhomology,” are found, and they may provide a clue as to how these rearrangements form. Here, we describe our genetic investigation into how flanking microhomology influences the type and frequency of DSB repair. We also show that microhomology-mediated repair (MHMR) efficiently induces chromosomal translocations. This research provides a basic understanding of the mechanisms that utilize microhomology for mutagenic repair.
Collapse
|
279
|
Holland AJ, Cleveland DW. Chromoanagenesis and cancer: mechanisms and consequences of localized, complex chromosomal rearrangements. Nat Med 2012; 18:1630-8. [PMID: 23135524 DOI: 10.1038/nm.2988] [Citation(s) in RCA: 208] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2012] [Accepted: 10/01/2012] [Indexed: 02/08/2023]
Abstract
Next-generation sequencing of DNA from human tumors or individuals with developmental abnormalities has led to the discovery of a process we term chromoanagenesis, in which large numbers of complex rearrangements occur at one or a few chromosomal loci in a single catastrophic event. Two mechanisms underlie these rearrangements, both of which can be facilitated by a mitotic chromosome segregation error to produce a micronucleus containing the chromosome to undergo rearrangement. In the first, chromosome shattering (chromothripsis) is produced by mitotic entry before completion of DNA replication within the micronucleus, with a failure to disassemble the micronuclear envelope encapsulating the chromosomal fragments for random reassembly in the subsequent interphase. Alternatively, locally defective DNA replication initiates serial, microhomology-mediated template switching (chromoanasynthesis) that produces local rearrangements with altered gene copy numbers. Complex rearrangements are present in a broad spectrum of tumors and in individuals with congenital or developmental defects, highlighting the impact of chromoanagenesis on human disease.
Collapse
Affiliation(s)
- Andrew J Holland
- Ludwig Institute for Cancer Research and Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, USA.
| | | |
Collapse
|
280
|
Drier Y, Lawrence MS, Carter SL, Stewart C, Gabriel SB, Lander ES, Meyerson M, Beroukhim R, Getz G. Somatic rearrangements across cancer reveal classes of samples with distinct patterns of DNA breakage and rearrangement-induced hypermutability. Genome Res 2012; 23:228-35. [PMID: 23124520 PMCID: PMC3561864 DOI: 10.1101/gr.141382.112] [Citation(s) in RCA: 113] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Whole-genome sequencing using massively parallel sequencing technologies enables accurate detection of somatic rearrangements in cancer. Pinpointing large numbers of rearrangement breakpoints to base-pair resolution allows analysis of rearrangement microhomology and genomic location for every sample. Here we analyze 95 tumor genome sequences from breast, head and neck, colorectal, and prostate carcinomas, and from melanoma, multiple myeloma, and chronic lymphocytic leukemia. We discover three genomic factors that are significantly correlated with the distribution of rearrangements: replication time, transcription rate, and GC content. The correlation is complex, and different patterns are observed between tumor types, within tumor types, and even between different types of rearrangements. Mutations in the APC gene correlate with and, hence, potentially contribute to DNA breakage in late-replicating, low %GC, untranscribed regions of the genome. We show that somatic rearrangements display less microhomology than germline rearrangements, and that breakpoint loci are correlated with local hypermutability with a particular enrichment for transversions.
Collapse
Affiliation(s)
- Yotam Drier
- Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel
| | | | | | | | | | | | | | | | | |
Collapse
|
281
|
Kangaspeska S, Hultsch S, Edgren H, Nicorici D, Murumägi A, Kallioniemi O. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms. PLoS One 2012; 7:e48745. [PMID: 23119097 PMCID: PMC3485361 DOI: 10.1371/journal.pone.0048745] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2012] [Accepted: 10/01/2012] [Indexed: 11/18/2022] Open
Abstract
RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.
Collapse
|
282
|
Yasuda T, Suzuki S, Nagasaki M, Miyano S. ChopSticks: High-resolution analysis of homozygous deletions by exploiting concordant read pairs. BMC Bioinformatics 2012; 13:279. [PMID: 23110596 PMCID: PMC3582528 DOI: 10.1186/1471-2105-13-279] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2012] [Accepted: 09/05/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Structural variations (SVs) in genomes are commonly observed even in healthy individuals and play key roles in biological functions. To understand their functional impact or to infer molecular mechanisms of SVs, they have to be characterized with the maximum resolution. However, high-resolution analysis is a difficult task because it requires investigation of the complex structures involved in an enormous number of alignments of next-generation sequencing (NGS) reads and genome sequences that contain errors. RESULTS We propose a new method called ChopSticks that improves the resolution of SV detection for homozygous deletions even when the depth of coverage is low. Conventional methods based on read pairs use only discordant pairs to localize the positions of deletions, where a discordant pair is a read pair whose alignment has an aberrant strand or distance. In contrast, our method exploits concordant reads as well. We theoretically proved that when the depth of coverage approaches zero or infinity, the expected resolution of our method is asymptotically equal to that of methods based only on discordant pairs under double coverage. To confirm the effectiveness of ChopSticks, we conducted computational experiments against both simulated NGS reads and real NGS sequences. The resolution of deletion calls by other methods was significantly improved, thus demonstrating the usefulness of ChopSticks. CONCLUSIONS ChopSticks can generate high-resolution deletion calls of homozygous deletions using information independent of other methods, and it is therefore useful to examine the functional impact of SVs or to infer SV generation mechanisms.
Collapse
Affiliation(s)
- Tomohiro Yasuda
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan.
| | | | | | | |
Collapse
|
283
|
COPS: a sensitive and accurate tool for detecting somatic Copy Number Alterations using short-read sequence data from paired samples. PLoS One 2012; 7:e47812. [PMID: 23110103 PMCID: PMC3478291 DOI: 10.1371/journal.pone.0047812] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2012] [Accepted: 09/18/2012] [Indexed: 01/05/2023] Open
Abstract
Copy Number Alterations (CNAs) such as deletions and duplications; compose a larger percentage of genetic variations than single nucleotide polymorphisms or other structural variations in cancer genomes that undergo major chromosomal re-arrangements. It is, therefore, imperative to identify cancer-specific somatic copy number alterations (SCNAs), with respect to matched normal tissue, in order to understand their association with the disease. We have devised an accurate, sensitive, and easy-to-use tool, COPS, COpy number using Paired Samples, for detecting SCNAs. We rigorously tested the performance of COPS using short sequence simulated reads at various sizes and coverage of SCNAs, read depths, read lengths and also with real tumor:normal paired samples. We found COPS to perform better in comparison to other known SCNA detection tools for all evaluated parameters, namely, sensitivity (detection of true positives), specificity (detection of false positives) and size accuracy. COPS performed well for sequencing reads of all lengths when used with most upstream read alignment tools. Additionally, by incorporating a downstream boundary segmentation detection tool, the accuracy of SCNA boundaries was further improved. Here, we report an accurate, sensitive and easy to use tool in detecting cancer-specific SCNAs using short-read sequence data. In addition to cancer, COPS can be used for any disease as long as sequence reads from both disease and normal samples from the same individual are available. An added boundary segmentation detection module makes COPS detected SCNA boundaries more specific for the samples studied. COPS is available at ftp://115.119.160.213 with username “cops” and password “cops”.
Collapse
|
284
|
Kalyana-Sundaram S, Shanmugam A, Chinnaiyan AM. Gene Fusion Markup Language: a prototype for exchanging gene fusion data. BMC Bioinformatics 2012; 13:269. [PMID: 23072312 PMCID: PMC3607969 DOI: 10.1186/1471-2105-13-269] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2011] [Accepted: 10/11/2012] [Indexed: 12/26/2022] Open
Abstract
Background An avalanche of next generation sequencing (NGS) studies has generated an unprecedented amount of genomic structural variation data. These studies have also identified many novel gene fusion candidates with more detailed resolution than previously achieved. However, in the excitement and necessity of publishing the observations from this recently developed cutting-edge technology, no community standardization approach has arisen to organize and represent the data with the essential attributes in an interchangeable manner. As transcriptome studies have been widely used for gene fusion discoveries, the current non-standard mode of data representation could potentially impede data accessibility, critical analyses, and further discoveries in the near future. Results Here we propose a prototype, Gene Fusion Markup Language (GFML) as an initiative to provide a standard format for organizing and representing the significant features of gene fusion data. GFML will offer the advantage of representing the data in a machine-readable format to enable data exchange, automated analysis interpretation, and independent verification. As this database-independent exchange initiative evolves it will further facilitate the formation of related databases, repositories, and analysis tools. The GFML prototype is made available at
http://code.google.com/p/gfml-prototype/. Conclusion The Gene Fusion Markup Language (GFML) presented here could facilitate the development of a standard format for organizing, integrating and representing the significant features of gene fusion data in an inter-operable and query-able fashion that will enable biologically intuitive access to gene fusion findings and expedite functional characterization. A similar model is envisaged for other NGS data analyses.
Collapse
Affiliation(s)
- Shanker Kalyana-Sundaram
- Michigan Center for Translational Pathology, Department of Pathology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | | | | |
Collapse
|
285
|
Marschall T, Costa IG, Canzar S, Bauer M, Klau GW, Schliep A, Schönhuth A. CLEVER: clique-enumerating variant finder. Bioinformatics 2012; 28:2875-82. [DOI: 10.1093/bioinformatics/bts566] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
286
|
DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat Methods 2012; 9:1107-12. [PMID: 23042453 DOI: 10.1038/nmeth.2206] [Citation(s) in RCA: 138] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Accepted: 09/04/2012] [Indexed: 01/18/2023]
Abstract
DNA rearrangements such as sister chromatid exchanges (SCEs) are sensitive indicators of genomic stress and instability, but they are typically masked by single-cell sequencing techniques. We developed Strand-seq to independently sequence parental DNA template strands from single cells, making it possible to map SCEs at orders-of-magnitude greater resolution than was previously possible. On average, murine embryonic stem (mES) cells exhibit eight SCEs, which are detected at a resolution of up to 23 bp. Strikingly, Strand-seq of 62 single mES cells predicts that the mm 9 mouse reference genome assembly contains at least 17 incorrectly oriented segments totaling nearly 1% of the genome. These misoriented contigs and fragments have persisted through several iterations of the mouse reference genome and have been difficult to detect using conventional sequencing techniques. The ability to map SCE events at high resolution and fine-tune reference genomes by Strand-seq dramatically expands the scope of single-cell sequencing.
Collapse
|
287
|
Fromer M, Moran J, Chambert K, Banks E, Bergen S, Ruderfer D, Handsaker R, McCarroll S, O’Donovan M, Owen M, Kirov G, Sullivan P, Hultman C, Sklar P, Purcell S. Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am J Hum Genet 2012; 91:597-607. [PMID: 23040492 DOI: 10.1016/j.ajhg.2012.08.005] [Citation(s) in RCA: 459] [Impact Index Per Article: 35.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2012] [Revised: 06/23/2012] [Accepted: 08/09/2012] [Indexed: 12/20/2022] Open
Abstract
Sequencing of gene-coding regions (the exome) is increasingly used for studying human disease, for which copy-number variants (CNVs) are a critical genetic component. However, detecting copy number from exome sequencing is challenging because of the noncontiguous nature of the captured exons. This is compounded by the complex relationship between read depth and copy number; this results from biases in targeted genomic hybridization, sequence factors such as GC content, and batching of samples during collection and sequencing. We present a statistical tool (exome hidden Markov model [XHMM]) that uses principal-component analysis (PCA) to normalize exome read depth and a hidden Markov model (HMM) to discover exon-resolution CNV and genotype variation across samples. We evaluate performance on 90 schizophrenia trios and 1,017 case-control samples. XHMM detects a median of two rare (<1%) CNVs per individual (one deletion and one duplication) and has 79% sensitivity to similarly rare CNVs overlapping three or more exons discovered with microarrays. With sensitivity similar to state-of-the-art methods, XHMM achieves higher specificity by assigning quality metrics to the CNV calls to filter out bad ones, as well as to statistically genotype the discovered CNV in all individuals, yielding a trio call set with Mendelian-inheritance properties highly consistent with expectation. We also show that XHMM breakpoint quality scores enable researchers to explicitly search for novel classes of structural variation. For example, we apply XHMM to extract those CNVs that are highly likely to disrupt (delete or duplicate) only a portion of a gene.
Collapse
|
288
|
Yao F, Ariyaratne PN, Hillmer AM, Lee WH, Li G, Teo ASM, Woo XY, Zhang Z, Chen JP, Poh WT, Zawack KFB, Chan CS, Leong ST, Neo SC, Choi PSD, Gao S, Nagarajan N, Thoreau H, Shahab A, Ruan X, Cacheux-Rataboul V, Wei CL, Bourque G, Sung WK, Liu ET, Ruan Y. Long span DNA paired-end-tag (DNA-PET) sequencing strategy for the interrogation of genomic structural mutations and fusion-point-guided reconstruction of amplicons. PLoS One 2012; 7:e46152. [PMID: 23029419 PMCID: PMC3461012 DOI: 10.1371/journal.pone.0046152] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Accepted: 08/28/2012] [Indexed: 01/23/2023] Open
Abstract
Structural variations (SVs) contribute significantly to the variability of the human genome and extensive genomic rearrangements are a hallmark of cancer. While genomic DNA paired-end-tag (DNA-PET) sequencing is an attractive approach to identify genomic SVs, the current application of PET sequencing with short insert size DNA can be insufficient for the comprehensive mapping of SVs in low complexity and repeat-rich genomic regions. We employed a recently developed procedure to generate PET sequencing data using large DNA inserts of 10–20 kb and compared their characteristics with short insert (1 kb) libraries for their ability to identify SVs. Our results suggest that although short insert libraries bear an advantage in identifying small deletions, they do not provide significantly better breakpoint resolution. In contrast, large inserts are superior to short inserts in providing higher physical genome coverage for the same sequencing cost and achieve greater sensitivity, in practice, for the identification of several classes of SVs, such as copy number neutral and complex events. Furthermore, our results confirm that large insert libraries allow for the identification of SVs within repetitive sequences, which cannot be spanned by short inserts. This provides a key advantage in studying rearrangements in cancer, and we show how it can be used in a fusion-point-guided-concatenation algorithm to study focally amplified regions in cancer.
Collapse
Affiliation(s)
- Fei Yao
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
- Department of Epidemiology and Public Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Pramila N. Ariyaratne
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Axel M. Hillmer
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Wah Heng Lee
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Guoliang Li
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Audrey S. M. Teo
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Xing Yi Woo
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Zhenshui Zhang
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Jieqi P. Chen
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Wan Ting Poh
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Kelson F. B. Zawack
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Chee Seng Chan
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - See Ting Leong
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Say Chuan Neo
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Poh Sum D. Choi
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Song Gao
- Graduate School for Integrative Sciences and Engineering, Centre for Life Sciences, National University of Singapore, Singapore, Singapore
| | - Niranjan Nagarajan
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Hervé Thoreau
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Atif Shahab
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Xiaoan Ruan
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Valère Cacheux-Rataboul
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Chia-Lin Wei
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Guillaume Bourque
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Wing-Kin Sung
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Edison T. Liu
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Yijun Ruan
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- * E-mail:
| |
Collapse
|
289
|
Arlt MF, Rajendran S, Birkeland SR, Wilson TE, Glover TW. De novo CNV formation in mouse embryonic stem cells occurs in the absence of Xrcc4-dependent nonhomologous end joining. PLoS Genet 2012; 8:e1002981. [PMID: 23028374 PMCID: PMC3447954 DOI: 10.1371/journal.pgen.1002981] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2012] [Accepted: 08/01/2012] [Indexed: 11/20/2022] Open
Abstract
Spontaneous copy number variant (CNV) mutations are an important factor in genomic structural variation, genomic disorders, and cancer. A major class of CNVs, termed nonrecurrent CNVs, is thought to arise by nonhomologous DNA repair mechanisms due to the presence of short microhomologies, blunt ends, or short insertions at junctions of normal and de novo pathogenic CNVs, features recapitulated in experimental systems in which CNVs are induced by exogenous replication stress. To test whether the canonical nonhomologous end joining (NHEJ) pathway of double-strand break (DSB) repair is involved in the formation of this class of CNVs, chromosome integrity was monitored in NHEJ–deficient Xrcc4−/− mouse embryonic stem (ES) cells following treatment with low doses of aphidicolin, a DNA replicative polymerase inhibitor. Mouse ES cells exhibited replication stress-induced CNV formation in the same manner as human fibroblasts, including the existence of syntenic hotspot regions, such as in the Auts2 and Wwox loci. The frequency and location of spontaneous and aphidicolin-induced CNV formation were not altered by loss of Xrcc4, as would be expected if canonical NHEJ were the predominant pathway of CNV formation. Moreover, de novo CNV junctions displayed a typical pattern of microhomology and blunt end use that did not change in the absence of Xrcc4. A number of complex CNVs were detected in both wild-type and Xrcc4−/− cells, including an example of a catastrophic, chromothripsis event. These results establish that nonrecurrent CNVs can be, and frequently are, formed by mechanisms other than Xrcc4-dependent NHEJ. Copy number variants (CNVs) are a major factor in genetic variation and are a common and important class of mutation in genomic disorders, yet there is limited understanding of how many CNVs arise and the risk factors involved. One DNA damage response pathway implicated in CNV formation is nonhomologous end joining (NHEJ), which repairs broken DNA ends by Xrcc4-dependent direct ligation. We examined the effects of loss of Xrcc4 and NHEJ on CNV formation following replication stress in mouse cells. Cells lacking NHEJ displayed unaltered CNV frequencies, locations, and breakpoint structures compared to normal cells. These results establish that CNV mutations in a cell model system, and likely in vivo, arise by a mutagenic mechanism other than canonical NHEJ, a pattern similar to that reported for model translocation events. Potential roles of alternative end joining and template switching are discussed.
Collapse
Affiliation(s)
- Martin F. Arlt
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Sountharia Rajendran
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Shanda R. Birkeland
- Department of Pathology, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Thomas E. Wilson
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
- Department of Pathology, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail: (TEW); (TWG)
| | - Thomas W. Glover
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail: (TEW); (TWG)
| |
Collapse
|
290
|
Rausch T, Zichner T, Schlattl A, Stütz AM, Benes V, Korbel JO. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 2012; 28:i333-i339. [PMID: 22962449 PMCID: PMC3436805 DOI: 10.1093/bioinformatics/bts378] [Citation(s) in RCA: 1551] [Impact Index Per Article: 119.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
MOTIVATION The discovery of genomic structural variants (SVs) at high sensitivity and specificity is an essential requirement for characterizing naturally occurring variation and for understanding pathological somatic rearrangements in personal genome sequencing data. Of particular interest are integrated methods that accurately identify simple and complex rearrangements in heterogeneous sequencing datasets at single-nucleotide resolution, as an optimal basis for investigating the formation mechanisms and functional consequences of SVs. RESULTS We have developed an SV discovery method, called DELLY, that integrates short insert paired-ends, long-range mate-pairs and split-read alignments to accurately delineate genomic rearrangements at single-nucleotide resolution. DELLY is suitable for detecting copy-number variable deletion and tandem duplication events as well as balanced rearrangements such as inversions or reciprocal translocations. DELLY, thus, enables to ascertain the full spectrum of genomic rearrangements, including complex events. On simulated data, DELLY compares favorably to other SV prediction methods across a wide range of sequencing parameters. On real data, DELLY reliably uncovers SVs from the 1000 Genomes Project and cancer genomes, and validation experiments of randomly selected deletion loci show a high specificity. AVAILABILITY DELLY is available at www.korbel.embl.de/software.html CONTACT tobias.rausch@embl.de.
Collapse
Affiliation(s)
- Tobias Rausch
- European Molecular Biology Laboratory, Genome Biology, Meyerhofstr. 1, 69117 Heidelberg, Germany.
| | | | | | | | | | | |
Collapse
|
291
|
Nickles D, Madireddy L, Yang S, Khankhanian P, Lincoln S, Hauser SL, Oksenberg JR, Baranzini SE. In depth comparison of an individual's DNA and its lymphoblastoid cell line using whole genome sequencing. BMC Genomics 2012; 13:477. [PMID: 22974163 PMCID: PMC3473256 DOI: 10.1186/1471-2164-13-477] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2011] [Accepted: 09/05/2012] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND A detailed analysis of whole genomes can be now achieved with next generation sequencing. Epstein Barr Virus (EBV) transformation is a widely used strategy in clinical research to obtain an unlimited source of a subject's DNA. Although the mechanism of transformation and immortalization by EBV is relatively well known at the transcriptional and proteomic level, the genetic consequences of EBV transformation are less well understood. A detailed analysis of the genetic alterations introduced by EBV transformation is highly relevant, as it will inform on the usefulness and limitations of this approach. RESULTS We used whole genome sequencing to assess the genomic signature of a low-passage lymphoblastoid cell line (LCL). Specifically, we sequenced the full genome (40X) of an individual using DNA purified from fresh whole blood as well as DNA from his LCL. A total of 217.33 Gb of sequence were generated from the cell line and 238.95 Gb from the normal genomic DNA. We determined with high confidence that 99.2% of the genomes were identical, with no reproducible changes in structural variation (chromosomal rearrangements and copy number variations) or insertion/deletion polymorphisms (indels). CONCLUSIONS Our results suggest that, at this level of resolution, the LCL is genetically indistinguishable from its genomic counterpart and therefore their use in clinical research is not likely to introduce a significant bias.
Collapse
Affiliation(s)
- Dorothee Nickles
- Department of Neurology, University of California San Francisco, San Francisco, CA 94143-0435, USA
| | | | | | | | | | | | | | | |
Collapse
|
292
|
Moorthie S, Hall A, Wright CF. Informatics and clinical genome sequencing: opening the black box. Genet Med 2012; 15:165-71. [PMID: 22975759 DOI: 10.1038/gim.2012.116] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Adoption of whole-genome sequencing as a routine biomedical tool is dependent not only on the availability of new high-throughput sequencing technologies, but also on the concomitant development of methods and tools for data collection, analysis, and interpretation. It would also be enormously facilitated by the development of decision support systems for clinicians and consideration of how such information can best be incorporated into care pathways. Here we present an overview of the data analysis and interpretation pipeline, the wider informatics needs, and some of the relevant ethical and legal issues.
Collapse
|
293
|
Abstract
Cancer initiation, progression, and the emergence of therapeutic resistance are evolutionary phenomena of clonal somatic cell populations. Studies in microbial experimental evolution and the theoretical work inspired by such studies are yielding deep insights into the evolutionary dynamics of clonal populations, yet there has been little explicit consideration of the relevance of this rapidly growing field to cancer biology. Here, we examine how the understanding of mutation, selection, and spatial structure in clonal populations that is emerging from experimental evolution may be applicable to cancer. Along the way, we discuss some significant ways in which cancer differs from the model systems used in experimental evolution. Despite these differences, we argue that enhanced prediction and control of cancer may be possible using ideas developed in the context of experimental evolution, and we point out some prospects for future research at the interface between these traditionally separate areas.
Collapse
Affiliation(s)
- Kathleen Sprouffske
- Institute for Evolutionary Biology and Environmental Sciences, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Lauren M.F. Merlo
- Lankenau Institute for Medical Research, 100 Lancaster Ave., Wynnewood, PA 19096, USA
| | - Philip J. Gerrish
- Department of Biology, University of New Mexico, Albuquerque, NM 87131-0001, USA; Centro de Matemática e Aplicaç ôes Fundamentais, Department of Mathematics, University of Lisbon, 1649-003 Lisbon, Portugal
| | - Carlo C. Maley
- Center for Evolution and Cancer, Helen Diller Family Comprehensive Cancer Center, Department of Surgery, University of California, 2340 Sutter Street, PO Box 1351, San Francisco, CA 94115, USA
| | - Paul D. Sniegowski
- Department of Biology, University of Pennsylvania, 415 S. University Avenue, Philadelphia, PA 19104-6018, USA
| |
Collapse
|
294
|
Abstract
Massively parallel approaches to nucleic acid sequencing have matured from proof-of-concept to commercial products during the past 5 years. These technologies are now widely accessible, increasingly affordable, and have already exerted a transformative influence on the study of human cancer. Here, we review new features of cancer genomes that are being revealed by large-scale applications of these technologies. We focus on those insights most likely to affect future clinical practice. Foremost among these lessons, we summarize the formidable genetic heterogeneity within given cancer types that is appreciable with higher resolution profiling and larger sample sets. We discuss the inherent challenges of defining driving genomic events in a given cancer genome amidst thousands of other somatic events. Finally, we explore the organizational, regulatory and societal challenges impeding precision cancer medicine based on genomic profiling from assuming its place as standard-of-care.
Collapse
|
295
|
Lane AB, Clarke DJ. Genome instability: does genetic diversity amplification drive tumorigenesis? Bioessays 2012; 34:963-72. [PMID: 22948965 DOI: 10.1002/bies.201200082] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Recent data show that catastrophic events during one cell cycle can cause massive genome damage producing viable clones with unstable genomes. This is in contrast with the traditional view that tumorigenesis requires a long-term process in which mutations gradually accumulate over decades. These sudden events are likely to result in a large increase in genomic diversity within a relatively short time, providing the opportunity for selective advantages to be gained by a subset of cells within a population. This genetic diversity amplification, arising from a single aberrant cell cycle, may drive a population conversion from benign to malignant. However, there is likely a period of relative genome stability during the clonal expansion of tumors - this may provide an opportunity for therapeutic intervention, especially if mechanisms that limit tolerance of aneuploidy are exploited.
Collapse
Affiliation(s)
- Andrew B Lane
- Department of Genetics, Cell Biology & Development, University of Minnesota, Minneapolis, MN, USA
| | | |
Collapse
|
296
|
Robbiani DF, Nussenzweig MC. Chromosome translocation, B cell lymphoma, and activation-induced cytidine deaminase. ANNUAL REVIEW OF PATHOLOGY-MECHANISMS OF DISEASE 2012; 8:79-103. [PMID: 22974238 DOI: 10.1146/annurev-pathol-020712-164004] [Citation(s) in RCA: 138] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Studies of B cell lymphomas in the early 1980s led to the cloning of genes (c-MYC and IGH) at a chromosome translocation breakpoint. A rush followed to identify recurrently translocated genes in all types of cancer, which led to remarkable advances in our understanding of cancer genetics. B lymphocyte tumors commonly bear chromosome translocations to immunoglobulin genes, which points to a role for antibody gene diversification processes in tumorigenesis. The discovery of activation-induced cytidine deaminase (AID) and the use of murine models to study translocation have led to a new understanding of how these events contribute to the genesis of lymphomas. Here, we review these advances with a focus on AID and insights gained from the study of translocations in primary cells.
Collapse
Affiliation(s)
- Davide F Robbiani
- Laboratory of Molecular Immunology and Howard Hughes Medical Institute, The Rockefeller University, New York, NY 10065, USA.
| | | |
Collapse
|
297
|
Meyer CA, Tang Q, Liu XS. Minireview: applications of next-generation sequencing on studies of nuclear receptor regulation and function. Mol Endocrinol 2012; 26:1651-9. [PMID: 22930692 DOI: 10.1210/me.2012-1150] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Next-generation sequencing technologies have expanded the experimental possibilities for studying the genome-wide regulation of transcription by nuclear receptors, their collaborating transcription factors, and coregulators. These technologies allow investigators to obtain abundance and DNA sequence information in a single experiment. In this review, we highlight proven and potential uses of next-generation sequencing in the study of gene regulation by nuclear receptors. We also provide suggestions on how to effectively leverage this technology in a collaborative environment.
Collapse
Affiliation(s)
- Clifford A Meyer
- Department of Biostatistics and Computational Biology, Harvard School of Public Health, Biostatistics and Computational Biology, 450 Brookline Street, Boston, Massachusetts 02215, USA
| | | | | |
Collapse
|
298
|
Daniels M, Goh F, Wright CM, Sriram KB, Relan V, Clarke BE, Duhig EE, Bowman RV, Yang IA, Fong KM. Whole genome sequencing for lung cancer. J Thorac Dis 2012; 4:155-63. [PMID: 22833821 DOI: 10.3978/j.issn.2072-1439.2012.02.01] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2012] [Accepted: 02/01/2012] [Indexed: 02/06/2023]
Abstract
Lung cancer is a leading cause of cancer related morbidity and mortality globally, and carries a dismal prognosis. Improved understanding of the biology of cancer is required to improve patient outcomes. Next-generation sequencing (NGS) is a powerful tool for whole genome characterisation, enabling comprehensive examination of somatic mutations that drive oncogenesis. Most NGS methods are based on polymerase chain reaction (PCR) amplification of platform-specific DNA fragment libraries, which are then sequenced. These techniques are well suited to high-throughput sequencing and are able to detect the full spectrum of genomic changes present in cancer. However, they require considerable investments in time, laboratory infrastructure, computational analysis and bioinformatic support. Next-generation sequencing has been applied to studies of the whole genome, exome, transcriptome and epigenome, and is changing the paradigm of lung cancer research and patient care. The results of this new technology will transform current knowledge of oncogenic pathways and provide molecular targets of use in the diagnosis and treatment of cancer. Somatic mutations in lung cancer have already been identified by NGS, and large scale genomic studies are underway. Personalised treatment strategies will improve care for those likely to benefit from available therapies, while sparing others the expense and morbidity of futile intervention. Organisational, computational and bioinformatic challenges of NGS are driving technological advances as well as raising ethical issues relating to informed consent and data release. Differentiation between driver and passenger mutations requires careful interpretation of sequencing data. Challenges in the interpretation of results arise from the types of specimens used for DNA extraction, sample processing techniques and tumour content. Tumour heterogeneity can reduce power to detect mutations implicated in oncogenesis. Next-generation sequencing will facilitate investigation of the biological and clinical implications of such variation. These techniques can now be applied to single cells and free circulating DNA, and possibly in the future to DNA obtained from body fluids and from subpopulations of tumour. As costs reduce, and speed and processing accuracy increase, NGS technology will become increasingly accessible to researchers and clinicians, with the ultimate goal of improving the care of patients with lung cancer.
Collapse
|
299
|
Monitoring chronic lymphocytic leukemia progression by whole genome sequencing reveals heterogeneous clonal evolution patterns. Blood 2012; 120:4191-6. [PMID: 22915640 DOI: 10.1182/blood-2012-05-433540] [Citation(s) in RCA: 262] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Chronic lymphocytic leukemia is characterized by relapse after treatment and chemotherapy resistance. Similarly, in other malignancies leukemia cells accumulate mutations during growth, forming heterogeneous cell populations that are subject to Darwinian selection and may respond differentially to treatment. There is therefore a clinical need to monitor changes in the subclonal composition of cancers during disease progression. Here, we use whole-genome sequencing to track subclonal heterogeneity in 3 chronic lymphocytic leukemia patients subjected to repeated cycles of therapy. We reveal different somatic mutation profiles in each patient and use these to establish probable hierarchical patterns of subclonal evolution, to identify subclones that decline or expand over time, and to detect founder mutations. We show that clonal evolution patterns are heterogeneous in individual patients. We conclude that genome sequencing is a powerful and sensitive approach to monitor disease progression repeatedly at the molecular level. If applied to future clinical trials, this approach might eventually influence treatment strategies as a tool to individualize and direct cancer treatment.
Collapse
|
300
|
Ramsay AJ, Martínez-Trillos A, Jares P, Rodríguez D, Kwarciak A, Quesada V. Next-generation sequencing reveals the secrets of the chronic lymphocytic leukemia genome. Clin Transl Oncol 2012; 15:3-8. [PMID: 22911550 DOI: 10.1007/s12094-012-0922-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2012] [Accepted: 07/23/2012] [Indexed: 02/06/2023]
Abstract
The study of the detailed molecular history of cancer development is one of the most promising techniques to understand and fight this diverse and prevalent disease. Unfortunately, this history is as diverse as cancer itself. Therefore, even with next-generation sequencing techniques, it is not easy to distinguish significant (driver) from random (passenger) events. The International Cancer Genome Consortium (ICGC) was formed to solve this fundamental issue by coordinating the sequencing of samples from 50 different cancer types and/or sub-types that are of clinical and societal importance. The contribution of Spain in this consortium has been focused on chronic lymphocytic leukemia (CLL). This approach has unveiled new and unexpected events in the development of CLL. In this review, we introduce the approaches utilized by the consortium for the study of the CLL genome and discuss the recent results and future perspectives of this work.
Collapse
Affiliation(s)
- Andrew J Ramsay
- Departamento de Bioquímica y Biología Molecular, Instituto Universitario de Oncología, Universidad de Oviedo, Oviedo, Spain
| | | | | | | | | | | |
Collapse
|