551
|
LaFramboise T. Single nucleotide polymorphism arrays: a decade of biological, computational and technological advances. Nucleic Acids Res 2009; 37:4181-93. [PMID: 19570852 PMCID: PMC2715261 DOI: 10.1093/nar/gkp552] [Citation(s) in RCA: 255] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Array manufacturers originally designed single nucleotide polymorphism (SNP) arrays to genotype human DNA at thousands of SNPs across the genome simultaneously. In the decade since their initial development, the platform's applications have expanded to include the detection and characterization of copy number variation—whether somatic, inherited, or de novo—as well as loss-of-heterozygosity in cancer cells. The technology's impressive contributions to insights in population and molecular genetics have been fueled by advances in computational methodology, and indeed these insights and methodologies have spurred developments in the arrays themselves. This review describes the most commonly used SNP array platforms, surveys the computational methodologies used to convert the raw data into inferences at the DNA level, and details the broad range of applications. Although the long-term future of SNP arrays is unclear, cost considerations ensure their relevance for at least the next several years. Even as emerging technologies seem poised to take over for at least some applications, researchers working with these new sources of data are adopting the computational approaches originally developed for SNP arrays.
Collapse
Affiliation(s)
- Thomas LaFramboise
- Department of Genetics, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
552
|
Shah SP, Köbel M, Senz J, Morin RD, Clarke BA, Wiegand KC, Leung G, Zayed A, Mehl E, Kalloger SE, Sun M, Giuliany R, Yorida E, Jones S, Varhol R, Swenerton KD, Miller D, Clement PB, Crane C, Madore J, Provencher D, Leung P, DeFazio A, Khattra J, Turashvili G, Zhao Y, Zeng T, Glover JNM, Vanderhyden B, Zhao C, Parkinson CA, Jimenez-Linan M, Bowtell DDL, Mes-Masson AM, Brenton JD, Aparicio SA, Boyd N, Hirst M, Gilks CB, Marra M, Huntsman DG. Mutation of FOXL2 in granulosa-cell tumors of the ovary. N Engl J Med 2009; 360:2719-29. [PMID: 19516027 DOI: 10.1056/nejmoa0902542] [Citation(s) in RCA: 558] [Impact Index Per Article: 34.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
BACKGROUND Granulosa-cell tumors (GCTs) are the most common type of malignant ovarian sex cord-stromal tumor (SCST). The pathogenesis of these tumors is unknown. Moreover, their histopathological diagnosis can be challenging, and there is no curative treatment beyond surgery. METHODS We analyzed four adult-type GCTs using whole-transcriptome paired-end RNA sequencing. We identified putative GCT-specific mutations that were present in at least three of these samples but were absent from the transcriptomes of 11 epithelial ovarian tumors, published human genomes, and databases of single-nucleotide polymorphisms. We confirmed these variants by direct sequencing of complementary DNA and genomic DNA. We then analyzed additional tumors and matched normal genomic DNA, using a combination of direct sequencing, analyses of restriction-fragment-length polymorphisms, and TaqMan assays. RESULTS All four index GCTs had a missense point mutation, 402C-->G (C134W), in FOXL2, a gene encoding a transcription factor known to be critical for granulosa-cell development. The FOXL2 mutation was present in 86 of 89 additional adult-type GCTs (97%), in 3 of 14 thecomas (21%), and in 1 of 10 juvenile-type GCTs (10%). The mutation was absent in 49 SCSTs of other types and in 329 unrelated ovarian or breast tumors. CONCLUSIONS Whole-transcriptome sequencing of four GCTs identified a single, recurrent somatic mutation (402C-->G) in FOXL2 that was present in almost all morphologically identified adult-type GCTs. Mutant FOXL2 is a potential driver in the pathogenesis of adult-type GCTs.
Collapse
Affiliation(s)
- Sohrab P Shah
- Centre for Translational and Applied Genomics, British Columbia Cancer Agency, Vancouver, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
553
|
Lin CH, Lin YC, Wu JY, Pan WH, Chen YT, Fann CSJ. A genome-wide survey of copy number variations in Han Chinese residing in Taiwan. Genomics 2009; 94:241-6. [PMID: 19559783 DOI: 10.1016/j.ygeno.2009.06.004] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2009] [Revised: 06/16/2009] [Accepted: 06/16/2009] [Indexed: 11/29/2022]
Abstract
Copy number variation (CNV) is a form of DNA sequence variation in the human genome. CNVs can affect expression of nearby and distant genes, and some of them might cause certain phenotypic differences. CNVs vary slightly in location and frequency among different populations. Because currently-available CNV information from Asian population was limited to fewer small-scale studies with only dozens of subjects, a high-resolution CNV survey was conducted using a large number of Han Chinese in this study. The Illumina HumanMap550K single-nucleotide polymorphism array was used to identify CNVs from 813 unrelated Han Chinese residing in Taiwan. A total of 365 CNV regions were identified in this population, and the average size of the CNV regions was 235 kb (covering a total of 2.86% of the human genome), and 67 (18.4%) were newly-discovered CNV regions. Two hundred and seventy-nine CNV regions (76%) were verified from 304 randomly-selected samples by Affymetrix 500K GeneChip and qPCR experiments. These regions contain 1029 genes, some of which are associated with diseases. Consistent with previous studies, most CNVs were rare structural variations in the human genome, and only 64 regions (17.5%) had a CNV allele frequency greater than 1%. Our discovery of 67 new CNV regions indicates that previous CNV coverage of the human genome is incomplete and there is diversity among different ethnic populations. The comprehensive knowledge of CNVs in the human genome is very important and useful in further genetic studies.
Collapse
Affiliation(s)
- Chien-Hsing Lin
- Institute of Biomedical Sciences, Academia Sinica, 128, Academia Road, Section 2 Nankang, Taipei 115, Taiwan
| | | | | | | | | | | |
Collapse
|
554
|
Chromosomal translocations induced at specified loci in human stem cells. Proc Natl Acad Sci U S A 2009; 106:10620-5. [PMID: 19549848 DOI: 10.1073/pnas.0902076106] [Citation(s) in RCA: 164] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The precise genetic manipulation of stem and precursor cells offers extraordinary potential for the analysis, prevention, and treatment of human malignancies. Chromosomal translocations are hallmarks of several tumor types where they are thought to have arisen in stem or precursor cells. Although approaches exist to study factors involved in translocation formation in mouse cells, approaches in human cells have been lacking, especially in relevant cell types. The technology of zinc finger nucleases (ZFNs) allows DNA double-strand breaks (DSBs) to be introduced into specified chromosomal loci. We harnessed this technology to induce chromosomal translocations in human cells by generating concurrent DSBs at 2 endogenous loci, the PPP1R12C/p84 gene on chromosome 19 and the IL2Rgamma gene on the X chromosome. Translocation breakpoint junctions for t(19;X) were detected with nested quantitative PCR in a high throughput 96-well format using denaturation curves and DNA sequencing in a variety of human cell types, including embryonic stem (hES) cells and hES cell-derived mesenchymal precursor cells. Although readily detected, translocations were less frequent than repair of a single DSB by gene targeting or nonhomologous end-joining, neither of which leads to gross chromosomal rearrangements. While previous studies have relied on laborious genetic modification of cells and extensive growth in culture, the approach described in this report is readily applicable to primary human cells, including multipotent and pluripotent cells, to uncover both the underlying mechanisms and phenotypic consequences of targeted translocations and other genomic rearrangements.
Collapse
|
555
|
Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA. Circos: an information aesthetic for comparative genomics. Genome Res 2009; 19:1639-45. [PMID: 19541911 DOI: 10.1101/gr.092759.109] [Citation(s) in RCA: 7419] [Impact Index Per Article: 463.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We created a visualization tool called Circos to facilitate the identification and analysis of similarities and differences arising from comparisons of genomes. Our tool is effective in displaying variation in genome structure and, generally, any other kind of positional relationships between genomic intervals. Such data are routinely produced by sequence alignments, hybridization arrays, genome mapping, and genotyping studies. Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements. Circos is capable of displaying data as scatter, line, and histogram plots, heat maps, tiles, connectors, and text. Bitmap or vector images can be created from GFF-style data inputs and hierarchical configuration files, which can be easily generated by automated tools, making Circos suitable for rapid deployment in data analysis and reporting pipelines.
Collapse
Affiliation(s)
- Martin Krzywinski
- Canada's Michael Smith Genome Sciences Center, Vancouver, British Columbia V5Z 4S6, Canada.
| | | | | | | | | | | | | | | |
Collapse
|
556
|
Abstract
Copy number variation (CNV) contributes in phenotypically relevant ways to the genetic variability of many organisms. Cost-effective genomewide methods for identifying copy number variation are necessary to elucidate the contribution that these structural variants make to the genomes of model organisms. We have developed a novel approach for the identification of copy number variation by next generation sequencing. As a proof of concept our method has been applied to map the deletions of three Drosophila deficiency strains. We demonstrate that low sequence coverage is sufficient for identifying and mapping large deletions at kilobase resolution, suggesting that data generated from high-throughput sequencing experiments are sufficient for simultaneously analyzing many strains. Genomic DNA from two Drosophila deficiency stocks was barcoded and sequenced in multiplex, and the breakpoints associated with each deletion were successfully identified. The approach we describe is immediately applicable to the systematic exploration of copy number variation in model organisms and humans.
Collapse
|
557
|
Silva JM, Ezhkova E, Silva J, Heart S, Castillo M, Campos Y, Castro V, Bonilla F, Cordon-Cardo C, Muthuswamy SK, Powers S, Fuchs E, Hannon GJ. Cyfip1 is a putative invasion suppressor in epithelial cancers. Cell 2009; 137:1047-61. [PMID: 19524508 PMCID: PMC2754270 DOI: 10.1016/j.cell.2009.04.013] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2008] [Revised: 01/23/2009] [Accepted: 04/02/2009] [Indexed: 12/20/2022]
Abstract
Identification of bona fide tumor suppressors is often challenging because of the large number of genetic alterations present in most human cancers. To evaluate candidate genes present within chromosomal regions recurrently deleted in human cancers, we coupled high-resolution genomic analysis with a two-stage genetic study using RNA interference (RNAi). We found that Cyfip1, a subunit of the WAVE complex, which regulates cytoskeletal dynamics, is commonly deleted in human epithelial cancers. Reduced expression of CYFIP1 is commonly observed during invasion of epithelial tumors and is associated with poor prognosis in this setting. Silencing of Cyfip1 disturbed normal epithelial morphogenesis in vitro and cooperated with oncogenic Ras to produce invasive carcinomas in vivo. Mechanistically, we have linked alterations in WAVE-regulated actin dynamics with impaired cell-cell adhesion and cell-ECM interactions. Thus, we propose Cyfip1 as an invasion suppressor gene.
Collapse
Affiliation(s)
- Jose M Silva
- Watson School Biological Sciences, Howard Hughes Medical Institute, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
558
|
Abstract
Chromosomal translocations and fusion oncogenes serve as the ultimate biomarker for clinicians as they show specificity for distinct histopathologic malignancies while simultaneously encoding an etiologic mutation and a therapeutic target. Previously considered a minor mutational event in epithelial solid tumors, new methodologies that do not rely on the detection of macroscopic cytogenetic alterations, as well as access to large series of annotated clinical material, are expanding the inventory of recurrent fusion oncogenes in both common and rare solid epithelial tumors. Unexpectedly, related assays are also revealing a high number of tandem or chimeric transcripts in normal tissues including, in one provocative case, a template for a known fusion oncogene. These observations may force us to reassess long-held views on the definition of a gene. They also raise the possibility that some rearrangements might represent constitutive forms of a physiological chimeric transcript. Defining the chimeric transcriptome in both health (transcription-induced chimerism and intergenic splicing) and disease (mutation-associated fusion oncogenes) will play an increasingly important role in the diagnosis, prognosis, and therapy of patients with cancer.
Collapse
|
559
|
Turner DJ, Keane TM, Sudbery I, Adams DJ. Next-generation sequencing of vertebrate experimental organisms. Mamm Genome 2009; 20:327-38. [PMID: 19452216 PMCID: PMC2714443 DOI: 10.1007/s00335-009-9187-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2009] [Accepted: 04/21/2009] [Indexed: 12/22/2022]
Abstract
Next-generation sequencing technologies are revolutionizing biology by allowing for genome-wide transcription factor binding-site profiling, transcriptome sequencing, and more recently, whole-genome resequencing. While it is currently not possible to generate complete de novo assemblies of higher-vertebrate genomes using next-generation sequencing, improvements in sequence read lengths and throughput, coupled with new assembly algorithms for large data sets, will soon make this a reality. These developments will in turn spawn a revolution in how genomic data are used to understand genetics and how model organisms are used for disease gene discovery. This review provides an overview of the current next-generation sequencing platforms and the newest computational tools for the analysis of next-generation sequencing data. We also describe how next-generation sequencing may be applied in the context of vertebrate model organism genetics.
Collapse
Affiliation(s)
- Daniel J. Turner
- Experimental Cancer Genetics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1HH UK
| | - Thomas M. Keane
- Experimental Cancer Genetics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1HH UK
| | - Ian Sudbery
- Experimental Cancer Genetics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1HH UK
| | - David J. Adams
- Experimental Cancer Genetics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1HH UK
| |
Collapse
|
560
|
Forshew T, Tatevossian RG, Lawson ARJ, Ma J, Neale G, Ogunkolade BW, Jones TA, Aarum J, Dalton J, Bailey S, Chaplin T, Carter RL, Gajjar A, Broniscer A, Young BD, Ellison DW, Sheer D. Activation of the ERK/MAPK pathway: a signature genetic defect in posterior fossa pilocytic astrocytomas. J Pathol 2009; 218:172-81. [PMID: 19373855 DOI: 10.1002/path.2558] [Citation(s) in RCA: 219] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2009] [Accepted: 03/16/2009] [Indexed: 12/25/2022]
Abstract
We report genetic aberrations that activate the ERK/MAP kinase pathway in 100% of posterior fossa pilocytic astrocytomas, with a high frequency of gene fusions between KIAA1549 and BRAF among these tumours. These fusions were identified from analysis of focal copy number gains at 7q34, detected using Affymetrix 250K and 6.0 SNP arrays. PCR and sequencing confirmed the presence of five KIAA1549-BRAF fusion variants, along with a single fusion between SRGAP3 and RAF1. The resulting fusion genes lack the auto-inhibitory domains of BRAF and RAF1, which are replaced in-frame by the beginning of KIAA1549 and SRGAP3, respectively, conferring constitutive kinase activity. An activating mutation of KRAS was identified in the single pilocytic astrocytoma without a BRAF or RAF1 fusion. Further fusions and activating mutations in BRAF were identified in 28% of grade II astrocytomas, highlighting the importance of the ERK/MAP kinase pathway in the development of paediatric low-grade gliomas.
Collapse
Affiliation(s)
- Tim Forshew
- Neuroscience Centre, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, Institute of Cell and Molecular Science, London, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
561
|
Fullwood MJ, Wei CL, Liu ET, Ruan Y. Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. Genome Res 2009; 19:521-32. [PMID: 19339662 DOI: 10.1101/gr.074906.107] [Citation(s) in RCA: 234] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Comprehensive understanding of functional elements in the human genome will require thorough interrogation and comparison of individual human genomes and genomic structures. Such an endeavor will require improvements in the throughputs and costs of DNA sequencing. Next-generation sequencing platforms have impressively low costs and high throughputs but are limited by short read lengths. An immediate and widely recognized solution to this critical limitation is the paired-end tag (PET) sequencing for various applications, collectively called the PET sequencing strategy, in which short and paired tags are extracted from the ends of long DNA fragments for ultra-high-throughput sequencing. The PET sequences can be accurately mapped to the reference genome, thus demarcating the genomic boundaries of PET-represented DNA fragments and revealing the identities of the target DNA elements. PET protocols have been developed for the analyses of transcriptomes, transcription factor binding sites, epigenetic sites such as histone modification sites, and genome structures. The exclusive advantage of the PET technology is its ability to uncover linkages between the two ends of DNA fragments. Using this unique feature, unconventional fusion transcripts, genome structural variations, and even molecular interactions between distant genomic elements can be unraveled by PET analysis. Extensive use of PET data could lead to efficient assembly of individual human genomes, transcriptomes, and interactomes, enabling new biological and clinical insights. With its versatile and powerful nature for DNA analysis, the PET sequencing strategy has a bright future ahead.
Collapse
Affiliation(s)
- Melissa J Fullwood
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore 138672, Singapore
| | | | | | | |
Collapse
|
562
|
Dubey S, Powell CA. Update in lung cancer 2008. Am J Respir Crit Care Med 2009; 179:860-8. [PMID: 19423719 PMCID: PMC2720086 DOI: 10.1164/rccm.200902-0289up] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2009] [Accepted: 02/23/2009] [Indexed: 12/31/2022] Open
Affiliation(s)
- Sarita Dubey
- Division of Hematology and Oncology, University of California, San Francisco, California, USA
| | | |
Collapse
|
563
|
Hormozdiari F, Alkan C, Eichler EE, Sahinalp SC. Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res 2009; 19:1270-8. [PMID: 19447966 DOI: 10.1101/gr.088633.108] [Citation(s) in RCA: 202] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Recent studies show that along with single nucleotide polymorphisms and small indels, larger structural variants among human individuals are common. The Human Genome Structural Variation Project aims to identify and classify deletions, insertions, and inversions (>5 Kbp) in a small number of normal individuals with a fosmid-based paired-end sequencing approach using traditional sequencing technologies. The realization of new ultra-high-throughput sequencing platforms now makes it feasible to detect the full spectrum of genomic variation among many individual genomes, including cancer patients and others suffering from diseases of genomic origin. Unfortunately, existing algorithms for identifying structural variation (SV) among individuals have not been designed to handle the short read lengths and the errors implied by the "next-gen" sequencing (NGS) technologies. In this paper, we give combinatorial formulations for the SV detection between a reference genome sequence and a next-gen-based, paired-end, whole genome shotgun-sequenced individual. We describe efficient algorithms for each of the formulations we give, which all turn out to be fast and quite reliable; they are also applicable to all next-gen sequencing methods (Illumina, 454 Life Sciences [Roche], ABI SOLiD, etc.) and traditional capillary sequencing technology. We apply our algorithms to identify SV among individual genomes very recently sequenced by Illumina technology.
Collapse
Affiliation(s)
- Fereydoun Hormozdiari
- School of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada V5A 1S6
| | | | | | | |
Collapse
|
564
|
Schweiger MR, Kerick M, Timmermann B, Albrecht MW, Borodina T, Parkhomchuk D, Zatloukal K, Lehrach H. Genome-wide massively parallel sequencing of formaldehyde fixed-paraffin embedded (FFPE) tumor tissues for copy-number- and mutation-analysis. PLoS One 2009; 4:e5548. [PMID: 19440246 PMCID: PMC2678265 DOI: 10.1371/journal.pone.0005548] [Citation(s) in RCA: 134] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2009] [Accepted: 04/01/2009] [Indexed: 01/01/2023] Open
Abstract
Background Cancer re-sequencing programs rely on DNA isolated from fresh snap frozen tissues, the preparation of which is combined with additional preservation efforts. Tissue samples at pathology departments are routinely stored as formalin-fixed and paraffin-embedded (FFPE) samples and their use would open up access to a variety of clinical trials. However, FFPE preparation is incompatible with many down-stream molecular biology techniques such as PCR based amplification methods and gene expression studies. Methodology/Principal Findings Here we investigated the sample quality requirements of FFPE tissues for massively parallel short-read sequencing approaches. We evaluated key variables of pre-fixation, fixation related and post-fixation processes that occur in routine medical service (e.g. degree of autolysis, duration of fixation and of storage). We also investigated the influence of tissue storage time on sequencing quality by using material that was up to 18 years old. Finally, we analyzed normal and tumor breast tissues using the Sequencing by Synthesis technique (Illumina Genome Analyzer, Solexa) to simultaneously localize genome-wide copy number alterations and to detect genomic variations such as substitutions and point-deletions and/or insertions in FFPE tissue samples. Conclusions/Significance The application of second generation sequencing techniques on small amounts of FFPE material opens up the possibility to analyze tissue samples which have been collected during routine clinical work as well as in the context of clinical trials. This is in particular important since FFPE samples are amply available from surgical tumor resections and histopathological diagnosis, and comprise tissue from precursor lesions, primary tumors, lymphogenic and/or hematogenic metastases. Large-scale studies using this tissue material will result in a better prediction of the prognosis of cancer patients and the early identification of patients which will respond to therapy.
Collapse
Affiliation(s)
- Michal R. Schweiger
- Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical Genetics, Charité Universitätsmedizin, Berlin, Germany
- * E-mail:
| | - Martin Kerick
- Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Bernd Timmermann
- Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Marcus W. Albrecht
- Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Tatjana Borodina
- Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Dmitri Parkhomchuk
- Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Kurt Zatloukal
- Institute of Pathology, Medical University, Graz, Austria
| | - Hans Lehrach
- Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
| |
Collapse
|
565
|
Abstract
Recent findings of gene fusions in carcinomas recapitulate the discovery of chromosomal abnormalities in leukemias and sarcomas decades ago. A recurring feature of carcinoma gene fusions, in contrast to those in hematopoietic and mesenchymal malignancies, is that they result in aberrant cell signaling. This may reflect differences in the differentiation programs of these tissues.
Collapse
|
566
|
McCaughan F. Molecular copy-number counting: potential of single-molecule diagnostics. Expert Rev Mol Diagn 2009; 9:309-12. [PMID: 19435452 DOI: 10.1586/erm.09.14] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2025]
|
567
|
Abstract
It is now becoming generally accepted that a significant amount of human genetic variation is due to structural changes of the genome rather than to base-pair changes in the DNA. As for base-pair changes, knowledge of gene and genome function has been informed by structural alterations that convey clinical phenotypes. Genomic disorders are a class of human conditions that result from structural changes of the human genome that convey traits or susceptibility to traits. The path to the delineation of genomic disorders is intertwined with the evolving technologies that have enabled the resolution of human genome analyses to continue increasing. Similarly, the ability to perform high-resolution human genome analysis has fueled the current and future clinical implementation of such discoveries in the evolving field of genome medicine.
Collapse
Affiliation(s)
- James R Lupski
- Departments of Molecular and Human Genetics, and Pediatrics, Baylor College of Medicine, and Texas Children's Hospital, Houston, TX 77030, USA.
| |
Collapse
|
568
|
Tomlins SA, Bjartell A, Chinnaiyan AM, Jenster G, Nam RK, Rubin MA, Schalken JA. ETS gene fusions in prostate cancer: from discovery to daily clinical practice. Eur Urol 2009; 56:275-86. [PMID: 19409690 DOI: 10.1016/j.eururo.2009.04.036] [Citation(s) in RCA: 275] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2009] [Accepted: 04/15/2009] [Indexed: 11/28/2022]
Abstract
CONTEXT In 2005, fusions between the androgen-regulated transmembrane protease serine 2 gene, TMPRSS2, and E twenty-six (ETS) transcription factors were discovered in prostate cancer. OBJECTIVE To review advances in our understanding of ETS gene fusions, focusing on challenges affecting translation to clinical application. EVIDENCE ACQUISITION The PubMed database was searched for reports on ETS fusions in prostate cancer. EVIDENCE SYNTHESIS Since the discovery of ETS fusions, novel 5' and 3' fusion partners and multiple splice isoforms have been reported. The most common fusion, TMPRSS2:ERG, is present in approximately 50% of prostate-specific antigen (PSA)-screened localized prostate cancers and in 15-35% of population-based cohorts. ETS fusions can be detected noninvasively in the urine of men with prostate cancer, with a specificity rate in PSA-screened cohorts of >90%. Reports from untreated population-based cohorts suggest an association between ETS fusions and cancer-specific death and metastatic spread. In retrospective prostatectomy cohorts, conflicting results have been published regarding associations between ETS fusions and cancer aggressiveness. In addition to serving as a potential biomarker, tissue and functional studies suggest a specific role for ETS fusions in the transition to carcinoma. Finally, recent results suggest that the 5' and 3' ends of ETS fusions as well as downstream targets may be targeted therapeutically. CONCLUSIONS Recent studies suggest that the first clinical applications of ETS fusions are likely to be in noninvasive detection of prostate cancer and in aiding with difficult diagnostic cases. Additional studies are needed to clarify the association between gene fusions and cancer aggressiveness, particularly those studies that take into account the multifocal and heterogeneous nature of localized prostate cancer. Multiple promising strategies have been identified to potentially target ETS fusions. Together, these results suggest that ETS fusions will affect multiple aspects of prostate cancer diagnosis and management.
Collapse
Affiliation(s)
- Scott A Tomlins
- Michigan Center for Translational Pathology, Department of Pathology, University of Michigan Medical School, Ann Arbor, MI, USA.
| | | | | | | | | | | | | |
Collapse
|
569
|
Guffanti A, Iacono M, Pelucchi P, Kim N, Soldà G, Croft LJ, Taft RJ, Rizzi E, Askarian-Amiri M, Bonnal RJ, Callari M, Mignone F, Pesole G, Bertalot G, Bernardi LR, Albertini A, Lee C, Mattick JS, Zucchi I, De Bellis G. A transcriptional sketch of a primary human breast cancer by 454 deep sequencing. BMC Genomics 2009; 10:163. [PMID: 19379481 PMCID: PMC2678161 DOI: 10.1186/1471-2164-10-163] [Citation(s) in RCA: 197] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2008] [Accepted: 04/20/2009] [Indexed: 02/07/2023] Open
Abstract
Background The cancer transcriptome is difficult to explore due to the heterogeneity of quantitative and qualitative changes in gene expression linked to the disease status. An increasing number of "unconventional" transcripts, such as novel isoforms, non-coding RNAs, somatic gene fusions and deletions have been associated with the tumoral state. Massively parallel sequencing techniques provide a framework for exploring the transcriptional complexity inherent to cancer with a limited laboratory and financial effort. We developed a deep sequencing and bioinformatics analysis protocol to investigate the molecular composition of a breast cancer poly(A)+ transcriptome. This method utilizes a cDNA library normalization step to diminish the representation of highly expressed transcripts and biology-oriented bioinformatic analyses to facilitate detection of rare and novel transcripts. Results We analyzed over 132,000 Roche 454 high-confidence deep sequencing reads from a primary human lobular breast cancer tissue specimen, and detected a range of unusual transcriptional events that were subsequently validated by RT-PCR in additional eight primary human breast cancer samples. We identified and validated one deletion, two novel ncRNAs (one intergenic and one intragenic), ten previously unknown or rare transcript isoforms and a novel gene fusion specific to a single primary tissue sample. We also explored the non-protein-coding portion of the breast cancer transcriptome, identifying thousands of novel non-coding transcripts and more than three hundred reads corresponding to the non-coding RNA MALAT1, which is highly expressed in many human carcinomas. Conclusion Our results demonstrate that combining 454 deep sequencing with a normalization step and careful bioinformatic analysis facilitates the discovery and quantification of rare transcripts or ncRNAs, and can be used as a qualitative tool to characterize transcriptome complexity, revealing many hitherto unknown transcripts, splice isoforms, gene fusion events and ncRNAs, even at a relatively low sequence sampling.
Collapse
Affiliation(s)
- Alessandro Guffanti
- Institute of Biomedical Technologies, National Research Council, Milan, Italy.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
570
|
Abstract
All cancers arise as a result of changes that have occurred in the DNA sequence of the genomes of cancer cells. Over the past quarter of a century much has been learnt about these mutations and the abnormal genes that operate in human cancers. We are now, however, moving into an era in which it will be possible to obtain the complete DNA sequence of large numbers of cancer genomes. These studies will provide us with a detailed and comprehensive perspective on how individual cancers have developed.
Collapse
Affiliation(s)
- Michael R Stratton
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.
| | | | | |
Collapse
|
571
|
The evolution and application of techniques in molecular biology to human brain tumors: a 25 year perspective. J Neurooncol 2009; 92:261-73. [PMID: 19357954 DOI: 10.1007/s11060-009-9829-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2008] [Accepted: 02/23/2009] [Indexed: 12/19/2022]
Abstract
Since the establishment of the AANS/CNS Section on Tumors in 1984, neurosurgeons have been actively involved in basic science research of human brain tumors that has moved the field forward considerably. Here, we chronicle the major advances that have been made with respect to our understanding of the concepts guiding the biology of human malignant brain tumors. Numerous technical advances in science, such as the development of gene transfer techniques, the polymerase chain reaction, the discovery of oncogenes and tumor suppressor genes, and the refinement of approaches to cancer cytogenetics have enabled researchers to identify many of the non-random genetic alterations associated with brain tumor growth, invasion, immunology, angiogenesis and apoptosis. These data led to some astounding progress, for example with the use of gene therapy, whereby in the 1990s several human clinical trials were conducted for patients with brain tumors. More recently, the human genome project has been completed providing a blueprint for the human species. What has followed are exciting new techniques in molecular biology such as transcriptional profiling, single nucleotide polymorphism (SNP)-arrays, array comparative genomic hybridization (array-CGH), microRNA profiling, and detection of epigenetic silencing of tumor suppressor genes. The cancer genome is now being sequenced at break neck speed using advanced DNA sequencing techniques. We are on the threshold of cataloguing the major genetic alterations observed in all human brain tumors. What will follow is modeling of these genetic alterations in systems that will allow for the development of novel pharmacotherapeutics and translational research therapies.
Collapse
|
572
|
Tyner JW, Rutenberg-Schoenberg ML, Erickson H, Willis SG, O'Hare T, Deininger MW, Druker BJ, Loriaux MM. Functional characterization of an activating TEK mutation in acute myeloid leukemia: a cellular context-dependent activating mutation. Leukemia 2009; 23:1345-8. [PMID: 19340004 DOI: 10.1038/leu.2009.66] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
573
|
Teschendorff AE, Caldas C. The breast cancer somatic 'muta-ome': tackling the complexity. Breast Cancer Res 2009; 11:301. [PMID: 19344493 PMCID: PMC2688941 DOI: 10.1186/bcr2236] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Acquired somatic mutations are responsible for approximately 90% of breast tumours. However, only one somatic aberration, amplification of the HER2 locus, is currently used to define a clinical subtype, one that accounts for approximately 10% to 15% of breast tumours. In recent years, a number of mutational profiling studies have attempted to further identify clinically relevant mutations. While these studies have confirmed the oncogenic or tumour suppressor role of many known suspects, they have exposed complexity as a main feature of the breast cancer mutational landscape (the 'muta-ome'). The two defining features of this complexity are (a) a surprising richness of low-frequency mutants contrasting with the relative rarity of high-frequency events and (b) the relatively large number of somatic genomic aberrations (approximately 20 to 50) driving an average tumour. Structural features of this complex landscape have begun to emerge from follow-up studies that have tackled the complexity by integrating the spectrum of genomic mutations with a variety of complementary biological knowledge databases. Among these structural features are the growing links between somatic gene disruptions and those conferring breast cancer risk, mutually exclusive coexistence and synergistic mutational patterns, and a clearly non-random distribution of mutations implicating specific molecular pathways in breast tumour initiation and progression. Recognising that a shift from a gene-centric to a pathway-centric approach is necessary, we envisage that further progress in identifying clinically relevant genomic aberration patterns and associated breast cancer subtypes will require not only multi-dimensional integrative analyses that combine mutational and functional profiles, but also larger profiling studies that use second- and third-generation sequencing technologies in order to fill out the important gaps in the current mutational landscape.
Collapse
Affiliation(s)
- Andrew E Teschendorff
- Medical Genomics Group, Paul O'Gorman Building, UCL Cancer Institute, University College London, London, UK.
| | | |
Collapse
|
574
|
Wilhelm BT, Landry JR. RNA-Seq-quantitative measurement of expression through massively parallel RNA-sequencing. Methods 2009; 48:249-57. [PMID: 19336255 DOI: 10.1016/j.ymeth.2009.03.016] [Citation(s) in RCA: 328] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2008] [Revised: 03/14/2009] [Accepted: 03/17/2009] [Indexed: 01/20/2023] Open
Abstract
The ability to quantitatively survey the global behavior of transcriptomes has been a key milestone in the field of systems biology, enabled by the advent of DNA microarrays. While this approach has literally transformed our vision and approach to cellular physiology, microarray technology has always been limited by the requirement to decide, a priori, what regions of the genome to examine. While very high density tiling arrays have reduced this limitation for simpler organisms, it remains an obstacle for larger, more complex, eukaryotic genomes. The recent development of "next-generation" massively parallel sequencing (MPS) technologies by companies such as Roche (454 GS FLX), Illumina (Genome Analyzer II), and ABI (AB SOLiD) has completely transformed the way in which quantitative transcriptomics can be done. These new technologies have reduced both the cost-per-reaction and time required by orders of magnitude, making the use of sequencing a cost-effective option for many experimental approaches. One such method that has recently been developed uses MPS technology to directly survey the RNA content of cells, without requiring any of the traditional cloning associated with EST sequencing. This approach, called "RNA-seq", can generate quantitative expression scores that are comparable to microarrays, with the added benefit that the entire transcriptome is surveyed without the requirement of a priori knowledge of transcribed regions. The important advantage of this technique is that not only can quantitative expression measures be made, but transcript structures including alternatively spliced transcript isoforms, can also be identified. This article discusses the experimental approach for both sample preparation and data analysis for the technique of RNA-seq.
Collapse
Affiliation(s)
- Brian T Wilhelm
- Laboratory of Molecular Genetics of Stem Cells, C.P. 6128 Succursale Centre-Ville, Montréal, Que. H3C3J7, Canada.
| | | |
Collapse
|
575
|
From cancer genomes to cancer models: bridging the gaps. EMBO Rep 2009; 10:359-66. [PMID: 19305388 DOI: 10.1038/embor.2009.46] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2008] [Accepted: 02/23/2009] [Indexed: 11/08/2022] Open
Abstract
Cancer genome projects are now being expanded in an attempt to provide complete landscapes of the mutations that exist in tumours. Although the importance of cataloguing genome variations is well recognized, there are obvious difficulties in bridging the gaps between high-throughput resequencing information and the molecular mechanisms of cancer evolution. Here, we describe the current status of the high-throughput genomic technologies, and the current limitations of the associated computational analysis and experimental validation of cancer genetic variants. We emphasize how the current cancer-evolution models will be influenced by the high-throughput approaches, in particular through efforts devoted to monitoring tumour progression, and how, in turn, the integration of data and models will be translated into mechanistic knowledge and clinical applications.
Collapse
|
576
|
Abstract
PURPOSE OF REVIEW Recent rapid progress in DNA sequencing has permitted projects to be undertaken that are aimed at building unbiased genome-wide portraits of the underlying mutations in human tumors. This review sets out the highlights of the recent progress in this area and the rapidly evolving picture of the underlying genetic basis of human epithelial cancers. RECENT FINDINGS Individual tumors are estimated to contain around 80 point mutations in protein coding genes of which 15 are likely to be tumorigenic. It is likely that there are hundreds of different genes that when mutated contribute to human tumorigenesis most in only a small fraction of tumors. Mutations caused by large chromosomal rearrangements also appear to be common in tumors. In prostate and lung cancers, recurrent chromosomal translocations resulting in tumorigenic fusion proteins have been identified. SUMMARY The multitude of new mutated genes being identified in human tumors represent many new directions for experimental research into the molecular pathways that lead to tumor formation. These studies, in turn, are likely to lead to many novel approaches to targeted therapy useful in subsets of tumors with particular types of gene mutation.
Collapse
|
577
|
Ortiz de Mendíbil I, Vizmanos JL, Novo FJ. Signatures of selection in fusion transcripts resulting from chromosomal translocations in human cancer. PLoS One 2009; 4:e4805. [PMID: 19279687 PMCID: PMC2653638 DOI: 10.1371/journal.pone.0004805] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2008] [Accepted: 01/30/2009] [Indexed: 11/27/2022] Open
Abstract
Background The recurrence and non-random distribution of translocation breakpoints in human tumors are usually attributed to local sequence features present in the vicinity of the breakpoints. However, it has also been suggested that functional constraints might contribute to delimit the position of translocation breakpoints within the genes involved, but a quantitative analysis of such contribution has been lacking. Methodology We have analyzed two well-known signatures of functional selection, such as reading-frame compatibility and non-random combinations of protein domains, on an extensive dataset of fusion proteins resulting from chromosomal translocations in cancer. Conclusions Our data provide strong experimental support for the concept that the position of translocation breakpoints in the genome of cancer cells is determined, to a large extent, by the need to combine certain protein domains and to keep an intact reading frame in fusion transcripts. Additionally, the information that we have assembled affords a global view of the oncogenic mechanisms and domain architectures that are used by fusion proteins. This can be used to assess the functional impact of novel chromosomal translocations and to predict the position of breakpoints in the genes involved.
Collapse
Affiliation(s)
| | | | - Francisco J. Novo
- Department of Genetics, University of Navarra, Pamplona, Spain
- * E-mail:
| |
Collapse
|
578
|
Cancer gene discovery in mouse and man. Biochim Biophys Acta Rev Cancer 2009; 1796:140-61. [PMID: 19285540 PMCID: PMC2756404 DOI: 10.1016/j.bbcan.2009.03.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2009] [Revised: 03/03/2009] [Accepted: 03/05/2009] [Indexed: 12/31/2022]
Abstract
The elucidation of the human and mouse genome sequence and developments in high-throughput genome analysis, and in computational tools, have made it possible to profile entire cancer genomes. In parallel with these advances mouse models of cancer have evolved into a powerful tool for cancer gene discovery. Here we discuss the approaches that may be used for cancer gene identification in both human and mouse and discuss how a cross-species 'oncogenomics' approach to cancer gene discovery represents a powerful strategy for finding genes that drive tumourigenesis.
Collapse
|
579
|
Abstract
Copy number variation is a defining characteristic of human subtelomeres. Human subtelomeric segmental duplication regions ('Subtelomeric Repeats') comprise about 25% of the most distal 500 kb and 80% of the most distal 100 kb in human DNA. Huge allelic disparities seen in subtelomeric DNA sequence content and organization are postulated to have an impact on the dosage of transcripts embedded within the duplicated sequences, on the transcription of genes in adjacent single copy DNA regions, and on the chromatin structures mediating telomere functions including chromosome stability. In addition to the complex duplicon substructure and huge allelic variations in extended subtelomere regions, both copy number variation and alternative sequence organizations for DNA characterize the sequences immediately adjacent to terminal (TTAGGG)n tracts ('subterminal DNA'). The structural variation in subterminal DNA is likely to have important consequences for expression of subterminal transcripts such as a newly-discovered gene family encoding actin-interacting proteins and a non-coding telomeric repeat containing RNA (TERRA) transcript family critical for telomere integrity. Major immediate challenges include discovering the full extent and nature of subtelomeric structural and copy number variation in humans, and developing methods for tracking individual allelic variants in the context of total genomic DNA.
Collapse
Affiliation(s)
- H. Riethman
- The Wistar Institute, Philadelphia, PA (USA)
| |
Collapse
|
580
|
Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 2009. [PMID: 19261174 DOI: 10.1186/gb‐2009‐10‐3‐r25] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source (http://bowtie.cbcb.umd.edu).
Collapse
Affiliation(s)
- Ben Langmead
- Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA.
| | | | | | | |
Collapse
|
581
|
Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 2009; 10:R25. [PMID: 19261174 PMCID: PMC2690996 DOI: 10.1186/gb-2009-10-3-r25] [Citation(s) in RCA: 16182] [Impact Index Per Article: 1011.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2008] [Revised: 12/19/2008] [Accepted: 03/04/2009] [Indexed: 12/19/2022] Open
Abstract
Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source (http://bowtie.cbcb.umd.edu).
Collapse
Affiliation(s)
- Ben Langmead
- Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA
| | - Cole Trapnell
- Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA
| | - Mihai Pop
- Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA
| | - Steven L Salzberg
- Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
582
|
Arlt MF, Mulle JG, Schaibley VM, Ragland RL, Durkin SG, Warren ST, Glover TW. Replication stress induces genome-wide copy number changes in human cells that resemble polymorphic and pathogenic variants. Am J Hum Genet 2009; 84:339-50. [PMID: 19232554 PMCID: PMC2667984 DOI: 10.1016/j.ajhg.2009.01.024] [Citation(s) in RCA: 122] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2008] [Revised: 01/23/2009] [Accepted: 01/30/2009] [Indexed: 11/28/2022] Open
Abstract
Copy number variants (CNVs) are an important component of genomic variation in humans and other mammals. Similar de novo deletions and duplications, or copy number changes (CNCs), are now known to be a major cause of genetic and developmental disorders and to arise somatically in many cancers. A major mechanism leading to both CNVs and disease-associated CNCs is meiotic unequal crossing over, or nonallelic homologous recombination (NAHR), mediated by flanking repeated sequences or segmental duplications. Others appear to involve nonhomologous end joining (NHEJ) or aberrant replication suggesting a mitotic cell origin. Here we show that aphidicolin-induced replication stress in normal human cells leads to a high frequency of CNCs of tens to thousands of kilobases across the human genome that closely resemble CNVs and disease-associated CNCs. Most deletion and duplication breakpoint junctions were characterized by short (<6 bp) microhomologies, consistent with the hypothesis that these rearrangements were formed by NHEJ or a replication-coupled process, such as template switching. This is a previously unrecognized consequence of replication stress and suggests that replication fork stalling and subsequent error-prone repair are important mechanisms in the formation of CNVs and pathogenic CNCs in humans.
Collapse
Affiliation(s)
- Martin F. Arlt
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jennifer G. Mulle
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | | | - Ryan L. Ragland
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Sandra G. Durkin
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Stephen T. Warren
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | - Thomas W. Glover
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
583
|
Tanaka H, Yao MC. Palindromic gene amplification--an evolutionarily conserved role for DNA inverted repeats in the genome. Nat Rev Cancer 2009; 9:216-24. [PMID: 19212324 DOI: 10.1038/nrc2591] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The clinical importance of gene amplification in the diagnosis and treatment of cancer has been widely recognized, as it is often evident in advanced stages of diseases. However, our knowledge of the underlying mechanisms is still limited. Gene amplification is an essential process in several organisms including the ciliate Tetrahymena thermophila, in which the initiating mechanism has been well characterized. Lessons from such simple eukaryotes may provide useful information regarding how gene amplification occurs in tumour cells.
Collapse
Affiliation(s)
- Hisashi Tanaka
- Department of Molecular Genetics, Cleveland Clinic Lerner Research Institute, 9,500 Euclid Avenue, Cleveland, Ohio 44195, USA.
| | | |
Collapse
|
584
|
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJM, Birol I. ABySS: a parallel assembler for short read sequence data. Genome Res 2009; 19:1117-23. [PMID: 19251739 DOI: 10.1101/gr.089532.108] [Citation(s) in RCA: 2489] [Impact Index Per Article: 155.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Widespread adoption of massively parallel deoxyribonucleic acid (DNA) sequencing instruments has prompted the recent development of de novo short read assembly algorithms. A common shortcoming of the available tools is their inability to efficiently assemble vast amounts of data generated from large-scale sequencing projects, such as the sequencing of individual human genomes to catalog natural genetic variation. To address this limitation, we developed ABySS (Assembly By Short Sequences), a parallelized sequence assembler. As a demonstration of the capability of our software, we assembled 3.5 billion paired-end reads from the genome of an African male publicly released by Illumina, Inc. Approximately 2.76 million contigs > or =100 base pairs (bp) in length were created with an N50 size of 1499 bp, representing 68% of the reference human genome. Analysis of these contigs identified polymorphic and novel sequences not present in the human reference assembly, which were validated by alignment to alternate human assemblies and to other primate genomes.
Collapse
Affiliation(s)
- Jared T Simpson
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia V5Z 4E6, Canada
| | | | | | | | | | | |
Collapse
|
585
|
Genome-wide profiling of genetic alterations in acute lymphoblastic leukemia: recent insights and future directions. Leukemia 2009; 23:1209-18. [DOI: 10.1038/leu.2009.18] [Citation(s) in RCA: 106] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
586
|
Voelkerding KV, Dames SA, Durtschi JD. Next-generation sequencing: from basic research to diagnostics. Clin Chem 2009; 55:641-58. [PMID: 19246620 DOI: 10.1373/clinchem.2008.112789] [Citation(s) in RCA: 449] [Impact Index Per Article: 28.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
BACKGROUND For the past 30 years, the Sanger method has been the dominant approach and gold standard for DNA sequencing. The commercial launch of the first massively parallel pyrosequencing platform in 2005 ushered in the new era of high-throughput genomic analysis now referred to as next-generation sequencing (NGS). CONTENT This review describes fundamental principles of commercially available NGS platforms. Although the platforms differ in their engineering configurations and sequencing chemistries, they share a technical paradigm in that sequencing of spatially separated, clonally amplified DNA templates or single DNA molecules is performed in a flow cell in a massively parallel manner. Through iterative cycles of polymerase-mediated nucleotide extensions or, in one approach, through successive oligonucleotide ligations, sequence outputs in the range of hundreds of megabases to gigabases are now obtained routinely. Highlighted in this review are the impact of NGS on basic research, bioinformatics considerations, and translation of this technology into clinical diagnostics. Also presented is a view into future technologies, including real-time single-molecule DNA sequencing and nanopore-based sequencing. SUMMARY In the relatively short time frame since 2005, NGS has fundamentally altered genomics research and allowed investigators to conduct experiments that were previously not technically feasible or affordable. The various technologies that constitute this new paradigm continue to evolve, and further improvements in technology robustness and process streamlining will pave the path for translation into clinical diagnostics.
Collapse
Affiliation(s)
- Karl V Voelkerding
- ARUP Institute for Experimental and Clinical Pathology, Salt Lake City, Utah 84108, USA.
| | | | | |
Collapse
|
587
|
Korbel JO, Abyzov A, Mu XJ, Carriero N, Cayting P, Zhang Z, Snyder M, Gerstein MB. PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Genome Biol 2009; 10:R23. [PMID: 19236709 PMCID: PMC2688268 DOI: 10.1186/gb-2009-10-2-r23] [Citation(s) in RCA: 171] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2008] [Revised: 12/22/2008] [Accepted: 02/23/2009] [Indexed: 11/10/2022] Open
Abstract
Personal-genomics endeavors, such as the 1000 Genomes project, are generating maps of genomic structural variants by analyzing ends of massively sequenced genome fragments. To process these we developed Paired-End Mapper (PEMer; http://sv.gersteinlab.org/pemer). This comprises an analysis pipeline, compatible with several next-generation sequencing platforms; simulation-based error models, yielding confidence-values for each structural variant; and a back-end database. The simulations demonstrated high structural variant reconstruction efficiency for PEMer's coverage-adjusted multi-cutoff scoring-strategy and showed its relative insensitivity to base-calling errors.
Collapse
Affiliation(s)
- Jan O Korbel
- Gene Expression Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstr, Heidelberg, 69117, Germany.
| | | | | | | | | | | | | | | |
Collapse
|
588
|
Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 2009. [PMID: 19015660 DOI: 10.1038/nrg2484,] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. Studies using this method have already altered our view of the extent and complexity of eukaryotic transcriptomes. RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods. This article describes the RNA-Seq approach, the challenges associated with its application, and the advances made so far in characterizing several eukaryote transcriptomes.
Collapse
Affiliation(s)
- Zhong Wang
- Department of Molecular, Cellular and Developmental Biology, Yale University, 219 Prospect Street, New Haven, Connecticut 06520, USA
| | | | | |
Collapse
|
589
|
RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 2009. [PMID: 19015660 DOI: 10.1038/nrg2484;] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. Studies using this method have already altered our view of the extent and complexity of eukaryotic transcriptomes. RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods. This article describes the RNA-Seq approach, the challenges associated with its application, and the advances made so far in characterizing several eukaryote transcriptomes.
Collapse
|
590
|
Skotheim RI, Thomassen GOS, Eken M, Lind GE, Micci F, Ribeiro FR, Cerveira N, Teixeira MR, Heim S, Rognes T, Lothe RA. A universal assay for detection of oncogenic fusion transcripts by oligo microarray analysis. Mol Cancer 2009; 8:5. [PMID: 19152679 PMCID: PMC2633275 DOI: 10.1186/1476-4598-8-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2008] [Accepted: 01/19/2009] [Indexed: 11/21/2022] Open
Abstract
BACKGROUND The ability to detect neoplasia-specific fusion genes is important not only in cancer research, but also increasingly in clinical settings to ensure that correct diagnosis is made and the optimal treatment is chosen. However, the available methodologies to detect such fusions all have their distinct short-comings. RESULTS We describe a novel oligonucleotide microarray strategy whereby one can screen for all known oncogenic fusion transcripts in a single experiment. To accomplish this, we combine measurements of chimeric transcript junctions with exon-wise measurements of individual fusion partners. To demonstrate the usefulness of the approach, we designed a DNA microarray containing 68,861 oligonucleotide probes that includes oligos covering all combinations of chimeric exon-exon junctions from 275 pairs of fusion genes, as well as sets of oligos internal to all the exons of the fusion partners. Using this array, proof of principle was demonstrated by detection of known fusion genes (such as TCF3:PBX1, ETV6:RUNX1, and TMPRSS2:ERG) from all six positive controls consisting of leukemia cell lines and prostate cancer biopsies. CONCLUSION This new method bears promise of an important complement to currently used diagnostic and research tools for the detection of fusion genes in neoplastic diseases.
Collapse
Affiliation(s)
- Rolf I Skotheim
- Department of Cancer Prevention, Institute for Cancer Research, Norwegian Radium Hospital, Rikshospitalet University Hospital, Oslo, Norway
- Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| | - Gard OS Thomassen
- Department of Cancer Prevention, Institute for Cancer Research, Norwegian Radium Hospital, Rikshospitalet University Hospital, Oslo, Norway
- Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
- Centre for Molecular Biology and Neuroscience, Institute of Medical Microbiology, Rikshospitalet University Hospital, Oslo, Norway
| | - Marthe Eken
- Department of Cancer Prevention, Institute for Cancer Research, Norwegian Radium Hospital, Rikshospitalet University Hospital, Oslo, Norway
- Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
- Department of Molecular Biosciences, University of Oslo, Oslo, Norway
| | - Guro E Lind
- Department of Cancer Prevention, Institute for Cancer Research, Norwegian Radium Hospital, Rikshospitalet University Hospital, Oslo, Norway
- Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| | - Francesca Micci
- Department of Cancer Genetics, Norwegian Radium Hospital, Rikshospitalet University Hospital, Oslo, Norway
| | - Franclim R Ribeiro
- Department of Cancer Prevention, Institute for Cancer Research, Norwegian Radium Hospital, Rikshospitalet University Hospital, Oslo, Norway
- Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
- Department of Genetics, Portuguese Oncology Institute, Porto, Portugal
| | - Nuno Cerveira
- Department of Genetics, Portuguese Oncology Institute, Porto, Portugal
| | - Manuel R Teixeira
- Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
- Department of Genetics, Portuguese Oncology Institute, Porto, Portugal
| | - Sverre Heim
- Department of Cancer Genetics, Norwegian Radium Hospital, Rikshospitalet University Hospital, Oslo, Norway
- Medical Faculty, University of Oslo, Oslo, Norway
| | - Torbjørn Rognes
- Centre for Molecular Biology and Neuroscience, Institute of Medical Microbiology, Rikshospitalet University Hospital, Oslo, Norway
- Department of Informatics, University of Oslo, Oslo, Norway
| | - Ragnhild A Lothe
- Department of Cancer Prevention, Institute for Cancer Research, Norwegian Radium Hospital, Rikshospitalet University Hospital, Oslo, Norway
- Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
- Department of Molecular Biosciences, University of Oslo, Oslo, Norway
| |
Collapse
|
591
|
Abstract
RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. Studies using this method have already altered our view of the extent and complexity of eukaryotic transcriptomes. RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods. This article describes the RNA-Seq approach, the challenges associated with its application, and the advances made so far in characterizing several eukaryote transcriptomes.
Collapse
Affiliation(s)
- Zhong Wang
- Department of Molecular, Cellular and Developmental Biology, Yale University, 219 Prospect Street, New Haven, Connecticut 06520, USA
| | | | | |
Collapse
|
592
|
RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 2009. [DOI: 10.1038/nrg2484\] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
593
|
Chiang DY, Getz G, Jaffe DB, O’Kelly MJT, Zhao X, Carter SL, Russ C, Nusbaum C, Meyerson M, Lander ES. High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat Methods 2009; 6:99-103. [PMID: 19043412 PMCID: PMC2630795 DOI: 10.1038/nmeth.1276] [Citation(s) in RCA: 382] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2008] [Accepted: 10/28/2008] [Indexed: 12/29/2022]
Abstract
Cancer results from somatic alterations in key genes, including point mutations, copy-number alterations and structural rearrangements. A powerful way to discover cancer-causing genes is to identify genomic regions that show recurrent copy-number alterations (gains and losses) in tumor genomes. Recent advances in sequencing technologies suggest that massively parallel sequencing may provide a feasible alternative to DNA microarrays for detecting copy-number alterations. Here we present: (i) a statistical analysis of the power to detect copy-number alterations of a given size; (ii) SegSeq, an algorithm to segment equal copy numbers from massively parallel sequence data; and (iii) analysis of experimental data from three matched pairs of tumor and normal cell lines. We show that a collection of approximately 14 million aligned sequence reads from human cell lines has comparable power to detect events as the current generation of DNA microarrays and has over twofold better precision for localizing breakpoints (typically, to within approximately 1 kilobase).
Collapse
Affiliation(s)
- Derek Y. Chiang
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA
- Department of Medical Oncology and Center for Cancer Genome Discovery, Dana-Farber Cancer Institute, 44 Binney Street, Boston, MA 02115, USA
| | - Gad Getz
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA
| | - David B. Jaffe
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA
| | | | - Xiaojun Zhao
- Novartis Institutes for Biomedical Research, 250 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Scott L. Carter
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA
- The Harvard-MIT Division of Health Sciences and Technology, 77 Massachusetts Ave., Cambridge, MA 02139, USA
| | - Carsten Russ
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA
| | - Chad Nusbaum
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA
| | - Matthew Meyerson
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA
- Department of Medical Oncology and Center for Cancer Genome Discovery, Dana-Farber Cancer Institute, 44 Binney Street, Boston, MA 02115, USA
| | - Eric S. Lander
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA
| |
Collapse
|
594
|
Abstract
RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. Studies using this method have already altered our view of the extent and complexity of eukaryotic transcriptomes. RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods. This article describes the RNA-Seq approach, the challenges associated with its application, and the advances made so far in characterizing several eukaryote transcriptomes.
Collapse
Affiliation(s)
- Zhong Wang
- Department of Molecular, Cellular and Developmental Biology, Yale University, 219 Prospect Street, New Haven, Connecticut 06520, USA
| | | | | |
Collapse
|
595
|
Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 2009. [DOI: 10.1038/nrg2484 or 1=1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
596
|
Abstract
A key goal in cancer research is to identify the total complement of genetic and epigenetic alterations that contribute to tumorigenesis. We are currently witnessing the rapid evolution and convergence of multiple genome-wide platforms that are making this goal a reality. Leading this effort are studies of the molecular lesions that underlie pediatric acute lymphoblastic leukemia (ALL). The recent application of microarray-based analyses of DNA copy number abnormalities (CNAs) in pediatric ALL, complemented by transcriptional profiling, resequencing and epigenetic approaches, has identified a high frequency of common genetic alterations in both B-progenitor and T-lineage ALL. These approaches have identified abnormalities in key pathways, including lymphoid differentiation, cell cycle regulation, tumor suppression, and drug responsiveness. Moreover, the nature and frequency of CNAs differ markedly among ALL genetic subtypes. In this article, we review the key findings from the published data on genome-wide analyses of ALL and highlight some of the technical aspects of data generation and analysis that must be carefully controlled to obtain optimal results.
Collapse
|
597
|
Abstract
The 454 Sequencer has dramatically increased the volume of sequencing conducted by the scientific community and expanded the range of problems that can be addressed by the direct readouts of DNA sequence. Key breakthroughs in the development of the 454 sequencing platform included higher throughput, simplified all in vitro sample preparation and the miniaturization of sequencing chemistries, enabling massively parallel sequencing reactions to be carried out at a scale and cost not previously possible. Together with other recently released next-generation technologies, the 454 platform has started to democratize sequencing, providing individual laboratories with access to capacities that rival those previously found only at a handful of large sequencing centers. Over the past 18 months, 454 sequencing has led to a better understanding of the structure of the human genome, allowed the first non-Sanger sequence of an individual human and opened up new approaches to identify small RNAs. To make next-generation technologies more widely accessible, they must become easier to use and less costly. In the longer term, the principles established by 454 sequencing might reduce cost further, potentially enabling personalized genomics.
Collapse
|
598
|
Abstract
DNA sequence represents a single format onto which a broad range of biological phenomena can be projected for high-throughput data collection. Over the past three years, massively parallel DNA sequencing platforms have become widely available, reducing the cost of DNA sequencing by over two orders of magnitude, and democratizing the field by putting the sequencing capacity of a major genome center in the hands of individual investigators. These new technologies are rapidly evolving, and near-term challenges include the development of robust protocols for generating sequencing libraries, building effective new approaches to data-analysis, and often a rethinking of experimental design. Next-generation DNA sequencing has the potential to dramatically accelerate biological and biomedical research, by enabling the comprehensive analysis of genomes, transcriptomes and interactomes to become inexpensive, routine and widespread, rather than requiring significant production-scale efforts.
Collapse
Affiliation(s)
- Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195-5065, USA.
| | | |
Collapse
|
599
|
Large-scale genomic analysis of ovarian carcinomas. Mol Oncol 2008; 3:157-64. [PMID: 19383377 DOI: 10.1016/j.molonc.2008.12.005] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2008] [Revised: 12/08/2008] [Accepted: 12/11/2008] [Indexed: 01/31/2023] Open
Abstract
Epithelial ovarian cancers are typified by frequent genomic aberrations that have been difficult to unravel. Recently, high-resolution array technologies have provided the first glimpse of the remarkable complexity of these aberrations with some ovarian cancers containing hundreds of copy number breakpoints, micro-deletions and amplifications. Many of these alterations contain cancer-related genes suggesting that the majority is disease-associated and not just the product of random genomic instability. Future developments such as next-generation sequencing and integrated analysis of data from multiple array platforms on large numbers of samples are poised to revolutionize our understanding of this complex disease.
Collapse
|
600
|
Hampton OA, Den Hollander P, Miller CA, Delgado DA, Li J, Coarfa C, Harris RA, Richards S, Scherer SE, Muzny DM, Gibbs RA, Lee AV, Milosavljevic A. A sequence-level map of chromosomal breakpoints in the MCF-7 breast cancer cell line yields insights into the evolution of a cancer genome. Genome Res 2008; 19:167-77. [PMID: 19056696 DOI: 10.1101/gr.080259.108] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
By applying a method that combines end-sequence profiling and massively parallel sequencing, we obtained a sequence-level map of chromosomal aberrations in the genome of the MCF-7 breast cancer cell line. A total of 157 distinct somatic breakpoints of two distinct types, dispersed and clustered, were identified. A total of 89 breakpoints are evenly dispersed across the genome. A majority of dispersed breakpoints are in regions of low copy repeats (LCRs), indicating a possible role for LCRs in chromosome breakage. The remaining 68 breakpoints form four distinct clusters of closely spaced breakpoints that coincide with the four highly amplified regions in MCF-7 detected by array CGH located in the 1p13.1-p21.1, 3p14.1-p14.2, 17q22-q24.3, and 20q12-q13.33 chromosomal cytobands. The clustered breakpoints are not significantly associated with LCRs. Sequences flanking most (95%) breakpoint junctions are consistent with double-stranded DNA break repair by nonhomologous end-joining or template switching. A total of 79 known or predicted genes are involved in rearrangement events, including 10 fusions of coding exons from different genes and 77 other rearrangements. Four fusions result in novel expressed chimeric mRNA transcripts. One of the four expressed fusion products (RAD51C-ATXN7) and one gene truncation (BRIP1 or BACH1) involve genes coding for members of protein complexes responsible for homology-driven repair of double-stranded DNA breaks. Another one of the four expressed fusion products (ARFGEF2-SULF2) involves SULF2, a regulator of cell growth and angiogenesis. We show that knock-down of SULF2 in cell lines causes tumorigenic phenotypes, including increased proliferation, enhanced survival, and increased anchorage-independent growth.
Collapse
Affiliation(s)
- Oliver A Hampton
- Bioinformatics Research Laboratory, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|