1
|
Temiz NA, Donohue DE, Bacolla A, Vasquez KM, Cooper DN, Mudunuri U, Ivanic J, Cer RZ, Yi M, Stephens RM, Collins JR, Luke BT. The somatic autosomal mutation matrix in cancer genomes. Hum Genet 2015; 134:851-64. [PMID: 26001532 PMCID: PMC4495249 DOI: 10.1007/s00439-015-1566-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Accepted: 05/12/2015] [Indexed: 01/26/2023]
Abstract
DNA damage in somatic cells originates from both environmental and endogenous sources, giving rise to mutations through multiple mechanisms. When these mutations affect the function of critical genes, cancer may ensue. Although identifying genomic subsets of mutated genes may inform therapeutic options, a systematic survey of tumor mutational spectra is required to improve our understanding of the underlying mechanisms of mutagenesis involved in cancer etiology. Recent studies have presented genome-wide sets of somatic mutations as a 96-element vector, a procedure that only captures the immediate neighbors of the mutated nucleotide. Herein, we present a 32 × 12 mutation matrix that captures the nucleotide pattern two nucleotides upstream and downstream of the mutation. A somatic autosomal mutation matrix (SAMM) was constructed from tumor-specific mutations derived from each of 909 individual cancer genomes harboring a total of 10,681,843 single-base substitutions. In addition, mechanistic template mutation matrices (MTMMs) representing oxidative DNA damage, ultraviolet-induced DNA damage, (5m)CpG deamination, and APOBEC-mediated cytosine mutation, are presented. MTMMs were mapped to the individual tumor SAMMs to determine the maximum contribution of each mutational mechanism to the overall mutation pattern. A Manhattan distance across all SAMM elements between any two tumor genomes was used to determine their relative distance. Employing this metric, 89.5% of all tumor genomes were found to have a nearest neighbor from the same tissue of origin. When a distance-dependent 6-nearest neighbor classifier was used, 10.4% of the SAMMs had an Undetermined tissue of origin, and 92.2% of the remaining SAMMs were assigned to the correct tissue of origin. [corrected]. Thus, although tumors from different tissues may have similar mutation patterns, their SAMMs often display signatures that are characteristic of specific tissues.
Collapse
Affiliation(s)
- Nuri A. Temiz
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
- />Masonic Cancer Center, University of Minnesota, 2-120 CCRB, 2231 6th St SE, Minneapolis, MN 55455 USA
| | - Duncan E. Donohue
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
- />US Army Medical Research and Material Command, 568 Doughten Dr., Fort Detrick, Frederick, MD 21702 USA
| | - Albino Bacolla
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
- />Division of Pharmacology and Toxicology, The University of Texas at Austin, Austin, TX 78723 USA
| | - Karen M. Vasquez
- />Division of Pharmacology and Toxicology, The University of Texas at Austin, Austin, TX 78723 USA
| | - David N. Cooper
- />Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, CF14 4XN UK
| | - Uma Mudunuri
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
| | - Joseph Ivanic
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
| | - Regina Z. Cer
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
- />Naval Medical Research Center-Frederick, 8400 Research Plaza, Fort Detrick, Frederick, MD 21702 USA
| | - Ming Yi
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
| | - Robert M. Stephens
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
| | - Jack R. Collins
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
| | - Brian T. Luke
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
| |
Collapse
|
2
|
Kovac MB, Kovacova M, Bachraty H, Bachrata K, Piscuoglio S, Hutter P, Ilencikova D, Bartosova Z, Tomlinson I, Roethlisberger B, Heinimann K. High-resolution breakpoint analysis provides evidence for the sequence-directed nature of genome rearrangements in hereditary disorders. Hum Mutat 2014; 36:250-9. [PMID: 25418510 DOI: 10.1002/humu.22734] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2014] [Accepted: 11/10/2014] [Indexed: 01/09/2023]
Abstract
Although most of the pertinent data on the sequence-directed processes leading to genome rearrangements (GRs) have come from studies on somatic tissues, little is known about GRs in the germ line of patients with hereditary disorders. This study aims at identifying DNA motifs and higher order structures of genome architecture, which can result in losses and gains of genetic material in the germ line. We first identified candidate motifs by studying 112 pathogenic germ-line GRs in hereditary colorectal cancer patients, and subsequently created an algorithm, termed recombination type ratio, which correctly predicts the propensity of rearrangements with respect to homologous versus nonhomologous recombination events.
Collapse
Affiliation(s)
- Michal B Kovac
- Research Group Human Genomics, Department of Biomedicine, University of Basel, Basel, Switzerland; Medical Genetics, University Hospital Basel, Basel, Switzerland; The Wellcome Trust Centre for Human Genetics, University of Oxford, Old Road Campus, Oxford, UK
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Chen C, Liu J, Zhou F, Sun J, Li L, Jin C, Shao J, Jiang H, Zhao N, Zheng S, Lin B. Next-generation sequencing of colorectal cancers in chinese: identification of a recurrent frame-shift and gain-of-function Indel mutation in the TFDP1 gene. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2014; 18:625-35. [PMID: 25133581 DOI: 10.1089/omi.2014.0058] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Abstract Re-sequencing of target genes is a highly effective approach for identifying mutations in cancers. Mutations, including indels (insertions, deletions, and the combination of the two), play important roles in carcinogenesis. Combining genomic DNA capture using high-density oligonucleotide microarrays (NimbleGen, Inc.) with next-generation high-throughput sequencing, we identified approximately 1600 indels for colorectal cancers in the Chinese population. Among them, 5 indels were localized to exonic regions of genes, including the TFDP1 (transcription factor Dp-1) gene. TFDP1 is an important transcription factor that coordinates with E2F proteins, thereby promoting transcription of E2F target genes and regulating the cell cycle and differentiation. We report here the identification of a recurrent frame-shift indel mutation (named indel84) in the TFDP1 gene in colorectal cancers by next-generation sequencing. We found in a validation set that TFDP1 indel84 is present in 70% of colorectal cancer (CRC) tissues. Wild-type TFDP1 encodes a protein of 410 amino acids with a potential DNA binding site at its N-terminal followed by several functional protein domains. The TFDP1 indel cDNA would generate an alternative TFDP1 protein missing the first 120 amino acids and potentially affecting the DNA binding domain. We further demonstrated that the TFDP1 indel84 mutation generated a gain-of-function phenotype by increasing cell proliferation, migration, and invasion of CRC cells. Our study identified a key molecular event for CRC that might have great diagnostic and therapeutic potentials.
Collapse
Affiliation(s)
- Chen Chen
- 1 Cancer Institute (Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education), Second Affiliated Hospital, College of Medicine, Zhejiang University , Hangzhou, Zhejiang Province, China
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Bacolla A, Cooper DN, Vasquez KM. Mechanisms of base substitution mutagenesis in cancer genomes. Genes (Basel) 2014; 5:108-46. [PMID: 24705290 PMCID: PMC3978516 DOI: 10.3390/genes5010108] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2013] [Revised: 02/07/2014] [Accepted: 02/11/2014] [Indexed: 01/24/2023] Open
Abstract
Cancer genome sequence data provide an invaluable resource for inferring the key mechanisms by which mutations arise in cancer cells, favoring their survival, proliferation and invasiveness. Here we examine recent advances in understanding the molecular mechanisms responsible for the predominant type of genetic alteration found in cancer cells, somatic single base substitutions (SBSs). Cytosine methylation, demethylation and deamination, charge transfer reactions in DNA, DNA replication timing, chromatin status and altered DNA proofreading activities are all now known to contribute to the mechanisms leading to base substitution mutagenesis. We review current hypotheses as to the major processes that give rise to SBSs and evaluate their relative relevance in the light of knowledge acquired from cancer genome sequencing projects and the study of base modifications, DNA repair and lesion bypass. Although gene expression data on APOBEC3B enzymes provide support for a role in cancer mutagenesis through U:G mismatch intermediates, the enzyme preference for single-stranded DNA may limit its activity genome-wide. For SBSs at both CG:CG and YC:GR sites, we outline evidence for a prominent role of damage by charge transfer reactions that follow interactions of the DNA with reactive oxygen species (ROS) and other endogenous or exogenous electron-abstracting molecules.
Collapse
Affiliation(s)
- Albino Bacolla
- Dell Pediatric Research Institute, Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, 1400 Barbara Jordan Blvd., Austin, TX 78723, USA.
| | - David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff CF14 4XN, UK.
| | - Karen M Vasquez
- Dell Pediatric Research Institute, Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, 1400 Barbara Jordan Blvd., Austin, TX 78723, USA.
| |
Collapse
|
5
|
Bacolla A, Temiz NA, Yi M, Ivanic J, Cer RZ, Donohue DE, Ball EV, Mudunuri US, Wang G, Jain A, Volfovsky N, Luke BT, Stephens RM, Cooper DN, Collins JR, Vasquez KM. Guanine holes are prominent targets for mutation in cancer and inherited disease. PLoS Genet 2013; 9:e1003816. [PMID: 24086153 PMCID: PMC3784513 DOI: 10.1371/journal.pgen.1003816] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2013] [Accepted: 08/07/2013] [Indexed: 12/27/2022] Open
Abstract
Single base substitutions constitute the most frequent type of human gene mutation and are a leading cause of cancer and inherited disease. These alterations occur non-randomly in DNA, being strongly influenced by the local nucleotide sequence context. However, the molecular mechanisms underlying such sequence context-dependent mutagenesis are not fully understood. Using bioinformatics, computational and molecular modeling analyses, we have determined the frequencies of mutation at G • C bp in the context of all 64 5'-NGNN-3' motifs that contain the mutation at the second position. Twenty-four datasets were employed, comprising >530,000 somatic single base substitutions from 21 cancer genomes, >77,000 germline single-base substitutions causing or associated with human inherited disease and 16.7 million benign germline single-nucleotide variants. In several cancer types, the number of mutated motifs correlated both with the free energies of base stacking and the energies required for abstracting an electron from the target guanines (ionization potentials). Similar correlations were also evident for the pathological missense and nonsense germline mutations, but only when the target guanines were located on the non-transcribed DNA strand. Likewise, pathogenic splicing mutations predominantly affected positions in which a purine was located on the non-transcribed DNA strand. Novel candidate driver mutations and tissue-specific mutational patterns were also identified in the cancer datasets. We conclude that electron transfer reactions within the DNA molecule contribute to sequence context-dependent mutagenesis, involving both somatic driver and passenger mutations in cancer, as well as germline alterations causing or associated with inherited disease.
Collapse
Affiliation(s)
- Albino Bacolla
- Division of Pharmacology and Toxicology, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, Texas, United States of America
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Nuri A. Temiz
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Ming Yi
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Joseph Ivanic
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Regina Z. Cer
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Duncan E. Donohue
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Edward V. Ball
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - Uma S. Mudunuri
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Guliang Wang
- Division of Pharmacology and Toxicology, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, Texas, United States of America
| | - Aklank Jain
- Division of Pharmacology and Toxicology, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, Texas, United States of America
| | - Natalia Volfovsky
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Brian T. Luke
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Robert M. Stephens
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - David N. Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - Jack R. Collins
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Karen M. Vasquez
- Division of Pharmacology and Toxicology, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, Texas, United States of America
| |
Collapse
|
6
|
Laycock-van Spyk S, Thomas N, Cooper DN, Upadhyaya M. Neurofibromatosis type 1-associated tumours: their somatic mutational spectrum and pathogenesis. Hum Genomics 2012; 5:623-90. [PMID: 22155606 PMCID: PMC3525246 DOI: 10.1186/1479-7364-5-6-623] [Citation(s) in RCA: 101] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
Somatic gene mutations constitute key events in the malignant transformation of human cells. Somatic mutation can either actively speed up the growth of tumour cells or relax the growth constraints normally imposed upon them, thereby conferring a selective (proliferative) advantage at the cellular level. Neurofibromatosis type-1 (NF1) affects 1/3,000-4,000 individuals worldwide and is caused by the inactivation of the NF1 tumour suppressor gene, which encodes the protein neurofibromin. Consistent with Knudson's two-hit hypothesis, NF1 patients harbouring a heterozygous germline NF1 mutation develop neurofibromas upon somatic mutation of the second, wild-type, NF1 allele. While the identification of somatic mutations in NF1 patients has always been problematic on account of the extensive cellular heterogeneity manifested by neurofibromas, the classification of NF1 somatic mutations is a prerequisite for understanding the complex molecular mechanisms underlying NF1 tumorigenesis. Here, the known somatic mutational spectrum for the NF1 gene in a range of NF1-associated neoplasms --including peripheral nerve sheath tumours (neurofibromas), malignant peripheral nerve sheath tumours, gastrointestinal stromal tumours, gastric carcinoid, juvenile myelomonocytic leukaemia, glomus tumours, astrocytomas and phaeochromocytomas -- have been collated and analysed.
Collapse
|
7
|
Naidoo N, Pawitan Y, Soong R, Cooper DN, Ku CS. Human genetics and genomics a decade after the release of the draft sequence of the human genome. Hum Genomics 2012; 5:577-622. [PMID: 22155605 PMCID: PMC3525251 DOI: 10.1186/1479-7364-5-6-577] [Citation(s) in RCA: 77] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade.
Collapse
Affiliation(s)
- Nasheen Naidoo
- Centre for Molecular Epidemiology, Department of Epidemiology and Public Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | | | | | | | | |
Collapse
|
8
|
Cooper DN, Bacolla A, Férec C, Vasquez KM, Kehrer-Sawatzki H, Chen JM. On the sequence-directed nature of human gene mutation: the role of genomic architecture and the local DNA sequence environment in mediating gene mutations underlying human inherited disease. Hum Mutat 2011; 32:1075-99. [PMID: 21853507 PMCID: PMC3177966 DOI: 10.1002/humu.21557] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2011] [Accepted: 06/17/2011] [Indexed: 12/21/2022]
Abstract
Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher order features of the genomic architecture. The human genome is now recognized to contain "pervasive architectural flaws" in that certain DNA sequences are inherently mutation prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. Here, we explore how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment. The mutability of a given gene or genomic region may also be influenced indirectly by a variety of noncanonical (non-B) secondary structures whose formation is facilitated by the underlying DNA sequence. Since these non-B DNA structures can interfere with subsequent DNA replication and repair and may serve to increase mutation frequencies in generalized fashion (i.e., both in the context of subtle mutations and SVs), they have the potential to serve as a unifying concept in studies of mutational mechanisms underlying human inherited disease.
Collapse
Affiliation(s)
- David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom.
| | | | | | | | | | | |
Collapse
|