Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Fu Y, Liu Z, Lou S, Bedford J, Mu XJ, Yip KY, Khurana E, Gerstein M. FunSeq2: a framework for prioritizing noncoding regulatory variants in cancer. Genome Biol 2015;15:480. [PMID: 25273974 PMCID: PMC4203974 DOI: 10.1186/s13059-014-0480-5] [Citation(s) in RCA: 226] [Impact Index Per Article: 25.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2014] [Indexed: 12/15/2022] Open

For:	Fu Y, Liu Z, Lou S, Bedford J, Mu XJ, Yip KY, Khurana E, Gerstein M. FunSeq2: a framework for prioritizing noncoding regulatory variants in cancer. Genome Biol 2015;15:480. [PMID: 25273974 PMCID: PMC4203974 DOI: 10.1186/s13059-014-0480-5] [Citation(s) in RCA: 226] [Impact Index Per Article: 25.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2014] [Indexed: 12/15/2022] Open

Number

Cited by Other Article(s)

201

van der Velde KJ, de Boer EN, van Diemen CC, Sikkema-Raddatz B, Abbott KM, Knopperts A, Franke L, Sijmons RH, de Koning TJ, Wijmenga C, Sinke RJ, Swertz MA. GAVIN: Gene-Aware Variant INterpretation for medical sequencing. Genome Biol 2017;18:6. [PMID: 28093075 PMCID: PMC5240400 DOI: 10.1186/s13059-016-1141-7] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2016] [Accepted: 12/19/2016] [Indexed: 01/08/2023] Open

202

Piraino SW, Furney SJ. Identification of coding and non-coding mutational hotspots in cancer genomes. BMC Genomics 2017;18:17. [PMID: 28056774 PMCID: PMC5217664 DOI: 10.1186/s12864-016-3420-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2016] [Accepted: 12/14/2016] [Indexed: 12/21/2022] Open

Abstract

Background

The identification of mutations that play a causal role in tumour development, so called “driver” mutations, is of critical importance for understanding how cancers form and how they might be treated. Several large cancer sequencing projects have identified genes that are recurrently mutated in cancer patients, suggesting a role in tumourigenesis. While the landscape of coding drivers has been extensively studied and many of the most prominent driver genes are well characterised, comparatively less is known about the role of mutations in the non-coding regions of the genome in cancer development. The continuing fall in genome sequencing costs has resulted in a concomitant increase in the number of cancer whole genome sequences being produced, facilitating systematic interrogation of both the coding and non-coding regions of cancer genomes.

Results

To examine the mutational landscapes of tumour genomes we have developed a novel method to identify mutational hotspots in tumour genomes using both mutational data and information on evolutionary conservation. We have applied our methodology to over 1300 whole cancer genomes and show that it identifies prominent coding and non-coding regions that are known or highly suspected to play a role in cancer. Importantly, we applied our method to the entire genome, rather than relying on predefined annotations (e.g. promoter regions) and we highlight recurrently mutated regions that may have resulted from increased exposure to mutational processes rather than selection, some of which have been identified previously as targets of selection. Finally, we implicate several pan-cancer and cancer-specific candidate non-coding regions, which could be involved in tumourigenesis.

Conclusions

We have developed a framework to identify mutational hotspots in cancer genomes, which is applicable to the entire genome. This framework identifies known and novel coding and non-coding mutional hotspots and can be used to differentiate candidate driver regions from likely passenger regions susceptible to somatic mutation.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-016-3420-9) contains supplementary material, which is available to authorized users.

Collapse

203

Handel AE, Gallone G, Zameel Cader M, Ponting CP. Most brain disease-associated and eQTL haplotypes are not located within transcription factor DNase-seq footprints in brain. Hum Mol Genet 2017;26:79-89. [PMID: 27798116 PMCID: PMC5351933 DOI: 10.1093/hmg/ddw369] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Revised: 09/19/2016] [Accepted: 10/24/2016] [Indexed: 11/20/2022] Open

204

Chadaeva IV, Ponomarenko MP, Rasskazov DA, Sharypova EB, Kashina EV, Matveeva MY, Arshinova TV, Ponomarenko PM, Arkova OV, Bondar NP, Savinkova LK, Kolchanov NA. Candidate SNP markers of aggressiveness-related complications and comorbidities of genetic diseases are predicted by a significant change in the affinity of TATA-binding protein for human gene promoters. BMC Genomics 2016;17:995. [PMID: 28105927 PMCID: PMC5249025 DOI: 10.1186/s12864-016-3353-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Aggressiveness in humans is a hereditary behavioral trait that mobilizes all systems of the body-first of all, the nervous and endocrine systems, and then the respiratory, vascular, muscular, and others-e.g., for the defense of oneself, children, family, shelter, territory, and other possessions as well as personal interests. The level of aggressiveness of a person determines many other characteristics of quality of life and lifespan, acting as a stress factor. Aggressive behavior depends on many parameters such as age, gender, diseases and treatment, diet, and environmental conditions. Among them, genetic factors are believed to be the main parameters that are well-studied at the factual level, but in actuality, genome-wide studies of aggressive behavior appeared relatively recently. One of the biggest projects of the modern science-1000 Genomes-involves identification of single nucleotide polymorphisms (SNPs), i.e., differences of individual genomes from the reference genome. SNPs can be associated with hereditary diseases, their complications, comorbidities, and responses to stress or a drug. Clinical comparisons between cohorts of patients and healthy volunteers (as a control) allow for identifying SNPs whose allele frequencies significantly separate them from one another as markers of the above conditions. Computer-based preliminary analysis of millions of SNPs detected by the 1000 Genomes project can accelerate clinical search for SNP markers due to preliminary whole-genome search for the most meaningful candidate SNP markers and discarding of neutral and poorly substantiated SNPs.

RESULTS

Here, we combine two computer-based search methods for SNPs (that alter gene expression) {i} Web service SNP_TATA_Comparator (DNA sequence analysis) and {ii} PubMed-based manual search for articles on aggressiveness using heuristic keywords. Near the known binding sites for TATA-binding protein (TBP) in human gene promoters, we found aggressiveness-related candidate SNP markers, including rs1143627 (associated with higher aggressiveness in patients undergoing cytokine immunotherapy), rs544850971 (higher aggressiveness in old women taking lipid-lowering medication), and rs10895068 (childhood aggressiveness-related obesity in adolescence with cardiovascular complications in adulthood).

CONCLUSIONS

After validation of these candidate markers by clinical protocols, these SNPs may become useful for physicians (may help to improve treatment of patients) and for the general population (a lifestyle choice preventing aggressiveness-related complications).

Collapse

Affiliation(s)

Irina V. Chadaeva Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, 10 Lavrentyev Avenue, Novosibirsk, 630090 Russia Novosibirsk State University, 2 Pirogova Street, Novosibirsk, 630090 Russia
Mikhail P. Ponomarenko Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, 10 Lavrentyev Avenue, Novosibirsk, 630090 Russia Novosibirsk State University, 2 Pirogova Street, Novosibirsk, 630090 Russia
Dmitry A. Rasskazov Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, 10 Lavrentyev Avenue, Novosibirsk, 630090 Russia
Ekaterina B. Sharypova Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, 10 Lavrentyev Avenue, Novosibirsk, 630090 Russia
Elena V. Kashina Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, 10 Lavrentyev Avenue, Novosibirsk, 630090 Russia
Marina Yu Matveeva Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, 10 Lavrentyev Avenue, Novosibirsk, 630090 Russia
Tatjana V. Arshinova Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, 10 Lavrentyev Avenue, Novosibirsk, 630090 Russia
Petr M. Ponomarenko Children’s Hospital Los Angeles, 4640 Hollywood Boulevard, University of Southern California, Los Angeles, CA 90027 USA
Olga V. Arkova Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, 10 Lavrentyev Avenue, Novosibirsk, 630090 Russia Vector-Best Inc, Koltsovo, Novosibirsk Region 630559 Russia
Natalia P. Bondar Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, 10 Lavrentyev Avenue, Novosibirsk, 630090 Russia
Ludmila K. Savinkova Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, 10 Lavrentyev Avenue, Novosibirsk, 630090 Russia
Nikolay A. Kolchanov Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, 10 Lavrentyev Avenue, Novosibirsk, 630090 Russia Novosibirsk State University, 2 Pirogova Street, Novosibirsk, 630090 Russia

Collapse

205

Dong C, Guo Y, Yang H, He Z, Liu X, Wang K. iCAGES: integrated CAncer GEnome Score for comprehensively prioritizing driver genes in personal cancer genomes. Genome Med 2016;8:135. [PMID: 28007024 PMCID: PMC5180414 DOI: 10.1186/s13073-016-0390-0] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 12/05/2016] [Indexed: 12/31/2022] Open

206

Lu Y, Quan C, Chen H, Bo X, Zhang C. 3DSNP: a database for linking human noncoding SNPs to their three-dimensional interacting genes. Nucleic Acids Res 2016;45:D643-D649. [PMID: 27789693 PMCID: PMC5210526 DOI: 10.1093/nar/gkw1022] [Citation(s) in RCA: 77] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Revised: 10/16/2016] [Accepted: 10/18/2016] [Indexed: 12/02/2022] Open

207

Li H, He Z, Gu Y, Fang L, Lv X. Prioritization of non-coding disease-causing variants and long non-coding RNAs in liver cancer. Oncol Lett 2016;12:3987-3994. [PMID: 27895760 DOI: 10.3892/ol.2016.5135] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Accepted: 06/16/2016] [Indexed: 01/10/2023] Open

Abstract

There are multiple bioinformatics tools available for the detection of coding driver mutations in cancers. However, the prioritization of pathogenic non-coding variants remains a challenging and demanding task. The present study was performed to discriminate non-coding disease-causing mutations and prioritize potential cancer-implicated long non-coding RNAs (lncRNAs) in liver cancer using a logistic regression model. A logistic regression model was constructed by combining 19,153 disease-associated ClinVar and human gene mutation database pathogenic variants as the response variable and non-coding features as the predictor variable. Genome-wide association study (GWAS) disease or trait-associated variants and recurrent somatic mutations were used to validate the model. Non-coding gene features with the highest fractions of load were characterized and potential cancer-associated lncRNA candidates were prioritized by combining the fraction of high-scoring regions and average score predicted by the logistic regression model. H3K9me3 and conserved regions were the most negatively and positively informative for the model, respectively. The area under the receiver operating characteristic curve of the model was 0.92. The average score of GWAS disease-associated variants was significantly increased compared with neutral single nucleotide polymorphisms (5.8642 vs. 5.4707; P<0.001), the average score of recurrent somatic mutations of liver cancer was significantly increased compared with non-recurrent somatic mutations (5.4101 vs. 5.2768; P=0.0125). The present study found regions in lncRNAs and introns/untranslated regions of protein coding genes where mutations are most likely to be damaging. In total, 847 lncRNAs were filtered out from the background. Characterization of this subset of lncRNAs showed that these lncRNAs are more conservative, less mutated and more highly expressed compared with other control lncRNAs. In addition, 23 of these lncRNAs were differentially expressed between 12 pairs of liver cancer and adjacent normal specimens. The logistic regression model is a useful tool to prioritize non-coding pathogenic variants and lncRNAs, and paves the way for the detection of non-coding driver lncRNAs in liver cancer.

Collapse

208

Candidate SNP Markers of Chronopathologies Are Predicted by a Significant Change in the Affinity of TATA-Binding Protein for Human Gene Promoters. BIOMED RESEARCH INTERNATIONAL 2016;2016:8642703. [PMID: 27635400 PMCID: PMC5011241 DOI: 10.1155/2016/8642703] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Revised: 06/25/2016] [Accepted: 06/28/2016] [Indexed: 01/14/2023]

209

Poulos RC, Sloane MA, Hesson LB, Wong JWH. The search for cis-regulatory driver mutations in cancer genomes. Oncotarget 2016;6:32509-25. [PMID: 26356674 PMCID: PMC4741709 DOI: 10.18632/oncotarget.5085] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2015] [Accepted: 08/06/2015] [Indexed: 12/16/2022] Open

210

Li MJ, Pan Z, Liu Z, Wu J, Wang P, Zhu Y, Xu F, Xia Z, Sham PC, Kocher JPA, Li M, Liu JS, Wang J. Predicting regulatory variants with composite statistic. Bioinformatics 2016;32:2729-36. [PMID: 27273672 DOI: 10.1093/bioinformatics/btw288] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2015] [Accepted: 04/29/2016] [Indexed: 11/14/2022] Open

211

Umer HM, Cavalli M, Dabrowski MJ, Diamanti K, Kruczyk M, Pan G, Komorowski J, Wadelius C. A Significant Regulatory Mutation Burden at a High-Affinity Position of the CTCF Motif in Gastrointestinal Cancers. Hum Mutat 2016;37:904-13. [PMID: 27174533 DOI: 10.1002/humu.23014] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Accepted: 05/03/2016] [Indexed: 12/22/2022]

212

Bendl J, Musil M, Štourač J, Zendulka J, Damborský J, Brezovský J. PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions. PLoS Comput Biol 2016;12:e1004962. [PMID: 27224906 PMCID: PMC4880439 DOI: 10.1371/journal.pcbi.1004962] [Citation(s) in RCA: 133] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2015] [Accepted: 05/05/2016] [Indexed: 12/20/2022] Open

Abstract

An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools’ predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations. To enable comprehensive evaluation of variants, the predictions are complemented with annotations from eight databases. The web server is freely available to the community at http://loschmidt.chemi.muni.cz/predictsnp2.

Collapse

213

Li H, Lv X. Functional annotation of noncoding variants and prioritization of cancer-associated lncRNAs in lung cancer. Oncol Lett 2016;12:222-230. [PMID: 27347129 DOI: 10.3892/ol.2016.4604] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Accepted: 04/01/2016] [Indexed: 11/05/2022] Open

Abstract

Multiple computational tools have been widely applied to the detection of coding driver mutations in cancer; however, the prioritization of pathogenic non-coding variants remains a difficult and demanding task. The present study was performed to distinguish non-coding disease-causing mutations from neutral ones, and to prioritize potential cancer-associated long non-coding RNAs (lncRNAs) with a logistic regression model in lung cancer. A logistic regression model was constructed, combining 19,153 disease-associated ClinVar and Human Gene Mutation Database pathogenic variants as the response variable and non-coding features as the predictor variable. Validation of the model was conducted with genome-wide association study (GWAS) disease- or trait-associated single nucleotide polymorphisms (SNPs) and recurrent somatic mutations. High scoring regions were characterized with respect to their distribution in various features and gene classes; potential cancer-associated lncRNA candidates were prioritized, combining the fraction of high-scoring regions and average score predicted by the logistic regression model. H3K79me2 was the most negative factor that contributed to the model, while conserved regions were most positively informative to the model. The area under the receiver operating characteristic curve of the model was 0.89. The model assigned a significantly higher score to GWAS SNPs and recurrent somatic mutations compared with neutral SNPs (mean, 5.9012 vs. 5.5238; P<0.001, Mann-Whitney U test) and non-recurrent mutations (mean, 5.4677 vs. 5.2277, P<0.001, Mann-Whitney U test), respectively. It was observed that regions, including splicing sites and untranslated regions, and gene classes, including cancer genes and cancer-associated lncRNAs, had an increased enrichment of high-scoring regions. In total, 2,679 cancer-associated lncRNAs were determined and characterized. A total of 104 of these lncRNAs were differentially expressed between lung cancer and normal specimens. The logistic regression model is a useful and efficient scoring system to prioritize non-coding pathogenic variants and lncRNAs, and may provide the basis for detecting non-coding driver lncRNAs in lung cancer.

Collapse

214

Whole genome sequencing and its applications in medical genetics. QUANTITATIVE BIOLOGY 2016. [DOI: 10.1007/s40484-016-0067-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

215

Kelley DR, Snoek J, Rinn JL. Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res 2016;26:990-9. [PMID: 27197224 PMCID: PMC4937568 DOI: 10.1101/gr.200535.115] [Citation(s) in RCA: 519] [Impact Index Per Article: 64.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2015] [Accepted: 04/26/2016] [Indexed: 12/22/2022]

216

A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals. Nat Commun 2016;7:11101. [PMID: 27089393 PMCID: PMC4837449 DOI: 10.1038/ncomms11101] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2015] [Accepted: 02/19/2016] [Indexed: 02/07/2023] Open

217

Ponomarenko MP, Arkova O, Rasskazov D, Ponomarenko P, Savinkova L, Kolchanov N. Candidate SNP Markers of Gender-Biased Autoimmune Complications of Monogenic Diseases Are Predicted by a Significant Change in the Affinity of TATA-Binding Protein for Human Gene Promoters. Front Immunol 2016;7:130. [PMID: 27092142 PMCID: PMC4819121 DOI: 10.3389/fimmu.2016.00130] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Accepted: 03/21/2016] [Indexed: 12/17/2022] Open

218

Turnaev II, Rasskazov DA, Arkova OV, Ponomarenko MP, Ponomarenko PM, Savinkova LK, Kolchanov NA. Hypothetical SNP markers that significantly affect the affinity of the TATA-binding protein to VEGFA, ERBB2, IGF1R, FLT1, KDR, and MET oncogene promoters as chemotherapy targets. Mol Biol 2016. [DOI: 10.1134/s0026893316010209] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

219

Quantitative Trait Loci Identify Functional Noncoding Variation in Cancer. PLoS Genet 2016;12:e1005826. [PMID: 26938653 PMCID: PMC4777413 DOI: 10.1371/journal.pgen.1005826] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open

220

Du C, Wu X, Li J. Mutation pattern is an influential factor on functional mutation rates in cancer. Cancer Cell Int 2016;16:2. [PMID: 26865835 PMCID: PMC4748466 DOI: 10.1186/s12935-016-0278-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2015] [Accepted: 02/03/2016] [Indexed: 11/11/2022] Open

Abstract

BACKGROUND

Mutation rates are consistently varied in cancer genome and play an important role in tumorigenesis, however, little has been known about their function potential and impact on the distribution of functional mutations. In this study, we investigated genomic features which affect mutation pattern and the function importance of mutation pattern in cancer.

METHODS

Somatic mutations of clear-cell renal cell carcinoma, liver cancer, lung cancer and melanoma and single nucleotide polymorphisms (SNPs) were intersected with 54 distinct genomic features. Somatic mutation and SNP densities were then computed for each feature type. We constructed 2856 1-Mb windows, in which each row (1-Mb window) contains somatic mutation, SNP densities and 54 feature vectors. Correlation analyses were conducted between somatic mutation, SNP densities and each feature vector. We also built two random forest models, namely somatic mutation model (CSM) and SNP model to predict somatic mutation and SNP densities on a 1-Kb scale. The relation of CSM and SNP scores was further analyzed with the distributions of deleterious coding variants predicted by SIFT and Mutation Assessor, non-coding functional variants evaluated with FunSeq 2 and GWAVA and disease-causing variants from HGMD and ClinVar databases.

RESULTS

We observed a wide range of genomic features which affect local mutation rates, such as replication time, transcription levels, histone marks and regulatory elements. Repressive histone marks, replication time and promoter contributed most to the CSM models, while, recombination rate and chromatin organizations were most important for the SNP model. We showed low mutated regions preferentially have higher densities of deleterious coding mutations, higher average scores of non-coding variants, higher fraction of functional regions and higher enrichment of disease-causing variants as compared to high mutated regions.

CONCLUSIONS

Somatic mutation densities vary largely across cancer genome, mutation frequency is a major indication of function and influence on the distribution of functional mutations in cancer.

Collapse

221

Khurana E, Fu Y, Chakravarty D, Demichelis F, Rubin MA, Gerstein M. Role of non-coding sequence variants in cancer. Nat Rev Genet 2016;17:93-108. [PMID: 26781813 DOI: 10.1038/nrg.2015.17] [Citation(s) in RCA: 319] [Impact Index Per Article: 39.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

222

Arkova OV, Ponomarenko MP, Rasskazov DA, Drachkova IA, Arshinova TV, Ponomarenko PM, Savinkova LK, Kolchanov NA. Obesity-related known and candidate SNP markers can significantly change affinity of TATA-binding protein for human gene promoters. BMC Genomics 2015;16 Suppl 13:S5. [PMID: 26694100 PMCID: PMC4686794 DOI: 10.1186/1471-2164-16-s13-s5] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

Abstract

BACKGROUND

Obesity affects quality of life and life expectancy and is associated with cardiovascular disorders, cancer, diabetes, reproductive disorders in women, prostate diseases in men, and congenital anomalies in children. The use of single nucleotide polymorphism (SNP) markers of diseases and drug responses (i.e., significant differences of personal genomes of patients from the reference human genome) can help physicians to improve treatment. Clinical research can validate SNP markers via genotyping of patients and demonstration that SNP alleles are significantly more frequent in patients than in healthy people. The search for biomedical SNP markers of interest can be accelerated by computer-based analysis of hundreds of millions of SNPs in the 1000 Genomes project because of selection of the most meaningful candidate SNP markers and elimination of neutral SNPs.

RESULTS

We cross-validated the output of two computer-based methods: DNA sequence analysis using Web service SNP_TATA_Comparator and keyword search for articles on comorbidities of obesity. Near the sites binding to TATA-binding protein (TBP) in human gene promoters, we found 22 obesity-related candidate SNP markers, including rs10895068 (male breast cancer in obesity); rs35036378 (reduced risk of obesity after ovariectomy); rs201739205 (reduced risk of obesity-related cancers due to weight loss by diet/exercise in obese postmenopausal women); rs183433761 (obesity resistance during a high-fat diet); rs367732974 and rs549591993 (both: cardiovascular complications in obese patients with type 2 diabetes mellitus); rs200487063 and rs34104384 (both: obesity-caused hypertension); rs35518301, rs72661131, and rs562962093 (all: obesity); and rs397509430, rs33980857, rs34598529, rs33931746, rs33981098, rs34500389, rs63750953, rs281864525, rs35518301, and rs34166473 (all: chronic inflammation in comorbidities of obesity). Using an electrophoretic mobility shift assay under nonequilibrium conditions, we empirically validated the statistical significance (α < 0.00025) of the differences in TBP affinity values between the minor and ancestral alleles of 4 out of the 22 SNPs: rs200487063, rs201381696, rs34104384, and rs183433761. We also measured half-life (t1/2), Gibbs free energy change (ΔG), and the association and dissociation rate constants, ka and kd, of the TBP-DNA complex for these SNPs.

CONCLUSIONS

Validation of the 22 candidate SNP markers by proper clinical protocols appears to have a strong rationale and may advance postgenomic predictive preventive personalized medicine.

Collapse

223

Piraino SW, Furney SJ. Beyond the exome: the role of non-coding somatic mutations in cancer. Ann Oncol 2015;27:240-8. [PMID: 26598542 DOI: 10.1093/annonc/mdv561] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2015] [Accepted: 11/04/2015] [Indexed: 02/06/2023] Open

224

A Dual Model for Prioritizing Cancer Mutations in the Non-coding Genome Based on Germline and Somatic Events. PLoS Comput Biol 2015;11:e1004583. [PMID: 26588488 PMCID: PMC4654583 DOI: 10.1371/journal.pcbi.1004583] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2015] [Accepted: 10/04/2015] [Indexed: 11/19/2022] Open

Abstract

We address here the issue of prioritizing non-coding mutations in the tumoral genome. To this aim, we created two independent computational models. The first (germline) model estimates purifying selection based on population SNP data. The second (somatic) model estimates tumor mutation density based on whole genome tumor sequencing. We show that each model reflects a different set of constraints acting either on the normal or tumor genome, and we identify the specific genome features that most contribute to these constraints. Importantly, we show that the somatic mutation model carries independent functional information that can be used to narrow down the non-coding regions that may be relevant to cancer progression. On this basis, we identify positions in non-coding RNAs and the non-coding parts of mRNAs that are both under purifying selection in the germline and protected from mutation in tumors, thus introducing a new strategy for future detection of cancer driver elements in the expressed non-coding genome.

Cancer cells undergo a mutation/selection process that resembles that of any living cell. Most mutations in cancer cell DNA occur in the so-called "non-coding" regions that represent 98.5% of the genome length. Pinning down which of these mutations contribute to the fitness of cancer cells would be important for identifying new "cancer drivers", which may in turn lead to future treatments. Unfortunately, predicting the impact of a non-coding DNA alteration remains extremely difficult. In this study, we analyze millions of non-coding cancer mutations and show cancer-specific mutational patterns can be used to predict non-coding regions that are preserved from mutations and may thus be important for cancer cell survival. Combining this information with population data, we propose a new scoring system that should help prioritize important non-coding mutations in future studies.

Collapse

225

Svetlichnyy D, Imrichova H, Fiers M, Kalender Atak Z, Aerts S. Identification of High-Impact cis-Regulatory Mutations Using Transcription Factor Specific Random Forest Models. PLoS Comput Biol 2015;11:e1004590. [PMID: 26562774 PMCID: PMC4642938 DOI: 10.1371/journal.pcbi.1004590] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2015] [Accepted: 10/10/2015] [Indexed: 02/02/2023] Open

Abstract

Cancer genomes contain vast amounts of somatic mutations, many of which are passenger mutations not involved in oncogenesis. Whereas driver mutations in protein-coding genes can be distinguished from passenger mutations based on their recurrence, non-coding mutations are usually not recurrent at the same position. Therefore, it is still unclear how to identify cis-regulatory driver mutations, particularly when chromatin data from the same patient is not available, thus relying only on sequence and expression information. Here we use machine-learning methods to predict functional regulatory regions using sequence information alone, and compare the predicted activity of the mutated region with the reference sequence. This way we define the Predicted Regulatory Impact of a Mutation in an Enhancer (PRIME). We find that the recently identified driver mutation in the TAL1 enhancer has a high PRIME score, representing a “gain-of-target” for MYB, whereas the highly recurrent TERT promoter mutation has a surprisingly low PRIME score. We trained Random Forest models for 45 cancer-related transcription factors, and used these to score variations in the HeLa genome and somatic mutations across more than five hundred cancer genomes. Each model predicts only a small fraction of non-coding mutations with a potential impact on the function of the encompassing regulatory region. Nevertheless, as these few candidate driver mutations are often linked to gains in chromatin activity and gene expression, they may contribute to the oncogenic program by altering the expression levels of specific oncogenes and tumor suppressor genes.

Precise regulation of gene expression is controlled by cis-regulatory modules (CRM) containing binding sites for transcription factors (TF). The genome-wide location of all TF binding sites can often be obtained by ChIP-seq (chromatin immunoprecipitation followed by deep sequencing), yet in most cases only a minority of the binding peaks actually represent functional CRMs that control the transcription initiation of a bona fide TF target gene. Here, we investigated for 45 cancer-related TFs how machine-learning approaches can be used to predict functional TF target CRMs. After careful evaluation of their performance, we used these TF-target classifiers to predict which cis-regulatory mutations may have a significant impact on gene regulation by evaluating whether the mutation causes a significant gain or loss in the probability that the CRM is a functional TF target. We found that Random Forest classifiers can achieve more than 100-fold higher specificity for mutation prediction compared to the simple approaches based on scanning with position weight matrices. By scanning somatic mutations in breast cancer genomes and in the HeLa genome, we finally show that our TF-target classifiers can identify high impact non-coding mutations that are associated with concordant TF binding, gene expression changes and chromatin activity. In conclusion, TF-specific Random Forest classifiers can be used to prioritize cis-regulatory mutations in cancer genomes with high accuracy.

Collapse

226

Lu Q, Yao X, Hu Y, Zhao H. GenoWAP: GWAS signal prioritization through integrated analysis of genomic functional annotation. Bioinformatics 2015;32:542-8. [PMID: 26504140 DOI: 10.1093/bioinformatics/btv610] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2015] [Accepted: 10/16/2015] [Indexed: 12/29/2022] Open

227

Ponomarenko M, Rasskazov D, Arkova O, Ponomarenko P, Suslov V, Savinkova L, Kolchanov N. How to Use SNP_TATA_Comparator to Find a Significant Change in Gene Expression Caused by the Regulatory SNP of This Gene's Promoter via a Change in Affinity of the TATA-Binding Protein for This Promoter. BIOMED RESEARCH INTERNATIONAL 2015;2015:359835. [PMID: 26516624 PMCID: PMC4609514 DOI: 10.1155/2015/359835] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/03/2015] [Accepted: 08/24/2015] [Indexed: 01/11/2023]

228

Li J, Drubay D, Michiels S, Gautheret D. Mining the coding and non-coding genome for cancer drivers. Cancer Lett 2015;369:307-15. [PMID: 26433158 DOI: 10.1016/j.canlet.2015.09.015] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2015] [Revised: 09/24/2015] [Accepted: 09/24/2015] [Indexed: 12/20/2022]

229

Zhou J, Troyanskaya OG. Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods 2015;12:931-4. [PMID: 26301843 PMCID: PMC4768299 DOI: 10.1038/nmeth.3547] [Citation(s) in RCA: 1116] [Impact Index Per Article: 124.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2015] [Accepted: 06/11/2015] [Indexed: 12/18/2022]

230

Lochovsky L, Zhang J, Fu Y, Khurana E, Gerstein M. LARVA: an integrative framework for large-scale analysis of recurrent variants in noncoding annotations. Nucleic Acids Res 2015;43:8123-34. [PMID: 26304545 PMCID: PMC4787796 DOI: 10.1093/nar/gkv803] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2015] [Accepted: 07/28/2015] [Indexed: 01/22/2023] Open

231

Poulos RC, Thoms JAI, Shah A, Beck D, Pimanda JE, Wong JWH. Systematic Screening of Promoter Regions Pinpoints Functional Cis-Regulatory Mutations in a Cutaneous Melanoma Genome. Mol Cancer Res 2015;13:1218-26. [PMID: 26082173 DOI: 10.1158/1541-7786.mcr-15-0146] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2015] [Accepted: 06/04/2015] [Indexed: 11/16/2022]

Abstract

UNLABELLED

With the recent discovery of recurrent mutations in the TERT promoter in melanoma, identification of other somatic causal promoter mutations is of considerable interest. Yet, the impact of sequence variation on the regulatory potential of gene promoters has not been systematically evaluated. This study assesses the impact of promoter mutations on promoter activity in the whole-genome sequenced malignant melanoma cell line COLO-829. Combining somatic mutation calls from COLO-829 with genome-wide chromatin accessibility and histone modification data revealed mutations within promoter elements. Interestingly, a high number of potential promoter mutations (n = 23) were found, a result mirrored in subsequent analysis of TCGA whole-melanoma genomes. The impact of wild-type and mutant promoter sequences were evaluated by subcloning into luciferase reporter vectors and testing their transcriptional activity in COLO-829 cells. Of the 23 promoter regions tested, four mutations significantly altered reporter activity relative to wild-type sequences. These data were then subjected to multiple computational algorithms that score the cis-regulatory altering potential of mutations. These analyses identified one mutation, located within the promoter region of NDUFB9, which encodes the mitochondrial NADH dehydrogenase (ubiquinone) 1 beta subcomplex 9, to be recurrent in 4.4% (19 of 432) of TCGA whole-melanoma exomes. The mutation is predicted to disrupt a highly conserved SP1/KLF transcription factor binding motif and its frequent co-occurrence with mutations in the coding sequence of NF1 supports a pathologic role for this mutation in melanoma. Taken together, these data show the relatively high prevalence of promoter mutations in the COLO-829 melanoma genome, and indicate that a proportion of these significantly alter the regulatory potential of gene promoters.

IMPLICATIONS

Genomic-based screening within gene promoter regions suggests that functional cis-regulatory mutations may be common in melanoma genomes, highlighting the need to examine their role in tumorigenesis.

Collapse

232

Kircher M, Shendure J. Running spell-check to identify regulatory variants. Nat Genet 2015. [DOI: 10.1038/ng.3364] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]

233

Wang J, Batmanov K. BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations. Nucleic Acids Res 2015. [PMID: 26202972 PMCID: PMC4666384 DOI: 10.1093/nar/gkv733] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

234

Wang Q, Lu Q, Zhao H. A review of study designs and statistical methods for genomic epidemiology studies using next generation sequencing. Front Genet 2015;6:149. [PMID: 25941534 PMCID: PMC4403555 DOI: 10.3389/fgene.2015.00149] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2015] [Accepted: 03/30/2015] [Indexed: 12/22/2022] Open

235

Vuong H, Che A, Ravichandran S, Luke BT, Collins JR, Mudunuri US. AVIA v2.0: annotation, visualization and impact analysis of genomic variants and genes. Bioinformatics 2015;31:2748-50. [PMID: 25861966 DOI: 10.1093/bioinformatics/btv200] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2014] [Accepted: 04/05/2015] [Indexed: 12/17/2022] Open

236

Mathelier A, Shi W, Wasserman WW. Identification of altered cis-regulatory elements in human disease. Trends Genet 2015;31:67-76. [DOI: 10.1016/j.tig.2014.12.003] [Citation(s) in RCA: 82] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2014] [Revised: 12/19/2014] [Accepted: 12/19/2014] [Indexed: 02/01/2023]