Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Han A, Kang HJ, Cho Y, Lee S, Kim YJ, Gong S. SNP@Domain: a web resource of single nucleotide polymorphisms (SNPs) within protein domain structures and sequences. Nucleic Acids Res 2006;34:W642-4. [PMID: 16845090 PMCID: PMC1538855 DOI: 10.1093/nar/gkl323] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

For:	Han A, Kang HJ, Cho Y, Lee S, Kim YJ, Gong S. SNP@Domain: a web resource of single nucleotide polymorphisms (SNPs) within protein domain structures and sequences. Nucleic Acids Res 2006;34:W642-4. [PMID: 16845090 PMCID: PMC1538855 DOI: 10.1093/nar/gkl323] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

Number

Cited by Other Article(s)

Andrades R, Recamonde-Mendoza M. Machine learning methods for prediction of cancer driver genes: a survey paper. Brief Bioinform 2022;23:6551145. [PMID: 35323900 DOI: 10.1093/bib/bbac062] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 02/06/2022] [Accepted: 02/08/2022] [Indexed: 12/21/2022] Open

Prediction and prioritization of rare oncogenic mutations in the cancer Kinome using novel features and multiple classifiers. PLoS Comput Biol 2014;10:e1003545. [PMID: 24743239 PMCID: PMC3990476 DOI: 10.1371/journal.pcbi.1003545] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2013] [Accepted: 02/18/2014] [Indexed: 01/18/2023] Open

Abstract

Cancer is a genetic disease that develops through a series of somatic mutations, a subset of which drive cancer progression. Although cancer genome sequencing studies are beginning to reveal the mutational patterns of genes in various cancers, identifying the small subset of “causative” mutations from the large subset of “non-causative” mutations, which accumulate as a consequence of the disease, is a challenge. In this article, we present an effective machine learning approach for identifying cancer-associated mutations in human protein kinases, a class of signaling proteins known to be frequently mutated in human cancers. We evaluate the performance of 11 well known supervised learners and show that a multiple-classifier approach, which combines the performances of individual learners, significantly improves the classification of known cancer-associated mutations. We introduce several novel features related specifically to structural and functional characteristics of protein kinases and find that the level of conservation of the mutated residue at specific evolutionary depths is an important predictor of oncogenic effect. We consolidate the novel features and the multiple-classifier approach to prioritize and experimentally test a set of rare unconfirmed mutations in the epidermal growth factor receptor tyrosine kinase (EGFR). Our studies identify T725M and L861R as rare cancer-associated mutations inasmuch as these mutations increase EGFR activity in the absence of the activating EGF ligand in cell-based assays.

Cancer progresses by accumulation of mutations in a subset of genes that confer growth advantage. The 518 protein kinase genes encoded in the human genome, collectively called the kinome, represent one of the largest families of oncogenes. Targeted sequencing studies of many different cancers have shown that the mutational landscape comprises both cancer-causing “driver” mutations and harmless “passenger” mutations. While the frequent recurrence of some driver mutations in human cancers helps distinguish them from the large number of passenger mutations, a significant challenge is to identify the rare “driver” mutations that are less frequently observed in patient samples and yet are causative. Here we combine computational and experimental approaches to identify rare cancer-associated mutations in Epidermal Growth Factor receptor kinase (EGFR), a signaling protein frequently mutated in cancers. Specifically, we evaluate a novel multiple-classifier approach and features specific to the protein kinase super-family in distinguishing known cancer-associated mutations from benign mutations. We then apply the multiple classifier to identify and test the functional impact of rare cancer-associated mutations in EGFR. We report, for the first time, that the EGFR mutations T725M and L861R, which are infrequently observed in cancers, constitutively activate EGFR in a manner analogous to the frequently observed driver mutations.

Collapse

Computational Approaches and Resources in Single Amino Acid Substitutions Analysis Toward Clinical Research. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2014;94:365-423. [DOI: 10.1016/b978-0-12-800168-4.00010-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]

Mele A, Cervantes JR, Chien V, Friedman D, Ferran C. Single nucleotide polymorphisms at the TNFAIP3/A20 locus and susceptibility/resistance to inflammatory and autoimmune diseases. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2014;809:163-83. [PMID: 25302371 DOI: 10.1007/978-1-4939-0398-6_10] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Peterson TA, Doughty E, Kann MG. Towards precision medicine: advances in computational approaches for the analysis of human variants. J Mol Biol 2013;425:4047-63. [PMID: 23962656 PMCID: PMC3807015 DOI: 10.1016/j.jmb.2013.08.008] [Citation(s) in RCA: 106] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2013] [Revised: 08/07/2013] [Accepted: 08/08/2013] [Indexed: 12/26/2022]

Yates CM, Sternberg MJE. Proteins and domains vary in their tolerance of non-synonymous single nucleotide polymorphisms (nsSNPs). J Mol Biol 2013;425:1274-86. [PMID: 23357174 DOI: 10.1016/j.jmb.2013.01.026] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2012] [Revised: 01/11/2013] [Accepted: 01/19/2013] [Indexed: 02/05/2023]

Gong S, Worth CL, Cheng TMK, Blundell TL. Meet Me Halfway: When Genomics Meets Structural Bioinformatics. J Cardiovasc Transl Res 2011;4:281-303. [DOI: 10.1007/s12265-011-9259-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2010] [Accepted: 02/08/2011] [Indexed: 01/08/2023]

Li J, Duncan DT, Zhang B. CanProVar: a human cancer proteome variation database. Hum Mutat 2010;31:219-28. [PMID: 20052754 DOI: 10.1002/humu.21176] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

A survey of proteins encoded by non-synonymous single nucleotide polymorphisms reveals a significant fraction with altered stability and activity. Biochem J 2009;424:15-26. [DOI: 10.1042/bj20090723] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Kooloos WM, Wessels JA, van der Straaten T, Huizinga TW, Guchelaar HJ. Criteria for the selection of single nucleotide polymorphisms in pathway pharmacogenetics: TNF inhibitors as a case study. Drug Discov Today 2009;14:837-44. [DOI: 10.1016/j.drudis.2009.05.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2008] [Revised: 05/20/2009] [Accepted: 05/27/2009] [Indexed: 12/11/2022]

Moses AM, Durbin R. Inferring selection on amino acid preference in protein domains. Mol Biol Evol 2008;26:527-36. [PMID: 19095755 PMCID: PMC2716081 DOI: 10.1093/molbev/msn286] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Yang JO, Hwang S, Oh J, Bhak J, Sohn TK. An integrated database-pipeline system for studying single nucleotide polymorphisms and diseases. BMC Bioinformatics 2008;9 Suppl 12:S19. [PMID: 19091018 PMCID: PMC2638159 DOI: 10.1186/1471-2105-9-s12-s19] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open

Abstract

BACKGROUND

Studies on the relationship between disease and genetic variations such as single nucleotide polymorphisms (SNPs) are important. Genetic variations can cause disease by influencing important biological regulation processes. Despite the needs for analyzing SNP and disease correlation, most existing databases provide information only on functional variants at specific locations on the genome, or deal with only a few genes associated with disease. There is no combined resource to widely support gene-, SNP-, and disease-related information, and to capture relationships among such data. Therefore, we developed an integrated database-pipeline system for studying SNPs and diseases.

RESULTS

To implement the pipeline system for the integrated database, we first unified complicated and redundant disease terms and gene names using the Unified Medical Language System (UMLS) for classification and noun modification, and the HUGO Gene Nomenclature Committee (HGNC) and NCBI gene databases. Next, we collected and integrated representative databases for three categories of information. For genes and proteins, we examined the NCBI mRNA, UniProt, UCSC Table Track and MitoDat databases. For genetic variants we used the dbSNP, JSNP, ALFRED, and HGVbase databases. For disease, we employed OMIM, GAD, and HGMD databases. The database-pipeline system provides a disease thesaurus, including genes and SNPs associated with disease. The search results for these categories are available on the web page http://diseasome.kobic.re.kr/, and a genome browser is also available to highlight findings, as well as to permit the convenient review of potentially deleterious SNPs among genes strongly associated with specific diseases and clinical phenotypes.

CONCLUSION

Our system is designed to capture the relationships between SNPs associated with disease and disease-causing genes. The integrated database-pipeline provides a list of candidate genes and SNP markers for evaluation in both epidemiological and molecular biological approaches to diseases-gene association studies. Furthermore, researchers then can decide semi-automatically the data set for association studies while considering the relationships between genetic variation and diseases. The database can also be economical for disease-association studies, as well as to facilitate an understanding of the processes which cause disease. Currently, the database contains 14,674 SNP records and 109,715 gene records associated with human diseases and it is updated at regular intervals.

Collapse

Kim YU, Kim SH, Jin H, Park YK, Ji MH, Kim YJ. The Korean HapMap Project Website. Genomics Inform 2008. [DOI: 10.5808/gi.2008.6.2.091] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Kim BC, Kim WY, Park D, Chung WH, Shin KS, Bhak J. SNP@Promoter: a database of human SNPs (single nucleotide polymorphisms) within the putative promoter regions. BMC Bioinformatics 2008;9 Suppl 1:S2. [PMID: 18315851 PMCID: PMC2259403 DOI: 10.1186/1471-2105-9-s1-s2] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Uzun A, Leslin CM, Abyzov A, Ilyin V. Structure SNP (StSNP): a web server for mapping and modeling nsSNPs on protein structures with linkage to metabolic pathways. Nucleic Acids Res 2007;35:W384-92. [PMID: 17537826 PMCID: PMC1933130 DOI: 10.1093/nar/gkm232] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Park J, Hwang S, Lee YS, Kim SC, Lee D. SNP@Ethnos: a database of ethnically variant single-nucleotide polymorphisms. Nucleic Acids Res 2006;35:D711-5. [PMID: 17135185 PMCID: PMC1747186 DOI: 10.1093/nar/gkl962] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open